Github user aray commented on a diff in the pull request:

    https://github.com/apache/spark/pull/9429#discussion_r43811041
  
    --- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
 ---
    @@ -205,45 +205,30 @@ class Analyzer(
             GroupingSets(bitmasks(a), a.groupByExprs, a.child, a.aggregations)
           case x: GroupingSets =>
             val gid = AttributeReference(VirtualColumn.groupingIdName, 
IntegerType, false)()
    -        // We will insert another Projection if the GROUP BY keys contains 
the
    -        // non-attribute expressions. And the top operators can references 
those
    -        // expressions by its alias.
    -        // e.g. SELECT key%5 as c1 FROM src GROUP BY key%5 ==>
    -        //      SELECT a as c1 FROM (SELECT key%5 AS a FROM src) GROUP BY a
    -
    -        // find all of the non-attribute expressions in the GROUP BY keys
    -        val nonAttributeGroupByExpressions = new ArrayBuffer[Alias]()
    -
    -        // The pair of (the original GROUP BY key, associated attribute)
    -        val groupByExprPairs = x.groupByExprs.map(_ match {
    -          case e: NamedExpression => (e, e.toAttribute)
    -          case other => {
    -            val alias = Alias(other, other.toString)()
    -            nonAttributeGroupByExpressions += alias // add the 
non-attributes expression alias
    -            (other, alias.toAttribute)
    -          }
    -        })
     
    -        // substitute the non-attribute expressions for aggregations.
    -        val aggregation = x.aggregations.map(expr => expr.transformDown {
    -          case e => 
groupByExprPairs.find(_._1.semanticEquals(e)).map(_._2).getOrElse(e)
    -        }.asInstanceOf[NamedExpression])
    +        val aliasedGroupByExprPairs = x.groupByExprs.map{
    +          case a @ Alias(expr, _) => (expr, a)
    +          case expr: NamedExpression => (expr, Alias(expr, expr.name)())
    --- End diff --
    
    I believe I need a new Alias here since we really have two versions of the 
expression -- the original and the version manipulated by the Generator with 
nulls inserted per the bitmask. In the Aggregate 'aggregation' list the 
grouping columns need to refer to the manipulated version and 'real' aggregates 
need to refer to the original version.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to