Yin Huai created SPARK-10167:
--------------------------------

             Summary: We need to explicitly use transformDown when rewrite 
aggregation results
                 Key: SPARK-10167
                 URL: https://issues.apache.org/jira/browse/SPARK-10167
             Project: Spark
          Issue Type: Sub-task
          Components: SQL
            Reporter: Yin Huai
            Priority: Minor


Right now, we use transformDown explicitly at 
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/utils.scala#L105
 and 
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/utils.scala#L130.
 We also need to be very clear on using transformDown at 
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/utils.scala#L300
 and 
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/aggregate/utils.scala#L334
 (right now transform means transformDown). The reason we need to use 
transformDown is when we rewrite final aggregate results, we should always 
match aggregate functions first. If we use transformUp, it is possible that we 
match grouping expression first if we use grouping expressions as children of 
aggregate functions.

There is nothing wrong with our master. We just want to make sure we will not 
have bugs if we change the behavior of transform (change it from transformDown 
to Up.), which I think is very unlikely (but just incase).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to