cloud-fan commented on PR #41817:
URL: https://github.com/apache/spark/pull/41817#issuecomment-1619046100

   It seems the problem is not well understood. Looking at the classdoc of the 
LCA rule
   ```
    * ** Example for Aggregate:
    * Before rewrite:
    * Aggregate [dept#14] [dept#14 AS a#12, 'a + 1, avg(salary#16) AS b#13, 'b 
+ avg(bonus#17)]
    * +- Child [dept#14,name#15,salary#16,bonus#17]
    *
    * After phase 1:
    * Aggregate [dept#14] [dept#14 AS a#12, lca(a) + 1, avg(salary#16) AS b#13, 
lca(b) + avg(bonus#17)]
    * +- Child [dept#14,name#15,salary#16,bonus#17]
    *
    * After phase 2:
    * Project [dept#14 AS a#12, lca(a) + 1, avg(salary)#26 AS b#13, lca(b) + 
avg(bonus)#27]
    * +- Aggregate [dept#14] [avg(salary#16) AS avg(salary)#26, avg(bonus#17) 
AS avg(bonus)#27,dept#14]
    *    +- Child [dept#14,name#15,salary#16,bonus#17]
    *
    * Now the problem falls back to the lateral alias resolution in Project.
   ```
   
   The key is that we leave LCA in the newly added Project above Aggregate. I 
think LCA in grouping expressions needs a very different solution. We probably 
need to add Project under Aggregate, to evaluate LCA before executing the 
grouping expressions.
   
   Can we revert https://github.com/apache/spark/pull/41804 first and think of 
the full story later?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to