sigmod edited a comment on pull request #31913:
URL: https://github.com/apache/spark/pull/31913#issuecomment-825318373


   Posted a message on the git commit page and copy it here.
   
   Hi @peter-toth
   
   @cloud-fan @maryannxue @rednaxelafx and I had an offline discussion about PR.
   
   Here's a summary:
   
   (1) This PR would bring complexities when we add more optimization around 
grouping columns down the road. E.g., if we want to reorder grouping columns in 
order to reuse partitioning properties or optimize away certain grouping 
columns, we would need to re-generate the ordinals in grouping refs.
   (2) It seems safe to pull out non-attribute grouping expressions from an 
Aggregate and put them in a Project below the Aggregate. Then, grouping 
expressions in the Aggregate are only AttributeReferences. The rest of 
Optimizer rules can still match the Aggregate as before. We do need to pay 
attention to rules that collapse Aggregate and Project. In the planner, we can 
combine Aggregate and Project, so that the physical plan doesn't change.
   
   Thus, would you mind reverting this PR and try the proposal in (2)? Thanks 
in advance!
   
   Best,
   Yingyi


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to