[GitHub] [spark] prakharjain09 commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

GitBox Thu, 19 Nov 2020 09:44:47 -0800


prakharjain09 commented on pull request #30426:
URL: https://github.com/apache/spark/pull/30426#issuecomment-730533709



   @maropu Thanks for pointing out to old PR and jirs - Yes SPARK-12978 seems 
related to SPARK-33486.
   
   > Btw, have you checked if this optimization could make some queries (e.g., 
TPCDS) faster?
   
   I did impact analysis on TPCDS 100 scale and didn't find noticeable 
improvement - In TPCDS at most of the places, the 1st HashAggregate (HA) 
reduces rows significantly and the 2nd HA doesn't take a lot of time after that.
   
   But we have seen some good improvements in some customer queries - 
Specifically when HA-1 doesn't reduce rows significantly. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [spark] prakharjain09 commented on pull request #30426: [SPARK-33486][SQL] Collapse Partial and Final physical aggregation nodes together whenever possible

Reply via email to