cloud-fan edited a comment on issue #24215: [SPARK-27229][SQL] GroupBy 
Placement in Intersect Distinct
URL: https://github.com/apache/spark/pull/24215#issuecomment-476959548
 
 
   I'm afraid this may cause big regression, as aggregate is expensive and here 
we do it twice.
   
   Aggregate needs to build a hash map, which may spill to disk if data is 
large. If the aggregate doesn't help with data reduction, we hit regression.
   
   Unless there is a proposal to solve the regression, I think we shouldn't 
merge it. 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to