yjshen commented on issue #12596:
URL: https://github.com/apache/datafusion/issues/12596#issuecomment-2372599393

   > Introduce the partitioned hashtable in partial aggregation, and we 
partition the datafusion before inserting them into hashtable.
   > And we push them into final aggregation partition by partition after, 
rather than split them again in repartition, and merge them again in coalesce.
   
   I'm not clear on how this proposal works. Could you please explain why it 
provides performance benefits compared to partial aggregation, exchange, and 
final aggregation? Is the proposal aimed explicitly at accelerating high 
cardinality aggregation, or is it intended to enhance aggregation performance?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to