mingmwang commented on PR #6034:
URL: 
https://github.com/apache/arrow-datafusion/pull/6034#issuecomment-1521324807

   @ozankabak @mustafasrepo 
   
   I strongly suggest to have separate implementation(Exec) for Streaming 
Aggregation. This is similar to how we separate the `HashJoinExec` 
/`SortMergeJoinExec` and `UnionExec` /`InterleaveExec`.
   With the split of physical plans, The physical plans will deliver clear 
informations about what kind real physical operators they are composed of.
   With the split of physical plans, we can keep each operator's code base 
(HashAggregation and SortAggregation) relatively simple. 
   We can further keep a relatively lightweight grouping state for each 
operators. The memory layout of the grouping state is critical for performance. 
 For the hash aggregation performance, currently, we still have huge gaps 
compared with DuckDb.
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to