Rachelint commented on PR #11758:
URL: https://github.com/apache/datafusion/pull/11758#issuecomment-2282832444

   > > I agree, finally it should be a big change which switches the group 
values and related states managed by block like duckdb , and I am working on 
this(#11931).
   > 
   > I think personally suggest sketching out what this would look like in a 
first PR without worrying about getting all the tests passing / compiling etc.
   > 
   > If we try to port all code at once to being managed in blocks it is going 
to be a very large change
   > 
   > I am thinking maybe we can have a incremental approach (like for example 
separately adding the ability to do blocked emission for 
https://docs.rs/datafusion/latest/datafusion/logical_expr/trait.GroupsAccumulator.html
 and
   > 
   > 
https://github.com/apache/datafusion/blob/fd237f8705b18fa089fdfb8dd5b04655ccb4d691/datafusion/physical-plan/src/aggregates/group_values/mod.rs#L37-L55
   > 
   > If we can set this up so that we get the pattern and a few common 
implementations setup in the first PR then we can make subsequent PRs to port 
over the other parts of the aggregation 🤔
   
   Yes, I want to do the similar things for 
   
   > incremental approach
   
   Yes, I am planning to make it  incrementally, too.
   
   For the `GroupValues`s, I make the compatibility work here to make it able 
to process the single big block or the multiple small blocks. And I am just 
trying to impl the block based managing for `GroupValuesRows` impls now, other 
impls will keep the old logic until we optimize them.
   
https://github.com/apache/datafusion/blob/56399f7bd87c70b94be8b39d38fb5484289d3915/datafusion/physical-plan/src/aggregates/row_hash.rs#L654-L683
   
   For the `GroupAccumulator`s, I impl a function to convert the `new block 
based group idx` to the `old flat group idx`, so the `GroupAccumulator`s can 
almost keep the old process logic. Based on this, we can swith them to the new 
style  incrementally.
   
https://github.com/apache/datafusion/blob/56399f7bd87c70b94be8b39d38fb5484289d3915/datafusion/physical-plan/src/aggregates/group_values/mod.rs#L60
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to