[GitHub] [arrow-datafusion] alamb opened a new pull request, #6932: Consolidate `BoundedAggregateStream`

via GitHub Wed, 12 Jul 2023 10:20:12 -0700


alamb opened a new pull request, #6932:
URL: https://github.com/apache/arrow-datafusion/pull/6932


   Draft as it builds on https://github.com/apache/arrow-datafusion/pull/6904 
and isn't ready for review yet
   
   # Which issue does this PR close?
   
   Closes https://github.com/apache/arrow-datafusion/issues/6798
   
   # Rationale for this change
   See https://github.com/apache/arrow-datafusion/issues/6798 -- basically this 
is a lot of copy/paste code from the old group by hash implementation which 
prevents me from deleting RowFormat
   
   Also, I think this will have the very nice side benefit of being 
significantly faster (due to not using `ScalarValue` to compare order by keys)
   
   # What changes are included in this PR?
   1. Add `GroupOrdering` to encapsulate tracking the state of any ordering
   2. Update `GroupedAggregateStream` to use GroupOrdering when available
   
   Still todo:
   - [ ] Remove  `BoundedAggregateStream` code
   
   I plan to remove the Row format as a follow on PR to keep this one 
reasonably sized
   
   # Performance Results
   TODO
   
   
   # Are these changes tested?
   Existing tests
   
   # Are there any user-facing changes?
   Faster performance, smaller code size
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-datafusion] alamb opened a new pull request, #6932: Consolidate `BoundedAggregateStream`

Reply via email to