alamb opened a new pull request, #6932: URL: https://github.com/apache/arrow-datafusion/pull/6932
Draft as it builds on https://github.com/apache/arrow-datafusion/pull/6904 and isn't ready for review yet # Which issue does this PR close? Closes https://github.com/apache/arrow-datafusion/issues/6798 # Rationale for this change See https://github.com/apache/arrow-datafusion/issues/6798 -- basically this is a lot of copy/paste code from the old group by hash implementation which prevents me from deleting RowFormat Also, I think this will have the very nice side benefit of being significantly faster (due to not using `ScalarValue` to compare order by keys) # What changes are included in this PR? 1. Add `GroupOrdering` to encapsulate tracking the state of any ordering 2. Update `GroupedAggregateStream` to use GroupOrdering when available Still todo: - [ ] Remove `BoundedAggregateStream` code I plan to remove the Row format as a follow on PR to keep this one reasonably sized # Performance Results TODO # Are these changes tested? Existing tests # Are there any user-facing changes? Faster performance, smaller code size -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
