avantgardnerio opened a new pull request, #7192: URL: https://github.com/apache/arrow-datafusion/pull/7192
## Which issue does this PR close? Closes #7191. ## Rationale for this change Described in issue. ## What changes are included in this PR? 1. A new `GroupedPriorityQueueAggregateStream` aggregation 2. A new `limit` property on `AggregateExec` 3. An optimizer rule to copy the limit from the `SortExec` if applicable ## Are these changes tested? Not yet. ## Are there any user-facing changes? 1. Some Top K queries should not crash 2. I probably broke other things so this is a draft ## Notes This is a draft PR to begin driving discussion. It is not yet complete in many ways, some of which are: 1. the `OwnedRow` code is not columnar, vectorized, etc 2. the `TreeMap` should probably be a `BinaryHeap` 3. we should probably figure out how to accumulate with the existing Acculumators? 4. filters are not yet applied 5. it's possible this should be a whole new `Exec` node, not just a new `Stream` type 6. probably much more! That's why I'm throwing this PR down as a straw man to drive discussion -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
