avantgardnerio opened a new pull request, #7192:
URL: https://github.com/apache/arrow-datafusion/pull/7192

   ## Which issue does this PR close?
   
   Closes #7191.
   
   ## Rationale for this change
   
   Described in issue.
   
   ## What changes are included in this PR?
   
   1. A new `GroupedPriorityQueueAggregateStream` aggregation
   2. A new `limit` property on `AggregateExec`
   3. An optimizer rule to copy the limit from the `SortExec` if applicable
   
   ## Are these changes tested?
   
   Not yet.
   
   ## Are there any user-facing changes?
   
   1. Some Top K queries should not crash
   2. I probably broke other things so this is a draft
   
   ## Notes
   
   This is a draft PR to begin driving discussion. It is not yet complete in 
many ways, some of which are:
   
   1. the `OwnedRow` code is not columnar, vectorized, etc
   2. the `TreeMap` should probably be a `BinaryHeap`
   3. we should probably figure out how to accumulate with the existing 
Acculumators?
   4. filters are not yet applied
   5. it's possible this should be a whole new `Exec` node, not just a new 
`Stream` type
   6. probably much more! That's why I'm throwing this PR down as a straw man 
to drive discussion


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to