[GitHub] [arrow-datafusion] gruuya commented on issue #7149: Top-K query optimization suboptimal memory usage without custom allocators

via GitHub Wed, 02 Aug 2023 05:52:21 -0700


gruuya commented on issue #7149:
URL: 
https://github.com/apache/arrow-datafusion/issues/7149#issuecomment-1662157300


   > Ah, this might be a good improvement. I thought it did it like this 
before, but this might have been changed more recently. I think doing this 
should be as fast (as long as we avoid to perform sorting twice).
   
   So I tried to do a first pass along these lines, you can see the results 
here: https://github.com/apache/arrow-datafusion/pull/7180
   
   Besides the memory the query times are actually slightly better than before, 
however it's only draft atm since I think there are some sub-optimal cases. For 
instance, previously the sorting was done in parallel via #6308 which is not 
the case here, and relatedly in the absence of a `fetch` parameter the 
algorithm should probably default to the previous approach.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-datafusion] gruuya commented on issue #7149: Top-K query optimization suboptimal memory usage without custom allocators

Reply via email to