alamb opened a new issue #416:
URL: https://github.com/apache/arrow-datafusion/issues/416


   **Is your feature request related to a problem or challenge? Please describe 
what you are trying to do.**
   The new sort preserving merge operator, introduced in #379 likely has room 
for performance improvement.
   
   
   **Describe the solution you'd like**
   1. Create a benchmark for the merging operator 
   2. Optimize / improve benchmark as appropriate
   
   Here is a suggestion from @jhorstmann  
https://github.com/apache/arrow-datafusion/pull/379/files#r637948151 as a 
separate ticket so it doesn't get lost:
   
   For bigger number of partitions, storing the cursors in a BinaryHeap, sorted 
by their current item, would be beneficial.
   
   A rust implementation of that approach can be seen in [this blog post and 
the first comment under it][1]. I have implemented the same approach in java 
before. I agree with @alamb though to make it work first, and then optimize 
later.
   
   [1]: https://dev.to/creativcoder/merge-k-sorted-arrays-in-rust-1b2f
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to