Dandandan commented on PR #15380: URL: https://github.com/apache/datafusion/pull/15380#issuecomment-2809275560
Interesting! I think it's close but still concatenate the input arrays in Phase 3. I think it's pretty close, what you can do to avoid it is: * create a `Vec<(size, size)>` based on the input batches, each element being (batch_id, row_id)` (0, 1), (0,2), 0,3 * use the global indices to find the (batch_id, row_id) for each sorted index into a new `Vec` * use `interleave` on all arrays rather than `take` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org