Dandandan commented on PR #15380: URL: https://github.com/apache/datafusion/pull/15380#issuecomment-2804048977
> But this PR, we also concat some batches into one batch, do you mean we can also use the indices from each batch to one batch just like the merge phase? I mean theoretically we don't have to `merge` as all the batches are in memory. The merging is useful for sorting streams of data, but I think it is expected the process of sorting batches first followed by a custom merge implementation is slower than one sorting pass based on rust std unstable sort (which is optimized for doing a minimal amount of comparisons quickly). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org