suremarc commented on PR #13296: URL: https://github.com/apache/datafusion/pull/13296#issuecomment-2499072825
> The implementation is really nice. I'm wondering is it convenient to move the stream concat logic into `StreamingMergeBuilder`, like > > ```rust > let result = StreamingMergeBuilder::new() > .with_streams(inputs) > .with_statistics_by_stream(stats) > .build(); // Concat non-overlapping input streams here > ``` > > Now `SortExec` is implemented as 1. Sort several small runs 2. Create a internal `SortPreservingMergeStream` to merge all small runs. This way sort query can also benefit from this work with some additional effort This didn't occur to me but I think it would be a great change. On the other hand I'm considering if it would make sense in a follow-on PR. But in any case there's a lot of statistics-related work that will need to be done before this PR is mergeable, unfortunately. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org