alamb commented on issue #4169: URL: https://github.com/apache/arrow-datafusion/issues/4169#issuecomment-1311576873
> One question for SortPreservingMergeExec, does it alway require the inputs to be sorted ? What if the inputs are not sorted, what kind of output stream the SortPreservingMergeExec can generate ? `SortPreservingMergeExec` is a fairly classic multi-column merge operator. This comment describes it well: https://github.com/apache/arrow-datafusion/blob/5883e43db6c16d3ac3616302606849abbfbc86eb/datafusion/core/src/physical_plan/sorts/sort_preserving_merge.rs#L54-L80 If the inputs are not sorted the output will be incorrect (specifically some of the rows may be lost) The SortPreservingMergeExec produces a sorted output stream without resorting its input One usecase for this operator outside of IOx might be to implement `UNION` (which removes duplicate) of two subqueries that were already sorted. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
