Dandandan opened a new pull request, #21335: URL: https://github.com/apache/datafusion/pull/21335
## Which issue does this PR close? N/A — performance optimization ## Rationale for this change `SortPreservingMergeExec` currently processes every row through the loser tree, even when an entire batch from the winning stream is strictly less than all other streams' current values. This is common for: - Time-series data with range-partitioned inputs - After hash repartition + sort where partitions have distinct value ranges - Any workload where input batches don't overlap ## What changes are included in this PR? **Batch pass-through optimization**: After the loser tree selects a winner, if the winner's cursor is at the start of a new batch, check whether the batch's last value is strictly less than the runner-up's current value (found via O(log K) loser-tree walk). Two fast paths when the check succeeds: 1. **Zero-copy**: When `in_progress` buffer is empty and the full batch qualifies, `RecordBatch::slice` returns the batch directly — no interleave, no row-level copying. 2. **Bulk-push**: When partial rows are already buffered, append all qualifying rows at once via `push_rows`, skipping O(remaining × log K) loser-tree comparisons. ### Files changed - `cursor.rs`: Add `remaining()`, `last_cmp()`, `advance_by()`, `is_at_start()` to `Cursor` - `builder.rs`: Add `push_rows()` (bulk append) and `take_batch_slice()` (zero-copy emission) - `merge.rs`: Add `find_runner_up()`, `can_batch_pass_through()`, modify `poll_next_inner` - `sort_preserving_merge.rs`: 3 new tests covering non-overlapping, partial overlap, and fetch limits ## Are these changes tested? Yes — 3 new test cases plus all 18 existing tests pass: - `test_batch_pass_through_multi_batch`: Multiple non-overlapping batches per partition - `test_batch_pass_through_with_fetch`: Fetch limit that cuts through a pass-through batch - `test_batch_pass_through_partial_overlap`: Mixed overlap across 3 partitions ## Are there any user-facing changes? No API changes. Existing behavior is preserved — the optimization is strictly an internal fast path that produces identical output. 🤖 Generated with [Claude Code](https://claude.com/claude-code) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
