alamb commented on code in PR #21182:
URL: https://github.com/apache/datafusion/pull/21182#discussion_r3033611632
##########
datafusion/physical-plan/src/sorts/sort_preserving_merge.rs:
##########
@@ -366,7 +366,7 @@ impl ExecutionPlan for SortPreservingMergeExec {
.map(|partition| {
let stream =
self.input.execute(partition,
Arc::clone(&context))?;
- Ok(spawn_buffered(stream, 1))
+ Ok(spawn_buffered(stream, 16))
Review Comment:
Why is this changed?
##########
datafusion/physical-optimizer/src/pushdown_sort.rs:
##########
@@ -95,7 +100,14 @@ impl PhysicalOptimizerRule for PushdownSort {
// Each node type defines its own pushdown behavior via
try_pushdown_sort()
match sort_input.try_pushdown_sort(required_ordering)? {
SortOrderPushdownResult::Exact { inner } => {
- // Data source guarantees perfect ordering - remove the
Sort operator
+ // Data source guarantees perfect ordering - remove the
Sort operator.
+ // Preserve the fetch (LIMIT) from the original SortExec
so the
+ // data source can stop reading early.
+ let inner = if let Some(fetch) = sort_exec.fetch() {
+ inner.with_fetch(Some(fetch)).unwrap_or(inner)
Review Comment:
is this removal of a fetch a bug fix? Or is it only needed for this branch?
##########
datafusion/sqllogictest/test_files/sort_pushdown.slt:
##########
@@ -1100,16 +1100,15 @@ CREATE EXTERNAL TABLE reversed_parquet(id INT, value
INT)
STORED AS PARQUET
LOCATION 'test_files/scratch/sort_pushdown/reversed/';
-# Test 4.1: SortExec must be present because files are not in inter-file order
+# Test 4.1: Files sorted by statistics → non-overlapping → SortExec eliminated
Review Comment:
maybe it is worthpointing out that we expect that they don't ned a sort
because they are already sorted (and thus the plan should not contain a
SortExec)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]