xudong963 commented on code in PR #18868:
URL: https://github.com/apache/datafusion/pull/18868#discussion_r2558449538


##########
datafusion/datasource-parquet/src/opener.rs:
##########
@@ -407,8 +407,12 @@ impl FileOpener for ParquetOpener {
                     .add_matched(n_remaining_row_groups);
             }
 
-            let mut access_plan = row_groups.build();
+            // Prune by limit
+            if let Some(limit) = limit {

Review Comment:
   
https://github.com/apache/datafusion/pull/18868/commits/52f012fe695567c84fbee1786489c24b42cb5c3b
   
   The reason why we can't directly use `limit` in opener.rs to decide limit 
pruning is that even if the limit is pushed down to scan, it's still possible 
that the query has order semantics. 
   
   For example:
   `select * from t where t > 1 order by a limit 1`
   
   If table t's file is sorted by a, and the sort could be removed in the 
`enforce_sorting` physical optimizer rule because t has respected the ordering 
requirement, then the limit is pushed down in `limit_pushdown` physical 
optimizer rule.
   
   So we need to decide if the limit is order-sensitive at the logical plan 
level. The commit does the check at the `limit_pushdown` logical optimizer rule.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to