zhuqi-lucas commented on code in PR #19433:
URL: https://github.com/apache/datafusion/pull/19433#discussion_r2642470624


##########
datafusion/physical-expr-common/src/sort_expr.rs:
##########
@@ -458,6 +458,21 @@ impl LexOrdering {
         })
     }
 
+    /// Checks if `other` is a prefix of this `LexOrdering`.
+    pub fn is_prefix(&self, other: &LexOrdering) -> bool {

Review Comment:
   I meet some similar topic for the sort pushdown, and submitted a 
[PR](https://github.com/apache/datafusion/pull/19446#issue-3752083820), i 
prefer to use eq_properties to do similar things, it will handle more cases 
after my try in my internal project. 
   
   
   The ordering_satisfy will handle more cases than prefix.



##########
datafusion/catalog-listing/src/table.rs:
##########
@@ -336,17 +336,64 @@ impl ListingTable {
         self.options.format.file_source(table_schema)
     }
 
-    /// If file_sort_order is specified, creates the appropriate physical 
expressions
+    /// Creates output ordering from user-specified file_sort_order or derives
+    /// from file orderings when user doesn't specify.
+    ///
+    /// If user specified `file_sort_order`, that takes precedence.
+    /// Otherwise, attempts to derive common ordering from file orderings in
+    /// the provided file groups.
     pub fn try_create_output_ordering(
         &self,
         execution_props: &ExecutionProps,
+        file_groups: &[FileGroup],
     ) -> datafusion_common::Result<Vec<LexOrdering>> {
-        create_lex_ordering(
-            &self.table_schema,
-            &self.options.file_sort_order,
-            execution_props,
-        )
+        // If user specified sort order, use that
+        if !self.options.file_sort_order.is_empty() {
+            return create_lex_ordering(
+                &self.table_schema,
+                &self.options.file_sort_order,
+                execution_props,
+            );
+        }
+
+        // Otherwise, try to derive from file orderings
+        Ok(derive_common_ordering_from_files(file_groups))
+    }
+}
+
+/// Derives a common ordering from file orderings across all file groups.
+///
+/// Returns the common ordering if all files have compatible orderings,
+/// otherwise returns an empty Vec (no ordering).
+fn derive_common_ordering_from_files(file_groups: &[FileGroup]) -> 
Vec<LexOrdering> {
+    // Collect all file orderings
+    let mut all_orderings: Vec<&LexOrdering> = Vec::new();
+    for group in file_groups {
+        for file in group.iter() {
+            if let Some(ordering) = &file.ordering {
+                all_orderings.push(ordering);
+            } else {
+                // If any file has no ordering, we can't derive a common 
ordering
+                return vec![];
+            }
+        }
+    }
+
+    if all_orderings.is_empty() {
+        return vec![];
     }
+
+    // Check that all orderings are identical
+    let first = all_orderings[0];
+    for ordering in &all_orderings[1..] {
+        if *ordering != first {

Review Comment:
   Do we want to support more relax cases, for example to use similar to:
   
   
https://github.com/apache/datafusion/blob/e6faacbf58f3db9321b3bff1fbcf351385492ced/datafusion/physical-expr/src/equivalence/properties/mod.rs#L533
 
   



##########
datafusion/core/tests/physical_optimizer/filter_pushdown/mod.rs:
##########
@@ -1829,7 +1829,7 @@ async fn test_topk_dynamic_filter_pushdown_integration() {
 COPY  (
   SELECT 1372708800 + value AS t
   FROM generate_series(0, 99999)
-  ORDER BY t
+  ORDER BY t + 1

Review Comment:
   Why this changes to t+1?



##########
datafusion/sqllogictest/test_files/parquet.slt:
##########
@@ -885,3 +885,42 @@ WHERE b = 2;
 
 statement ok
 DROP TABLE t;
+

Review Comment:
    It seems only test ASC NULLS FIRST. Better to add test cases for:
        - DESC ordering
        - Multi-column ordering
        - Files with no ordering metadata



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to