ozankabak commented on code in PR #5035:
URL: https://github.com/apache/arrow-datafusion/pull/5035#discussion_r1090117567


##########
datafusion/core/src/physical_plan/union.rs:
##########
@@ -220,55 +220,55 @@ impl ExecutionPlan for UnionExec {
     }
 
     fn output_ordering(&self) -> Option<&[PhysicalSortExpr]> {
-        let first_input_ordering = self.inputs[0].output_ordering();
-        // If the Union is not partition aware and all the input ordering spec 
strictly equal with the first_input_ordering
-        // Return the first_input_ordering as the output_ordering
-        //
-        // It might be too strict here in the case that the input ordering are 
compatible but not exactly the same.
-        // For example one input ordering has the ordering spec 
SortExpr('a','b','c') and the other has the ordering
-        // spec SortExpr('a'), It is safe to derive the out ordering with the 
spec SortExpr('a').
-        if !self.partition_aware
-            && first_input_ordering.is_some()
-            && self
-                .inputs
-                .iter()
-                .map(|plan| plan.output_ordering())
-                .all(|ordering| {
-                    ordering.is_some()
-                        && sort_expr_list_eq_strict_order(
-                            ordering.unwrap(),
-                            first_input_ordering.unwrap(),
-                        )
-                })
-        {
-            first_input_ordering
+        // If the Union is partition aware, there is no output ordering.
+        // Otherwise, the output ordering is the "meet" of its input orderings.
+        // The meet is the finest ordering that satisfied by all the input
+        // orderings, see https://en.wikipedia.org/wiki/Join_and_meet.
+        if self.partition_aware {
+            return None;
+        }
+        // To find the meet, we first find the smallest input ordering.
+        let mut smallest: Option<&[PhysicalSortExpr]> = None;

Review Comment:
   @mingmwang, I spent a little bit of time thinking how this can be 
implemented. Ignoring equivalences, I came up with this:
   ```rust
   let mut idx = 0;
   let first = orderings[0];
   loop {
       for ordering in orderings.iter() {
           if idx >= ordering.len() {
               return Some(ordering);
           } else if ordering[idx] != first[idx] {
               return if idx > 0 {
                   Some(&ordering[..idx])
               } else {
                   None
               };
           }
       }
       idx += 1;
   }
   ```
   I wrote some tests for this and it seems to work for various cases 
(including your example). What do you think? 
   
   Circling back to equivalences, I am not sure if such handling is necessary 
or sensible in this case -- maybe you can provide some insight about that. If 
such handling is necessary, we will need to replace the not-equality check 
`ordering[idx] != first[idx]` with an equivalence-aware check. However, at this 
point, I am not sure what to replace it with (if it is indeed necessary). If 
you can help out with these points, we can quickly make a PR and get the fix 
in. Thanks.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to