ozankabak commented on code in PR #5035:
URL: https://github.com/apache/arrow-datafusion/pull/5035#discussion_r1090071827


##########
datafusion/core/src/physical_plan/union.rs:
##########
@@ -220,55 +220,55 @@ impl ExecutionPlan for UnionExec {
     }
 
     fn output_ordering(&self) -> Option<&[PhysicalSortExpr]> {
-        let first_input_ordering = self.inputs[0].output_ordering();
-        // If the Union is not partition aware and all the input ordering spec 
strictly equal with the first_input_ordering
-        // Return the first_input_ordering as the output_ordering
-        //
-        // It might be too strict here in the case that the input ordering are 
compatible but not exactly the same.
-        // For example one input ordering has the ordering spec 
SortExpr('a','b','c') and the other has the ordering
-        // spec SortExpr('a'), It is safe to derive the out ordering with the 
spec SortExpr('a').
-        if !self.partition_aware
-            && first_input_ordering.is_some()
-            && self
-                .inputs
-                .iter()
-                .map(|plan| plan.output_ordering())
-                .all(|ordering| {
-                    ordering.is_some()
-                        && sort_expr_list_eq_strict_order(
-                            ordering.unwrap(),
-                            first_input_ordering.unwrap(),
-                        )
-                })
-        {
-            first_input_ordering
+        // If the Union is partition aware, there is no output ordering.
+        // Otherwise, the output ordering is the "meet" of its input orderings.
+        // The meet is the finest ordering that satisfied by all the input
+        // orderings, see https://en.wikipedia.org/wiki/Join_and_meet.
+        if self.partition_aware {
+            return None;
+        }
+        // To find the meet, we first find the smallest input ordering.
+        let mut smallest: Option<&[PhysicalSortExpr]> = None;

Review Comment:
   Regarding 1, the initial implementation treats input orderings "atomically", 
so it finds the meet in that sense. I think you are suggesting we find the meet 
by "looking inside". So the initial implementation is overly conservative; i.e. 
wrong in a relatively harmless way (it will lose fine ordering information but 
not come up with a wrong ordering). Nevertheless, I agree that we should fix it.
   
   Regarding 2, what would be your suggestion? Should we check ordering without 
any normalization?
   
   Happy to fix this quickly based on your feedback. Thanks for the detailed 
review, and no worries about the late response.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to