Jefffrey commented on code in PR #8780:
URL: https://github.com/apache/arrow-datafusion/pull/8780#discussion_r1463097872


##########
datafusion/optimizer/src/simplify_expressions/simplify_exprs.rs:
##########
@@ -93,16 +93,33 @@ impl SimplifyExpressions {
             .map(|input| Self::optimize_internal(input, execution_props))
             .collect::<Result<Vec<_>>>()?;
 
-        let expr = plan
-            .expressions()
-            .into_iter()
-            .map(|e| {
-                // TODO: unify with `rewrite_preserving_name`
-                let original_name = e.name_for_alias()?;
-                let new_e = simplifier.simplify(e)?;
-                new_e.alias_if_changed(original_name)
-            })
-            .collect::<Result<Vec<_>>>()?;
+        let expr = match plan {
+            // Canonicalize step won't reorder expressions in a Join on clause.
+            // The left and right expressions in a Join on clause are not 
commutative.

Review Comment:
   ```suggestion
               // Canonicalize step won't reorder expressions in a Join on 
clause.
               // The left and right expressions in a Join on clause are not 
commutative,
               // since the order of the columns must match the order of the 
children.
   ```
   
   Just an extra bit of clarification



##########
datafusion/physical-plan/src/joins/utils.rs:
##########
@@ -293,9 +293,17 @@ fn check_join_set_is_valid(
     let right_missing = on_right.difference(right).collect::<HashSet<_>>();
 
     if !left_missing.is_empty() | !right_missing.is_empty() {
-        return plan_err!(
-            "The left or right side of the join does not have all columns on 
\"on\": \nMissing on the left: {left_missing:?}\nMissing on the right: 
{right_missing:?}"
-        );
+        let on_left_reverse = &on.iter().map(|on| 
on.1.clone()).collect::<HashSet<_>>();
+        let left_missing_reverse =
+            on_left_reverse.difference(left).collect::<HashSet<_>>();
+        let on_right_reverse = &on.iter().map(|on| 
on.0.clone()).collect::<HashSet<_>>();
+        let right_missing_reverse =
+            on_right_reverse.difference(right).collect::<HashSet<_>>();
+        if !left_missing_reverse.is_empty() | 
!right_missing_reverse.is_empty() {
+            return plan_err!(
+                "The left or right side of the join does not have all columns 
on \"on\": \nMissing on the left: {left_missing:?}\nMissing on the right: 
{right_missing:?}"
+            );
+        }

Review Comment:
   Is this change still required if the order of the `on` clause in Joins 
should not be reordered, now?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to