Re: [PR] feat: support normalized expr in CSE [datafusion]

via GitHub Thu, 14 Nov 2024 09:16:56 -0800


zhuliquan commented on code in PR #13315:
URL: https://github.com/apache/datafusion/pull/13315#discussion_r1842616030



##########
datafusion/expr/src/expr.rs:
##########
@@ -1674,6 +1674,69 @@ impl Expr {
     }
 }
 
+impl Normalizeable for Expr {
+    fn can_normalize(&self) -> bool {
+        #[allow(clippy::match_like_matches_macro)]
+        match self {
+            Expr::BinaryExpr(BinaryExpr {
+                op:
+                    _op @ (Operator::Plus
+                    | Operator::Multiply
+                    | Operator::BitwiseAnd
+                    | Operator::BitwiseOr
+                    | Operator::BitwiseXor
+                    | Operator::Eq
+                    | Operator::NotEq),
+                ..
+            }) => true,
+            _ => false,
+        }
+    }
+}
+
+impl NormalizeEq for Expr {
+    fn normalize_eq(&self, other: &Self) -> bool {
+        match (self, other) {
+            (
+                Expr::BinaryExpr(BinaryExpr {
+                    left: self_left,
+                    op: self_op,
+                    right: self_right,
+                }),
+                Expr::BinaryExpr(BinaryExpr {
+                    left: other_left,
+                    op: other_op,
+                    right: other_right,
+                }),
+            ) => {
+                if self_op != other_op {
+                    return false;
+                }
+
+                if matches!(
+                    self_op,
+                    Operator::Plus
+                        | Operator::Multiply
+                        | Operator::BitwiseAnd
+                        | Operator::BitwiseOr
+                        | Operator::BitwiseXor
+                        | Operator::Eq
+                        | Operator::NotEq
+                ) {
+                    (self_left.normalize_eq(other_left)
+                        && self_right.normalize_eq(other_right))
+                        || (self_left.normalize_eq(other_right)
+                            && self_right.normalize_eq(other_left))
+                } else {
+                    self_left.normalize_eq(other_left)
+                        && self_right.normalize_eq(other_right)
+                }
+            }
+            (_, _) => self == other,

Review Comment:
   Hi @peter-toth, Apologies for the delayed commit. I've added more arm in the 
`normalize_eq` function to handle cumulative `BinaryExpr` comparisons for other 
expressions. While working on this, I also noticed that other expressions could 
benefit from normalization. For example, with the `InList` and `CaseWhen` 
expression, we can ignore the order of elements.
   
   You can see the relevant code here:
   
https://github.com/apache/datafusion/blob/cc11692226da7e5dd49caaee2a8c3e66af920d4c/datafusion/expr/src/expr.rs#L2013
   
https://github.com/apache/datafusion/blob/cc11692226da7e5dd49caaee2a8c3e66af920d4c/datafusion/expr/src/expr.rs#L2034-L2036
   
   In this case, I think the normalize_eq(&self, other: &Self) -> bool trait is 
not the best way to handle this scenario for almost exponential time 
complexity. At this moment, it's a good idea to normalize it first and then 
compare it. Do you have any suggestions on how to approach this?
   
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] feat: support normalized expr in CSE [datafusion]

Reply via email to