Jefffrey commented on code in PR #8780:
URL: https://github.com/apache/arrow-datafusion/pull/8780#discussion_r1445268405
##########
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##########
@@ -224,6 +226,62 @@ impl<S: SimplifyInfo> ExprSimplifier<S> {
}
}
+#[allow(rustdoc::private_intra_doc_links)]
+/// Canonicalize any BinaryExprs that are not in canonical form
+/// <literal> <op> <col> is rewritten to <col> <op> <literal> (remember to
switch the operator)
+/// <col> <op> <literal> is canonical
+/// <col1> <op> <col2> is rewritten so that the name of col1 sorts higher than
col2 (b > a would be canonicalized to a < b)
Review Comment:
Might need to upgrade the docs here, for example remove the `remember to
switch the operator` line (or change it to clarify that operator would be
switched)
##########
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##########
@@ -224,6 +226,62 @@ impl<S: SimplifyInfo> ExprSimplifier<S> {
}
}
+#[allow(rustdoc::private_intra_doc_links)]
+/// Canonicalize any BinaryExprs that are not in canonical form
+/// <literal> <op> <col> is rewritten to <col> <op> <literal> (remember to
switch the operator)
+/// <col> <op> <literal> is canonical
+/// <col1> <op> <col2> is rewritten so that the name of col1 sorts higher than
col2 (b > a would be canonicalized to a < b)
+struct Canonicalizer {}
+
+impl Canonicalizer {
+ fn new() -> Self {
+ Self {}
+ }
+}
+
+impl TreeNodeRewriter for Canonicalizer {
+ type N = Expr;
+
+ fn mutate(&mut self, expr: Expr) -> Result<Expr> {
+ if let Expr::BinaryExpr(BinaryExpr { left, op, right }) = expr {
+ // Case 1, <col1> <op> <col2>
+ let mut new_expr = BinaryExpr {
+ left: left.clone(),
+ op: op.clone(),
+ right: right.clone(),
+ };
Review Comment:
I wonder if there is a cleaner way to do this, without having to clone here
(since this will always clone for `Expr;:BinaryExpr` even if we don't end up
swapping)
Could try an approach without relying on mutability of `new_expr`
##########
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##########
@@ -224,6 +226,62 @@ impl<S: SimplifyInfo> ExprSimplifier<S> {
}
}
+#[allow(rustdoc::private_intra_doc_links)]
+/// Canonicalize any BinaryExprs that are not in canonical form
+/// <literal> <op> <col> is rewritten to <col> <op> <literal> (remember to
switch the operator)
+/// <col> <op> <literal> is canonical
+/// <col1> <op> <col2> is rewritten so that the name of col1 sorts higher than
col2 (b > a would be canonicalized to a < b)
+struct Canonicalizer {}
+
+impl Canonicalizer {
+ fn new() -> Self {
+ Self {}
+ }
+}
+
+impl TreeNodeRewriter for Canonicalizer {
+ type N = Expr;
+
+ fn mutate(&mut self, expr: Expr) -> Result<Expr> {
+ if let Expr::BinaryExpr(BinaryExpr { left, op, right }) = expr {
+ // Case 1, <col1> <op> <col2>
+ let mut new_expr = BinaryExpr {
+ left: left.clone(),
+ op: op.clone(),
+ right: right.clone(),
+ };
+ match (left.as_ref(), right.as_ref()) {
Review Comment:
You could even push the `op.swap()` into this match, to remove the nested if
blocks, e.g.
```rust
match (left.as_ref(), right.as_ref(), op.swap()) {
(Expr::Column(_), Expr::Column(_), Some(swapped_op)) => {
todo!(); // swap based on name, directly using the swapped_op
}
(Expr::Literal(_), Expr::Column(_), Some(swapped_op)) => {
todo!(); // swap to have column on left
}
_ => todo!(),
}
```
You could then even add the name check of left vs right as an if condition
to the branch
##########
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##########
@@ -224,6 +226,62 @@ impl<S: SimplifyInfo> ExprSimplifier<S> {
}
}
+#[allow(rustdoc::private_intra_doc_links)]
+/// Canonicalize any BinaryExprs that are not in canonical form
+/// <literal> <op> <col> is rewritten to <col> <op> <literal> (remember to
switch the operator)
+/// <col> <op> <literal> is canonical
+/// <col1> <op> <col2> is rewritten so that the name of col1 sorts higher than
col2 (b > a would be canonicalized to a < b)
+struct Canonicalizer {}
+
+impl Canonicalizer {
+ fn new() -> Self {
+ Self {}
+ }
+}
+
+impl TreeNodeRewriter for Canonicalizer {
+ type N = Expr;
+
+ fn mutate(&mut self, expr: Expr) -> Result<Expr> {
+ if let Expr::BinaryExpr(BinaryExpr { left, op, right }) = expr {
+ // Case 1, <col1> <op> <col2>
+ let mut new_expr = BinaryExpr {
+ left: left.clone(),
+ op: op.clone(),
+ right: right.clone(),
+ };
+ match (left.as_ref(), right.as_ref()) {
+ (Expr::Column(_a), Expr::Column(_b)) => {
+ let left_name = left.canonical_name();
+ let right_name = right.canonical_name();
+ if left_name < right_name {
Review Comment:
I think this should be the other way around, if you want to be consistent
with the docs
```rust
/// <col1> <op> <col2> is rewritten so that the name of col1 sorts higher
than col2 (b > a would be canonicalized to a < b)
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]