ion-elgreco opened a new issue, #14943: URL: https://github.com/apache/datafusion/issues/14943
### Describe the bug
I was doing some improvements to merge in delta-rs to add an expr simplifier
on our early pruning filter. While doing this I noticed that these type of
expressions: `s.foo = 'a' and 'a' = s.foo` are not getting simplified into
`s.foo='a'` directly but rather in `s.foo='a' and s.foo='a' even when you set
max cycles to ludicrous number of 100.
If the lhs and rhs can be swapped and then it's equal to an existing
expression, it should be removed in one cicle.
See one of our logs after using the simplifier:
Before simplifying
```
[crates/core/src/operations/merge/mod.rs:864:5] commit_predicate.clone() =
Some(
"unique_row_hash BETWEEN 'new_hash' AND 'new_hash' AND 202502 = month_id
AND month_id = 202502 AND 20250226 = date_id AND date_id = 20250226",
)
```
After simplifying once with max cycles 100
```
[crates/core/src/operations/merge/mod.rs:886:5] commit_predicate.clone() =
Some(
"unique_row_hash >= 'new_hash' AND unique_row_hash <= 'new_hash' AND
month_id = 202502 AND month_id = 202502 AND date_id = 20250226 AND date_id =
20250226",
)
```
### To Reproduce
```
Some(
BinaryExpr(
BinaryExpr {
left: BinaryExpr(
BinaryExpr {
left: BinaryExpr(
BinaryExpr {
left: Column(
Column {
relation: Some(
Bare {
table: "t",
},
),
name: "unique_row_hash",
},
),
op: GtEq,
right: Literal(
Utf8("new_hash"),
),
},
),
op: And,
right: BinaryExpr(
BinaryExpr {
left: Column(
Column {
relation: Some(
Bare {
table: "t",
},
),
name: "unique_row_hash",
},
),
op: LtEq,
right: Literal(
Utf8("new_hash"),
),
},
),
},
),
op: And,
right: BinaryExpr(
BinaryExpr {
left: BinaryExpr(
BinaryExpr {
left: BinaryExpr(
BinaryExpr {
left: BinaryExpr(
BinaryExpr {
left: Column(
Column {
relation: Some(
Bare {
table: "t",
},
),
name: "month_id",
},
),
op: Eq,
right: Literal(
Int32(202502),
),
},
),
op: And,
right: BinaryExpr(
BinaryExpr {
left: Column(
Column {
relation: Some(
Bare {
table: "t",
},
),
name: "month_id",
},
),
op: Eq,
right: Literal(
Int64(202502),
),
},
),
},
),
op: And,
right: BinaryExpr(
BinaryExpr {
left: Column(
Column {
relation: Some(
Bare {
table: "t",
},
),
name: "date_id",
},
),
op: Eq,
right: Literal(
Int32(20250226),
),
},
),
},
),
op: And,
right: BinaryExpr(
BinaryExpr {
left: Column(
Column {
relation: Some(
Bare {
table: "t",
},
),
name: "date_id",
},
),
op: Eq,
right: Literal(
Int64(20250226),
),
},
),
},
),
},
),
)
```
### Expected behavior
Simplify further
### Additional context
_No response_
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
