Jefffrey opened a new issue, #8724:
URL: https://github.com/apache/arrow-datafusion/issues/8724
### Is your feature request related to a problem or challenge?
Given this case:
```
DataFusion CLI v34.0.0
❯ CREATE TABLE kumachan (wakana string) AS VALUES ('ookuma');
0 rows in set. Query took 0.004 seconds.
❯ explain select * from kumachan where wakana = 'ookuma' and 'ookuma' =
wakana;
+---------------+-------------------------------------------------------------------------------+
| plan_type | plan
|
+---------------+-------------------------------------------------------------------------------+
| logical_plan | Filter: kumachan.wakana = Utf8("ookuma") AND
Utf8("ookuma") = kumachan.wakana |
| | TableScan: kumachan projection=[wakana]
|
| physical_plan | CoalesceBatchesExec: target_batch_size=8192
|
| | FilterExec: wakana@0 = ookuma AND ookuma = wakana@0
|
| | MemoryExec: partitions=1, partition_sizes=[1]
|
| |
|
+---------------+-------------------------------------------------------------------------------+
2 rows in set. Query took 0.004 seconds.
❯
```
I would expect the final plan to simplify the `FilterExec` to only be
`wakana@0 = ookuma` or `ookuma = wakana@0`, as the `AND` of these conditions is
redundant.
### Describe the solution you'd like
Possible solutions:
1. introduce new optimizer rule to reorder all `BinaryExpr` with `op` of
`Operator::Eq` to a consistent order, which would be placed high in the rules
list (during analysis?) to ensure downstream rules (like simplify_expressions)
can properly determine this equality and simplify the condition
2. modify `BinaryExpr` to have a custom `PartialEq`/`Eq` implementation
which disregards order of its `left`/`right` fields when checking equality
https://github.com/apache/arrow-datafusion/blob/9a6cc889a40e4740bfc859557a9ca9c8d043891e/datafusion/expr/src/expr.rs#L208-L217
Option 1 seems the cleaner solution, since don't have to muck with manual
implementation of `PartialEq`/`Eq`
### Describe alternatives you've considered
_No response_
### Additional context
_No response_
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]