alamb commented on issue #8724:
URL: 
https://github.com/apache/arrow-datafusion/issues/8724#issuecomment-1877868335

   I am marking this as a good first issue but it is really a medium sized 
project
   
   However,  I think it is well specified and the existing code is 
straightforward to extend
   
   The goal is to add this simplification directly to 
[ExprSimplifier](https://docs.rs/datafusion/latest/datafusion/optimizer/simplify_expressions/struct.ExprSimplifier.html#)
   
   ## Canonicalize
   
   First canonicalize any BinaryExprs so:
   1. `<literal> <op> <col>` is rewritten to `<col> <op> <literal>`  (remember 
to switch the operator)
   2. `<col1> <op> <col2>` is rewritten so that the name of col1 sorts higher 
than col2 (`b > a` would be canonicalized to `a < b`);
   
   ## Remove reundancy
   
   1. For any chain of `<expr1> AND <expr2> AND <expr3>` remove any identical 
`expr`s
   2. For any chain of `<expr1> OR <expr2> OR <expr3>` remove any identical 
`expr`s
   
   So for example I would expect the following to be simplified:
   
   ```
   A=1 AND 1 = A AND A = 3 --> A = 1 AND A = 3
   ```
   
   
   ```
   (A=1 AND (B> 3 OR 3 < B)) --> (A = 1 AND B > 3)
   ```
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to