alamb commented on issue #15387:
URL: https://github.com/apache/datafusion/issues/15387#issuecomment-2779011846

   Here is a simpler reproducer showing the `x = x` filter is still present and 
can be replaced with `x IS NOT NULL`:
   
   ```sql
   > create table foo (x int)
   as values (1), (2), (null);
   0 row(s) fetched.
   Elapsed 0.041 seconds.
   
   > select * from foo;
   +------+
   | x    |
   +------+
   | 1    |
   | 2    |
   | NULL |
   +------+
   3 row(s) fetched.
   Elapsed 0.002 seconds.
   
   > select * from foo where x = x;
   +---+
   | x |
   +---+
   | 1 |
   | 2 |
   +---+
   2 row(s) fetched.
   Elapsed 0.009 seconds.
   
   
   > select * from foo where x IS NOT NULL;
   +---+
   | x |
   +---+
   | 1 |
   | 2 |
   +---+
   2 row(s) fetched.
   Elapsed 0.002 seconds.
   
   > explain select * from foo where x = x;
   +---------------+-------------------------------+
   | plan_type     | plan                          |
   +---------------+-------------------------------+
   | physical_plan | ┌───────────────────────────┐ |
   |               | │    CoalesceBatchesExec    │ |
   |               | │    --------------------   │ |
   |               | │     target_batch_size:    │ |
   |               | │            8192           │ |
   |               | └─────────────┬─────────────┘ |
   |               | ┌─────────────┴─────────────┐ |
   |               | │         FilterExec        │ |
   |               | │    --------------------   │ |
   |               | │      predicate: x = x     │ |
   |               | └─────────────┬─────────────┘ |
   |               | ┌─────────────┴─────────────┐ |
   |               | │       DataSourceExec      │ |
   |               | │    --------------------   │ |
   |               | │         bytes: 176        │ |
   |               | │       format: memory      │ |
   |               | │          rows: 1          │ |
   |               | └───────────────────────────┘ |
   |               |                               |
   +---------------+-------------------------------+
   1 row(s) fetched.
   Elapsed 0.006 seconds.
   ```
   
   
   So a sketch for the solution to this issue is:
   1. Add a rule in `ExprSimplifier` for `<expr> = <expr>` --> `<expr> IS NOT 
NULL` in this match statement: 
https://github.com/apache/datafusion/blob/2cd6ed99dab90ca73497374e860f89e6fe83af1d/datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs#L730-L738
   2. Add tests in slt (perhaps like above)
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to