alamb commented on issue #15387: URL: https://github.com/apache/datafusion/issues/15387#issuecomment-2779011846
Here is a simpler reproducer showing the `x = x` filter is still present and can be replaced with `x IS NOT NULL`: ```sql > create table foo (x int) as values (1), (2), (null); 0 row(s) fetched. Elapsed 0.041 seconds. > select * from foo; +------+ | x | +------+ | 1 | | 2 | | NULL | +------+ 3 row(s) fetched. Elapsed 0.002 seconds. > select * from foo where x = x; +---+ | x | +---+ | 1 | | 2 | +---+ 2 row(s) fetched. Elapsed 0.009 seconds. > select * from foo where x IS NOT NULL; +---+ | x | +---+ | 1 | | 2 | +---+ 2 row(s) fetched. Elapsed 0.002 seconds. > explain select * from foo where x = x; +---------------+-------------------------------+ | plan_type | plan | +---------------+-------------------------------+ | physical_plan | ┌───────────────────────────┐ | | | │ CoalesceBatchesExec │ | | | │ -------------------- │ | | | │ target_batch_size: │ | | | │ 8192 │ | | | └─────────────┬─────────────┘ | | | ┌─────────────┴─────────────┐ | | | │ FilterExec │ | | | │ -------------------- │ | | | │ predicate: x = x │ | | | └─────────────┬─────────────┘ | | | ┌─────────────┴─────────────┐ | | | │ DataSourceExec │ | | | │ -------------------- │ | | | │ bytes: 176 │ | | | │ format: memory │ | | | │ rows: 1 │ | | | └───────────────────────────┘ | | | | +---------------+-------------------------------+ 1 row(s) fetched. Elapsed 0.006 seconds. ``` So a sketch for the solution to this issue is: 1. Add a rule in `ExprSimplifier` for `<expr> = <expr>` --> `<expr> IS NOT NULL` in this match statement: https://github.com/apache/datafusion/blob/2cd6ed99dab90ca73497374e860f89e6fe83af1d/datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs#L730-L738 2. Add tests in slt (perhaps like above) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org