kazuyukitanimura opened a new pull request #34055: URL: https://github.com/apache/spark/pull/34055
### What changes were proposed in this pull request? This PR proposes to improve simplifications of `EqualTo/EqualNullSafe` binary comparators when one side is a boolean literal. For example: `EqualTo(predicate, TrueLiteral) => predicate`, `EqualNullSafe(predicate, TrueLiteral) => And(predicate, IsNotNull(predicate))` This PR helps pushing down the filter and reducing unnecessary IO. ### Why are the changes needed? The following query does not push down the filter in the current implementation ``` SELECT * FROM t WHERE (a AND b) = true ``` although the following equivalent query pushes down the filter as expected. ``` SELECT * FROM t WHERE (a AND b) ``` That is because the first query creates `EqualTo(And(a, b), TrueLiteral)` that is simply not in the form that we can push down. However, we should be able to get it simplified to `And(a, b)` It is fair for Spark SQL users to expect `(a AND b) = true` performs the same as `(a AND b)`. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Added unit tests ``` build/sbt "testOnly *BooleanSimplificationSuite -- -z SPARK-36721" ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
