kazuyukitanimura opened a new pull request #33865: URL: https://github.com/apache/spark/pull/33865
### What changes were proposed in this pull request? This PR proposes to add `BooleanType` support to the `UnwrapCastInBinaryComparison` optimizer that is currently supports `NumericType` only. The main idea is to treat `BooleanType` as 1 bit integer so that we can utilize all optimizations already defined in `UnwrapCastInBinaryComparison`. In addition, this change replaces the precedent simplification in `TypeCoercion` that was left as `TODO` to move it to optimizer for many years. This work is an extension of SPARK-24994 and SPARK-32858 ### Why are the changes needed? Current implementation of Spark without this PR cannot properly optimize the filter for the following case ``` SELECT * FROM t WHERE boolean_field = 2 ``` The above query creates a filter of `cast(boolean_field, int) = 2`. The casting prevents from pushing down the filter. In contrast, this PR creates a `false` filter and returns early as there cannot be such a matching case anyway (empty results.) Even for the following case does not push down the filter properly in the current implementation. ``` SELECT * FROM t WHERE boolean_field = 1 ``` The above query should be able to push down the filter `boolean_field=true`; however, due to the precedent optimization in `TypeCoercion` that is incompatible with the physical planner, the filter push down fails and all rows have to be read first. With this PR, `UnwrapCastInBinaryComparison` takes care of the optimization and properly pushes down the filter ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? Passed existing tests ``` build/sbt "catalyst/test" build/sbt "sql/test" ``` Added unit tests ``` build/sbt "catalyst/testOnly *UnwrapCastInBinaryComparisonSuite -- -z SPARK-36607" build/sbt "sql/testOnly *UnwrapCastInComparisonEndToEndSuite -- -z SPARK-36607" ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
