kazuyukitanimura opened a new pull request #33930:
URL: https://github.com/apache/spark/pull/33930
### What changes were proposed in this pull request?
This PR proposes to add more Not operator simplifications in
`BooleanSimplification` by applying the following rules
- Not(null) == null
- e.g. IsNull(Not(...)) can be IsNull(...)
- (Not(a) = b) == (a = Not(b))
- e.g. Not(...) = true can be (...) = false
- (a != b) == (a = Not(b))
- e.g. (...) != true can be (...) = false
### Why are the changes needed?
This PR simplifies SQL statements that includes Not operators.
In addition, the following query does not push down the filter in the
current implementation
```
SELECT * FROM t WHERE (not boolean_col) <=> null
```
although the following equivalent query pushes down the filter as expected.
```
SELECT * FROM t WHERE boolean_col <=> null
```
That is because the first query creates `IsNull(Not(boolean_col))` in the
current implementation, which should be able to get simplified further to
`IsNull(boolean_col)`
This PR helps optimizing such cases.
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Added unit tests
```
build/sbt "testOnly *BooleanSimplificationSuite -- -z SPARK-36665"
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]