GitHub user mgyucht opened a pull request:
https://github.com/apache/spark/pull/21425
Add unit tests for NOT IN subquery around null values
## What changes were proposed in this pull request?
This PR adds several unit tests along the `cols NOT IN (subquery)` pathway.
There are a scattering of tests here and there which cover this codepath, but
there doesn't seem to be a unified unit test of the correctness of null-aware
anti joins anywhere. I have also added a brief explanation of how this
expression behaves in SubquerySuite. Lastly, I made some clarifying changes in
the NOT IN pathway in RewritePredicateSubquery.
## How was this patch tested?
Added unit tests! There should be no behavioral change in this PR
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/mgyucht/spark-1 spark-24381
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/21425.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #21425
----
commit d6040ea0028754c7fe39ddcebb6bd027749acc4e
Author: Miles Yucht <miles@...>
Date: 2018-05-24T15:16:37Z
Add tests, and small clean-up of the NOT IN pathway
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]