GitHub user mgaido91 opened a pull request:
https://github.com/apache/spark/pull/20717
[SPARK-23564][SQL] Add isNotNull check for left anti and outer joins
## What changes were proposed in this pull request?
In order to optimize queries, some conditions can be added to the join
condition for LEFT ANTI and OUTER joins. Unfortunately, so far this was not
done since we are using only constraints which can be enforced on the output of
the operator (in this case of the JOIN).
We can enforce some `isNotNull` conditions on one side, which are not valid
conditions on the output of the Join, though. The PR adds these conditions in
the `Optimizer` phase, in order to improve performance in some cases.
## How was this patch tested?
Added UTs
Please review http://spark.apache.org/contributing.html before opening a
pull request.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/mgaido91/spark SPARK-23564
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/20717.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #20717
----
commit 45fbb851e76eeaa45c9926571059274efca2441a
Author: Marco Gaido <marcogaido91@...>
Date: 2018-03-02T16:27:18Z
[SPARK-23564][SQL] Add isNotNull check for left anti and outer joins
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]