GitHub user viirya opened a pull request:
https://github.com/apache/spark/pull/11691
[SPARK-13854][SQL] Add constraints to outer join
JIRA: https://issues.apache.org/jira/browse/SPARK-13854
## What changes were proposed in this pull request?
Currently, for left outer join we only keep left side constraint. For right
outer join, we only keep right side constraints. For full outer join, the
constraints are empty.
In fact, the constraints are less than the actual constraints for the join
operator.
For example, for left outer join, besides the constraints from left side,
the constraints of right side should be inherited with a bit modification.
Consider a join as following:
val tr1 = LocalRelation('a.int, 'b.int, 'c.int).subquery('tr1)
val tr2 = LocalRelation('a.int, 'd.int, 'e.int).subquery('tr2)
tr1.where('a.attr > 10)
.join(tr2.where('d.attr < 100), LeftOuter, Some("tr1.a".attr ===
"tr2.a".attr))
The constraints are not only "a" > 10, "a" is not null. It should also
include ("d" is null || "d" < 100).
## How was this patch tested?
Three tests in `ConstraintPropagationSuite` are modified for this PR.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/viirya/spark-1 join-constraints
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/11691.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #11691
----
commit ab7ea3734777f378c7b2d809595f52a4bf8b4ce2
Author: Liang-Chi Hsieh <[email protected]>
Date: 2016-03-14T06:11:38Z
init import.
commit f9a3ed035fd1b072f07f78bc63a2fdc8cddb3c7b
Author: Liang-Chi Hsieh <[email protected]>
Date: 2016-03-14T06:32:32Z
Merge remote-tracking branch 'upstream/master' into join-constraints
commit deae036525bc58e6c81d41a8cadfa8b33f7f9d74
Author: Liang-Chi Hsieh <[email protected]>
Date: 2016-03-14T06:36:24Z
Fix.
commit 1aa85500725a4a4d5a55583cc400d7b7d4171c37
Author: Liang-Chi Hsieh <[email protected]>
Date: 2016-03-14T07:57:22Z
Add constraints to outer join.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]