GitHub user JoshRosen opened a pull request:
https://github.com/apache/spark/pull/7904
[SPARK-7165] [WIP] [SQL] Use sort merge join for outer join
This patch extends the SortMergeJoin operator so that it can perform outer
joins (left, right, and full). This patch is an updated version of #5717 by
@adrian-wang, which, in turn, was an updated version of #5208.
I'm opening this patch now to run with Jenkins; there are a number of TODOs
from earlier versions of this patch that still need to be investigated /
resolved:
- [ ] Address comment related to `boundCondition`:
https://github.com/apache/spark/pull/5717#discussion_r32891269
- [ ] Explore whether planner change should be moved to a different rule:
https://github.com/apache/spark/pull/5717#discussion-diff-32891271
- [ ] Address other bound condition comment:
https://github.com/apache/spark/pull/5717#discussion-diff-32891298
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/JoshRosen/spark outer-join-smj
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/7904.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #7904
----
commit bd261d04ef120cb05deb32259903b2f816c53bba
Author: Daoyuan Wang <[email protected]>
Date: 2015-06-19T04:47:30Z
use sort merge join for outer join
commit 07f96b5e9a0d2a79498589976f287f38eacbd11e
Author: Daoyuan Wang <[email protected]>
Date: 2015-06-19T06:22:46Z
rebase
commit 4152e864a19490ee6c5db7f9d3589fed7693efde
Author: Daoyuan Wang <[email protected]>
Date: 2015-07-30T07:12:49Z
bring it up to date
commit fd73084f40a6eb74fab98d062c13c54e049ef055
Author: Daoyuan Wang <[email protected]>
Date: 2015-07-30T08:04:04Z
fix default setting change
commit f5200790e1f5ddf4d3cb2cb68dfdfff45473aaf4
Author: Daoyuan Wang <[email protected]>
Date: 2015-07-31T05:18:04Z
fix style
commit bcdbadb30bf04533ca2e85fb0e843cf9bbc26a3e
Author: Josh Rosen <[email protected]>
Date: 2015-08-03T20:36:04Z
Merge remote-tracking branch 'origin/master' into outer-join-smj
commit 47329eebad88d71bb58fc5b759e8a9d6b21d22fa
Author: Josh Rosen <[email protected]>
Date: 2015-08-03T21:06:41Z
Remove old TODO
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]