GitHub user davies opened a pull request:
https://github.com/apache/spark/pull/12820
[SPARK-14781] [SQL] support nested predicate subquery
## What changes were proposed in this pull request?
In order to support nested predicate subquery, this PR introduce an
internal join type LeftSemiPlus, which will emit all the rows from left, plus
an additional column, which presents there are any rows matched from right or
not (it's not null-aware right now). This additional column could be used to
replace the subquery in Filter.
In theory, all the predicate subquery could use this join type, but it's
slower than LeftSemi and LeftAnti, so it's only used for nested subquery
(subquery inside OR).
For example, the following SQL:
```sql
SELECT a FROM t WHERE EXISTS (select 0) OR EXISTS (select 1)
```
This PR also fix a bug in predicate subquery push down through join (they
should not).
Nested null-aware subquery is still not supported. For example, `a > 3 OR
b NOT IN (select bb from t)`
TODO: add more tests for subquery
## How was this patch tested?
Added unit tests.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/davies/spark or_exists
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/12820.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #12820
----
commit 0c9f26c943e70894d9ad18b8dac2792b5d6fd92b
Author: Davies Liu <[email protected]>
Date: 2016-05-01T07:48:50Z
support nested predicate subquery
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]