GitHub user ioana-delaney opened a pull request:
https://github.com/apache/spark/pull/20868
[SPARK-23750][SQL] Inner Join Elimination based on Informational RI
constraints
## What changes were proposed in this pull request?
This transformation detects RI joins and eliminates the parent/PK table if
none of its columns, other than the PK columns, are referenced in the query.
**Example:**
```SQL
select fact.c1
from fact, dim1, dim2
where fact.c1 = dim1.pk /* FK = PK */ and
fact.c2 = dim2.pk /* FK = PK */ and
dim1.pk = 10 and
dim2.pk like âabc%â
```
**Internal optimized query after join elimination:**
```SQL
select fact.c1
from fact
where fact.c1 = 10 and fact.c2 like âabc%â
```
The transformation will apply under the following restrictions:
- No columns from the parent table are retrieved.
- No columns from the parent table other than the PK columns are referenced
in the predicates.
- Conservatively, only allow local predicates on PK columns or equi-joins
between PK columns and other tables.
- The join is directly above a base table access i.e. no aliases or other
expressions above base table access
- Other restrictions on string data types
## How was this patch tested?
A new test suite HiveRIJElimSuite.scala was introduced.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/ioana-delaney/spark rijelim
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/20868.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #20868
----
commit 9d1f0f1841b4f7828534036c1b2cef4ef7f1d84a
Author: Ioana Delaney <ioanamdelaney@...>
Date: 2018-03-20T19:55:19Z
[SPARK-23750] Add dependent DDL changes from SPARK-21784.
commit 0d189ab49b2dcb748b51f875f1a04e6b2fb9f69b
Author: Ioana Delaney <ioanamdelaney@...>
Date: 2018-03-20T23:29:11Z
[SPARK-23750] Join elimination rewrite based on RI constraints.
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]