GitHub user ioana-delaney opened a pull request: https://github.com/apache/spark/pull/20868
[SPARK-23750][SQL] Inner Join Elimination based on Informational RI constraints ## What changes were proposed in this pull request? This transformation detects RI joins and eliminates the parent/PK table if none of its columns, other than the PK columns, are referenced in the query. **Example:** ```SQL select fact.c1 from fact, dim1, dim2 where fact.c1 = dim1.pk /* FK = PK */ and fact.c2 = dim2.pk /* FK = PK */ and dim1.pk = 10 and dim2.pk like âabc%â ``` **Internal optimized query after join elimination:** ```SQL select fact.c1 from fact where fact.c1 = 10 and fact.c2 like âabc%â ``` The transformation will apply under the following restrictions: - No columns from the parent table are retrieved. - No columns from the parent table other than the PK columns are referenced in the predicates. - Conservatively, only allow local predicates on PK columns or equi-joins between PK columns and other tables. - The join is directly above a base table access i.e. no aliases or other expressions above base table access - Other restrictions on string data types ## How was this patch tested? A new test suite HiveRIJElimSuite.scala was introduced. You can merge this pull request into a Git repository by running: $ git pull https://github.com/ioana-delaney/spark rijelim Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/20868.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #20868 ---- commit 9d1f0f1841b4f7828534036c1b2cef4ef7f1d84a Author: Ioana Delaney <ioanamdelaney@...> Date: 2018-03-20T19:55:19Z [SPARK-23750] Add dependent DDL changes from SPARK-21784. commit 0d189ab49b2dcb748b51f875f1a04e6b2fb9f69b Author: Ioana Delaney <ioanamdelaney@...> Date: 2018-03-20T23:29:11Z [SPARK-23750] Join elimination rewrite based on RI constraints. ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org