GitHub user wangyum opened a pull request: https://github.com/apache/spark/pull/22250
[SPARK-25259][SQL] left/right join support push down during-join predicates ## What changes were proposed in this pull request? Prepare data: ```sql create temporary view EMPLOYEE as select * from values ("000010", "HAAS", "A00"), ("000010", "THOMPSON", "B01"), ("000030", "KWAN", "C01"), ("000110", "LUCCHESSI", "A00"), ("000120", "O'CONNELL", "A))"), ("000130", "QUINTANA", "C01") as EMPLOYEE(EMPNO, LASTNAME, WORKDEPT); create temporary view DEPARTMENT as select * from values ("A00", "SPIFFY COMPUTER SERVICE DIV.", "000010"), ("B01", "PLANNING", "000020"), ("C01", "INFORMATION CENTER", "000030"), ("D01", "DEVELOPMENT CENTER", null) as EMPLOYEE(DEPTNO, DEPTNAME, MGRNO); create temporary view PROJECT as select * from values ("AD3100", "ADMIN SERVICES", "D01"), ("IF1000", "QUERY SERVICES", "C01"), ("IF2000", "USER EDUCATION", "E01"), ("MA2100", "WELD LINE AUDOMATION", "D01"), ("PL2100", "WELD LINE PLANNING", "01") as EMPLOYEE(PROJNO, PROJNAME, DEPTNO); ``` For the below SQL, we can push `DEPTNO='E01'` to right side to reduce data reading: ```sql SELECT PROJNO, PROJNAME, P.DEPTNO, DEPTNAME FROM PROJECT P LEFT OUTER JOIN DEPARTMENT D ON P.DEPTNO = D.DEPTNO AND P.DEPTNO='E01'; ``` Optimized SQL is equivalent to: ```sql SELECT PROJNO, PROJNAME, P.DEPTNO, DEPTNAME FROM PROJECT P LEFT OUTER JOIN (SELECT * FROM DEPARTMENT WHERE DEPTNO='E01') D ON P.DEPTNO = D.DEPTNO AND P.DEPTNO='E01'; ``` This pr enhancement `PushPredicateThroughJoin` to support this feature. ## How was this patch tested? unit tests You can merge this pull request into a Git repository by running: $ git pull https://github.com/wangyum/spark SPARK-25259 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/22250.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #22250 ---- commit f9b32d5d044a899529959ad5042f8cf95c789ea8 Author: Yuming Wang <yumwang@...> Date: 2018-08-28T06:18:05Z left/right join support push down during-join predicates ---- --- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org