Chenzhao Guo created SPARK-22671:
------------------------------------

             Summary: SortMergeJoin read more data when wholeStageCodegen is 
off compared with when it is on
                 Key: SPARK-22671
                 URL: https://issues.apache.org/jira/browse/SPARK-22671
             Project: Spark
          Issue Type: Bug
          Components: SQL
    Affects Versions: 2.2.0
            Reporter: Chenzhao Guo


In SortMergeJoin(with wholeStageCodegen), an optimization already exists: if 
the left table of a partition is empty then there is no need to read the right 
table of this corresponding partition. This benefits the case in which many 
partitions of left table is empty and the right table is big.

While in the code path without wholeStageCodegen, this optimization doesn't 
happen.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to