Cheng Su created SPARK-34729:
--------------------------------

             Summary: Faster execution for broadcast nested loop join (left 
semi/anti with no condition)
                 Key: SPARK-34729
                 URL: https://issues.apache.org/jira/browse/SPARK-34729
             Project: Spark
          Issue Type: Sub-task
          Components: SQL
    Affects Versions: 3.2.0
            Reporter: Cheng Su


For `BroadcastNestedLoopJoinExec` left semi and left anti join without 
condition. If we broadcast left side. Currently we check whether every row from 
broadcast side has a match or not by iterating broadcast side a lot of time - 
[https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/joins/BroadcastNestedLoopJoinExec.scala#L256-L275]
 . This is unnecessary, as there's no condition, and we only need to check 
whether stream side is empty or not. Create this Jira to add the optimization. 
This can boost the affected query execution performance a lot.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to