[ 
https://issues.apache.org/jira/browse/SPARK-44307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hyukjin Kwon updated SPARK-44307:
---------------------------------
    Target Version/s:   (was: 3.4.1)

> Bloom filter is not added for left outer join if the left side table is 
> smaller than broadcast threshold.
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: SPARK-44307
>                 URL: https://issues.apache.org/jira/browse/SPARK-44307
>             Project: Spark
>          Issue Type: Bug
>          Components: Optimizer
>    Affects Versions: 3.4.1
>            Reporter: mahesh kumar behera
>            Priority: Major
>
> In case of left outer join, even if the left side table is small enough to be 
> broadcasted, shuffle join is used. This is because of the property of the 
> left outer join. If the left side is broadcasted in left outer join, the 
> result generated will be wrong. But this is not taken care of in bloom 
> filter. While injecting the bloom filter, if lest side is smaller than 
> broadcast threshold, bloom filter is not added. It assumes that the left side 
> will be broadcast and there is no need for a bloom filter. This causes bloom 
> filter optimization to be missed in case of left outer join with small left 
> side and huge right-side table.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to