[
https://issues.apache.org/jira/browse/SPARK-44307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Hyukjin Kwon updated SPARK-44307:
---------------------------------
Target Version/s: (was: 3.4.1)
> Bloom filter is not added for left outer join if the left side table is
> smaller than broadcast threshold.
> ---------------------------------------------------------------------------------------------------------
>
> Key: SPARK-44307
> URL: https://issues.apache.org/jira/browse/SPARK-44307
> Project: Spark
> Issue Type: Bug
> Components: Optimizer
> Affects Versions: 3.4.1
> Reporter: mahesh kumar behera
> Priority: Major
>
> In case of left outer join, even if the left side table is small enough to be
> broadcasted, shuffle join is used. This is because of the property of the
> left outer join. If the left side is broadcasted in left outer join, the
> result generated will be wrong. But this is not taken care of in bloom
> filter. While injecting the bloom filter, if lest side is smaller than
> broadcast threshold, bloom filter is not added. It assumes that the left side
> will be broadcast and there is no need for a bloom filter. This causes bloom
> filter optimization to be missed in case of left outer join with small left
> side and huge right-side table.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]