Stamatis Zampetakis created HIVE-24251:
------------------------------------------
Summary: Improve bloom filter size estimation for multi column
semijoin reducers
Key: HIVE-24251
URL: https://issues.apache.org/jira/browse/HIVE-24251
Project: Hive
Issue Type: Improvement
Components: Query Planning
Reporter: Stamatis Zampetakis
Assignee: Stamatis Zampetakis
There are various cases where the expected size of the bloom filter is largely
underestimated making the semijoin reducer completely ineffective. This more
relevant for multi-column semi join reducers since the current
[code|https://github.com/apache/hive/blob/d61c9160ffa5afbd729887c3db690eccd7ef8238/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFBloomFilter.java#L273]
does not take them into account.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)