Stamatis Zampetakis created HIVE-24251: ------------------------------------------
Summary: Improve bloom filter size estimation for multi column semijoin reducers Key: HIVE-24251 URL: https://issues.apache.org/jira/browse/HIVE-24251 Project: Hive Issue Type: Improvement Components: Query Planning Reporter: Stamatis Zampetakis Assignee: Stamatis Zampetakis There are various cases where the expected size of the bloom filter is largely underestimated making the semijoin reducer completely ineffective. This more relevant for multi-column semi join reducers since the current [code|https://github.com/apache/hive/blob/d61c9160ffa5afbd729887c3db690eccd7ef8238/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDAFBloomFilter.java#L273] does not take them into account. -- This message was sent by Atlassian Jira (v8.3.4#803005)