Deepak Jaiswal created HIVE-15676:
-------------------------------------
Summary: Remove Bloom Filters from semi join reduction if it is
too big.
Key: HIVE-15676
URL: https://issues.apache.org/jira/browse/HIVE-15676
Project: Hive
Issue Type: Improvement
Reporter: Deepak Jaiswal
Assignee: Deepak Jaiswal
Bloom filters themselves could become really big if the row count is high.
Aggregating such bloom filters in reducers could be even more expensive. For
e.g., a bloom filter for 100M rows can be as big as 170MB. Aggregating 100 such
filters in reducer could end up taking 17GB of memory.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)