Deepak Jaiswal created HIVE-15676: ------------------------------------- Summary: Remove Bloom Filters from semi join reduction if it is too big. Key: HIVE-15676 URL: https://issues.apache.org/jira/browse/HIVE-15676 Project: Hive Issue Type: Improvement Reporter: Deepak Jaiswal Assignee: Deepak Jaiswal
Bloom filters themselves could become really big if the row count is high. Aggregating such bloom filters in reducers could be even more expensive. For e.g., a bloom filter for 100M rows can be as big as 170MB. Aggregating 100 such filters in reducer could end up taking 17GB of memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)