Deepak Jaiswal created HIVE-15676:
-------------------------------------

             Summary: Remove Bloom Filters from semi join reduction if it is 
too big.
                 Key: HIVE-15676
                 URL: https://issues.apache.org/jira/browse/HIVE-15676
             Project: Hive
          Issue Type: Improvement
            Reporter: Deepak Jaiswal
            Assignee: Deepak Jaiswal


Bloom filters themselves could become really big if the row count is high. 
Aggregating such bloom filters in reducers could be even more expensive. For 
e.g., a bloom filter for 100M rows can be as big as 170MB. Aggregating 100 such 
filters in reducer could end up taking 17GB of memory.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to