Gopal V created HIVE-11306:
------------------------------

             Summary: Add a bloom-1 filter for Hybrid MapJoin spills
                 Key: HIVE-11306
                 URL: https://issues.apache.org/jira/browse/HIVE-11306
             Project: Hive
          Issue Type: Improvement
          Components: Hive
    Affects Versions: 1.3.0, 2.0.0
            Reporter: Gopal V
            Assignee: Gopal V


HIVE-9277 implemented Spillable joins for Tez, which suffers from a corner-case 
performance issue when joining wide small tables against a narrow big table 
(like a user info table join events stream).

The fact that the wide table is spilled causes extra IO, even though the nDV of 
the join key might be in the thousands.

A cheap bloom-1 filter would add a massive performance gain for such queries, 
massively cutting down on the spill IO costs for the big-table spills.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to