Gopal V created HIVE-11306: ------------------------------ Summary: Add a bloom-1 filter for Hybrid MapJoin spills Key: HIVE-11306 URL: https://issues.apache.org/jira/browse/HIVE-11306 Project: Hive Issue Type: Improvement Components: Hive Affects Versions: 1.3.0, 2.0.0 Reporter: Gopal V Assignee: Gopal V
HIVE-9277 implemented Spillable joins for Tez, which suffers from a corner-case performance issue when joining wide small tables against a narrow big table (like a user info table join events stream). The fact that the wide table is spilled causes extra IO, even though the nDV of the join key might be in the thousands. A cheap bloom-1 filter would add a massive performance gain for such queries, massively cutting down on the spill IO costs for the big-table spills. -- This message was sent by Atlassian JIRA (v6.3.4#6332)