Kunal Khatua created DRILL-7141:
-----------------------------------
Summary: Hash-Join (and Agg) should always spill to disk the least
used partition
Key: DRILL-7141
URL: https://issues.apache.org/jira/browse/DRILL-7141
Project: Apache Drill
Issue Type: Improvement
Components: Execution - Relational Operators
Affects Versions: 1.15.0
Reporter: Kunal Khatua
Assignee: Boaz Ben-Zvi
Fix For: Future
When the probe-side data for a hash join is skewed, it is preferable to have
the corresponding partition on the build side to be in memory.
Currently, with the spill-to-disk feature, the partition selected for spilling
to disk is done at random. This means that a highly skewed probe-side data
would also spill for lack of a corresponding hash table partition in memory.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)