[
https://issues.apache.org/jira/browse/IMPALA-7635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Amogh Margoor resolved IMPALA-7635.
-----------------------------------
Resolution: Fixed
> Reduce size of hash tables in-memory by packing buckets more densely
> --------------------------------------------------------------------
>
> Key: IMPALA-7635
> URL: https://issues.apache.org/jira/browse/IMPALA-7635
> Project: IMPALA
> Issue Type: Improvement
> Components: Backend
> Affects Versions: Impala 3.1.0
> Reporter: Tim Armstrong
> Assignee: Amogh Margoor
> Priority: Major
> Labels: perf
>
> Currently the hash tables used for hash join and aggregation use 16 bytes per
> bucket and 24 bytes per additional duplicate for a key:
> {code}
> /// Linked list of entries used for duplicates.
> struct DuplicateNode {
> /// Used for full outer and right {outer, anti, semi} joins. Indicates
> whether the
> /// row in the DuplicateNode has been matched.
> /// From an abstraction point of view, this is an awkward place to store
> this
> /// information.
> /// TODO: Fold this flag in the next pointer below.
> bool matched;
> /// Chain to next duplicate node, NULL when end of list.
> DuplicateNode* next;
> HtData htdata;
> };
> struct Bucket {
> /// Whether this bucket contains a vaild entry, or it is empty.
> bool filled;
> /// Used for full outer and right {outer, anti, semi} joins. Indicates
> whether the
> /// row in the bucket has been matched.
> /// From an abstraction point of view, this is an awkward place to store
> this
> /// information but it is efficient. This space is otherwise unused.
> bool matched;
> /// Used in case of duplicates. If true, then the bucketData union should
> be used as
> /// 'duplicates'.
> bool hasDuplicates;
> /// Cache of the hash for data.
> /// TODO: Do we even have to cache the hash value?
> uint32_t hash;
> /// Either the data for this bucket or the linked list of duplicates.
> union {
> HtData htdata;
> DuplicateNode* duplicates;
> } bucketData;
> };
> {code}
> There are some comments in the code that suggest folding the boolean values
> into the upper bits of the pointers (since on amd64 the address space is only
> 48 bits, but moving to 57 bits apparently - see
> https://software.intel.com/sites/default/files/managed/2b/80/5-level_paging_white_paper.pdf).
> That would reduce the bucket to 12 bytes of actual data.
> This would give us the opportunity to reduce memory requirements of joins and
> the pressure on caches significantly, provided we can work out the
> implementation issues and the cost of the bit manipulation doesn't exceed the
> benefit (my intuition is that cache effects are way more important but I
> could be wrong).
> Here's a rough idea of what we could do:
> # Implement folding of booleans into the pointer and mark struct Bucket as
> packed so that it doesn't just undo the work with additional padding.
> # Modifying Hashtable to work with the new bucket structure. This needs a
> little thought since the bucket allocations must be a power-of-two size in
> bytes, but we also need the hash table entries to be a power-of-two in order
> for masking the hash to get the bucket number to work. I think either we
> could just leave wasted space in the buffer or switch to a non-power-of-two
> number of buckets and using an alternative method of getting the bucket from
> the hash:
> https://lemire.me/blog/2016/06/27/a-fast-alternative-to-the-modulo-reduction/
> # Run benchmarks to see if it's beneficial. The effect probably depends on
> the data set size.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]