Tim Armstrong created IMPALA-7635:
-------------------------------------

             Summary: Reduce size of hash tables in-memory by packing buckets 
more densely
                 Key: IMPALA-7635
                 URL: https://issues.apache.org/jira/browse/IMPALA-7635
             Project: IMPALA
          Issue Type: Improvement
          Components: Backend
    Affects Versions: Impala 3.1.0
            Reporter: Tim Armstrong


Currently the hash tables used for hash join and aggregation use 16 bytes per 
bucket and 24 bytes per additional duplicate for a key:

{code}
  /// Linked list of entries used for duplicates.
  struct DuplicateNode {
    /// Used for full outer and right {outer, anti, semi} joins. Indicates 
whether the
    /// row in the DuplicateNode has been matched.
    /// From an abstraction point of view, this is an awkward place to store 
this
    /// information.
    /// TODO: Fold this flag in the next pointer below.
    bool matched;

    /// Chain to next duplicate node, NULL when end of list.
    DuplicateNode* next;
    HtData htdata;
  };

  struct Bucket {
    /// Whether this bucket contains a vaild entry, or it is empty.
    bool filled;

    /// Used for full outer and right {outer, anti, semi} joins. Indicates 
whether the
    /// row in the bucket has been matched.
    /// From an abstraction point of view, this is an awkward place to store 
this
    /// information but it is efficient. This space is otherwise unused.
    bool matched;

    /// Used in case of duplicates. If true, then the bucketData union should 
be used as
    /// 'duplicates'.
    bool hasDuplicates;

    /// Cache of the hash for data.
    /// TODO: Do we even have to cache the hash value?
    uint32_t hash;

    /// Either the data for this bucket or the linked list of duplicates.
    union {
      HtData htdata;
      DuplicateNode* duplicates;
    } bucketData;
  };

{code}

There are some comments in the code that suggest folding the boolean values 
into the upper bits of the pointers (since on amd64 the address space is only 
48 bits, but moving to 57 bits apparently - see 
https://software.intel.com/sites/default/files/managed/2b/80/5-level_paging_white_paper.pdf).
 That would reduce the bucket to 12 bytes of actual data.

This would give us the opportunity to reduce memory requirements of joins and 
the pressure on caches significantly, provided we can work out the 
implementation issues and the cost of the bit manipulation doesn't exceed the 
benefit (my intuition is that cache effects are way more important but I could 
be wrong).

Here's a rough idea of what we could do:
# Implement folding of booleans into the pointer and mark struct Bucket as 
packed so that it doesn't just undo the work with additional padding.
# Modifying Hashtable to work with the new bucket structure. This needs a 
little thought since the bucket allocations must be a power-of-two size in 
bytes, but we also need the hash table entries to be a power-of-two in order 
for masking the hash to get the bucket number to work. I think either we could 
just leave wasted space in the buffer or switch to a non-power-of-two number of 
buckets and using an alternative method of getting the bucket from the hash: 
https://lemire.me/blog/2016/06/27/a-fast-alternative-to-the-modulo-reduction/
# Run benchmarks to see if it's beneficial. The effect probably depends on the 
data set size.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to