Joe McDonnell created IMPALA-9434:
-------------------------------------

             Summary: Implement / Evaluate Robin Hood Hashing for 
exec/hash-table.h
                 Key: IMPALA-9434
                 URL: https://issues.apache.org/jira/browse/IMPALA-9434
             Project: IMPALA
          Issue Type: Improvement
          Components: Backend
    Affects Versions: Impala 3.4.0
            Reporter: Joe McDonnell


exec/hash-table.h's HashTable currently implements an open addressing hash 
table with linear and quadratic probing. Since there are few guarantees about 
the probe length, we limit the load factor to 0.75 to avoid high probe lengths. 
In general, probe lengths increase severely as the load factor goes higher than 
that.

Robin Hood Hashing reduces the variances of probe lengths by continually 
rebalancing elements. It steals from the "rich" elements that have a very short 
probe length and gives to the "poor" elements with a long probe length. This 
tighter guarantee about the probe length allows for a higher load factor (0.9).

We should evaluate Robin Hood Hashing to see if it improves our hash table 
performance. Since this hash table does not support deletes, that reduces the 
complexity. The Bucket already has some unused bytes in case we need them.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to