Tim Armstrong created IMPALA-7636:
-------------------------------------

             Summary: Avoid storing hash in hash table bucket for hash tables 
in join
                 Key: IMPALA-7636
                 URL: https://issues.apache.org/jira/browse/IMPALA-7636
             Project: IMPALA
          Issue Type: Improvement
          Components: Backend
    Affects Versions: Impala 3.1.0
            Reporter: Tim Armstrong


Somewhat related to IMPALA-7635, I think storing the precomputed hash in the 
hash table buckets is of questionable benefit for joins. It's useful for 
aggregations since we frequently resize the hash tables, but in joins it's only 
used to short-circuit calling Equal(), which often isn't that expensive. It's 
unclear how many calls to Equal() are actually avoided. We should do some 
benchmarks to determine . As a sanity check for the idea, we could remove the 
(hash == bucket->hash) check in Probe() and see if performance is affected.

The difficult part here is figuring out how to share the HashTable code between 
the agg and join but having different bucket representations - templates?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to