Tim Armstrong created IMPALA-7636:
-------------------------------------
Summary: Avoid storing hash in hash table bucket for hash tables
in join
Key: IMPALA-7636
URL: https://issues.apache.org/jira/browse/IMPALA-7636
Project: IMPALA
Issue Type: Improvement
Components: Backend
Affects Versions: Impala 3.1.0
Reporter: Tim Armstrong
Somewhat related to IMPALA-7635, I think storing the precomputed hash in the
hash table buckets is of questionable benefit for joins. It's useful for
aggregations since we frequently resize the hash tables, but in joins it's only
used to short-circuit calling Equal(), which often isn't that expensive. It's
unclear how many calls to Equal() are actually avoided. We should do some
benchmarks to determine . As a sanity check for the idea, we could remove the
(hash == bucket->hash) check in Probe() and see if performance is affected.
The difficult part here is figuring out how to share the HashTable code between
the agg and join but having different bucket representations - templates?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]