Hi team,

Recently I found a serious performance bug for the current hash
join/groupby implementation [1] and submitted a fix for it [2]. The damage
caused by this bug depends on the query itself and the underlying data, but
for a specific join query TPC-H Q5 whose initial query time was very
strange, *the query time was reduced from ~3500s to ~1000s on 16 nodes
after applying this fix*.

When we perform hash join/groupby, keys are first hash partitioned into
each NC partition (hash1(key)%P, where P is the number of NC partitions).
Within each NC partition, a hash table is built by hashing each key to a
slot (hash2(key)%N, where N is the number of slots). The key problem is
that we used the same hash function for both hash1 and hash2! In general,
this may lead to a lot of hash collisions. To see this problem, consider
what happens with NC partition 0. We know that all keys assigned to NC
partition 0 must satisfy hash(key) % P == 0. Now suppose we have 16 NC
partitions (P = 16) and N is a multiple of 4. Since hash(key)%16 == 0, we
know that hash(key)%N must be a multiple of 4! This means all slots that
are multiple of 1,2,3 will be empty, and all keys will be clustered into
slots that are multiples of 4. To fix this problem, we can simply use a
different hash function for hash join/groupby.

If you are running experiments related to join queries and have seen some
unexpected performance results, it'll be helpful to try this fix to see
what happens.

Best regards,
Chen Luo

[1] https://issues.apache.org/jira/browse/ASTERIXDB-2783
[2] https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/8123

Reply via email to