Chen:
Great!
Several yeas ago, I was comparing AsterixDB with Hive/Tez&Spark, and noticed 
threads within each
node were highly unbalanced. I think your answer would be the answer.
Best!

> -----原始邮件-----
> 发件人: "Chen Luo" <[email protected]>
> 发送时间: 2020-09-27 01:15:43 (星期日)
> 收件人: [email protected], [email protected]
> 抄送: 
> 主题: Serious Hash Collisions in Hash Join/Groupby
> 
> Hi team,
> 
> Recently I found a serious performance bug for the current hash
> join/groupby implementation [1] and submitted a fix for it [2]. The damage
> caused by this bug depends on the query itself and the underlying data, but
> for a specific join query TPC-H Q5 whose initial query time was very
> strange, *the query time was reduced from ~3500s to ~1000s on 16 nodes
> after applying this fix*.
> 
> When we perform hash join/groupby, keys are first hash partitioned into
> each NC partition (hash1(key)%P, where P is the number of NC partitions).
> Within each NC partition, a hash table is built by hashing each key to a
> slot (hash2(key)%N, where N is the number of slots). The key problem is
> that we used the same hash function for both hash1 and hash2! In general,
> this may lead to a lot of hash collisions. To see this problem, consider
> what happens with NC partition 0. We know that all keys assigned to NC
> partition 0 must satisfy hash(key) % P == 0. Now suppose we have 16 NC
> partitions (P = 16) and N is a multiple of 4. Since hash(key)%16 == 0, we
> know that hash(key)%N must be a multiple of 4! This means all slots that
> are multiple of 1,2,3 will be empty, and all keys will be clustered into
> slots that are multiples of 4. To fix this problem, we can simply use a
> different hash function for hash join/groupby.
> 
> If you are running experiments related to join queries and have seen some
> unexpected performance results, it'll be helpful to try this fix to see
> what happens.
> 
> Best regards,
> Chen Luo
> 
> [1] https://issues.apache.org/jira/browse/ASTERIXDB-2783
> [2] https://asterix-gerrit.ics.uci.edu/c/asterixdb/+/8123

Reply via email to