Michael Ho created IMPALA-8005:
----------------------------------
Summary: Randomize partitioning exchanges destinations
Key: IMPALA-8005
URL: https://issues.apache.org/jira/browse/IMPALA-8005
Project: IMPALA
Issue Type: Improvement
Components: Distributed Exec
Affects Versions: Impala 3.1.0
Reporter: Michael Ho
Assignee: Michael Ho
Currently, we use the same hash seed for partitioning exchanges at the sender.
For a table with skew in distribution in the shuffling keys, multiple queries
using the same shuffling keys for exchanges will end up hashing to the same
destination fragments running on particular host and potentially overloading
that host.
We should consider using the query id or other query specific information to
seed the hashing function to randomize the destinations for different queries.
Thanks to [~tlipcon] for pointing this problem out.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)