Michael Ho created IMPALA-8005:
----------------------------------

             Summary: Randomize partitioning exchanges destinations
                 Key: IMPALA-8005
                 URL: https://issues.apache.org/jira/browse/IMPALA-8005
             Project: IMPALA
          Issue Type: Improvement
          Components: Distributed Exec
    Affects Versions: Impala 3.1.0
            Reporter: Michael Ho
            Assignee: Michael Ho


Currently, we use the same hash seed for partitioning exchanges at the sender. 
For a table with skew in distribution in the shuffling keys, multiple queries 
using the same shuffling keys for exchanges will end up hashing to the same 
destination fragments running on particular host and potentially overloading 
that host.

We should consider using the query id or other query specific information to 
seed the hashing function to randomize the destinations for different queries. 
Thanks to [~tlipcon] for pointing this problem out.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to