carolinchen created IMPALA-10964:
------------------------------------
Summary: Add query option that limits skew query in runtime
Key: IMPALA-10964
URL: https://issues.apache.org/jira/browse/IMPALA-10964
Project: IMPALA
Issue Type: Improvement
Components: Backend
Affects Versions: Impala 4.0.0
Reporter: carolinchen
Fix For: Impala 4.0.1
Reject queries that skew value is too big when executing the query.
Query skew refers to the situation in which some nodes are significantly behind
other nodes in the process of concurrent execution of SQL.
There are two style skews:
1. Row skew, which may be caused by unreasonable sql or uneven task
distributions.
2. Time skew, which may be caused by different capability by execnode.
Query skew will cause two effects:
1. For the skew node may execute slowly, which will slow down the query
progress .
2. For the skew node may exhaust lots system resources( I/O, memory, rpc),
which will
affect other queries in the same host/ query pool.
When the skew value reach unreasonale range, will affect the cluster status
and other running queries. This is a mechanism to protect the cluster from
potentially harmful queries(eg: mem_limit).
In our environment, the SKEW_LIMIT query option is added to limit skewed query.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)