Sahil Takiar created IMPALA-9254:
------------------------------------
Summary: Queries should only be retried if all fragments fail with
retryable errors
Key: IMPALA-9254
URL: https://issues.apache.org/jira/browse/IMPALA-9254
Project: IMPALA
Issue Type: Sub-task
Reporter: Sahil Takiar
Currently, Impala only propagates an {{overall_status}} from an executor to the
coordinator. The {{overall_status}} is set in the {{QueryState}} and "If
multiple fragments have errors, the first fragment to hit an error is givenĀ
preference.".
The issue is that if multiple fragments fail, it is possible some of the errors
should trigger a retry, while other errors shouldn't. For example, one fragment
could fail due to faulty disks, but others could fail due to mem limit
exceptions. These types of queries shouldn't be retried because it is likely
the query will just fail again.
This can only happen if the non-retryable error occurs in a specific time
window: [when the retryable error occurs, the query is cancelled]. Since any
fragment failure causes the entire query to be cancelled, this can only occur
if the non-retryable error occurs after the retryable error, but before the
query is cancelled.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]