[
https://issues.apache.org/jira/browse/IMPALA-9254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Tim Armstrong updated IMPALA-9254:
----------------------------------
Component/s: Distributed Exec
> Queries should only be retried if all fragments fail with retryable errors
> --------------------------------------------------------------------------
>
> Key: IMPALA-9254
> URL: https://issues.apache.org/jira/browse/IMPALA-9254
> Project: IMPALA
> Issue Type: Sub-task
> Components: Distributed Exec
> Reporter: Sahil Takiar
> Priority: Major
>
> Currently, Impala only propagates an {{overall_status}} from an executor to
> the coordinator. The {{overall_status}} is set in the {{QueryState}} and "If
> multiple fragments have errors, the first fragment to hit an error is givenĀ
> preference.".
> The issue is that if multiple fragments fail, it is possible some of the
> errors should trigger a retry, while other errors shouldn't. For example, one
> fragment could fail due to faulty disks, but others could fail due to mem
> limit exceptions. These types of queries shouldn't be retried because it is
> likely the query will just fail again.
> This can only happen if the non-retryable error occurs in a specific time
> window: [when the retryable error occurs, the query is cancelled]. Since any
> fragment failure causes the entire query to be cancelled, this can only occur
> if the non-retryable error occurs after the retryable error, but before the
> query is cancelled.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]