[ 
https://issues.apache.org/jira/browse/IMPALA-9254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tim Armstrong updated IMPALA-9254:
----------------------------------
    Component/s: Distributed Exec

> Queries should only be retried if all fragments fail with retryable errors
> --------------------------------------------------------------------------
>
>                 Key: IMPALA-9254
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9254
>             Project: IMPALA
>          Issue Type: Sub-task
>          Components: Distributed Exec
>            Reporter: Sahil Takiar
>            Priority: Major
>
> Currently, Impala only propagates an {{overall_status}} from an executor to 
> the coordinator. The {{overall_status}} is set in the {{QueryState}} and "If 
> multiple fragments have errors, the first fragment to hit an error is givenĀ 
> preference.".
> The issue is that if multiple fragments fail, it is possible some of the 
> errors should trigger a retry, while other errors shouldn't. For example, one 
> fragment could fail due to faulty disks, but others could fail due to mem 
> limit exceptions. These types of queries shouldn't be retried because it is 
> likely the query will just fail again.
> This can only happen if the non-retryable error occurs in a specific time 
> window: [when the retryable error occurs, the query is cancelled]. Since any 
> fragment failure causes the entire query to be cancelled, this can only occur 
> if the non-retryable error occurs after the retryable error, but before the 
> query is cancelled.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to