[ 
https://issues.apache.org/jira/browse/IMPALA-9124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17102628#comment-17102628
 ] 

Sahil Takiar commented on IMPALA-9124:
--------------------------------------

Disclaimer: while this feature is referred to as *transparent* query retries, 
clients may see some unexpected behavior when a query is retried. The retry 
will not be 100% transparent to the end client, there will be some differences 
that requires client-awareness of query retries:
* When a query is retried, the retry is modeled as a brand new query with a new 
query id - which will be distinct from the query id of the originally submitted 
query that ultimately failed
* Since a query retry is a brand new query, that query has its own runtime 
profile as well - the runtime profiles of the failed and retried queries will 
be linked together
* When requesting a runtime profile from the ImpalaService, the 
GetRuntimeProfile() method will always return the profile of the latest query 
attempt - there are plans to add new options to the ImpalaService interface so 
that users can fetch the profiles of failed attempts as well; this does not 
apply for the web ui - users can still get all query profiles (for both failed 
and retried queries) from the web ui

> Transparently retry queries that fail due to cluster membership changes
> -----------------------------------------------------------------------
>
>                 Key: IMPALA-9124
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9124
>             Project: IMPALA
>          Issue Type: New Feature
>          Components: Backend, Clients
>            Reporter: Sahil Takiar
>            Assignee: Sahil Takiar
>            Priority: Critical
>         Attachments: Impala Transparent Query Retries.pdf
>
>
> Currently, if the Impala Coordinator or any Executors run into errors during 
> query execution, Impala will fail the entire query. It would improve user 
> experience to transparently retry the query for some transient, recoverable 
> errors.
> This JIRA focuses on retrying queries that would otherwise fail due to 
> cluster membership changes. Specifically, node failures that cause changes in 
> the cluster membership (currently the Coordinator cancels all queries running 
> on a node if it detects that the node is no longer part of the cluster) and 
> node blacklisting (the Coordinator blacklists a node because it detects a 
> problem with that node - can’t execute RPCs against the node). It is not 
> focused on retrying general errors (e.g. any frontend errors, 
> MemLimitExceeded exceptions, etc.).



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to