Sahil Takiar created IMPALA-9124: ------------------------------------ Summary: Transparently retry queries that fail due to cluster membership changes Key: IMPALA-9124 URL: https://issues.apache.org/jira/browse/IMPALA-9124 Project: IMPALA Issue Type: New Feature Components: Backend, Clients Reporter: Sahil Takiar Assignee: Sahil Takiar
Currently, if the Impala Coordinator or any Executors run into errors during query execution, Impala will fail the entire query. It would improve user experience to transparently retry the query for some transient, recoverable errors. This JIRA focuses on retrying queries that would otherwise fail due to cluster membership changes. Specifically, node failures that cause changes in the cluster membership (currently the Coordinator cancels all queries running on a node if it detects that the node is no longer part of the cluster) and node blacklisting (the Coordinator blacklists a node because it detects a problem with that node - can’t execute RPCs against the node). It is not focused on retrying general errors (e.g. any frontend errors, MemLimitExceeded exceptions, etc.). -- This message was sent by Atlassian Jira (v8.3.4#803005)