Michael Ho created IMPALA-8634:
----------------------------------
Summary: Catalog client should be resilient to temporary Catalog
outage
Key: IMPALA-8634
URL: https://issues.apache.org/jira/browse/IMPALA-8634
Project: IMPALA
Issue Type: Improvement
Components: Catalog
Affects Versions: Impala 3.2.0
Reporter: Michael Ho
Currently, when the catalog server is down, catalog clients will fail all RPCsÂ
sent to it. In essence, DDL queries will fail and the Impala service becomes a
lot less functional. Catalog clients should consider retrying failed RPCs with
some exponential backoff in between while catalog server is being restarted
after crashing. We probably need to add [a test
|https://github.com/apache/impala/blob/master/tests/custom_cluster/test_restart_services.py]
to exercise the paths of catalog restart to verify coordinators are resilient
to it.
cc'ing [~stakiar], [~joemcdonnell], [~twm378]
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)