Michael Ho created IMPALA-8634: ---------------------------------- Summary: Catalog client should be resilient to temporary Catalog outage Key: IMPALA-8634 URL: https://issues.apache.org/jira/browse/IMPALA-8634 Project: IMPALA Issue Type: Improvement Components: Catalog Affects Versions: Impala 3.2.0 Reporter: Michael Ho
Currently, when the catalog server is down, catalog clients will fail all RPCs sent to it. In essence, DDL queries will fail and the Impala service becomes a lot less functional. Catalog clients should consider retrying failed RPCs with some exponential backoff in between while catalog server is being restarted after crashing. We probably need to add [a test |https://github.com/apache/impala/blob/master/tests/custom_cluster/test_restart_services.py] to exercise the paths of catalog restart to verify coordinators are resilient to it. cc'ing [~stakiar], [~joemcdonnell], [~twm378] -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-all-unsubscr...@impala.apache.org For additional commands, e-mail: issues-all-h...@impala.apache.org