[
https://issues.apache.org/jira/browse/HDDS-14389?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gargi Jaiswal reassigned HDDS-14389:
------------------------------------
Assignee: Gargi Jaiswal
> [Website v2] [Docs] [Administrator Guide] OM HA, SCM HA failover behavior
> -------------------------------------------------------------------------
>
> Key: HDDS-14389
> URL: https://issues.apache.org/jira/browse/HDDS-14389
> Project: Apache Ozone
> Issue Type: Sub-task
> Components: documentation
> Reporter: Wei-Chiu Chuang
> Assignee: Gargi Jaiswal
> Priority: Major
>
> Our OM HA and SCM HA doc covers how multiple OM reaches consensus using Ratis.
> However, it misses another important part of story: client failover behavior.
> HadoopRpcOMFailoverProxyProvider: if client to OM is Hadoop RPC transport.
> The failover or retry may happen if (1) the OM is not not reachable, (2) not
> a leader, or (3) is a leader but not ready to accept requests.
> The failover will retry up to 500 times (ozone.client.failover.max.attempts),
> and 2 seconds between each failover retry
> (ozone.client.wait.between.retries.millis). If the OM is not aware of the
> current leader, client will try the next OM in round-robin fashion;
> otherwise, client will retry contacting the current leader.
> Additionally, it is crucial to ensure clients and OM have consistent node
> mapping configurations, otherwise failover may not reach the leader OM.
> GrpcOMFailoverProxyProvider: If client to OM is gRPC transport, the behavior
> is largely the same. But I don't have much experience with it so I'll just
> leave it as.
> Similarly, client (client, OM or Datanode) to SCM failover is controlled by a
> series of configuration properties in SCMClientConfig:
> hdds.scmclient.rpc.timeout, hdds.scmclient.max.retry.timeout,
> hdds.scmclient.failover.max.retry, hdds.scmclient.failover.retry.interval.
> Having these behaviors documented will help users troubleshoot problems.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]