bshashikant opened a new pull request #2114: URL: https://github.com/apache/ozone/pull/2114
## What changes were proposed in this pull request? IN SCM HA, the primary node starts up the ratis server while other bootstrapping nodes will get added to the ratis group. Now, if all the bootstrapping SCM's get stopped, the primary node will now step down from leadership as it will loose majority. If the bootstrapping nodes are now bootstrapped again, the bootsrapping node will try to first validate the cluster id from the leader SCM with the persisted cluster id , but as there is no leader existing, bootstrapping wil keep on failing and retrying until it shuts down. The issue can be very easily simulated in kubernetes deployments, where bootstrap and init cmds are run repeatedly on every restart. The Jira aims to bypass the cluster id validation if a bootstrapping node already has a cluster id. ## What is the link to the Apache JIRA https://issues.apache.org/jira/browse/HDDS-5062 ## How was this patch tested? Added unit test -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
