slfan1989 commented on PR #7008: URL: https://github.com/apache/ozone/pull/7008#issuecomment-2485828541
> @nandakumar131 can you please review as well? @adoroszlai Thank you very much for reviewing this PR! This improvement is very important to us. Currently, when we restart the SCM, it cannot determine whether the EC Container has finished reporting because, similar to the Ratis 3-replica Container, the SCM considers the Container ready as soon as just one replica reports successfully. This results in an issue where we are unable to promote the SCM to leader when it has just restarted and has already exited safe mode. This PR has been in use internally for several months, and I personally believe it has met expectations. Currently, we have fully transitioned our internal Ozone cluster to the EC-6-3-1024K strategy (meaning there is almost no 3-replica data in the cluster, with only a small amount, less than 10PB, as exceptions). This decision was driven by cost considerations, as we have already stored over 100PB of data. I sincerely hope we can continue to push this PR forward. If there are any suggestions for improvement, I will continue to make the necessary changes. cc: @siddhantsangwan @sadanand48 @errose28 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
