errose28 commented on PR #7541: URL: https://github.com/apache/ozone/pull/7541#issuecomment-2529823140
Hi @slfan1989 can you elaborate more on the use case for this feature? > However, since clusters are frequently scaled and the number of active DataNodes is uncertain, relying on a fixed value to determine whether SafeMode conditions are met is not reliable. I can see how the default value of 1 may not be too helpful, but what are the practical issues that the current safemode configurations are causing in your cluster? The other rules that check for container replicas and pipelines should ensure that cluster is sufficiently operational when safemode is exited. I have a few concerns with adding this. Pipeline list is not a definitive list of cluster membership, especially if new nodes are added with the restart, so the rule may be more or less an approximation depending on the situation. It also adds more configuration and complexity to safemode. FWIW HDFS does not have such a feature because it does not have fixed cluster membership by design. Ozone does not really have fixed cluster membership either. Persistent pipelines actually add more complexity to the system than they are worth IMO and I don't think we should tie more features into them. I feel like a dashboard as described in HDDS-11525 would better address the concern about nodes not registering, and that the container and pipeline rules provide stronger guarantees about the cluster's readiness than node count or percentage. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
