errose28 commented on PR #7541:
URL: https://github.com/apache/ozone/pull/7541#issuecomment-2529823140

   Hi @slfan1989 can you elaborate more on the use case for this feature?
   > However, since clusters are frequently scaled and the number of active 
DataNodes is uncertain, relying on a fixed value to determine whether SafeMode 
conditions are met is not reliable.
   
   I can see how the default value of 1 may not be too helpful, but what are 
the practical issues that the current safemode configurations are causing in 
your cluster? The other rules that check for container replicas and pipelines 
should ensure that cluster is sufficiently operational when safemode is exited.
   
   I have a few concerns with adding this. Pipeline list is not a definitive 
list of cluster membership, especially if new nodes are added with the restart, 
so the rule may be more or less an approximation depending on the situation. It 
also adds more configuration and complexity to safemode. FWIW HDFS does not 
have such a feature because it does not have fixed cluster membership by 
design. Ozone does not really have fixed cluster membership either. Persistent 
pipelines actually add more complexity to the system than they are worth IMO 
and I don't think we should tie more features into them.
   
   I feel like a dashboard as described in HDDS-11525 would better address the 
concern about nodes not registering, and that the container and pipeline rules 
provide stronger guarantees about the cluster's readiness than node count or 
percentage.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to