bharatviswa504 edited a comment on issue #29: HDDS-2034. Async RATIS pipeline 
creation and destroy through heartbeat commands
URL: https://github.com/apache/hadoop-ozone/pull/29#issuecomment-555631118
 
 
   Hi,
   I have one comment on the changes in HealthyPipelineSafeModeRule.
   In HealthyPipelineSafeModeRule, now the thresholdCount on a freshlyInstalled 
with one datanode cluster will be 1. So, with single a node cluster, we will 
never come out of safemode, as during process, we have only checked 3 node 
pipeline.
   
   The intention of this rule was when we come out of safeMode, we have at 
least few pipelines with type Ratis and 3. But with this patch, that logic is 
changed.
   
   **HealthyPipelineSafeModeRule.java**
   ```
   L79:    
   int pipelineCount = pipelineManager.getPipelines(
           HddsProtos.ReplicationType.RATIS, HddsProtos.ReplicationFactor.THREE,
           Pipeline.PipelineState.OPEN).size() +
           pipelineManager.getPipelines(HddsProtos.ReplicationType.RATIS,
               HddsProtos.ReplicationFactor.THREE,
               Pipeline.PipelineState.ALLOCATED).size();
   
       // This value will be zero when pipeline count is 0.
       // On a fresh installed cluster, there will be zero pipelines in the SCM
       // pipeline DB.
       healthyPipelineThresholdCount = Math.max(minHealthyPipelines,
           (int) Math.ceil(healthyPipelinesPercent * pipelineCount));
   
   L125:
       if (pipeline.getType() == HddsProtos.ReplicationType.RATIS &&
           pipeline.getFactor() == HddsProtos.ReplicationFactor.THREE) {
         getSafeModeMetrics().incCurrentHealthyPipelinesCount();
         currentHealthyPipelineCount++;
       }
   ```
   
   If we consider both 3, and 1 node pipelines, then also we shall be in 
problem. The scenario is in a cluster it has 10 3 node pipelines in the 
cluster, and now if we consider both 3 and 1 to increment 
currentHealthyPipelineCount, there can be a case, when from each pipeline one 
Datanode is reported, and we increment the currentHealthyPipelineCount, and we 
might reach the threshold. But after coming out of safeMode there will be no 3 
pipeline nodes in the cluster and writes will fail.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to