Montura opened a new pull request, #5758:
URL: https://github.com/apache/ozone/pull/5758

   Sometimes the number of under-utilized nodes may not be sufficient to 
satisfy the limit about the max percent of datanodes participating in the 
balance iteration (`datanodes.involved.max.percentage.per.iteration`).  Thus, 
collections of source and target datanodes  are reset and balancing is skipped. 
   
   The issue it can be easily detected when cluster has few nodes (< 10), for 
example 4 or 5.
   
   ## What changes were proposed in this pull request?
   
   Two flags are introduced in 
`hdds.scm.container.balancer.ContainerBalancerConfiguration`: 
   * `adapt.balance.when.close.to.limit`
   * `adapt.balance.when.each.the.limit` 
    
   By default they have value `true` as it was for their local variables before 
in
    `hdds.scm.container.balancer.ContainerBalancerTask#doIteration`
   ```
   boolean canAdaptWhenNearingLimits = true;
   boolean canAdaptOnReachingLimits = true;
   ```
   This variables is used for adapting source and target datanodes during the 
balancing in the following way:
   
   * If balancer is one datanode away from 
`datanodes.involved.max.percentage.per.iteration` limit
   ```
   // ContainerBalancerTask#adaptWhenNearingIterationLimits
   int maxDatanodesToInvolve =  
config.getMaxDatanodesRatioToInvolvePerIteration() * totalNodesInCluster;
   if (countDatanodesInvolvedPerIteration + 1 == maxDatanodesToInvolve) {
       // Restricts potential target datanodes to nodes that have already been 
selected
   }
   ```
   
   * If balancer has reached `datanodes.involved.max.percentage.per.iteration` 
limit
   ```
   // ContainerBalancerTask#adaptOnReachingIterationLimits
   int maxDatanodesToInvolve =  
config.getMaxDatanodesRatioToInvolvePerIteration() * totalNodesInCluster;
   if (countDatanodesInvolvedPerIteration  == maxDatanodesToInvolve) {
       // Restricts potential source and target datanodes to nodes that have 
already been selected
   }
   ```
   ## What is the link to the Apache JIRA
   
   https://issues.apache.org/jira/browse/HDDS-9889
   
   ## How was this patch tested?
   
   `hdds.scm.container.balancer.TestContainerBalancerTask` is reworked:
   
   1. Extracted two classes 
   * `hdds.scm.container.balancer.MockedSCM` for setting up testable 
`hdds.scm.server.StorageContainerManager`
   * `hdds.scm.container.balancer.TestableCluster` for creating test cluster 
with a required number of datanodes
   
   2. Parameterized all test in `TestContainerBalancerTask` with *cluster* with 
different node count and *boolean flag* (`useDatanodeLimit`) which is set to 
introduced flags in `ContainerBalancerConfigfuration`. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to