Montura opened a new pull request, #5758:
URL: https://github.com/apache/ozone/pull/5758
Sometimes the number of under-utilized nodes may not be sufficient to
satisfy the limit about the max percent of datanodes participating in the
balance iteration (`datanodes.involved.max.percentage.per.iteration`). Thus,
collections of source and target datanodes are reset and balancing is skipped.
The issue it can be easily detected when cluster has few nodes (< 10), for
example 4 or 5.
## What changes were proposed in this pull request?
Two flags are introduced in
`hdds.scm.container.balancer.ContainerBalancerConfiguration`:
* `adapt.balance.when.close.to.limit`
* `adapt.balance.when.each.the.limit`
By default they have value `true` as it was for their local variables before
in
`hdds.scm.container.balancer.ContainerBalancerTask#doIteration`
```
boolean canAdaptWhenNearingLimits = true;
boolean canAdaptOnReachingLimits = true;
```
This variables is used for adapting source and target datanodes during the
balancing in the following way:
* If balancer is one datanode away from
`datanodes.involved.max.percentage.per.iteration` limit
```
// ContainerBalancerTask#adaptWhenNearingIterationLimits
int maxDatanodesToInvolve =
config.getMaxDatanodesRatioToInvolvePerIteration() * totalNodesInCluster;
if (countDatanodesInvolvedPerIteration + 1 == maxDatanodesToInvolve) {
// Restricts potential target datanodes to nodes that have already been
selected
}
```
* If balancer has reached `datanodes.involved.max.percentage.per.iteration`
limit
```
// ContainerBalancerTask#adaptOnReachingIterationLimits
int maxDatanodesToInvolve =
config.getMaxDatanodesRatioToInvolvePerIteration() * totalNodesInCluster;
if (countDatanodesInvolvedPerIteration == maxDatanodesToInvolve) {
// Restricts potential source and target datanodes to nodes that have
already been selected
}
```
## What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-9889
## How was this patch tested?
`hdds.scm.container.balancer.TestContainerBalancerTask` is reworked:
1. Extracted two classes
* `hdds.scm.container.balancer.MockedSCM` for setting up testable
`hdds.scm.server.StorageContainerManager`
* `hdds.scm.container.balancer.TestableCluster` for creating test cluster
with a required number of datanodes
2. Parameterized all test in `TestContainerBalancerTask` with *cluster* with
different node count and *boolean flag* (`useDatanodeLimit`) which is set to
introduced flags in `ContainerBalancerConfigfuration`.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]