ndimiduk commented on code in PR #6622:
URL: https://github.com/apache/hbase/pull/6622#discussion_r1930729131
##########
hbase-balancer/src/main/java/org/apache/hadoop/hbase/master/balancer/BalancerClusterState.java:
##########
@@ -446,6 +447,11 @@ private void registerRegion(RegionInfo region, int
regionIndex, int serverIndex,
: serversToIndex.get(loc.get(i).getAddress()));
}
}
+
+ int numReplicas = region.getReplicaId() + 1;
+ if (numReplicas > maxReplicas) {
+ maxReplicas = numReplicas;
+ }
Review Comment:
I'm pretty sure that you have insufficient information here. I'm looking at
the call hierarchy of `registerRegion` and I believe that there's no limit by
table. So you're approximating the maximum number of region replicas used by
any table in the cluster? I suspect that you need to introduce a new index by
table name to the number of replicas per table.
##########
hbase-balancer/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java:
##########
@@ -331,10 +331,33 @@ void updateMetricsSize(int size) {
}
}
- private boolean areSomeRegionReplicasColocated(BalancerClusterState c) {
- regionReplicaHostCostFunction.prepare(c);
- double cost = Math.abs(regionReplicaHostCostFunction.cost());
- return cost > CostFunction.getCostEpsilon(cost);
+ private boolean areSomeRegionReplicasColocatedOnHost(BalancerClusterState c)
{
+ if (c.numHosts >= c.maxReplicas) {
+ regionReplicaHostCostFunction.prepare(c);
+ double hostCost = Math.abs(regionReplicaHostCostFunction.cost());
+ boolean colocatedAtHost = hostCost >
CostFunction.getCostEpsilon(hostCost);
+ if (colocatedAtHost) {
+ return true;
+ }
+ LOG.trace("No host colocation detected with host cost={}", hostCost);
+ }
+ return false;
+ }
+
+ private boolean areSomeRegionReplicasColocatedOnRack(BalancerClusterState c)
{
+ if (c.numRacks >= c.maxReplicas) {
+ regionReplicaRackCostFunction.prepare(c);
+ double rackCost = Math.abs(regionReplicaRackCostFunction.cost());
+ boolean colocatedAtRack = rackCost >
CostFunction.getCostEpsilon(rackCost);
+ if (colocatedAtRack) {
+ return true;
+ }
+ LOG.trace("No rack colocation detected with rack cost={}", rackCost);
+ } else {
+ LOG.trace("Rack colocation is inevitable with fewer racks than replicas,
"
Review Comment:
If we get here, we're probably here permanently. Can probably save some GC
and logging subsystem pressure by gating the message on a `hasLoggedThisBefore`
variable. I cannot tell how frequently this method will be called
(once/balancer run, not once/region ?), so maybe it's not so bad.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]