Re: [PR] HBASE-30134: Improve CacheAwareLoadBalancer to consider low cache ratio when calculating imbalance [hbase]

via GitHub Fri, 08 May 2026 10:36:50 -0700


Copilot commented on code in PR #8197:
URL: https://github.com/apache/hbase/pull/8197#discussion_r3210335453



##########
hbase-balancer/src/main/java/org/apache/hadoop/hbase/master/balancer/CacheAwareLoadBalancer.java:
##########
@@ -149,8 +191,16 @@ private void updateRegionLoad() {
           if (!ServerName.isSameAddress(currentServer, sn)) {
             int regionSizeMB =
               
regionCacheRatioOnCurrentServerMap.get(regionEncodedName).getSecond();
+            // The coldDataSize accounts for data size classified as "cold" by 
DataTieringManager,
+            // which should be kept out of cache. We sum cold region size in 
the cache ratio, as we
+            // don't want to move regions with low cache ratio due to data 
classified as cold.
             float regionCacheRatioOnOldServer =
-              regionSizeMB == 0 ? 0.0f : (float) regionSizeInCache / 
regionSizeMB;
+              regionSizeMB
+                  == 0
+                    ? 0.0f
+                    : (float) (regionSizeInCache
+                      + 
sm.getRegionColdDataSize().getOrDefault(regionEncodedName, 0))
+                      / regionSizeMB;

Review Comment:
   `regionCacheRatioOnOldServer` can exceed 1.0 now that cold data MB is added 
to the numerator (and rounding can also push it over). Since downstream 
weighting multiplies this ratio by region size, values >1 inflate cost 
computations. Consider clamping the computed ratio to [0, 1].



##########
hbase-client/src/main/java/org/apache/hadoop/hbase/ServerMetricsBuilder.java:
##########
@@ -118,6 +121,7 @@ public static ClusterStatusProtos.ServerLoad 
toServerLoad(ServerMetrics metrics)
     if (metrics.getReplicationLoadSink() != null) {
       
builder.setReplLoadSink(ProtobufUtil.toReplicationLoadSink(metrics.getReplicationLoadSink()));
     }
+    builder.setCacheFreeSize(metrics.getCacheFreeSize());
     return builder.build();

Review Comment:
   `toServerLoad(ServerMetrics)` now sets `cacheFreeSize` but does not 
serialize the new `regionColdData` map. This makes 
`ServerMetricsBuilder.toServerMetrics` and `toServerLoad` asymmetric and drops 
cold-data info when converting ClusterMetrics to protobuf (e.g., via 
`ClusterMetricsBuilder`). Consider adding 
`putAllRegionColdData(metrics.getRegionColdDataSize())` here.



##########
hbase-protocol-shaded/src/main/protobuf/server/ClusterStatus.proto:
##########
@@ -332,6 +332,17 @@ message ServerLoad {
    * The metrics for region cached on this region server
    */
   map<string, uint32> regionCachedInfo = 16;
+
+  /**
+* Unallocated block cache capacity on this RegionServer, in bytes.
+* Used by the master for cache-aware load balancing (optional).
+*/

Review Comment:
   The newly added doc comment for `cacheFreeSize` is not formatted 
consistently with the surrounding protobuf comments (missing leading spaces 
before `*`). Please align the indentation to match the rest of the file to keep 
generated docs readable.
   



##########
hbase-balancer/src/main/java/org/apache/hadoop/hbase/master/balancer/CacheAwareLoadBalancer.java:
##########
@@ -483,25 +536,87 @@ static class CacheAwareCostFunction extends CostFunction {
         !isPersistentCache ? 0.0f : conf.getFloat(CACHE_COST_KEY, 
DEFAULT_CACHE_COST));
       bestCacheRatio = 0.0;
       cacheRatio = 0.0;
+      lowCacheRatioThreshold =
+        conf.getFloat(LOW_CACHE_RATIO_FOR_RELOCATION_KEY, 
LOW_CACHE_RATIO_FOR_RELOCATION_DEFAULT);
+      potentialCacheRatioAfterMove = Math.min(1.0f, conf
+        .getFloat(POTENTIAL_CACHE_RATIO_AFTER_MOVE_KEY, 
POTENTIAL_CACHE_RATIO_AFTER_MOVE_DEFAULT));
+      minFreeCacheSpaceFactor =
+        conf.getFloat(MIN_FREE_CACHE_SPACE_FACTOR_KEY, 
MIN_FREE_CACHE_SPACE_FACTOR_DEFAULT);
     }
 
     @Override
     void prepare(BalancerClusterState cluster) {
       super.prepare(cluster);
-      cacheRatio = 0.0;
-      bestCacheRatio = 0.0;
+      recomputeCacheRatio(cluster);
+      if (LOG.isDebugEnabled()) {
+        LOG.debug("CacheAwareCostFunction: Cost: {}", 1 - cacheRatio);
+      }
+    }
 
+    private void recomputeCacheRatio(BalancerClusterState cluster) {
+      double[] currentWeighted = computeCurrentWeightedContributions(cluster);
+      double currentSum = 0.0;
+      double bestCacheSum = 0.0;
       for (int region = 0; region < cluster.numRegions; region++) {
-        cacheRatio += cluster.getOrComputeWeightedRegionCacheRatio(region,
-          cluster.regionIndexToServerIndex[region]);
-        bestCacheRatio += cluster.getOrComputeWeightedRegionCacheRatio(region,
-          getServerWithBestCacheRatioForRegion(region));
+        currentSum += currentWeighted[region];
+        // here we only get the server index where this region cache ratio is 
the highest
+        int serverIndexBestCache = 
cluster.getOrComputeServerWithBestRegionCachedRatio()[region];
+        double currentHighestCache =
+          cluster.getOrComputeWeightedRegionCacheRatio(region, 
serverIndexBestCache);
+        // Get a hypothetical best cache ratio for this region if any server 
has enough free cache
+        // to host it.
+        double potentialHighestCache =
+          potentialBestWeightedFromFreeCache(cluster, region, 
currentHighestCache);
+        double actualHighest = Math.max(currentHighestCache, 
potentialHighestCache);
+        bestCacheSum += actualHighest;
       }
+      bestCacheRatio = bestCacheSum;
+      if (bestCacheSum <= 0.0) {
+        cacheRatio = cluster.numRegions == 0 ? 1.0 : 0.0;
+      } else {
+        cacheRatio = Math.min(1.0, currentSum / bestCacheSum);
+      }
+    }
 
-      cacheRatio = bestCacheRatio == 0 ? 1.0 : cacheRatio / bestCacheRatio;
-      if (LOG.isDebugEnabled()) {
-        LOG.debug("CacheAwareCostFunction: Cost: {}", 1 - cacheRatio);
+    private double[] computeCurrentWeightedContributions(BalancerClusterState 
cluster) {
+      int totalRegions = cluster.numRegions;
+      double[] contrib = new double[totalRegions];
+      for (int r = 0; r < totalRegions; r++) {
+        int s = cluster.regionIndexToServerIndex[r];
+        int sizeMb = cluster.getTotalRegionHFileSizeMB(r);
+        if (sizeMb <= 0) {
+          contrib[r] = 0.0;
+          continue;
+        }
+        contrib[r] = cluster.getOrComputeWeightedRegionCacheRatio(r, s);
+      }
+      return contrib;
+    }
+
+    /*
+     * If this region is cold in metrics and at least one RS (including its 
current host) reports
+     * enough free block cache to hold it, return an optimistic weighted cache 
score ({@link
+     * #potentialCacheRatioAfterMove} * region MB) so placement is not 
considered optimal solely
+     * from low ratios when capacity exists somewhere in the cluster.

Review Comment:
   The block comment says “If this region is cold in metrics…”, but the actual 
condition is `observedRatio < lowCacheRatioThreshold` (i.e., low cache ratio), 
not cold-data classification. Please update the comment to match the 
implemented logic so future maintainers don’t misinterpret when the heuristic 
applies.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] HBASE-30134: Improve CacheAwareLoadBalancer to consider low cache ratio when calculating imbalance [hbase]

Reply via email to