Apache9 commented on a change in pull request #3723:
URL: https://github.com/apache/hbase/pull/3723#discussion_r739842907



##########
File path: 
hbase-balancer/src/main/java/org/apache/hadoop/hbase/master/balancer/LoadCandidateGenerator.java
##########
@@ -34,27 +35,51 @@ BalanceAction generate(BalancerClusterState cluster) {
   private int pickLeastLoadedServer(final BalancerClusterState cluster, int 
thisServer) {
     Integer[] servers = cluster.serverIndicesSortedByRegionCount;
 
-    int index = 0;
-    while (servers[index] == null || servers[index] == thisServer) {
-      index++;
-      if (index == servers.length) {
-        return -1;
+    int selectedIndex = -1;
+    double currentLargestRandom = -1;
+    for (int i = 0; i < servers.length; i++) {
+      if (servers[i] == null || servers[i] == thisServer) {
+        continue;
+      }
+      if (selectedIndex != -1
+        && cluster.getNumRegionsComparator().compare(servers[i], 
servers[selectedIndex]) != 0) {
+        // Exhausted servers of the same region count
+        break;
+      }
+      // we don't know how many servers have the same region count, we will 
randomly select one

Review comment:
       Checked the code again, it seems not a typical Reservoir Sampling, it 
just uses the simple solution in your linked article, where we just assign 
every element a random number and choose the greatest one. I think we'd better 
add more detailed comments here otherwise it will easily confuse later 
developers.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to