[GitHub] [hbase] clarax commented on a change in pull request #3415: HBASE-25739 TableSkewCostFunction need to use aggregated deviation
clarax commented on a change in pull request #3415: URL: https://github.com/apache/hbase/pull/3415#discussion_r668248743 ## File path: hbase-balancer/src/main/java/org/apache/hadoop/hbase/master/balancer/DoubleArrayCost.java ## @@ -72,31 +72,13 @@ private static double computeCost(double[] stats) { double count = stats.length; double mean = total / count; -// Compute max as if all region servers had 0 and one had the sum of all costs. This must be -// a zero sum cost for this to make sense. -double max = ((count - 1) * mean) + (total - mean); - -// It's possible that there aren't enough regions to go around -double min; -if (count > total) { - min = ((count - total) * mean) + ((1 - mean) * total); -} else { - // Some will have 1 more than everything else. - int numHigh = (int) (total - (Math.floor(mean) * count)); - int numLow = (int) (count - numHigh); - - min = (numHigh * (Math.ceil(mean) - mean)) + (numLow * (mean - Math.floor(mean))); - -} -min = Math.max(0, min); Review comment: applied later by prior code change. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] clarax commented on a change in pull request #3415: HBASE-25739 TableSkewCostFunction need to use aggregated deviation
clarax commented on a change in pull request #3415: URL: https://github.com/apache/hbase/pull/3415#discussion_r668248743 ## File path: hbase-balancer/src/main/java/org/apache/hadoop/hbase/master/balancer/DoubleArrayCost.java ## @@ -72,31 +72,13 @@ private static double computeCost(double[] stats) { double count = stats.length; double mean = total / count; -// Compute max as if all region servers had 0 and one had the sum of all costs. This must be -// a zero sum cost for this to make sense. -double max = ((count - 1) * mean) + (total - mean); - -// It's possible that there aren't enough regions to go around -double min; -if (count > total) { - min = ((count - total) * mean) + ((1 - mean) * total); -} else { - // Some will have 1 more than everything else. - int numHigh = (int) (total - (Math.floor(mean) * count)); - int numLow = (int) (count - numHigh); - - min = (numHigh * (Math.ceil(mean) - mean)) + (numLow * (mean - Math.floor(mean))); - -} -min = Math.max(0, min); Review comment: used later. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@hbase.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] clarax commented on a change in pull request #3415: HBASE-25739 TableSkewCostFunction need to use aggregated deviation
clarax commented on a change in pull request #3415: URL: https://github.com/apache/hbase/pull/3415#discussion_r658223242 ## File path: hbase-balancer/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java ## @@ -125,7 +125,7 @@ private int stepsPerRegion = 800; private long maxRunningTime = 30 * 1000 * 1; // 30 seconds. private int numRegionLoadsToRemember = 15; - private float minCostNeedBalance = 0.05f; + private float minCostNeedBalance = 0.025f; Review comment: Yes, as explained in the jira and the pr, we have to lower it for consistent user experience. the old code almost always return max for table skew that inflates the total cost. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] clarax commented on a change in pull request #3415: HBASE-25739 TableSkewCostFunction need to use aggregated deviation
clarax commented on a change in pull request #3415: URL: https://github.com/apache/hbase/pull/3415#discussion_r658223242 ## File path: hbase-balancer/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java ## @@ -125,7 +125,7 @@ private int stepsPerRegion = 800; private long maxRunningTime = 30 * 1000 * 1; // 30 seconds. private int numRegionLoadsToRemember = 15; - private float minCostNeedBalance = 0.05f; + private float minCostNeedBalance = 0.025f; Review comment: Yes, as explained in the jira and the pr, we have to lower it for constant user experience. the old code almost always return max for table skew that inflates the total cost. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] clarax commented on a change in pull request #3415: HBASE-25739 TableSkewCostFunction need to use aggregated deviation
clarax commented on a change in pull request #3415: URL: https://github.com/apache/hbase/pull/3415#discussion_r658222382 ## File path: hbase-balancer/src/main/java/org/apache/hadoop/hbase/master/balancer/CostFunction.java ## @@ -89,13 +89,14 @@ protected void regionMoved(int region, int oldServer, int newServer) { * @return The scaled value. */ protected static double scale(double min, double max, double value) { -if (max <= min || value <= min) { +if (max <= min || value <= min + || Math.abs(max - min) <= 0.01 || Math.abs(value - min) <= 0.01) { Review comment: Let me create a COST_EPSILON because I have seen quite wide range of precision. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] clarax commented on a change in pull request #3415: HBASE-25739 TableSkewCostFunction need to use aggregated deviation
clarax commented on a change in pull request #3415: URL: https://github.com/apache/hbase/pull/3415#discussion_r658220514 ## File path: hbase-balancer/src/main/java/org/apache/hadoop/hbase/master/balancer/DoubleArrayCost.java ## @@ -106,4 +88,32 @@ private static double getSum(double[] stats) { } return total; } + + /** + * Return the min skew of distribution + */ + public static double getMinSkew(double total, double numServers) { Review comment: It is to convert the input from integer to double for computation in the function. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] clarax commented on a change in pull request #3415: HBASE-25739 TableSkewCostFunction need to use aggregated deviation
clarax commented on a change in pull request #3415: URL: https://github.com/apache/hbase/pull/3415#discussion_r658219316 ## File path: hbase-balancer/src/main/java/org/apache/hadoop/hbase/master/balancer/DoubleArrayCost.java ## @@ -106,4 +88,32 @@ private static double getSum(double[] stats) { } return total; } + + /** + * Return the min skew of distribution + */ + public static double getMinSkew(double total, double numServers) { Review comment: yes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [hbase] clarax commented on a change in pull request #3415: HBASE-25739 TableSkewCostFunction need to use aggregated deviation
clarax commented on a change in pull request #3415: URL: https://github.com/apache/hbase/pull/3415#discussion_r658207914 ## File path: hbase-balancer/src/main/java/org/apache/hadoop/hbase/master/balancer/DoubleArrayCost.java ## @@ -106,4 +88,32 @@ private static double getSum(double[] stats) { } return total; } + + /** + * Return the min skew of distribution + */ + public static double getMinSkew(double total, double numServers) { +double mean = total / numServers; +// It's possible that there aren't enough regions to go around +double min; +if (numServers > total) { Review comment: This is the case when we have more nodes than regions, we will have nodes without regions and it is balanced. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org