huaxiangsun commented on a change in pull request #2003:
URL: https://github.com/apache/hbase/pull/2003#discussion_r454031142
##########
File path:
hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java
##########
@@ -1462,8 +1473,14 @@ protected double getCostFromRl(BalancerRegionLoad rl) {
}
@Override
- protected double getCostFromRl(BalancerRegionLoad rl) {
- return rl.getStorefileSizeMB();
+ protected double getCostFromRl(BalancerRegionLoad rl, boolean
isPrimaryRegion) {
+ // Do not count replica region's file size, as replica regions serve
very little
+ // read requests, this may be changed if there are enough data from
production showing
Review comment:
As I wrote in the comments, all these factors really impacts system
performance. From one of the production clusters' stats, < 0.01% of requests
goes to replica regions, which means most of regions are cold at Region
servers. That is the reason I want to remove this factors from balancer. Agreed
with you that things could be different with others, make it configurable makes
more sense. If it is ok with you, I want to drop this change from this patch
and creates a separate issue to track it, probably with a test case as well.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]