[jira] Commented: (HBASE-920) Make region balancing sloppier

stack (JIRA) Tue, 14 Oct 2008 21:21:15 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-920?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12639714#action_12639714
 ]


stack commented on HBASE-920:
-----------------------------

Made 229 regions.  In a cluster with three servers, I restarted a couple of 
times.  Below are the distributions over a couple of restarts:

Address Start Code      Load
13.powerset.com:60020   1224043343420   requests: 251 regions: 77
14.powerset.com:60020   1224043340404   requests: 0 regions: 76
15.u.powerset.com:60020 1224043340366   requests: 1 regions: 78
Total:  servers: 3              requests: 252 regions: 229

Balancing ran once only it seems:

2008-10-15 04:02:24,580 DEBUG org.apache.hadoop.hbase.master.ServerManager: 
Total Load: 1, Num Servers: 3, Avg Load: 1.0
2008-10-15 04:02:39,583 DEBUG org.apache.hadoop.hbase.master.ServerManager: 
Total Load: 62, Num Servers: 3, Avg Load: 21.0
2008-10-15 04:03:03,936 DEBUG org.apache.hadoop.hbase.master.ServerManager: 
Total Load: 229, Num Servers: 3, Avg Load: 77.0
2008-10-15 04:03:04,058 DEBUG org.apache.hadoop.hbase.master.RegionManager: 
Server XX.XX.XX:60020 is overloaded. Server load: 87 avg: 77.0, slop: 0.1
2008-10-15 04:03:18,966 DEBUG org.apache.hadoop.hbase.master.ServerManager: 
Total Load: 229, Num Servers: 3, Avg Load: 77.0
2008-10-15 04:03:33,991 DEBUG org.apache.hadoop.hbase.master.ServerManager: 
Total Load: 229, Num Servers: 3, Avg Load: 77.0

Ran it again a few times.  Worst skew was two off the average.

Ran with four servers.  Some skew.  

Address Start Code      Load
12.powerset.com:60020   1224044265656   requests: 0 regions: 63
13.powerset.com:60020   1224044265216   requests: 0 regions: 59
14.powerset.com:60020   1224044265160   requests: 0 regions: 53
15.powerset.com:60020   1224044265235   requests: 0 regions: 56

Average 58.





> Make region balancing sloppier
> ------------------------------
>
>                 Key: HBASE-920
>                 URL: https://issues.apache.org/jira/browse/HBASE-920
>             Project: Hadoop HBase
>          Issue Type: Improvement
>            Reporter: stack
>            Assignee: stack
>             Fix For: 0.18.1
>
>         Attachments: hbase-920.patch
>
>
> The region load balancer is exacting.  Here's the logic:
> {code}
>         if (avgLoad > 2.0 && thisServersLoad.getNumberOfRegions() > avgLoad) {
>           if (LOG.isDebugEnabled()) {
>             LOG.debug("Server " + serverName + " is overloaded. Server load: 
> " +
>               thisServersLoad.getNumberOfRegions() + " avg: " + avgLoad);
>           }
> ...
> {code}
> On a cluster of thousands of regions, especially around startup or if there's 
> been a crash, the above makes for a bunch of churn as load balancer closes 
> and opens nodes to achieve an exact balance (all nodes must be <= to average).
> I'd suggest that nodes should be left alone if they are within some 
> percentage of the average -- say 10% (should be configurable).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HBASE-920) Make region balancing sloppier

Reply via email to