[ 
https://issues.apache.org/jira/browse/HBASE-71?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12572713#action_12572713
 ] 

Bryan Duxbury commented on HBASE-71:
------------------------------------

One problem is that when a region server checks in, if it's overloaded, we 
don't know which regions we should tell it to offline. The current flow 
requires us to tell it to close specific regions by name. However, unless 
they're meta regions, we won't know their names, as a region server only 
reports its aggregate load, not the individual regions and loads. 

We have a few options here:
 * Have regionservers report in with N region names along with their normal 
reports. If a region server is overloaded, then the master can pick from those 
regions and choose some to offline. At first, the way that the region server 
picks the N regions could be essentially arbitrary, but in the future we could 
make it do something like send the 10 most requested regions, or some other 
load-based factor. Downside of this approach is that it would potentially bloat 
the reports, even when rebalancing isn't required.
 * Determine what regions to relocate during meta scans. This way we have 
access to all the region names we need as well as who has them. Downside is 
that it would really increase the amount of stuff MetaScanner does, maybe 
complicating it significantly. We'd also have to figure out a way to do this 
without having to load all of the region->server assignments into memory at 
once, so it would be non-trivial.
 * Have a new message type that just tells a region server to close a number of 
regions, letting it choose which to close. It would then report what regions it 
chose to close. Downside of this is that we'd have to assume a reported close 
that wasn't specifically marked for closing means rebalancing. We wouldn't know 
where those regions are in terms of the overall flow at that moment; they'd 
just be offlining.

Are there any alternatives that I missed?

> [hbase] Master should rebalance region assignments periodically
> ---------------------------------------------------------------
>
>                 Key: HBASE-71
>                 URL: https://issues.apache.org/jira/browse/HBASE-71
>             Project: Hadoop HBase
>          Issue Type: Improvement
>          Components: master
>            Reporter: Bryan Duxbury
>            Assignee: Bryan Duxbury
>             Fix For: 0.2.0
>
>
> The master currently only does region assignments at startup or when there 
> are splits or dead regionservers. This means that if you join a new 
> regionserver to the cluster after startup, it does not get assigned a fair 
> share of the already-served regions as you would expect. It only gets a share 
> of new regions being served.
> The master should periodically check the balance of regions, based on 
> whatever assignment function, instead of in reaction to the above listed 
> events.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to