[ 
https://issues.apache.org/jira/browse/HBASE-10498?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13901245#comment-13901245
 ] 

rajeshbabu commented on HBASE-10498:
------------------------------------

[~stack]
bq. When a region goes online/offline, we now update in two places? Why not 
have the one place read the other so we keep one list only, an authoritative 
one? 
Some times we need to update in two places only for regions to colocate. If we 
read from one list some times regions may not be colocated.

For example lets take two regions regionA, regionB to be colocated and 
currently assigned to same region server RS1. 
Now lets suppose they need to move to other region server.
1) first region regionA started moving to RS2
2) region regionB started moving, while selecting destination if we refer 
regionsAssignments
list then we will get RS1 as the desination, so regionB will be assinged to RS1.
3) By the time assignments completed the two regions will be opened at 
different servers which is not expected.
These kind of things can happen during master startup,balancing,move or 
assignment failure cases etc..

To avoid the mismatches some times we need to refer multiple data structures 
like regionsInTransitions or regionPlans and regionAssignments in AM.
Still some times we may not ensure same region server has been selected for the 
regions to co-locate.

bq. Inject into the balancer an Interface it can pull on when it needs to know 
what is online?
Any way balancer has reference to master,we need not add new APIs to get the 
online regions from AM.

Thanks.

> Add new APIs to load balancer interface
> ---------------------------------------
>
>                 Key: HBASE-10498
>                 URL: https://issues.apache.org/jira/browse/HBASE-10498
>             Project: HBase
>          Issue Type: Improvement
>          Components: Balancer
>            Reporter: rajeshbabu
>            Assignee: rajeshbabu
>             Fix For: 0.98.1, 0.99.0
>
>         Attachments: HBASE-10498.patch
>
>
> If a custom load balancer required to maintain region and corresponding 
> server locations,
> we can capture this information when we run any balancer algorithm before 
> assignment(like random,retain).
> But during master startup we will not call any balancer algorithm if a region 
> already assinged
> During split also we open child regions first in RS and then notify to master 
> through zookeeper. 
> So split regions information cannot be captured into balancer.
> Since balancer has access to master we can get the information from online 
> regions or region plan data structures in AM.
> But some use cases we cannot relay on this information(mainly to maintain 
> colocation of two tables regions). 
> So it's better to add some APIs to load balancer to notify balancer when 
> *region is online or offline*.
> These APIs helps a lot to maintain *regions colocation through custom load 
> balancer* which is very important in secondary indexing. 



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to