[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

ramkrishna.s.vasudevan (JIRA) Tue, 21 Nov 2017 21:24:29 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16261986#comment-16261986
 ]


ramkrishna.s.vasudevan commented on HBASE-18946:
------------------------------------------------

Thanks for the detailed review.
bq.We only do this when it a region with replicas or do we do it always (would 
be good if former, we want assignment to run fast).
Yes only for the replica regions.
bq.Please remind me what is the rule for replica assign? Just that they need to 
be on different servers? Nothing about ordering? (Hmm... seems like replica has 
to go out first). How does the patch to the balancer ensure this ordering?
Our initial requirement is that replicas for sure should be in different 
servers if there are enough number of servers. Ordering is not of importance. 
Coming to the balancer, in our code base only StochasticLB knows about replicas 
while actually balancing the cluster. We have tried FavoredStocasticLB and it 
does not know about replicas and infact messes with the replica assignment 
itself (by corrupting the META entries for replicas). That is a big change 
which we need to do later. We have confirmed this with [~enis] also offline 
some time back.
Also as in said in previous comment balancer does not come into picture while 
doing round robin assignment of a new table reigons. It just tries to do round 
robin based on available servers. 
bq.is there a hole where you can't see an ongoing Assigment? It has been 
queue'd and is being worked on but but you have no means of querying where a 
region is being assigned
Yes exactly. We don know about it. It not only applies for replica regions any 
new create table regions has the same issue. The assignment queued just uses 
the current regions in the queue to do the assignments.  But for those regions 
it is ok we don't mind how they are distributed but for replicas it is very 
important. when we have enough servers if the replicas are not distributed then 
we don server the purpose of replicas. If the servers are less than the 
replicas then it is ok to assign the replicas to the same RS. In future we are 
planning to even avoid this and fail the assignments itself.
bq.If round robin, are we not moving through the list of servers? Is the issue 
only when cluster is small – three servers or so?
Hope you mean before this patch right? We are moving through the list of 
servers but all the regions (including replicas) do not go to the assignment 
queue together. So what ever is getting processed from the assignment queue 
there it does round robin but the next set of regions that is processed again 
does round robin and we end up in same RS.
bq.On patch, don't renumber protobuf fields.
Oh yes. I did that so that the steps are in order. Will change it and will try 
to remove some duplicate code.
bq.If NOT isDefaultReplica and NOT replicaAvailable, we just fall through?
Yes. If it is a normal region we just go with the old code only and if the 
replica is not avaliable in the existing code there is way to assign all such 
region that don't find a suitable server to some servers randomly. Which is 
fine for us too because replicas are more than the available number of servers.
Actually there is more to do with AM and replicas. We know the issues but not 
yet ready with patches. Like on a rolling restart like case the AM will keep 
moving the replicas to RS that are running. So finally when the last one is 
closed all the region would have moved there and META will only have that 
entry. Now when new RS are started it will try to do retain assignment and 
again replica regions may get colocated and only a balancer can solve it. We 
need to see how best we can do in these cases. But all that later (out of scope 
here).


> Stochastic load balancer assigns replica regions to the same RS
> ---------------------------------------------------------------
>
>                 Key: HBASE-18946
>                 URL: https://issues.apache.org/jira/browse/HBASE-18946
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.0.0-alpha-3
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 2.0.0-beta-1
>
>         Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, HBASE-18946_2.patch, 
> TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (HBASE-18946) Stochastic load balancer assigns replica regions to the same RS

Reply via email to