[
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16261986#comment-16261986
]
ramkrishna.s.vasudevan commented on HBASE-18946:
------------------------------------------------
Thanks for the detailed review.
bq.We only do this when it a region with replicas or do we do it always (would
be good if former, we want assignment to run fast).
Yes only for the replica regions.
bq.Please remind me what is the rule for replica assign? Just that they need to
be on different servers? Nothing about ordering? (Hmm... seems like replica has
to go out first). How does the patch to the balancer ensure this ordering?
Our initial requirement is that replicas for sure should be in different
servers if there are enough number of servers. Ordering is not of importance.
Coming to the balancer, in our code base only StochasticLB knows about replicas
while actually balancing the cluster. We have tried FavoredStocasticLB and it
does not know about replicas and infact messes with the replica assignment
itself (by corrupting the META entries for replicas). That is a big change
which we need to do later. We have confirmed this with [~enis] also offline
some time back.
Also as in said in previous comment balancer does not come into picture while
doing round robin assignment of a new table reigons. It just tries to do round
robin based on available servers.
bq.is there a hole where you can't see an ongoing Assigment? It has been
queue'd and is being worked on but but you have no means of querying where a
region is being assigned
Yes exactly. We don know about it. It not only applies for replica regions any
new create table regions has the same issue. The assignment queued just uses
the current regions in the queue to do the assignments. But for those regions
it is ok we don't mind how they are distributed but for replicas it is very
important. when we have enough servers if the replicas are not distributed then
we don server the purpose of replicas. If the servers are less than the
replicas then it is ok to assign the replicas to the same RS. In future we are
planning to even avoid this and fail the assignments itself.
bq.If round robin, are we not moving through the list of servers? Is the issue
only when cluster is small – three servers or so?
Hope you mean before this patch right? We are moving through the list of
servers but all the regions (including replicas) do not go to the assignment
queue together. So what ever is getting processed from the assignment queue
there it does round robin but the next set of regions that is processed again
does round robin and we end up in same RS.
bq.On patch, don't renumber protobuf fields.
Oh yes. I did that so that the steps are in order. Will change it and will try
to remove some duplicate code.
bq.If NOT isDefaultReplica and NOT replicaAvailable, we just fall through?
Yes. If it is a normal region we just go with the old code only and if the
replica is not avaliable in the existing code there is way to assign all such
region that don't find a suitable server to some servers randomly. Which is
fine for us too because replicas are more than the available number of servers.
Actually there is more to do with AM and replicas. We know the issues but not
yet ready with patches. Like on a rolling restart like case the AM will keep
moving the replicas to RS that are running. So finally when the last one is
closed all the region would have moved there and META will only have that
entry. Now when new RS are started it will try to do retain assignment and
again replica regions may get colocated and only a balancer can solve it. We
need to see how best we can do in these cases. But all that later (out of scope
here).
> Stochastic load balancer assigns replica regions to the same RS
> ---------------------------------------------------------------
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
> Issue Type: Bug
> Affects Versions: 2.0.0-alpha-3
> Reporter: ramkrishna.s.vasudevan
> Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch,
> HBASE-18946_2.patch, HBASE-18946_2.patch,
> TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the
> default LB Stocahstic load balancer assigns replica regions to the same RS.
> This happens when we have 3 RS checked in and we have a table with 3
> replicas. When a RS goes down then the replicas being assigned to same RS is
> acceptable but the case when we have enough RS to assign this behaviour is
> undesirable and does not solve the purpose of replicas.
> [~huaxiang] and [~enis].
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)