[ 
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16261768#comment-16261768
 ] 

stack commented on HBASE-18946:
-------------------------------

bq. While doing roundrobinAssignment contact the AM to know the current state 
of replica regions and choose a server accordingly. 

We only do this when it a region with replicas or do we do it always (would be 
good if former, we want assignment to run fast).

Yeah, if round robin, its round robin (smile).

Please remind me what is the rule for replica assign? Just that they need to be 
on different servers? Nothing about ordering? (Hmm... seems like replica has to 
go out first). How does the patch to the balancer ensure this ordering?

is there a hole where you can't see an ongoing Assigment? It has been queue'd 
and is being worked on but but you have no means of querying where a region is 
being assigned (i.e. we are about to assign a replica and we want to avoid 
assigning to the same location as where we just assigned?).

If round robin, are we not moving through the list of servers? Is the issue 
only when cluster is small -- three servers or so?


On patch, don't renumber protobuf fields.

What is happening here (BTW, repeats code):
{code}
1263            List<RegionInfo> serverRegions =
1264                assignments.computeIfAbsent(serverName, k -> new 
ArrayList<>());
1265            if (!RegionReplicaUtil.isDefaultReplica(region)) {
1266              if (!replicaAvailable(region, serverName)) {
1267                assignRegionToServer(cluster, serverName, serverRegions, 
region);
1268                serverIdx = (j + serverIdx + 1) % numServers;
1269                assigned = true;
1270                break;
1271              }
1272            } else if (!cluster.wouldLowerAvailability(region, serverName)) 
{
1273              assignRegionToServer(cluster, serverName, serverRegions, 
region);
1274              serverIdx = (j + serverIdx + 1) % numServers; // remain from 
next server
...
{code}

If NOT isDefaultReplica and NOT replicaAvailable, we just fall through?


Good stuff.




> Stochastic load balancer assigns replica regions to the same RS
> ---------------------------------------------------------------
>
>                 Key: HBASE-18946
>                 URL: https://issues.apache.org/jira/browse/HBASE-18946
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.0.0-alpha-3
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 2.0.0-beta-1
>
>         Attachments: HBASE-18946.patch, HBASE-18946.patch, 
> HBASE-18946_2.patch, HBASE-18946_2.patch, 
> TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the 
> default LB Stocahstic load balancer assigns replica regions to the same RS. 
> This happens when we have 3 RS checked in and we have a table with 3 
> replicas. When a RS goes down then the replicas being assigned to same RS is 
> acceptable but the case when we have enough RS to assign this behaviour is 
> undesirable and does not solve the purpose of replicas. 
> [~huaxiang] and [~enis]. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to