[ 
https://issues.apache.org/jira/browse/HBASE-20741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-20741:
-------------------------------------------
        Fix Version/s: 3.0.0
    Affects Version/s: 3.0.0
               Status: Patch Available  (was: Open)

The latest patch makes use of the LB's round robin Assignment. We assign the 
primary daughter regions to the target server but the replica daughter regions 
are assigned round robin way. When I ran the attached test case I still found 
the round robin was not working as expected because in LB 
roundRobinAssignment() code, 
{code}
    int serverIdx = 0;
    if (numServers > 1) {
      serverIdx = RANDOM.nextInt(numServers);
    }
    int regionIdx = 0;

    for (int j = 0; j < numServers; j++) {
      ServerName server = servers.get((j + serverIdx) % numServers);
      List<RegionInfo> serverRegions = new ArrayList<>(max);
      for (int i = regionIdx; i < numRegions; i += numServers) {
        RegionInfo region = regions.get(i % numRegions);
        if (cluster.wouldLowerAvailability(region, server)) {
          unassignedRegions.add(region);
        } else {
          serverRegions.add(region);
          cluster.doAssignRegion(region, server);
        }
      }
      assignments.put(server, serverRegions);
      regionIdx++;
    }
{code}
Here we iterate over serverlist and per server we try to assign as many regions 
to that considering if it will lower the availability. So here in our case we 
have a list of replica daughter regions and hence we may end in finding that 
assigning one of the replica daughter region could lower the availability, so 
we collect the unassignedRegions here. In the callee method we try to iterate 
these unassigned regions and do a roundrobin there. But there we only check if 
there could be lower availability but the recent server picked up for the 
replica regions and added to the assignments map is ignored. So in this patch 
we add that ability also to ensure that if at all we find a replica region and 
the replica region's pair was assigned to one of the servers in the above code 
we continue to the next server. 
After making this change my test case runs consistently and without that change 
it fails randomly.
[~huaxiang], [[email protected]], [[email protected]] - Pls have a look. 

> Split of a region with replicas creates all daughter regions and its replica 
> in same server
> -------------------------------------------------------------------------------------------
>
>                 Key: HBASE-20741
>                 URL: https://issues.apache.org/jira/browse/HBASE-20741
>             Project: HBase
>          Issue Type: Bug
>          Components: read replicas
>    Affects Versions: 2.0.0, 3.0.0
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>            Priority: Major
>             Fix For: 3.0.0, 2.2.0
>
>         Attachments: HBASE-20741.patch, HBASE-20741_1.patch
>
>
> Generally it is better that the parent region when split creates the daughter 
> region in the same target server. 
> But for replicas also we do the same and all the replica regions are created 
> in the same target server. We should ideally be doing a round robin and only 
> the primary daughter region should be opened in the intended target server 
> (where the parent was previously opened).
> [~huaxiang] FYI.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to