[
https://issues.apache.org/jira/browse/HBASE-20741?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
ramkrishna.s.vasudevan updated HBASE-20741:
-------------------------------------------
Fix Version/s: 3.0.0
Affects Version/s: 3.0.0
Status: Patch Available (was: Open)
The latest patch makes use of the LB's round robin Assignment. We assign the
primary daughter regions to the target server but the replica daughter regions
are assigned round robin way. When I ran the attached test case I still found
the round robin was not working as expected because in LB
roundRobinAssignment() code,
{code}
int serverIdx = 0;
if (numServers > 1) {
serverIdx = RANDOM.nextInt(numServers);
}
int regionIdx = 0;
for (int j = 0; j < numServers; j++) {
ServerName server = servers.get((j + serverIdx) % numServers);
List<RegionInfo> serverRegions = new ArrayList<>(max);
for (int i = regionIdx; i < numRegions; i += numServers) {
RegionInfo region = regions.get(i % numRegions);
if (cluster.wouldLowerAvailability(region, server)) {
unassignedRegions.add(region);
} else {
serverRegions.add(region);
cluster.doAssignRegion(region, server);
}
}
assignments.put(server, serverRegions);
regionIdx++;
}
{code}
Here we iterate over serverlist and per server we try to assign as many regions
to that considering if it will lower the availability. So here in our case we
have a list of replica daughter regions and hence we may end in finding that
assigning one of the replica daughter region could lower the availability, so
we collect the unassignedRegions here. In the callee method we try to iterate
these unassigned regions and do a roundrobin there. But there we only check if
there could be lower availability but the recent server picked up for the
replica regions and added to the assignments map is ignored. So in this patch
we add that ability also to ensure that if at all we find a replica region and
the replica region's pair was assigned to one of the servers in the above code
we continue to the next server.
After making this change my test case runs consistently and without that change
it fails randomly.
[~huaxiang], [[email protected]], [[email protected]] - Pls have a look.
> Split of a region with replicas creates all daughter regions and its replica
> in same server
> -------------------------------------------------------------------------------------------
>
> Key: HBASE-20741
> URL: https://issues.apache.org/jira/browse/HBASE-20741
> Project: HBase
> Issue Type: Bug
> Components: read replicas
> Affects Versions: 2.0.0, 3.0.0
> Reporter: ramkrishna.s.vasudevan
> Assignee: ramkrishna.s.vasudevan
> Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-20741.patch, HBASE-20741_1.patch
>
>
> Generally it is better that the parent region when split creates the daughter
> region in the same target server.
> But for replicas also we do the same and all the replica regions are created
> in the same target server. We should ideally be doing a round robin and only
> the primary daughter region should be opened in the intended target server
> (where the parent was previously opened).
> [~huaxiang] FYI.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)