[
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16193395#comment-16193395
]
huaxiang sun commented on HBASE-18946:
--------------------------------------
Thanks [~ram_krish] for the finding! I checked the code, I think it is caused
by the fact replica regions are added first, then all default regions are added
at the end of the list.
https://github.com/apache/hbase/blob/master/hbase-client/src/main/java/org/apache/hadoop/hbase/client/RegionReplicaUtil.java#L187
For example, 3 RSes and two regions with 3 replicas.
{code}
regionId-replicaId
0-1 0-2 1-1 1-2 0-0 1-0
R0 R1 R2
0-1 0-2 1-1
1-2 0-0 1-0
{code}
We can see that for R1 and R2, replicas for same regions are assigned to the
same RS.
If the logic can be changed a bit as follows, it can fix this issue. Other
places need to be checked as well.
{code}
public static List<RegionInfo> addReplicas(final TableDescriptor
tableDescriptor,
final List<RegionInfo> regions, int oldReplicaCount, int newReplicaCount)
{
if ((newReplicaCount - 1) <= 0) {
return regions;
}
List<RegionInfo> hRegionInfos = new ArrayList<>((newReplicaCount) *
regions.size());
for (int i = 0; i < regions.size(); i++) {
if (RegionReplicaUtil.isDefaultReplica(regions.get(i))) {
// region level replica index starts from 0. So if oldReplicaCount was
2 then the max replicaId for
// the existing regions would be 1
hRegionInfos.add(regions.get(i));
for (int j = oldReplicaCount; j < newReplicaCount; j++) {
hRegionInfos.add(RegionReplicaUtil.getRegionInfoForReplica(regions.get(i), j));
}
}
}
// hRegionInfos.addAll(regions);
{code}
> Stochastic load balancer assigns replica regions to the same RS
> ---------------------------------------------------------------
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
> Issue Type: Bug
> Affects Versions: 2.0.0-alpha-3
> Reporter: ramkrishna.s.vasudevan
> Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the
> default LB Stocahstic load balancer assigns replica regions to the same RS.
> This happens when we have 3 RS checked in and we have a table with 3
> replicas. When a RS goes down then the replicas being assigned to same RS is
> acceptable but the case when we have enough RS to assign this behaviour is
> undesirable and does not solve the purpose of replicas.
> [~huaxiang] and [~enis].
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)