[
https://issues.apache.org/jira/browse/HBASE-18946?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16211449#comment-16211449
]
huaxiang sun commented on HBASE-18946:
--------------------------------------
Thanks [~ram_krish]. One possible slowdown here with the approach is that if
queueAll() queues more than assignDispatchWaitQueueMaxSize regions, with the
current logic, it still needs to wait a bit, please see
https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java#L1639.
The previous logic is that when the first region is queued, it starts to wait
assignDispatchWaitMillis to start the real work. With the patch, the whole
batch is added at once, it skipped the addFirstOne logic. I think it can be
changed to avoid this case.
{code}
private HashMap<RegionInfo, RegionStateNode> waitOnAssignQueue() {
HashMap<RegionInfo, RegionStateNode> regions = null;
assignQueueLock.lock();
try {
if (pendingAssignQueue.isEmpty() && isRunning()) {
assignQueueFullCond.await();
}
if (!isRunning()) return null;
+if (pendingAssignQueue.size() < assignDispatchWaitQueueMaxSize) {
+ assignQueueFullCond.await(assignDispatchWaitMillis,
TimeUnit.MILLISECONDS);
+}
-assignQueueFullCond.await(assignDispatchWaitMillis,
TimeUnit.MILLISECONDS);
regions = new HashMap<RegionInfo,
RegionStateNode>(pendingAssignQueue.size());
for (RegionStateNode regionNode: pendingAssignQueue) {
regions.put(regionNode.getRegionInfo(), regionNode);
}
pendingAssignQueue.clear();
} catch (InterruptedException e) {
LOG.warn("got interrupted ", e);
Thread.currentThread().interrupt();
} finally {
assignQueueLock.unlock();
}
return regions;
}
{code}
> Stochastic load balancer assigns replica regions to the same RS
> ---------------------------------------------------------------
>
> Key: HBASE-18946
> URL: https://issues.apache.org/jira/browse/HBASE-18946
> Project: HBase
> Issue Type: Bug
> Affects Versions: 2.0.0-alpha-3
> Reporter: ramkrishna.s.vasudevan
> Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0-beta-1
>
> Attachments: HBASE-18946.patch, HBASE-18946.patch,
> TestRegionReplicasWithRestartScenarios.java
>
>
> Trying out region replica and its assignment I can see that some times the
> default LB Stocahstic load balancer assigns replica regions to the same RS.
> This happens when we have 3 RS checked in and we have a table with 3
> replicas. When a RS goes down then the replicas being assigned to same RS is
> acceptable but the case when we have enough RS to assign this behaviour is
> undesirable and does not solve the purpose of replicas.
> [~huaxiang] and [~enis].
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)