[
https://issues.apache.org/jira/browse/HBASE-22472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17019219#comment-17019219
]
Bharath Vissapragada commented on HBASE-22472:
----------------------------------------------
[~zhangduo] I've started running into this again. I was looking at your PR and
you mentioned
{quote}The problem here is that we should only start one region server for the
restarted cluster...
{quote}
What is the reason for this? Shouldn't it be the original number of region
servers so that the recovered queues are evenly split at startup? Otherwise one
region server gets all the recovered queues and there are asserts like the
following.. (In testReplicationStatusSourceStartedTargetStoppedNewOp). These
tests with sleeps are very racy.
{quote}testReplicationStatusSourceStartedTargetStoppedNewOp()
......
ClusterMetrics metrics =
hbaseAdmin.getClusterMetrics(EnumSet.of(Option.LIVE_SERVERS));
List<ReplicationLoadSource> loadSources =
metrics.getLiveServerMetrics().get(serverName).getReplicationLoadSourceList();
assertEquals(1, loadSources.size());
{quote}
> The newly split TestReplicationStatus* tests are flaky
> ------------------------------------------------------
>
> Key: HBASE-22472
> URL: https://issues.apache.org/jira/browse/HBASE-22472
> Project: HBase
> Issue Type: Bug
> Components: Replication, test
> Reporter: Duo Zhang
> Assignee: Duo Zhang
> Priority: Major
> Fix For: 3.0.0, 2.3.0
>
>
> They are introduced by HBASE-22455, from the original TestReplicationStatus
> tests. Need to dig more.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)