[ 
https://issues.apache.org/jira/browse/HBASE-22472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17019219#comment-17019219
 ] 

Bharath Vissapragada commented on HBASE-22472:
----------------------------------------------

[~zhangduo] I've started running into this again. I was looking at your PR and 
you mentioned
{quote}The problem here is that we should only start one region server for the 
restarted cluster...
{quote}
What is the reason for this? Shouldn't it be the original number of region 
servers so that the recovered queues are evenly split at startup? Otherwise one 
region server gets all the recovered queues and there are asserts like the 
following.. (In testReplicationStatusSourceStartedTargetStoppedNewOp). These 
tests with sleeps are very racy.
{quote}testReplicationStatusSourceStartedTargetStoppedNewOp()
 ......
 ClusterMetrics metrics = 
hbaseAdmin.getClusterMetrics(EnumSet.of(Option.LIVE_SERVERS));
 List<ReplicationLoadSource> loadSources =
 metrics.getLiveServerMetrics().get(serverName).getReplicationLoadSourceList();
 assertEquals(1, loadSources.size());
{quote}

> The newly split TestReplicationStatus* tests are flaky
> ------------------------------------------------------
>
>                 Key: HBASE-22472
>                 URL: https://issues.apache.org/jira/browse/HBASE-22472
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication, test
>            Reporter: Duo Zhang
>            Assignee: Duo Zhang
>            Priority: Major
>             Fix For: 3.0.0, 2.3.0
>
>
> They are introduced by HBASE-22455, from the original TestReplicationStatus 
> tests. Need to dig more.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to