[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16465759#comment-16465759
 ] 

Zheng Hu commented on HBASE-20475:
----------------------------------

Checked the UT & log again, the phenomenon is:
{code}
<!--- testEditsBehindDroppedTableTiming begin
1. add peer
2. restart cluster to keep only on rs, and create table;
3. disable peer;
4. put a row in the test_dropped table;
5. put 1000 row in the test table;
6. enable peer;
7. we expected that the last row (rowKey=999) would not exist in peer cluster, 
but failed... 
<!--- testEditsBehindDroppedTableTiming end
{code}


I think the potential problems are: 
1. In HBaseInterClusterReplicationEndpoint, we hashed the encoded region name, 
divided entries into batches, and replicate them in order by one thread. 
there's possible the batches are groupped as: 

{code}
batch-1 : [500,...,998, 999]
batch-2 : [row_in_test_dropped_table, 0, 1, 2, 3, ..., 499 ]
{code}

The batch-1 replicated firslty, then the row=999 would be replicated to peer 
cluster. 

2. All UT use the same rowkey range for the putted 1000 row. so one UT may 
effect the another.

> Fix the flaky TestReplicationDroppedTables unit test.
> -----------------------------------------------------
>
>                 Key: HBASE-20475
>                 URL: https://issues.apache.org/jira/browse/HBASE-20475
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 2.1.0
>            Reporter: Zheng Hu
>            Assignee: Zheng Hu
>            Priority: Major
>             Fix For: 3.0.0, 2.1.0
>
>         Attachments: HBASE-20475-addendum-v2.patch, 
> HBASE-20475-addendum-v3.patch, HBASE-20475-addendum.patch, HBASE-20475.patch
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to