[
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16465759#comment-16465759
]
Zheng Hu commented on HBASE-20475:
----------------------------------
Checked the UT & log again, the phenomenon is:
{code}
<!--- testEditsBehindDroppedTableTiming begin
1. add peer
2. restart cluster to keep only on rs, and create table;
3. disable peer;
4. put a row in the test_dropped table;
5. put 1000 row in the test table;
6. enable peer;
7. we expected that the last row (rowKey=999) would not exist in peer cluster,
but failed...
<!--- testEditsBehindDroppedTableTiming end
{code}
I think the potential problems are:
1. In HBaseInterClusterReplicationEndpoint, we hashed the encoded region name,
divided entries into batches, and replicate them in order by one thread.
there's possible the batches are groupped as:
{code}
batch-1 : [500,...,998, 999]
batch-2 : [row_in_test_dropped_table, 0, 1, 2, 3, ..., 499 ]
{code}
The batch-1 replicated firslty, then the row=999 would be replicated to peer
cluster.
2. All UT use the same rowkey range for the putted 1000 row. so one UT may
effect the another.
> Fix the flaky TestReplicationDroppedTables unit test.
> -----------------------------------------------------
>
> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
> Issue Type: Bug
> Affects Versions: 2.1.0
> Reporter: Zheng Hu
> Assignee: Zheng Hu
> Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20475-addendum-v2.patch,
> HBASE-20475-addendum-v3.patch, HBASE-20475-addendum.patch, HBASE-20475.patch
>
>
> See
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)