[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16469838#comment-16469838 ] Zheng Hu commented on HBASE-20475: -- Sure, filed HBASE-20560 for it . > Fix the flaky TestReplicationDroppedTables unit test. > - > > Key: HBASE-20475 > URL: https://issues.apache.org/jira/browse/HBASE-20475 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.0 > > Attachments: > 0001-HBASE-20475-TestReplicationDroppedTables-refactor.patch, > HBASE-20475-addendum-v2.patch, HBASE-20475-addendum-v3.patch, > HBASE-20475-addendum.patch, HBASE-20475.patch > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16468832#comment-16468832 ] Duo Zhang commented on HBASE-20475: --- The patch is a bit big as an addendum, let's open a new issue for it? So we can use the review board... > Fix the flaky TestReplicationDroppedTables unit test. > - > > Key: HBASE-20475 > URL: https://issues.apache.org/jira/browse/HBASE-20475 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.0 > > Attachments: > 0001-HBASE-20475-TestReplicationDroppedTables-refactor.patch, > HBASE-20475-addendum-v2.patch, HBASE-20475-addendum-v3.patch, > HBASE-20475-addendum.patch, HBASE-20475.patch > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16468727#comment-16468727 ] Zheng Hu commented on HBASE-20475: -- Ping [~Apache9] > Fix the flaky TestReplicationDroppedTables unit test. > - > > Key: HBASE-20475 > URL: https://issues.apache.org/jira/browse/HBASE-20475 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.0 > > Attachments: > 0001-HBASE-20475-TestReplicationDroppedTables-refactor.patch, > HBASE-20475-addendum-v2.patch, HBASE-20475-addendum-v3.patch, > HBASE-20475-addendum.patch, HBASE-20475.patch > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467225#comment-16467225 ] Zheng Hu commented on HBASE-20475: -- Let me explain the changes in UT refactor : 1. Let TestReplicationDroppedTables#setUpBase override the TestReplicationBase#setUpBase, because when I read the log, I found that in testEditsBehindDroppedTableTiming, there were some replaying WALs which was stuck in its previous UT testEditsDroppedWithDroppedTable. The original version, we TestReplicationBase#setUpBase firstly, then TestReplicationDroppedTables#setUp, so the peer creation is ahead of WAL rolling, so the newly created peer in testEditsBehindDroppedTableTiming would still catch the WALs from previous UT. 2. In those UTs, we would shutdown the source mini cluster to keep only one RS, but in the following operations, we were still using the htable1 & admin which was initialized for the previous mini cluster, so I fixed those . 3. The verifyReplicationProceeded & verifyReplicationStuck should not only check the lastRowkey, as I said above So I fixed those too.. 4. Some minor change, such as we used the deprecated HTableDescriptor, I changed them to builder... > Fix the flaky TestReplicationDroppedTables unit test. > - > > Key: HBASE-20475 > URL: https://issues.apache.org/jira/browse/HBASE-20475 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.0 > > Attachments: > 0001-HBASE-20475-TestReplicationDroppedTables-refactor.patch, > HBASE-20475-addendum-v2.patch, HBASE-20475-addendum-v3.patch, > HBASE-20475-addendum.patch, HBASE-20475.patch > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16465790#comment-16465790 ] Zheng Hu commented on HBASE-20475: -- I guess we were stopping a ReplicationSourceShipper which has not finished its initialization. so the NPE happen {code} worker.entryReader.interrupt(); {code} > Fix the flaky TestReplicationDroppedTables unit test. > - > > Key: HBASE-20475 > URL: https://issues.apache.org/jira/browse/HBASE-20475 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.0 > > Attachments: HBASE-20475-addendum-v2.patch, > HBASE-20475-addendum-v3.patch, HBASE-20475-addendum.patch, HBASE-20475.patch > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16465766#comment-16465766 ] Duo Zhang commented on HBASE-20475: --- What about the NPE posted by me above? Is there a race? > Fix the flaky TestReplicationDroppedTables unit test. > - > > Key: HBASE-20475 > URL: https://issues.apache.org/jira/browse/HBASE-20475 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.0 > > Attachments: HBASE-20475-addendum-v2.patch, > HBASE-20475-addendum-v3.patch, HBASE-20475-addendum.patch, HBASE-20475.patch > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16465759#comment-16465759 ] Zheng Hu commented on HBASE-20475: -- Checked the UT & log again, the phenomenon is: {code} > Key: HBASE-20475 > URL: https://issues.apache.org/jira/browse/HBASE-20475 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.0 > > Attachments: HBASE-20475-addendum-v2.patch, > HBASE-20475-addendum-v3.patch, HBASE-20475-addendum.patch, HBASE-20475.patch > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463684#comment-16463684 ] Zheng Hu commented on HBASE-20475: -- Filed HBASE-20531 to address the above NPE. > Fix the flaky TestReplicationDroppedTables unit test. > - > > Key: HBASE-20475 > URL: https://issues.apache.org/jira/browse/HBASE-20475 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.0 > > Attachments: HBASE-20475-addendum-v2.patch, > HBASE-20475-addendum-v3.patch, HBASE-20475-addendum.patch, HBASE-20475.patch > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463322#comment-16463322 ] Zheng Hu commented on HBASE-20475: -- Oh, the hadoop QA results (https://builds.apache.org/job/HBASE-Flaky-Tests/30342/) has been expired , and the flaky results in http://104.198.223.121:8080/job/HBASE-Flaky-Tests/ was caused by another NPE .. {code} 2018-05-03 17:05:59,008 ERROR [RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=58476] master.MasterRpcServices(508): Region server instance-2.c.gcp-hbase.internal,52125,1525367143898 reported a fatal error: * ABORTING region server instance-2.c.gcp-hbase.internal,52125,1525367143898: Unrecoverable exception while closing region hbase:meta,,1.1588230740, still finishing close * Cause: java.io.IOException: java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1637) at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1466) at org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:104) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at org.apache.hadoop.hbase.regionserver.HRegionServer.reportFileArchivalForQuotas(HRegionServer.java:3709) at org.apache.hadoop.hbase.regionserver.HStore.reportArchivedFilesForQuota(HStore.java:2718) at org.apache.hadoop.hbase.regionserver.HStore.removeCompactedfiles(HStore.java:2649) at org.apache.hadoop.hbase.regionserver.HStore.close(HStore.java:929) at org.apache.hadoop.hbase.regionserver.HRegion$2.call(HRegion.java:1615) at org.apache.hadoop.hbase.regionserver.HRegion$2.call(HRegion.java:1612) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) ... 3 more {code} > Fix the flaky TestReplicationDroppedTables unit test. > - > > Key: HBASE-20475 > URL: https://issues.apache.org/jira/browse/HBASE-20475 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.0 > > Attachments: HBASE-20475-addendum-v2.patch, > HBASE-20475-addendum-v3.patch, HBASE-20475-addendum.patch, HBASE-20475.patch > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463285#comment-16463285 ] Zheng Hu commented on HBASE-20475: -- Let me dig the NPE. > Fix the flaky TestReplicationDroppedTables unit test. > - > > Key: HBASE-20475 > URL: https://issues.apache.org/jira/browse/HBASE-20475 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.0 > > Attachments: HBASE-20475-addendum-v2.patch, > HBASE-20475-addendum-v3.patch, HBASE-20475-addendum.patch, HBASE-20475.patch > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457702#comment-16457702 ] Hudson commented on HBASE-20475: Results for branch branch-2 [build #670 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/670/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/670//General_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/670//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/670//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > Fix the flaky TestReplicationDroppedTables unit test. > - > > Key: HBASE-20475 > URL: https://issues.apache.org/jira/browse/HBASE-20475 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.0 > > Attachments: HBASE-20475-addendum-v2.patch, > HBASE-20475-addendum-v3.patch, HBASE-20475-addendum.patch, HBASE-20475.patch > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457593#comment-16457593 ] Hudson commented on HBASE-20475: Results for branch master [build #314 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/314/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/master/314//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/master/314//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/master/314//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > Fix the flaky TestReplicationDroppedTables unit test. > - > > Key: HBASE-20475 > URL: https://issues.apache.org/jira/browse/HBASE-20475 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.0 > > Attachments: HBASE-20475-addendum-v2.patch, > HBASE-20475-addendum-v3.patch, HBASE-20475-addendum.patch, HBASE-20475.patch > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457584#comment-16457584 ] Duo Zhang commented on HBASE-20475: --- I think there is a race... Need to dig more, our locking still has problems... > Fix the flaky TestReplicationDroppedTables unit test. > - > > Key: HBASE-20475 > URL: https://issues.apache.org/jira/browse/HBASE-20475 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.0 > > Attachments: HBASE-20475-addendum-v2.patch, > HBASE-20475-addendum-v3.patch, HBASE-20475-addendum.patch, HBASE-20475.patch > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457582#comment-16457582 ] Duo Zhang commented on HBASE-20475: --- {noformat} 2018-04-28 11:21:41,825 WARN [RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=33073] replication.RefreshPeerProcedure(131): Refresh peer 2 for DISABLE on asf915.gq1.ygridcore.net,38380,1524914179604 failed java.lang.NullPointerException via asf915.gq1.ygridcore.net,38380,1524914179604:java.lang.NullPointerException: at org.apache.hadoop.hbase.procedure2.RemoteProcedureException.fromProto(RemoteProcedureException.java:120) at org.apache.hadoop.hbase.master.MasterRpcServices.lambda$reportProcedureDone$4(MasterRpcServices.java:2248) at java.util.ArrayList.forEach(ArrayList.java:1257) at java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1080) at org.apache.hadoop.hbase.master.MasterRpcServices.reportProcedureDone(MasterRpcServices.java:2243) at org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:15180) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324) at org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304) Caused by: java.lang.NullPointerException: at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.terminate(ReplicationSource.java:501) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.terminate(ReplicationSource.java:480) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.terminate(ReplicationSource.java:475) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.refreshSources(ReplicationSourceManager.java:397) at org.apache.hadoop.hbase.replication.regionserver.PeerProcedureHandlerImpl.refreshPeerState(PeerProcedureHandlerImpl.java:78) at org.apache.hadoop.hbase.replication.regionserver.PeerProcedureHandlerImpl.disablePeer(PeerProcedureHandlerImpl.java:97) at org.apache.hadoop.hbase.replication.regionserver.RefreshPeerCallable.call(RefreshPeerCallable.java:65) at org.apache.hadoop.hbase.replication.regionserver.RefreshPeerCallable.call(RefreshPeerCallable.java:34) at org.apache.hadoop.hbase.regionserver.handler.RSProcedureHandler.process(RSProcedureHandler.java:47) at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) {noformat} > Fix the flaky TestReplicationDroppedTables unit test. > - > > Key: HBASE-20475 > URL: https://issues.apache.org/jira/browse/HBASE-20475 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.0 > > Attachments: HBASE-20475-addendum-v2.patch, > HBASE-20475-addendum-v3.patch, HBASE-20475-addendum.patch, HBASE-20475.patch > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457512#comment-16457512 ] Duo Zhang commented on HBASE-20475: --- [~openinx] is probably on the train to home. Let me help committing... > Fix the flaky TestReplicationDroppedTables unit test. > - > > Key: HBASE-20475 > URL: https://issues.apache.org/jira/browse/HBASE-20475 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.0 > > Attachments: HBASE-20475-addendum-v2.patch, > HBASE-20475-addendum-v3.patch, HBASE-20475-addendum.patch, HBASE-20475.patch > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457456#comment-16457456 ] Duo Zhang commented on HBASE-20475: --- It works. Please merge the patches and commit it to branch-2. Thanks. > Fix the flaky TestReplicationDroppedTables unit test. > - > > Key: HBASE-20475 > URL: https://issues.apache.org/jira/browse/HBASE-20475 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.0 > > Attachments: HBASE-20475-addendum-v2.patch, > HBASE-20475-addendum-v3.patch, HBASE-20475-addendum.patch, HBASE-20475.patch > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457281#comment-16457281 ] Zheng Hu commented on HBASE-20475: -- The TestReplicationDroppedTables is becoming more stable now, still wait some days to see if flaky again .. > Fix the flaky TestReplicationDroppedTables unit test. > - > > Key: HBASE-20475 > URL: https://issues.apache.org/jira/browse/HBASE-20475 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.0 > > Attachments: HBASE-20475-addendum-v2.patch, > HBASE-20475-addendum-v3.patch, HBASE-20475-addendum.patch, HBASE-20475.patch > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16456682#comment-16456682 ] Hadoop QA commented on HBASE-20475: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 24s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 24s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 51s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 12s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 20s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 3s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 50s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 18s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 13m 17s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 35s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 41s{color} | {color:green} hbase-replication in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green}163m 8s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 40s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}208m 34s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:d8b550f | | JIRA Issue | HBASE-20475 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12921007/HBASE-20475-addendum-v3.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux b5e52e8f01e0 4.4.0-104-generic #127-Ubuntu SMP Mon Dec 11 12:16:42 UTC 2017 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 39cf42be9a | | maven |
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16456444#comment-16456444 ] Zheng Hu commented on HBASE-20475: -- Thanks [~Apache9] for reviewing, Pushed the addendum to master branch, waiting the flaky dashboard ... > Fix the flaky TestReplicationDroppedTables unit test. > - > > Key: HBASE-20475 > URL: https://issues.apache.org/jira/browse/HBASE-20475 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.0 > > Attachments: HBASE-20475-addendum-v2.patch, > HBASE-20475-addendum-v3.patch, HBASE-20475-addendum.patch, HBASE-20475.patch > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16456321#comment-16456321 ] Duo Zhang commented on HBASE-20475: --- {code} peersSnapshot = new HashMap<>(); replicationPeers.getPeerCache().forEach(peersSnapshot::put); {code} I think HashMap has a constructor which takes a Map? {code} LOG.warn("Skipping failover for peer:" + actualPeerId + " of node " + deadRS + ", peer is null"); {code} Use '{}' to rewrite the log. Overall LGTM. +1. You can commit and leave the issue open for a while to see if it really works. > Fix the flaky TestReplicationDroppedTables unit test. > - > > Key: HBASE-20475 > URL: https://issues.apache.org/jira/browse/HBASE-20475 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.0 > > Attachments: HBASE-20475-addendum-v2.patch, > HBASE-20475-addendum-v3.patch, HBASE-20475-addendum.patch, HBASE-20475.patch > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16456255#comment-16456255 ] Hadoop QA commented on HBASE-20475: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s{color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 27s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 25s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 48s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 18s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 21s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 11s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 37s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 37s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 36s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 14m 48s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 25s{color} | {color:green} hbase-replication in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green}110m 11s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 31s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}158m 11s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:d8b550f | | JIRA Issue | HBASE-20475 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12920972/HBASE-20475-addendum-v2.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 4d433459f0e3 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 39cf42be9a | | maven |
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16454297#comment-16454297 ] Zheng Hu commented on HBASE-20475: -- Found the cause why the test to be flaky again, and it happened randomly ... see the following log: {code} 2018-04-25 18:08:24,432 INFO [regionserver/asf915:0.logRoller] wal.AbstractFSWAL(671): Rolled WAL /user/jenkins/test-data/be2e424e-e2f6-4ac1-91f1-d33621a46da3/WALs/asf915.gq1.ygridcore.net,42474,1524679689830/asf915.gq1.ygridcore.net%2C42 474%2C1524679689830.1524679692092 with entries=13, filesize=3.55 KB; new WAL /user/jenkins/test-data/be2e424e-e2f6-4ac1-91f1-d33621a46da3/WALs/asf915.gq1.ygridcore.net,42474,1524679689830/asf915.gq1.ygridcore.net%2C42474%2C1524679689830.15 24679704418 .. 2018-04-25 18:08:36,957 INFO [ReplicationExecutor-0] replication.ZKReplicationQueueStorage(387): Atomically moving asf915.gq1.ygridcore.net,42474,1524679689830/2's WALs to asf915.gq1.ygridcore.net,34222,1524679706894 2018-04-25 18:08:36,957 DEBUG [RpcServer.replication.FPBQ.Fifo.handler=2,queue=0,port=36265] ipc.AbstractRpcClient(200): Codec=org.apache.hadoop.hbase.codec.KeyValueCodecWithTags@396b2d58, compressor=null, tcpKeepAlive=true, tcpNoDelay=tru e, connectTO=1, readTO=2, writeTO=6, minIdleTimeBeforeClose=12, maxRetries=0, fallbackAllowed=true, bind address=null 2018-04-25 18:08:36,959 DEBUG [ReplicationExecutor-0] replication.ZKReplicationQueueStorage(414): Creating asf915.gq1.ygridcore.net%2C42474%2C1524679689830.1524679704418 with data PBUF\x08\x9F\x1A 2018-04-25 18:08:36,961 INFO [ReplicationExecutor-0] replication.ZKReplicationQueueStorage(426): Atomically moved asf915.gq1.ygridcore.net,42474,1524679689830/2's WALs to asf915.gq1.ygridcore.net,34222,1524679706894 {code} Step.1 Atomically moving -> it means that we begin to claimQueue from the deadRS to destination RS. Step.2 at 2018-04-25 18:08:24,432 , the rs start to roll the WAL, and at 2018-04-25 18:08:36,959 , it created the WAL. Step.3 Atomically moved ... -> the NodeFailoverWorker finished to cliamQueue from deadRS to dest RS, but exclude the WAL asf915.gq1.ygridcore.net%2C42474%2C1524679689830.1524679704418, because had no ZNODE yet. So when our RecoveredReplicationSourceShipper try to ship the edits and setWALPosition, it found that the znode did not exist, and the RS crashed finally. (It's strange here: the new rs had been transferring queue, but the dead RS was still creating the new WAL...) {code} 2018-04-25 18:08:39,107 DEBUG [ReplicationExecutor-0.replicationSource,2-asf915.gq1.ygridcore.net,42474,1524679689830.replicationSource.wal-reader.asf915.gq1.ygridcore.net%2C42474%2C1524679689830,2-asf915.gq1.ygridcore.net,42474,1524679689 830] regionserver.WALEntryStream(250): Reached the end of log hdfs://localhost:43322/user/jenkins/test-data/be2e424e-e2f6-4ac1-91f1-d33621a46da3/oldWALs/asf915.gq1.ygridcore.net%2C42474%2C1524679689830.1524679704418 2018-04-25 18:08:39,109 DEBUG [ReplicationExecutor-0.replicationSource,2-asf915.gq1.ygridcore.net,42474,1524679689830.replicationSource.shipperasf915.gq1.ygridcore.net%2C42474%2C1524679689830,2-asf915.gq1.ygridcore.net,42474,1524679689830] replication.ReplicationQueueInfo(110): Found dead servers:[asf915.gq1.ygridcore.net,42474,1524679689830] 2018-04-25 18:08:39,118 ERROR [ReplicationExecutor-0.replicationSource,2-asf915.gq1.ygridcore.net,42474,1524679689830.replicationSource.shipperasf915.gq1.ygridcore.net%2C42474%2C1524679689830,2-asf915.gq1.ygridcore.net,42474,1524679689830] helpers.MarkerIgnoringBase(159): * ABORTING region server asf915.gq1.ygridcore.net,34222,1524679706894: Failed to operate on replication queue * org.apache.hadoop.hbase.replication.ReplicationException: Failed to set log position (serverName=asf915.gq1.ygridcore.net,34222,1524679706894, queueId=2-asf915.gq1.ygridcore.net,42474,1524679689830, fileName=asf915.gq1.ygridcore.net%2C4247 4%2C1524679689830.1524679704418, position=3632) at org.apache.hadoop.hbase.replication.ZKReplicationQueueStorage.setWALPosition(ZKReplicationQueueStorage.java:256) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.lambda$logPositionAndCleanOldLogs$7(ReplicationSourceManager.java:488) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.abortWhenFail(ReplicationSourceManager.java:455) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.logPositionAndCleanOldLogs(ReplicationSourceManager.java:488) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.updateLogPosition(ReplicationSourceShipper.java:231) at org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.shipEdits(ReplicationSourceShipper.java:133) at
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16454240#comment-16454240 ] Duo Zhang commented on HBASE-20475: --- Please open an issue to fix it... > Fix the flaky TestReplicationDroppedTables unit test. > - > > Key: HBASE-20475 > URL: https://issues.apache.org/jira/browse/HBASE-20475 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.0 > > Attachments: HBASE-20475-addendum.patch, HBASE-20475.patch > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16454213#comment-16454213 ] Zheng Hu commented on HBASE-20475: -- Found the above bug, because I notice the log : {code} 2018-04-25 18:08:39,091 DEBUG [ReplicationExecutor-0.replicationSource,2-asf915.gq1.ygridcore.net,42474,1524679689830] zookeeper.ZKUtil(614): regionserver:34222-0x162fdfda59e0021, quorum=localhost:60019, baseZNode=/1 Unable to get data of znode /1/replication/rs/asf915.gq1.ygridcore.net,42474,1524679689830/2-asf915.gq1.ygridcore.net,42474,1524679689830/asf915.gq1.ygridcore.net%2C42474%2C1524679689830.1524679704418 because node does not exist (not an error) 2018-04-25 18:08:39,091 WARN [ReplicationExecutor-0.replicationSource,2-asf915.gq1.ygridcore.net,42474,1524679689830] replication.ZKReplicationQueueStorage(377): Failed parse log position (serverName=asf915.gq1.ygridcore.net,42474,1524679689830, queueId=2-asf915.gq1.ygridcore.net,42474,1524679689830, fileName=asf915.gq1.ygridcore.net%2C42474%2C1524679689830.1524679704418) {code} > Fix the flaky TestReplicationDroppedTables unit test. > - > > Key: HBASE-20475 > URL: https://issues.apache.org/jira/browse/HBASE-20475 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.0 > > Attachments: HBASE-20475-addendum.patch, HBASE-20475.patch > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16454202#comment-16454202 ] Zheng Hu commented on HBASE-20475: -- Found an unrelated bug in RecoveredReplicationSourceShipper#getRecoveredQueueStartPos() {code} private long getRecoveredQueueStartPos() { long startPosition = 0; String peerClusterZnode = source.getQueueId(); try { startPosition = this.replicationQueues.getWALPosition(source.getServerWALsBelongTo(), peerClusterZnode, this.queue.peek().getName()); if (LOG.isTraceEnabled()) { LOG.trace("Recovered queue started with log " + this.queue.peek() + " at position " + startPosition); } } catch (ReplicationException e) { terminate("Couldn't get the position of this recovered queue " + peerClusterZnode, e); } return startPosition; } {code} When we start run to RecoveredReplicationSourceShipper, all WALs of dead server has been pushed into the new RS's queue, So it will always return -1 for following method, because the path does not exist any more . {code} startPosition = this.replicationQueues.getWALPosition(source.getServerWALsBelongTo(), peerClusterZnode, this.queue.peek().getName()); {code} > Fix the flaky TestReplicationDroppedTables unit test. > - > > Key: HBASE-20475 > URL: https://issues.apache.org/jira/browse/HBASE-20475 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.0 > > Attachments: HBASE-20475-addendum.patch, HBASE-20475.patch > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16453788#comment-16453788 ] Hadoop QA commented on HBASE-20475: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 29s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 34s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 0s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 12s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 44s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 27s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 13s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 15s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 13m 7s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}149m 5s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}190m 9s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:d8b550f | | JIRA Issue | HBASE-20475 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12920738/HBASE-20475-addendum.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 3e67c23601c4 4.4.0-104-generic #127-Ubuntu SMP Mon Dec 11 12:16:42 UTC 2017 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 8a30acf46f | | maven | version: Apache Maven 3.5.3 (3383c37e1f9e9b3bc3df5050c29c8aff9f295297; 2018-02-24T19:49:05Z) | | Default Java | 1.8.0_162 | | findbugs | v3.1.0-RC3 | | unit | https://builds.apache.org/job/PreCommit-HBASE-Build/12660/artifact/patchprocess/patch-unit-hbase-server.txt | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/12660/testReport/ | | Max. process+thread count | 4309 (vs. ulimit of 1) | | modules | C: hbase-server U: hbase-server | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/12660/console | | Powered by | Apache Yetus
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16453579#comment-16453579 ] Zheng Hu commented on HBASE-20475: -- Two possible causes , I think 1. The host asf915.gq1.ygridcore.net has some problem when run the UT, but other UTs work fine. It's strange. 2. After the patch, we put 1000 row into test table, and scan those without setCaching, so it's easy to be timeout. > Fix the flaky TestReplicationDroppedTables unit test. > - > > Key: HBASE-20475 > URL: https://issues.apache.org/jira/browse/HBASE-20475 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.0 > > Attachments: HBASE-20475.patch > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16453565#comment-16453565 ] Zheng Hu commented on HBASE-20475: -- Checked the console log (https://builds.apache.org/job/HBASE-Flaky-Tests/30156/consoleFull), A different problem now, all come from setUp() method. {code} [ERROR] Tests run: 4, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: 117.161 s <<< FAILURE! - in org.apache.hadoop.hbase.replication.TestReplicationDroppedTables [ERROR] testEditsStuckBehindDroppedTable(org.apache.hadoop.hbase.replication.TestReplicationDroppedTables) Time elapsed: 62.122 s <<< ERROR! java.io.IOException: Failed to get result within timeout, timeout=6ms at org.apache.hadoop.hbase.replication.TestReplicationDroppedTables.setUp(TestReplicationDroppedTables.java:67) [ERROR] testEditsBehindDroppedTableTiming(org.apache.hadoop.hbase.replication.TestReplicationDroppedTables) Time elapsed: 1.292 s <<< ERROR! java.net.ConnectException: Call to asf915.gq1.ygridcore.net/67.195.81.159:34222 failed on connection exception: org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: asf915.gq1.ygridcore.net/67.195.81.159:34222 Caused by: org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException: Connection refused: asf915.gq1.ygridcore.net/67.195.81.159:34222 Caused by: java.net.ConnectException: Connection refused [ERROR] testEditsDroppedWithDroppedTable(org.apache.hadoop.hbase.replication.TestReplicationDroppedTables) Time elapsed: 1.292 s <<< ERROR! java.io.IOException: Call to asf915.gq1.ygridcore.net/67.195.81.159:34222 failed on local exception: org.apache.hadoop.hbase.ipc.FailedServerException: This server is in the failed servers list: asf915.gq1.ygridcore.net/67.195.81.159:34222 at org.apache.hadoop.hbase.replication.TestReplicationDroppedTables.setUp(TestReplicationDroppedTables.java:65) Caused by: org.apache.hadoop.hbase.ipc.FailedServerException: This server is in the failed servers list: asf915.gq1.ygridcore.net/67.195.81.159:34222 at org.apache.hadoop.hbase.replication.TestReplicationDroppedTables.setUp(TestReplicationDroppedTables.java:65) {code} > Fix the flaky TestReplicationDroppedTables unit test. > - > > Key: HBASE-20475 > URL: https://issues.apache.org/jira/browse/HBASE-20475 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.0 > > Attachments: HBASE-20475.patch > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16453377#comment-16453377 ] Duo Zhang commented on HBASE-20475: --- See http://104.198.223.121:8080/job/HBASE-Flaky-Tests/36000/ > Fix the flaky TestReplicationDroppedTables unit test. > - > > Key: HBASE-20475 > URL: https://issues.apache.org/jira/browse/HBASE-20475 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.0 > > Attachments: HBASE-20475.patch > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16452183#comment-16452183 ] Hudson commented on HBASE-20475: Results for branch master [build #311 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/311/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/master/311//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/master/311//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/master/311//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > Fix the flaky TestReplicationDroppedTables unit test. > - > > Key: HBASE-20475 > URL: https://issues.apache.org/jira/browse/HBASE-20475 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.0 > > Attachments: HBASE-20475.patch > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451869#comment-16451869 ] Zheng Hu commented on HBASE-20475: -- Checked branch-2, it's flaky too. If the UT is not flaky in master branch, I'll port to branch-2. [~stack] boss, FYI. > Fix the flaky TestReplicationDroppedTables unit test. > - > > Key: HBASE-20475 > URL: https://issues.apache.org/jira/browse/HBASE-20475 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.0 > > Attachments: HBASE-20475.patch > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451853#comment-16451853 ] Zheng Hu commented on HBASE-20475: -- Thanks [~Apache9] for checking, Pushed to master branch now. I'm not sure whether it was broken in branch-2 , but I think should port this patch to branch-2. So keep this issue open and see what the flaky dashboard says. > Fix the flaky TestReplicationDroppedTables unit test. > - > > Key: HBASE-20475 > URL: https://issues.apache.org/jira/browse/HBASE-20475 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.0 > > Attachments: HBASE-20475.patch > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451806#comment-16451806 ] Duo Zhang commented on HBASE-20475: --- Checked locally, it works for me. [~openinx] Please commit to master first and wait some time to see what the flakey dashboard says. If no problem then commit to branch-2. The pre commit will not run the test... Thanks. > Fix the flaky TestReplicationDroppedTables unit test. > - > > Key: HBASE-20475 > URL: https://issues.apache.org/jira/browse/HBASE-20475 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.0 > > Attachments: HBASE-20475.patch > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451731#comment-16451731 ] Hadoop QA commented on HBASE-20475: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 48s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 46s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 12s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 56s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 1s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 55s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 11s{color} | {color:green} hbase-server: The patch generated 0 new + 0 unchanged - 1 fixed = 0 total (was 1) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 54s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 15m 14s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}111m 50s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 19s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}158m 30s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:d8b550f | | JIRA Issue | HBASE-20475 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12920570/HBASE-20475.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux e991b241f0ea 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh | | git revision | master / a8be3bb814 | | maven | version: Apache Maven 3.5.3 (3383c37e1f9e9b3bc3df5050c29c8aff9f295297; 2018-02-24T19:49:05Z) | | Default Java | 1.8.0_162 | | findbugs | v3.1.0-RC3 | | unit | https://builds.apache.org/job/PreCommit-HBASE-Build/12622/artifact/patchprocess/patch-unit-hbase-server.txt | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/12622/testReport/ | | Max. process+thread count | 4322 (vs. ulimit of 1) | | modules | C: hbase-server U: hbase-server | | Console output |
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449840#comment-16449840 ] Duo Zhang commented on HBASE-20475: --- It is known to be flakey so the pre commit will not run it. But it does not fail 100% in the past... > Fix the flaky TestReplicationDroppedTables unit test. > - > > Key: HBASE-20475 > URL: https://issues.apache.org/jira/browse/HBASE-20475 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.0 > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449833#comment-16449833 ] Zheng Hu commented on HBASE-20475: -- I've tried to checkout some commits before HBASE-20128, and run the UT. It was also failed. I think the UT has been broken a long time ago. And I also found that some successful Hadoop QA (such as [1] ) did not run the TestReplicationDroppedTables at all [1] https://issues.apache.org/jira/browse/HBASE-20128?focusedCommentId=16440441=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16440441 > Fix the flaky TestReplicationDroppedTables unit test. > - > > Key: HBASE-20475 > URL: https://issues.apache.org/jira/browse/HBASE-20475 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.0 > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449822#comment-16449822 ] Duo Zhang commented on HBASE-20475: --- But we haven't touched the related code recently, so why it did not always fail in the past? > Fix the flaky TestReplicationDroppedTables unit test. > - > > Key: HBASE-20475 > URL: https://issues.apache.org/jira/browse/HBASE-20475 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.0 > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449791#comment-16449791 ] Zheng Hu commented on HBASE-20475: -- It will block the progress of replication if we deleted the table at sink side finally. but may not the first batch who thrown a TableNotFoundException as said above. I think need to rewrite the TestReplicationDroppedTables , it's not quite correct now. > Fix the flaky TestReplicationDroppedTables unit test. > - > > Key: HBASE-20475 > URL: https://issues.apache.org/jira/browse/HBASE-20475 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.0 > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449775#comment-16449775 ] Zheng Hu commented on HBASE-20475: -- I think I found the root cause finally. the source replicate a batch which contains log of a deleted table and a test table, and we expected that WAL of the deleted table(by thrown TableNotFoundException when replicate to peer) will block the WAL of the test table, but in the ReplicatonSink, I found that: {code} if (!rowMap.isEmpty()) { LOG.debug("Started replicating mutations."); for (Entry> entry : rowMap.entrySet()) { batch(entry.getKey(), entry.getValue().values()); } LOG.debug("Finished replicating mutations."); } {code} The entry from the same batch is ahead of another entry, because of their hash code of table name is ahead of another's. So the WAL of test table may apply to the sink cluster firstly, and then the WAL of deleted table. that's to say WAL of the deleted table won't block the WAL of the test table, finally the UT failed. > Fix the flaky TestReplicationDroppedTables unit test. > - > > Key: HBASE-20475 > URL: https://issues.apache.org/jira/browse/HBASE-20475 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.0 > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449563#comment-16449563 ] Zheng Hu commented on HBASE-20475: -- OK, the concurrent batch replication for serial replication is not related to this issue, I'll create a separate issue for it. > Fix the flaky TestReplicationDroppedTables unit test. > - > > Key: HBASE-20475 > URL: https://issues.apache.org/jira/browse/HBASE-20475 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.0 > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449555#comment-16449555 ] Duo Zhang commented on HBASE-20475: --- I think we can introduce a flag. If the peer is serial then we enable the flag for ReplicationEndpoint. > Fix the flaky TestReplicationDroppedTables unit test. > - > > Key: HBASE-20475 > URL: https://issues.apache.org/jira/browse/HBASE-20475 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.0 > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.
[ https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449514#comment-16449514 ] Zheng Hu commented on HBASE-20475: -- When dig the UT, I found that HBaseInterClusterReplicationEndpoint will separate the entries into batches, and each batch will be replicate to the sink concurrently. And this will broken the serialization for serial replication , maybe serial peer need a separate ReplicationEndpoint instead of HBaseInterClusterReplicationEndpoint. [~Apache9] FYI. > Fix the flaky TestReplicationDroppedTables unit test. > - > > Key: HBASE-20475 > URL: https://issues.apache.org/jira/browse/HBASE-20475 > Project: HBase > Issue Type: Bug >Affects Versions: 2.1.0 >Reporter: Zheng Hu >Assignee: Zheng Hu >Priority: Major > Fix For: 3.0.0, 2.1.0 > > > See > https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html -- This message was sent by Atlassian JIRA (v7.6.3#76005)