[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-05-09 Thread Zheng Hu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16469838#comment-16469838
 ] 

Zheng Hu commented on HBASE-20475:
--

Sure,  filed HBASE-20560 for it .

> Fix the flaky TestReplicationDroppedTables unit test.
> -
>
> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: 
> 0001-HBASE-20475-TestReplicationDroppedTables-refactor.patch, 
> HBASE-20475-addendum-v2.patch, HBASE-20475-addendum-v3.patch, 
> HBASE-20475-addendum.patch, HBASE-20475.patch
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-05-09 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16468832#comment-16468832
 ] 

Duo Zhang commented on HBASE-20475:
---

The patch is a bit big as an addendum, let's open a new issue for it? So we can 
use the review board...

> Fix the flaky TestReplicationDroppedTables unit test.
> -
>
> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: 
> 0001-HBASE-20475-TestReplicationDroppedTables-refactor.patch, 
> HBASE-20475-addendum-v2.patch, HBASE-20475-addendum-v3.patch, 
> HBASE-20475-addendum.patch, HBASE-20475.patch
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-05-09 Thread Zheng Hu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16468727#comment-16468727
 ] 

Zheng Hu commented on HBASE-20475:
--

Ping [~Apache9]

> Fix the flaky TestReplicationDroppedTables unit test.
> -
>
> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: 
> 0001-HBASE-20475-TestReplicationDroppedTables-refactor.patch, 
> HBASE-20475-addendum-v2.patch, HBASE-20475-addendum-v3.patch, 
> HBASE-20475-addendum.patch, HBASE-20475.patch
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-05-08 Thread Zheng Hu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16467225#comment-16467225
 ] 

Zheng Hu commented on HBASE-20475:
--

Let me explain the changes in UT refactor : 
1.  Let TestReplicationDroppedTables#setUpBase override the 
TestReplicationBase#setUpBase,  because when I read the log,  I found that in 
testEditsBehindDroppedTableTiming,  there were some  replaying WALs  which was 
stuck  in its previous UT testEditsDroppedWithDroppedTable.   The original 
version,  we TestReplicationBase#setUpBase firstly, then 
TestReplicationDroppedTables#setUp, so the peer creation is ahead of WAL 
rolling,  so the newly created peer in testEditsBehindDroppedTableTiming would 
still catch the WALs from previous UT.
2.  In those UTs,  we would shutdown the source mini cluster to keep only one 
RS, but  in the following operations, we were still using the htable1 & admin 
which was initialized for the previous mini cluster, so I fixed those . 
3.  The verifyReplicationProceeded & verifyReplicationStuck should not only 
check the lastRowkey, as I said above So I fixed those too.. 
4.  Some minor change, such as we used the deprecated HTableDescriptor, I 
changed them to builder... 



> Fix the flaky TestReplicationDroppedTables unit test.
> -
>
> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: 
> 0001-HBASE-20475-TestReplicationDroppedTables-refactor.patch, 
> HBASE-20475-addendum-v2.patch, HBASE-20475-addendum-v3.patch, 
> HBASE-20475-addendum.patch, HBASE-20475.patch
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-05-07 Thread Zheng Hu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16465790#comment-16465790
 ] 

Zheng Hu commented on HBASE-20475:
--

I guess we were stopping a ReplicationSourceShipper which has not finished its 
initialization.   so the NPE happen
{code}
worker.entryReader.interrupt();
{code}

> Fix the flaky TestReplicationDroppedTables unit test.
> -
>
> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20475-addendum-v2.patch, 
> HBASE-20475-addendum-v3.patch, HBASE-20475-addendum.patch, HBASE-20475.patch
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-05-07 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16465766#comment-16465766
 ] 

Duo Zhang commented on HBASE-20475:
---

What about the NPE posted by me above? Is there a race?

> Fix the flaky TestReplicationDroppedTables unit test.
> -
>
> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20475-addendum-v2.patch, 
> HBASE-20475-addendum-v3.patch, HBASE-20475-addendum.patch, HBASE-20475.patch
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-05-07 Thread Zheng Hu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16465759#comment-16465759
 ] 

Zheng Hu commented on HBASE-20475:
--

Checked the UT & log again, the phenomenon is:
{code}

> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20475-addendum-v2.patch, 
> HBASE-20475-addendum-v3.patch, HBASE-20475-addendum.patch, HBASE-20475.patch
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-05-04 Thread Zheng Hu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463684#comment-16463684
 ] 

Zheng Hu commented on HBASE-20475:
--

Filed HBASE-20531 to address the above NPE. 

> Fix the flaky TestReplicationDroppedTables unit test.
> -
>
> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20475-addendum-v2.patch, 
> HBASE-20475-addendum-v3.patch, HBASE-20475-addendum.patch, HBASE-20475.patch
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-05-03 Thread Zheng Hu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463322#comment-16463322
 ] 

Zheng Hu commented on HBASE-20475:
--

Oh,  the hadoop QA results 
(https://builds.apache.org/job/HBASE-Flaky-Tests/30342/) has been expired , and 
the flaky results in http://104.198.223.121:8080/job/HBASE-Flaky-Tests/
 was caused by another NPE ..

{code}
2018-05-03 17:05:59,008 ERROR 
[RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=58476] 
master.MasterRpcServices(508): Region server 
instance-2.c.gcp-hbase.internal,52125,1525367143898 reported a fatal error:
* ABORTING region server 
instance-2.c.gcp-hbase.internal,52125,1525367143898: Unrecoverable exception 
while closing region hbase:meta,,1.1588230740, still finishing close *
Cause:
java.io.IOException: java.lang.NullPointerException
at 
org.apache.hadoop.hbase.regionserver.HRegion.doClose(HRegion.java:1637)
at org.apache.hadoop.hbase.regionserver.HRegion.close(HRegion.java:1466)
at 
org.apache.hadoop.hbase.regionserver.handler.CloseRegionHandler.process(CloseRegionHandler.java:104)
at 
org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.reportFileArchivalForQuotas(HRegionServer.java:3709)
at 
org.apache.hadoop.hbase.regionserver.HStore.reportArchivedFilesForQuota(HStore.java:2718)
at 
org.apache.hadoop.hbase.regionserver.HStore.removeCompactedfiles(HStore.java:2649)
at org.apache.hadoop.hbase.regionserver.HStore.close(HStore.java:929)
at 
org.apache.hadoop.hbase.regionserver.HRegion$2.call(HRegion.java:1615)
at 
org.apache.hadoop.hbase.regionserver.HRegion$2.call(HRegion.java:1612)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
... 3 more
{code}

> Fix the flaky TestReplicationDroppedTables unit test.
> -
>
> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20475-addendum-v2.patch, 
> HBASE-20475-addendum-v3.patch, HBASE-20475-addendum.patch, HBASE-20475.patch
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-05-03 Thread Zheng Hu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16463285#comment-16463285
 ] 

Zheng Hu commented on HBASE-20475:
--

Let me dig the NPE. 

> Fix the flaky TestReplicationDroppedTables unit test.
> -
>
> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20475-addendum-v2.patch, 
> HBASE-20475-addendum-v3.patch, HBASE-20475-addendum.patch, HBASE-20475.patch
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-04-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457702#comment-16457702
 ] 

Hudson commented on HBASE-20475:


Results for branch branch-2
[build #670 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/670/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/670//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/670//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/670//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Fix the flaky TestReplicationDroppedTables unit test.
> -
>
> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20475-addendum-v2.patch, 
> HBASE-20475-addendum-v3.patch, HBASE-20475-addendum.patch, HBASE-20475.patch
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-04-28 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457593#comment-16457593
 ] 

Hudson commented on HBASE-20475:


Results for branch master
[build #314 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/314/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/314//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/314//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/314//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Fix the flaky TestReplicationDroppedTables unit test.
> -
>
> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20475-addendum-v2.patch, 
> HBASE-20475-addendum-v3.patch, HBASE-20475-addendum.patch, HBASE-20475.patch
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-04-28 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457584#comment-16457584
 ] 

Duo Zhang commented on HBASE-20475:
---

I think there is a race... Need to dig more, our locking still has problems...

> Fix the flaky TestReplicationDroppedTables unit test.
> -
>
> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20475-addendum-v2.patch, 
> HBASE-20475-addendum-v3.patch, HBASE-20475-addendum.patch, HBASE-20475.patch
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-04-28 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457582#comment-16457582
 ] 

Duo Zhang commented on HBASE-20475:
---

{noformat}
2018-04-28 11:21:41,825 WARN  
[RpcServer.default.FPBQ.Fifo.handler=3,queue=0,port=33073] 
replication.RefreshPeerProcedure(131): Refresh peer 2 for DISABLE on 
asf915.gq1.ygridcore.net,38380,1524914179604 failed
java.lang.NullPointerException via 
asf915.gq1.ygridcore.net,38380,1524914179604:java.lang.NullPointerException: 
at 
org.apache.hadoop.hbase.procedure2.RemoteProcedureException.fromProto(RemoteProcedureException.java:120)
at 
org.apache.hadoop.hbase.master.MasterRpcServices.lambda$reportProcedureDone$4(MasterRpcServices.java:2248)
at java.util.ArrayList.forEach(ArrayList.java:1257)
at 
java.util.Collections$UnmodifiableCollection.forEach(Collections.java:1080)
at 
org.apache.hadoop.hbase.master.MasterRpcServices.reportProcedureDone(MasterRpcServices.java:2243)
at 
org.apache.hadoop.hbase.shaded.protobuf.generated.RegionServerStatusProtos$RegionServerStatusService$2.callBlockingMethod(RegionServerStatusProtos.java:15180)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:409)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
at 
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
Caused by: java.lang.NullPointerException: 
at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.terminate(ReplicationSource.java:501)
at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.terminate(ReplicationSource.java:480)
at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.terminate(ReplicationSource.java:475)
at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.refreshSources(ReplicationSourceManager.java:397)
at 
org.apache.hadoop.hbase.replication.regionserver.PeerProcedureHandlerImpl.refreshPeerState(PeerProcedureHandlerImpl.java:78)
at 
org.apache.hadoop.hbase.replication.regionserver.PeerProcedureHandlerImpl.disablePeer(PeerProcedureHandlerImpl.java:97)
at 
org.apache.hadoop.hbase.replication.regionserver.RefreshPeerCallable.call(RefreshPeerCallable.java:65)
at 
org.apache.hadoop.hbase.replication.regionserver.RefreshPeerCallable.call(RefreshPeerCallable.java:34)
at 
org.apache.hadoop.hbase.regionserver.handler.RSProcedureHandler.process(RSProcedureHandler.java:47)
at 
org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:104)
at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
{noformat}

> Fix the flaky TestReplicationDroppedTables unit test.
> -
>
> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20475-addendum-v2.patch, 
> HBASE-20475-addendum-v3.patch, HBASE-20475-addendum.patch, HBASE-20475.patch
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-04-28 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457512#comment-16457512
 ] 

Duo Zhang commented on HBASE-20475:
---

[~openinx] is probably on the train to home. Let me help committing...

> Fix the flaky TestReplicationDroppedTables unit test.
> -
>
> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20475-addendum-v2.patch, 
> HBASE-20475-addendum-v3.patch, HBASE-20475-addendum.patch, HBASE-20475.patch
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-04-28 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457456#comment-16457456
 ] 

Duo Zhang commented on HBASE-20475:
---

It works. Please merge the patches and commit it to branch-2.

Thanks.

> Fix the flaky TestReplicationDroppedTables unit test.
> -
>
> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20475-addendum-v2.patch, 
> HBASE-20475-addendum-v3.patch, HBASE-20475-addendum.patch, HBASE-20475.patch
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-04-27 Thread Zheng Hu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16457281#comment-16457281
 ] 

Zheng Hu commented on HBASE-20475:
--

The TestReplicationDroppedTables is becoming more stable now, still wait some 
days to see if flaky again .. 

> Fix the flaky TestReplicationDroppedTables unit test.
> -
>
> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20475-addendum-v2.patch, 
> HBASE-20475-addendum-v3.patch, HBASE-20475-addendum.patch, HBASE-20475.patch
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-04-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16456682#comment-16456682
 ] 

Hadoop QA commented on HBASE-20475:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
24s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
24s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
51s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
12s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
20s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
3s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
15s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
18s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
13m 17s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
35s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
41s{color} | {color:green} hbase-replication in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}163m  
8s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
40s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}208m 34s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:d8b550f |
| JIRA Issue | HBASE-20475 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12921007/HBASE-20475-addendum-v3.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux b5e52e8f01e0 4.4.0-104-generic #127-Ubuntu SMP Mon Dec 11 
12:16:42 UTC 2017 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 39cf42be9a |
| maven | 

[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-04-27 Thread Zheng Hu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16456444#comment-16456444
 ] 

Zheng Hu commented on HBASE-20475:
--

Thanks [~Apache9] for reviewing,  Pushed the addendum to master branch,  
waiting the flaky dashboard ... 

> Fix the flaky TestReplicationDroppedTables unit test.
> -
>
> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20475-addendum-v2.patch, 
> HBASE-20475-addendum-v3.patch, HBASE-20475-addendum.patch, HBASE-20475.patch
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-04-27 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16456321#comment-16456321
 ] 

Duo Zhang commented on HBASE-20475:
---

{code}
peersSnapshot = new HashMap<>();
replicationPeers.getPeerCache().forEach(peersSnapshot::put);
{code}
I think HashMap has a constructor which takes a Map?

{code}
LOG.warn("Skipping failover for peer:" + actualPeerId + " of node " + deadRS
+ ", peer is null");
{code}
Use '{}' to rewrite the log.

Overall LGTM. +1. You can commit and leave the issue open for a while to see if 
it really works.

> Fix the flaky TestReplicationDroppedTables unit test.
> -
>
> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20475-addendum-v2.patch, 
> HBASE-20475-addendum-v3.patch, HBASE-20475-addendum.patch, HBASE-20475.patch
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-04-27 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16456255#comment-16456255
 ] 

Hadoop QA commented on HBASE-20475:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
27s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
25s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
48s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
18s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
21s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
11s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
37s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
36s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
14m 48s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
25s{color} | {color:green} hbase-replication in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}110m 
11s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
31s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}158m 11s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:d8b550f |
| JIRA Issue | HBASE-20475 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12920972/HBASE-20475-addendum-v2.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 4d433459f0e3 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 
13:48:03 UTC 2016 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 39cf42be9a |
| maven | 

[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-04-26 Thread Zheng Hu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16454297#comment-16454297
 ] 

Zheng Hu commented on HBASE-20475:
--

Found the cause why  the test to be flaky again, and it happened randomly  ...  
see the following log: 

{code}
2018-04-25 18:08:24,432 INFO  [regionserver/asf915:0.logRoller] 
wal.AbstractFSWAL(671): Rolled WAL 
/user/jenkins/test-data/be2e424e-e2f6-4ac1-91f1-d33621a46da3/WALs/asf915.gq1.ygridcore.net,42474,1524679689830/asf915.gq1.ygridcore.net%2C42
474%2C1524679689830.1524679692092 with entries=13, filesize=3.55 KB; new WAL 
/user/jenkins/test-data/be2e424e-e2f6-4ac1-91f1-d33621a46da3/WALs/asf915.gq1.ygridcore.net,42474,1524679689830/asf915.gq1.ygridcore.net%2C42474%2C1524679689830.15
24679704418

..

2018-04-25 18:08:36,957 INFO  [ReplicationExecutor-0] 
replication.ZKReplicationQueueStorage(387): Atomically moving 
asf915.gq1.ygridcore.net,42474,1524679689830/2's WALs to 
asf915.gq1.ygridcore.net,34222,1524679706894
2018-04-25 18:08:36,957 DEBUG 
[RpcServer.replication.FPBQ.Fifo.handler=2,queue=0,port=36265] 
ipc.AbstractRpcClient(200): 
Codec=org.apache.hadoop.hbase.codec.KeyValueCodecWithTags@396b2d58, 
compressor=null, tcpKeepAlive=true, tcpNoDelay=tru
e, connectTO=1, readTO=2, writeTO=6, minIdleTimeBeforeClose=12, 
maxRetries=0, fallbackAllowed=true, bind address=null
2018-04-25 18:08:36,959 DEBUG [ReplicationExecutor-0] 
replication.ZKReplicationQueueStorage(414): Creating 
asf915.gq1.ygridcore.net%2C42474%2C1524679689830.1524679704418 with data 
PBUF\x08\x9F\x1A
2018-04-25 18:08:36,961 INFO  [ReplicationExecutor-0] 
replication.ZKReplicationQueueStorage(426): Atomically moved 
asf915.gq1.ygridcore.net,42474,1524679689830/2's WALs to 
asf915.gq1.ygridcore.net,34222,1524679706894
{code}

Step.1  Atomically moving ->  it means that we begin to claimQueue from the 
deadRS to destination RS.
Step.2   at 2018-04-25 18:08:24,432 , the rs start to roll the WAL, and at 
2018-04-25 18:08:36,959 , it created the WAL. 
Step.3   Atomically moved ...  ->  the NodeFailoverWorker finished to 
cliamQueue from deadRS to dest RS, but exclude the WAL 
asf915.gq1.ygridcore.net%2C42474%2C1524679689830.1524679704418, because had no 
ZNODE yet.


So when our RecoveredReplicationSourceShipper try to ship the edits and 
setWALPosition, it found that the znode did not exist, and the RS crashed 
finally. (It's strange here: the new rs had been transferring queue, but the 
dead RS was still creating the new WAL...)

{code}
2018-04-25 18:08:39,107 DEBUG 
[ReplicationExecutor-0.replicationSource,2-asf915.gq1.ygridcore.net,42474,1524679689830.replicationSource.wal-reader.asf915.gq1.ygridcore.net%2C42474%2C1524679689830,2-asf915.gq1.ygridcore.net,42474,1524679689
830] regionserver.WALEntryStream(250): Reached the end of log 
hdfs://localhost:43322/user/jenkins/test-data/be2e424e-e2f6-4ac1-91f1-d33621a46da3/oldWALs/asf915.gq1.ygridcore.net%2C42474%2C1524679689830.1524679704418
2018-04-25 18:08:39,109 DEBUG 
[ReplicationExecutor-0.replicationSource,2-asf915.gq1.ygridcore.net,42474,1524679689830.replicationSource.shipperasf915.gq1.ygridcore.net%2C42474%2C1524679689830,2-asf915.gq1.ygridcore.net,42474,1524679689830]
 replication.ReplicationQueueInfo(110): Found dead 
servers:[asf915.gq1.ygridcore.net,42474,1524679689830]
2018-04-25 18:08:39,118 ERROR 
[ReplicationExecutor-0.replicationSource,2-asf915.gq1.ygridcore.net,42474,1524679689830.replicationSource.shipperasf915.gq1.ygridcore.net%2C42474%2C1524679689830,2-asf915.gq1.ygridcore.net,42474,1524679689830]
 helpers.MarkerIgnoringBase(159): * ABORTING region server 
asf915.gq1.ygridcore.net,34222,1524679706894: Failed to operate on replication 
queue *
org.apache.hadoop.hbase.replication.ReplicationException: Failed to set log 
position (serverName=asf915.gq1.ygridcore.net,34222,1524679706894, 
queueId=2-asf915.gq1.ygridcore.net,42474,1524679689830, 
fileName=asf915.gq1.ygridcore.net%2C4247
4%2C1524679689830.1524679704418, position=3632)
at 
org.apache.hadoop.hbase.replication.ZKReplicationQueueStorage.setWALPosition(ZKReplicationQueueStorage.java:256)
at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.lambda$logPositionAndCleanOldLogs$7(ReplicationSourceManager.java:488)
at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.abortWhenFail(ReplicationSourceManager.java:455)
at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.logPositionAndCleanOldLogs(ReplicationSourceManager.java:488)
at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.updateLogPosition(ReplicationSourceShipper.java:231)
at 
org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.shipEdits(ReplicationSourceShipper.java:133)
at 

[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-04-26 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16454240#comment-16454240
 ] 

Duo Zhang commented on HBASE-20475:
---

Please open an issue to fix it...

> Fix the flaky TestReplicationDroppedTables unit test.
> -
>
> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20475-addendum.patch, HBASE-20475.patch
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-04-26 Thread Zheng Hu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16454213#comment-16454213
 ] 

Zheng Hu commented on HBASE-20475:
--

Found the above bug, because I notice the log :

{code}
2018-04-25 18:08:39,091 DEBUG 
[ReplicationExecutor-0.replicationSource,2-asf915.gq1.ygridcore.net,42474,1524679689830]
 zookeeper.ZKUtil(614): regionserver:34222-0x162fdfda59e0021, 
quorum=localhost:60019, baseZNode=/1 Unable to get data of znode 
/1/replication/rs/asf915.gq1.ygridcore.net,42474,1524679689830/2-asf915.gq1.ygridcore.net,42474,1524679689830/asf915.gq1.ygridcore.net%2C42474%2C1524679689830.1524679704418
 because node does not exist (not an error)
2018-04-25 18:08:39,091 WARN  
[ReplicationExecutor-0.replicationSource,2-asf915.gq1.ygridcore.net,42474,1524679689830]
 replication.ZKReplicationQueueStorage(377): Failed parse log position 
(serverName=asf915.gq1.ygridcore.net,42474,1524679689830, 
queueId=2-asf915.gq1.ygridcore.net,42474,1524679689830, 
fileName=asf915.gq1.ygridcore.net%2C42474%2C1524679689830.1524679704418)
{code}

> Fix the flaky TestReplicationDroppedTables unit test.
> -
>
> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20475-addendum.patch, HBASE-20475.patch
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-04-26 Thread Zheng Hu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16454202#comment-16454202
 ] 

Zheng Hu commented on HBASE-20475:
--

Found an unrelated bug in 
RecoveredReplicationSourceShipper#getRecoveredQueueStartPos()

{code}
  private long getRecoveredQueueStartPos() {
long startPosition = 0;
String peerClusterZnode = source.getQueueId();
try {
  startPosition = 
this.replicationQueues.getWALPosition(source.getServerWALsBelongTo(),
peerClusterZnode, this.queue.peek().getName());
  if (LOG.isTraceEnabled()) {
LOG.trace("Recovered queue started with log " + this.queue.peek() + " 
at position " +
  startPosition);
  }
} catch (ReplicationException e) {
  terminate("Couldn't get the position of this recovered queue " + 
peerClusterZnode, e);
}
return startPosition;
  }
{code}

When we start run to RecoveredReplicationSourceShipper, all WALs of dead server 
has been pushed into the new RS's queue, So it will always return -1 for  
following method, because the path does not exist any more . 

{code}
  startPosition = 
this.replicationQueues.getWALPosition(source.getServerWALsBelongTo(),
peerClusterZnode, this.queue.peek().getName());
{code}

> Fix the flaky TestReplicationDroppedTables unit test.
> -
>
> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20475-addendum.patch, HBASE-20475.patch
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-04-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16453788#comment-16453788
 ] 

Hadoop QA commented on HBASE-20475:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
29s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
34s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
12s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
44s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
15s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
13m  7s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
28s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}149m  5s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
23s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}190m  9s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:d8b550f |
| JIRA Issue | HBASE-20475 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12920738/HBASE-20475-addendum.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 3e67c23601c4 4.4.0-104-generic #127-Ubuntu SMP Mon Dec 11 
12:16:42 UTC 2017 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 8a30acf46f |
| maven | version: Apache Maven 3.5.3 
(3383c37e1f9e9b3bc3df5050c29c8aff9f295297; 2018-02-24T19:49:05Z) |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC3 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/12660/artifact/patchprocess/patch-unit-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/12660/testReport/ |
| Max. process+thread count | 4309 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/12660/console |
| Powered by | Apache Yetus 

[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-04-26 Thread Zheng Hu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16453579#comment-16453579
 ] 

Zheng Hu commented on HBASE-20475:
--

Two possible causes , I think
1. The host asf915.gq1.ygridcore.net has some problem when run the UT,  but 
other UTs work fine.  It's strange.
2. After the patch,  we put 1000 row into test table, and scan those without 
setCaching, so it's easy to be timeout. 

> Fix the flaky TestReplicationDroppedTables unit test.
> -
>
> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20475.patch
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-04-26 Thread Zheng Hu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16453565#comment-16453565
 ] 

Zheng Hu commented on HBASE-20475:
--

Checked the console log 
(https://builds.apache.org/job/HBASE-Flaky-Tests/30156/consoleFull),  A 
different problem now,   all come from  setUp() method.  

{code}
[ERROR] Tests run: 4, Failures: 0, Errors: 3, Skipped: 0, Time elapsed: 117.161 
s <<< FAILURE! - in 
org.apache.hadoop.hbase.replication.TestReplicationDroppedTables
[ERROR] 
testEditsStuckBehindDroppedTable(org.apache.hadoop.hbase.replication.TestReplicationDroppedTables)
  Time elapsed: 62.122 s  <<< ERROR!
java.io.IOException: Failed to get result within timeout, timeout=6ms
at 
org.apache.hadoop.hbase.replication.TestReplicationDroppedTables.setUp(TestReplicationDroppedTables.java:67)

[ERROR] 
testEditsBehindDroppedTableTiming(org.apache.hadoop.hbase.replication.TestReplicationDroppedTables)
  Time elapsed: 1.292 s  <<< ERROR!
java.net.ConnectException: Call to asf915.gq1.ygridcore.net/67.195.81.159:34222 
failed on connection exception: 
org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
 Connection refused: asf915.gq1.ygridcore.net/67.195.81.159:34222
Caused by: 
org.apache.hbase.thirdparty.io.netty.channel.AbstractChannel$AnnotatedConnectException:
 Connection refused: asf915.gq1.ygridcore.net/67.195.81.159:34222
Caused by: java.net.ConnectException: Connection refused

[ERROR] 
testEditsDroppedWithDroppedTable(org.apache.hadoop.hbase.replication.TestReplicationDroppedTables)
  Time elapsed: 1.292 s  <<< ERROR!
java.io.IOException: Call to asf915.gq1.ygridcore.net/67.195.81.159:34222 
failed on local exception: org.apache.hadoop.hbase.ipc.FailedServerException: 
This server is in the failed servers list: 
asf915.gq1.ygridcore.net/67.195.81.159:34222
at 
org.apache.hadoop.hbase.replication.TestReplicationDroppedTables.setUp(TestReplicationDroppedTables.java:65)
Caused by: org.apache.hadoop.hbase.ipc.FailedServerException: This server is in 
the failed servers list: asf915.gq1.ygridcore.net/67.195.81.159:34222
at 
org.apache.hadoop.hbase.replication.TestReplicationDroppedTables.setUp(TestReplicationDroppedTables.java:65)
{code}

> Fix the flaky TestReplicationDroppedTables unit test.
> -
>
> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20475.patch
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-04-25 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16453377#comment-16453377
 ] 

Duo Zhang commented on HBASE-20475:
---

See http://104.198.223.121:8080/job/HBASE-Flaky-Tests/36000/

> Fix the flaky TestReplicationDroppedTables unit test.
> -
>
> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20475.patch
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-04-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16452183#comment-16452183
 ] 

Hudson commented on HBASE-20475:


Results for branch master
[build #311 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/311/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/311//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/311//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/311//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Fix the flaky TestReplicationDroppedTables unit test.
> -
>
> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20475.patch
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-04-25 Thread Zheng Hu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451869#comment-16451869
 ] 

Zheng Hu commented on HBASE-20475:
--

Checked branch-2,  it's flaky too.  If the UT is not flaky in master branch,  
I'll  port to branch-2.  [~stack]  boss, FYI. 

> Fix the flaky TestReplicationDroppedTables unit test.
> -
>
> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20475.patch
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-04-25 Thread Zheng Hu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451853#comment-16451853
 ] 

Zheng Hu commented on HBASE-20475:
--

Thanks [~Apache9] for checking, Pushed to master branch now.   I'm not sure 
whether it was broken in branch-2 ,  but I think  should port  this patch to 
branch-2.  So keep this issue open and see what the flaky dashboard says. 

> Fix the flaky TestReplicationDroppedTables unit test.
> -
>
> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20475.patch
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-04-25 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451806#comment-16451806
 ] 

Duo Zhang commented on HBASE-20475:
---

Checked locally, it works for me.

[~openinx] Please commit to master first and wait some time to see what the 
flakey dashboard says. If no problem then commit to branch-2. The pre commit 
will not run the test...

Thanks.

> Fix the flaky TestReplicationDroppedTables unit test.
> -
>
> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
> Attachments: HBASE-20475.patch
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-04-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16451731#comment-16451731
 ] 

Hadoop QA commented on HBASE-20475:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
48s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
46s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
12s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
56s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
1s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
30s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
11s{color} | {color:green} hbase-server: The patch generated 0 new + 0 
unchanged - 1 fixed = 0 total (was 1) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
54s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
15m 14s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}111m 50s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}158m 30s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:d8b550f |
| JIRA Issue | HBASE-20475 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12920570/HBASE-20475.patch |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux e991b241f0ea 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh
 |
| git revision | master / a8be3bb814 |
| maven | version: Apache Maven 3.5.3 
(3383c37e1f9e9b3bc3df5050c29c8aff9f295297; 2018-02-24T19:49:05Z) |
| Default Java | 1.8.0_162 |
| findbugs | v3.1.0-RC3 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/12622/artifact/patchprocess/patch-unit-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/12622/testReport/ |
| Max. process+thread count | 4322 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 

[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-04-24 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449840#comment-16449840
 ] 

Duo Zhang commented on HBASE-20475:
---

It is known to be flakey so the pre commit will not run it. But it does not 
fail 100% in the past...

> Fix the flaky TestReplicationDroppedTables unit test.
> -
>
> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-04-24 Thread Zheng Hu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449833#comment-16449833
 ] 

Zheng Hu commented on HBASE-20475:
--

I've tried to checkout some commits before HBASE-20128,  and run the UT.  It 
was  also failed.  I think the  UT has been broken a long time ago.  And I also 
found that some successful Hadoop QA (such as [1] ) did not run the  
TestReplicationDroppedTables at all  

[1] 
https://issues.apache.org/jira/browse/HBASE-20128?focusedCommentId=16440441=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16440441

> Fix the flaky TestReplicationDroppedTables unit test.
> -
>
> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-04-24 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449822#comment-16449822
 ] 

Duo Zhang commented on HBASE-20475:
---

But we haven't touched the related code recently, so why it did not always fail 
in the past?

> Fix the flaky TestReplicationDroppedTables unit test.
> -
>
> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-04-24 Thread Zheng Hu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449791#comment-16449791
 ] 

Zheng Hu commented on HBASE-20475:
--

It will block the progress of replication if we deleted the table  at sink side 
finally.  but may not the first batch who thrown a TableNotFoundException as 
said above.  I think  need to rewrite the TestReplicationDroppedTables ,  it's 
not quite correct now. 

> Fix the flaky TestReplicationDroppedTables unit test.
> -
>
> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-04-24 Thread Zheng Hu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449775#comment-16449775
 ] 

Zheng Hu commented on HBASE-20475:
--

I think I found the root cause finally.  the source replicate a batch which 
contains log of a deleted table and a test table,  and we expected that WAL of 
the deleted table(by thrown TableNotFoundException when replicate to peer) will 
block the WAL of the test table,  but in the ReplicatonSink, I found that: 

{code}
  if (!rowMap.isEmpty()) {
LOG.debug("Started replicating mutations.");
for (Entry> entry : 
rowMap.entrySet()) {
  batch(entry.getKey(), entry.getValue().values());
}
LOG.debug("Finished replicating mutations.");
  }
{code}

The entry from the same batch is ahead of another entry, because of their hash 
code of table name is ahead of another's.   So the WAL of test table may  apply 
to the sink cluster firstly, and then the WAL of deleted table.  that's to say 
WAL of the deleted table won't block the WAL of the test table, finally the UT 
failed. 



 

> Fix the flaky TestReplicationDroppedTables unit test.
> -
>
> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-04-24 Thread Zheng Hu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449563#comment-16449563
 ] 

Zheng Hu commented on HBASE-20475:
--

OK,   the concurrent batch replication for serial replication is not related to 
this issue, I'll create a separate issue for it.

> Fix the flaky TestReplicationDroppedTables unit test.
> -
>
> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-04-24 Thread Duo Zhang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449555#comment-16449555
 ] 

Duo Zhang commented on HBASE-20475:
---

I think we can introduce a flag. If the peer is serial then we enable the flag 
for ReplicationEndpoint.

> Fix the flaky TestReplicationDroppedTables unit test.
> -
>
> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20475) Fix the flaky TestReplicationDroppedTables unit test.

2018-04-24 Thread Zheng Hu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-20475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16449514#comment-16449514
 ] 

Zheng Hu commented on HBASE-20475:
--

When dig the UT, I found that HBaseInterClusterReplicationEndpoint  will 
separate the entries  into batches, and each batch will be replicate to the 
sink concurrently.  And this will broken the serialization for serial 
replication ,  maybe serial peer need a separate  ReplicationEndpoint instead 
of HBaseInterClusterReplicationEndpoint. [~Apache9]  FYI. 


> Fix the flaky TestReplicationDroppedTables unit test.
> -
>
> Key: HBASE-20475
> URL: https://issues.apache.org/jira/browse/HBASE-20475
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.1.0
>Reporter: Zheng Hu
>Assignee: Zheng Hu
>Priority: Major
> Fix For: 3.0.0, 2.1.0
>
>
> See 
> https://builds.apache.org/job/HBASE-Find-Flaky-Tests/lastSuccessfulBuild/artifact/dashboard.html



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)