[ 
https://issues.apache.org/jira/browse/HBASE-20561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16478904#comment-16478904
 ] 

Hadoop QA commented on HBASE-20561:
-----------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
1s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red}  0m  
0s{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
41s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
0s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
25s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
56s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
36s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
53s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
11s{color} | {color:green} hbase-replication: The patch generated 0 new + 2 
unchanged - 1 fixed = 2 total (was 3) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
11s{color} | {color:red} hbase-server: The patch generated 1 new + 4 unchanged 
- 0 fixed = 5 total (was 4) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
50s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
14m 42s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
21s{color} | {color:green} hbase-replication in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}106m 
51s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
38s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}156m 35s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:d8b550f |
| JIRA Issue | HBASE-20561 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12923878/HBASE-20561.master.002.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 749921cc6653 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 60bdaf7846 |
| maven | version: Apache Maven 3.5.3 
(3383c37e1f9e9b3bc3df5050c29c8aff9f295297; 2018-02-24T19:49:05Z) |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC3 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HBASE-Build/12862/artifact/patchprocess/diff-checkstyle-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/12862/testReport/ |
| Max. process+thread count | 4014 (vs. ulimit of 10000) |
| modules | C: hbase-replication hbase-server U: . |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/12862/console |
| Powered by | Apache Yetus 0.7.0   http://yetus.apache.org |


This message was automatically generated.



> The way we stop a ReplicationSource may cause the RS down
> ---------------------------------------------------------
>
>                 Key: HBASE-20561
>                 URL: https://issues.apache.org/jira/browse/HBASE-20561
>             Project: HBase
>          Issue Type: Bug
>          Components: Replication
>            Reporter: Duo Zhang
>            Assignee: Guanghao Zhang
>            Priority: Major
>         Attachments: HBASE-20561.master.001.patch, 
> HBASE-20561.master.002.patch
>
>
> See this:
> https://builds.apache.org/job/HBASE-Flaky-Tests/31125/artifact/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.replication.multiwal.TestReplicationKillMasterRSCompressedWithMultipleAsyncWAL-output.txt
> {noformat}
> 2018-05-09 15:07:00,887 INFO  [RS_REFRESH_PEER-regionserver/asf916:0-1] 
> regionserver.RefreshPeerCallable(52): Received a peer change event, peerId=2, 
> type=REMOVE_PEER
> 2018-05-09 15:07:00,890 INFO  [RS_REFRESH_PEER-regionserver/asf916:0-1] 
> regionserver.ReplicationSource(485): Closing source 
> 2-asf916.gq1.ygridcore.net,36287,1525878368395 because: Replication stream 
> was removed by a user
> 2018-05-09 15:07:00,892 DEBUG 
> [ReplicationExecutor-0.replicationSource,2-asf916.gq1.ygridcore.net,36287,1525878368395.replicationSource.shipperasf916.gq1.ygridcore.net%2C36287%2C1525878368395.asf916.gq1.ygridcore.net%2C36287%2C1525878368395.regiongroup-0,2-asf916.gq1.ygridcore.net,36287,1525878368395]
>  zookeeper.ZKWatcher(617): regionserver:34308-0x163456ff2490004, 
> quorum=localhost:60149, baseZNode=/1 Received InterruptedException, will 
> interrupt current thread and rethrow a SystemErrorException
> java.lang.InterruptedException
>       at java.lang.Object.wait(Native Method)
>       at java.lang.Object.wait(Object.java:502)
>       at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1406)
>       at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:871)
>       at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.delete(RecoverableZooKeeper.java:166)
>       at org.apache.hadoop.hbase.zookeeper.ZKUtil.deleteNode(ZKUtil.java:1231)
>       at org.apache.hadoop.hbase.zookeeper.ZKUtil.deleteNode(ZKUtil.java:1220)
>       at 
> org.apache.hadoop.hbase.replication.ZKReplicationQueueStorage.removeWAL(ZKReplicationQueueStorage.java:198)
>       at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.lambda$cleanOldLogs$8(ReplicationSourceManager.java:526)
>       at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.abortWhenFail(ReplicationSourceManager.java:454)
>       at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.cleanOldLogs(ReplicationSourceManager.java:526)
>       at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.cleanOldLogs(ReplicationSourceManager.java:506)
>       at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.logPositionAndCleanOldLogs(ReplicationSourceManager.java:489)
>       at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.updateLogPosition(ReplicationSourceShipper.java:231)
>       at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.shipEdits(ReplicationSourceShipper.java:133)
>       at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.run(ReplicationSourceShipper.java:103)
> 2018-05-09 15:07:00,892 DEBUG 
> [ReplicationExecutor-0.replicationSource,2-asf916.gq1.ygridcore.net,36287,1525878368395.replicationSource.shipperasf916.gq1.ygridcore.net%2C36287%2C1525878368395.asf916.gq1.ygridcore.net%2C36287%2C1525878368395.regiongroup-1,2-asf916.gq1.ygridcore.net,36287,1525878368395]
>  zookeeper.ZKWatcher(617): regionserver:34308-0x163456ff2490004, 
> quorum=localhost:60149, baseZNode=/1 Received InterruptedException, will 
> interrupt current thread and rethrow a SystemErrorException
> java.lang.InterruptedException
>       at java.lang.Object.wait(Native Method)
>       at java.lang.Object.wait(Object.java:502)
>       at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1406)
>       at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:990)
>       at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:910)
>       at 
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.multi(RecoverableZooKeeper.java:663)
>       at 
> org.apache.hadoop.hbase.zookeeper.ZKUtil.multiOrSequential(ZKUtil.java:1690)
>       at 
> org.apache.hadoop.hbase.replication.ZKReplicationQueueStorage.setWALPosition(ZKReplicationQueueStorage.java:246)
>       at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.lambda$logPositionAndCleanOldLogs$7(ReplicationSourceManager.java:487)
>       at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.abortWhenFail(ReplicationSourceManager.java:454)
>       at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.logPositionAndCleanOldLogs(ReplicationSourceManager.java:487)
>       at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.updateLogPosition(ReplicationSourceShipper.java:231)
>       at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.shipEdits(ReplicationSourceShipper.java:133)
>       at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.run(ReplicationSourceShipper.java:103)
> 2018-05-09 15:07:00,896 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=4,queue=0,port=36902] 
> master.MasterRpcServices(1141): Checking to see if procedure is done pid=52
> 2018-05-09 15:07:00,898 ERROR 
> [ReplicationExecutor-0.replicationSource,2-asf916.gq1.ygridcore.net,36287,1525878368395.replicationSource.shipperasf916.gq1.ygridcore.net%2C36287%2C1525878368395.asf916.gq1.ygridcore.net%2C36287%2C1525878368395.regiongroup-0,2-asf916.gq1.ygridcore.net,36287,1525878368395]
>  helpers.MarkerIgnoringBase(159): ***** ABORTING region server 
> asf916.gq1.ygridcore.net,34308,1525878368287: Failed to operate on 
> replication queue *****
> org.apache.hadoop.hbase.replication.ReplicationException: Failed to remove 
> wal from queue (serverName=asf916.gq1.ygridcore.net,34308,1525878368287, 
> queueId=2-asf916.gq1.ygridcore.net,36287,1525878368395, 
> fileName=asf916.gq1.ygridcore.net%2C36287%2C1525878368395.asf916.gq1.ygridcore.net%2C36287%2C1525878368395.regiongroup-0.1525878371229)
>       at 
> org.apache.hadoop.hbase.replication.ZKReplicationQueueStorage.removeWAL(ZKReplicationQueueStorage.java:202)
>       at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.lambda$cleanOldLogs$8(ReplicationSourceManager.java:526)
>       at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.abortWhenFail(ReplicationSourceManager.java:454)
>       at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.cleanOldLogs(ReplicationSourceManager.java:526)
>       at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.cleanOldLogs(ReplicationSourceManager.java:506)
>       at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.logPositionAndCleanOldLogs(ReplicationSourceManager.java:489)
>       at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.updateLogPosition(ReplicationSourceShipper.java:231)
>       at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.shipEdits(ReplicationSourceShipper.java:133)
>       at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.run(ReplicationSourceShipper.java:103)
> Caused by: org.apache.zookeeper.KeeperException$SystemErrorException: 
> KeeperErrorCode = SystemError
>       at 
> org.apache.hadoop.hbase.zookeeper.ZKWatcher.interruptedException(ZKWatcher.java:608)
>       at org.apache.hadoop.hbase.zookeeper.ZKUtil.deleteNode(ZKUtil.java:1236)
>       at org.apache.hadoop.hbase.zookeeper.ZKUtil.deleteNode(ZKUtil.java:1220)
>       at 
> org.apache.hadoop.hbase.replication.ZKReplicationQueueStorage.removeWAL(ZKReplicationQueueStorage.java:198)
>       ... 8 more
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to