[
https://issues.apache.org/jira/browse/HBASE-20561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16477488#comment-16477488
]
Hadoop QA commented on HBASE-20561:
-----------------------------------
| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m
0s{color} | {color:red} The patch doesn't appear to include any new or modified
tests. Please justify why no new tests are needed for this patch. Also please
list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m
21s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m
9s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m
59s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 7m
30s{color} | {color:green} branch has no errors when building our shaded
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m
30s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m
17s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 3m
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 3m
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 2m
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m
0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 7m
36s{color} | {color:green} patch has no errors when building our shaded
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}
20m 52s{color} | {color:green} Patch does not cause any errors with Hadoop
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m
7s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m
42s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}133m
36s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m
20s{color} | {color:green} The patch does not generate ASF License warnings.
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}203m 45s{color} |
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:d8b550f |
| JIRA Issue | HBASE-20561 |
| JIRA Patch URL |
https://issues.apache.org/jira/secure/attachment/12923650/HBASE-20561.master.001.patch
|
| Optional Tests | asflicense javac javadoc unit findbugs shadedjars
hadoopcheck hbaseanti checkstyle compile |
| uname | Linux 791bb4539274 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality |
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
|
| git revision | master / ab53329cb3 |
| maven | version: Apache Maven 3.5.3
(3383c37e1f9e9b3bc3df5050c29c8aff9f295297; 2018-02-24T19:49:05Z) |
| Default Java | 1.8.0_171 |
| findbugs | v3.1.0-RC3 |
| Test Results |
https://builds.apache.org/job/PreCommit-HBASE-Build/12837/testReport/ |
| Max. process+thread count | 4099 (vs. ulimit of 10000) |
| modules | C: hbase-server U: hbase-server |
| Console output |
https://builds.apache.org/job/PreCommit-HBASE-Build/12837/console |
| Powered by | Apache Yetus 0.7.0 http://yetus.apache.org |
This message was automatically generated.
> The way we stop a ReplicationSource may cause the RS down
> ---------------------------------------------------------
>
> Key: HBASE-20561
> URL: https://issues.apache.org/jira/browse/HBASE-20561
> Project: HBase
> Issue Type: Bug
> Components: Replication
> Reporter: Duo Zhang
> Assignee: Guanghao Zhang
> Priority: Major
> Attachments: HBASE-20561.master.001.patch
>
>
> See this:
> https://builds.apache.org/job/HBASE-Flaky-Tests/31125/artifact/hbase-server/target/surefire-reports/org.apache.hadoop.hbase.replication.multiwal.TestReplicationKillMasterRSCompressedWithMultipleAsyncWAL-output.txt
> {noformat}
> 2018-05-09 15:07:00,887 INFO [RS_REFRESH_PEER-regionserver/asf916:0-1]
> regionserver.RefreshPeerCallable(52): Received a peer change event, peerId=2,
> type=REMOVE_PEER
> 2018-05-09 15:07:00,890 INFO [RS_REFRESH_PEER-regionserver/asf916:0-1]
> regionserver.ReplicationSource(485): Closing source
> 2-asf916.gq1.ygridcore.net,36287,1525878368395 because: Replication stream
> was removed by a user
> 2018-05-09 15:07:00,892 DEBUG
> [ReplicationExecutor-0.replicationSource,2-asf916.gq1.ygridcore.net,36287,1525878368395.replicationSource.shipperasf916.gq1.ygridcore.net%2C36287%2C1525878368395.asf916.gq1.ygridcore.net%2C36287%2C1525878368395.regiongroup-0,2-asf916.gq1.ygridcore.net,36287,1525878368395]
> zookeeper.ZKWatcher(617): regionserver:34308-0x163456ff2490004,
> quorum=localhost:60149, baseZNode=/1 Received InterruptedException, will
> interrupt current thread and rethrow a SystemErrorException
> java.lang.InterruptedException
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:502)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1406)
> at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:871)
> at
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.delete(RecoverableZooKeeper.java:166)
> at org.apache.hadoop.hbase.zookeeper.ZKUtil.deleteNode(ZKUtil.java:1231)
> at org.apache.hadoop.hbase.zookeeper.ZKUtil.deleteNode(ZKUtil.java:1220)
> at
> org.apache.hadoop.hbase.replication.ZKReplicationQueueStorage.removeWAL(ZKReplicationQueueStorage.java:198)
> at
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.lambda$cleanOldLogs$8(ReplicationSourceManager.java:526)
> at
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.abortWhenFail(ReplicationSourceManager.java:454)
> at
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.cleanOldLogs(ReplicationSourceManager.java:526)
> at
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.cleanOldLogs(ReplicationSourceManager.java:506)
> at
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.logPositionAndCleanOldLogs(ReplicationSourceManager.java:489)
> at
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.updateLogPosition(ReplicationSourceShipper.java:231)
> at
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.shipEdits(ReplicationSourceShipper.java:133)
> at
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.run(ReplicationSourceShipper.java:103)
> 2018-05-09 15:07:00,892 DEBUG
> [ReplicationExecutor-0.replicationSource,2-asf916.gq1.ygridcore.net,36287,1525878368395.replicationSource.shipperasf916.gq1.ygridcore.net%2C36287%2C1525878368395.asf916.gq1.ygridcore.net%2C36287%2C1525878368395.regiongroup-1,2-asf916.gq1.ygridcore.net,36287,1525878368395]
> zookeeper.ZKWatcher(617): regionserver:34308-0x163456ff2490004,
> quorum=localhost:60149, baseZNode=/1 Received InterruptedException, will
> interrupt current thread and rethrow a SystemErrorException
> java.lang.InterruptedException
> at java.lang.Object.wait(Native Method)
> at java.lang.Object.wait(Object.java:502)
> at org.apache.zookeeper.ClientCnxn.submitRequest(ClientCnxn.java:1406)
> at org.apache.zookeeper.ZooKeeper.multiInternal(ZooKeeper.java:990)
> at org.apache.zookeeper.ZooKeeper.multi(ZooKeeper.java:910)
> at
> org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.multi(RecoverableZooKeeper.java:663)
> at
> org.apache.hadoop.hbase.zookeeper.ZKUtil.multiOrSequential(ZKUtil.java:1690)
> at
> org.apache.hadoop.hbase.replication.ZKReplicationQueueStorage.setWALPosition(ZKReplicationQueueStorage.java:246)
> at
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.lambda$logPositionAndCleanOldLogs$7(ReplicationSourceManager.java:487)
> at
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.abortWhenFail(ReplicationSourceManager.java:454)
> at
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.logPositionAndCleanOldLogs(ReplicationSourceManager.java:487)
> at
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.updateLogPosition(ReplicationSourceShipper.java:231)
> at
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.shipEdits(ReplicationSourceShipper.java:133)
> at
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.run(ReplicationSourceShipper.java:103)
> 2018-05-09 15:07:00,896 DEBUG
> [RpcServer.default.FPBQ.Fifo.handler=4,queue=0,port=36902]
> master.MasterRpcServices(1141): Checking to see if procedure is done pid=52
> 2018-05-09 15:07:00,898 ERROR
> [ReplicationExecutor-0.replicationSource,2-asf916.gq1.ygridcore.net,36287,1525878368395.replicationSource.shipperasf916.gq1.ygridcore.net%2C36287%2C1525878368395.asf916.gq1.ygridcore.net%2C36287%2C1525878368395.regiongroup-0,2-asf916.gq1.ygridcore.net,36287,1525878368395]
> helpers.MarkerIgnoringBase(159): ***** ABORTING region server
> asf916.gq1.ygridcore.net,34308,1525878368287: Failed to operate on
> replication queue *****
> org.apache.hadoop.hbase.replication.ReplicationException: Failed to remove
> wal from queue (serverName=asf916.gq1.ygridcore.net,34308,1525878368287,
> queueId=2-asf916.gq1.ygridcore.net,36287,1525878368395,
> fileName=asf916.gq1.ygridcore.net%2C36287%2C1525878368395.asf916.gq1.ygridcore.net%2C36287%2C1525878368395.regiongroup-0.1525878371229)
> at
> org.apache.hadoop.hbase.replication.ZKReplicationQueueStorage.removeWAL(ZKReplicationQueueStorage.java:202)
> at
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.lambda$cleanOldLogs$8(ReplicationSourceManager.java:526)
> at
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.abortWhenFail(ReplicationSourceManager.java:454)
> at
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.cleanOldLogs(ReplicationSourceManager.java:526)
> at
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.cleanOldLogs(ReplicationSourceManager.java:506)
> at
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager.logPositionAndCleanOldLogs(ReplicationSourceManager.java:489)
> at
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.updateLogPosition(ReplicationSourceShipper.java:231)
> at
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.shipEdits(ReplicationSourceShipper.java:133)
> at
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceShipper.run(ReplicationSourceShipper.java:103)
> Caused by: org.apache.zookeeper.KeeperException$SystemErrorException:
> KeeperErrorCode = SystemError
> at
> org.apache.hadoop.hbase.zookeeper.ZKWatcher.interruptedException(ZKWatcher.java:608)
> at org.apache.hadoop.hbase.zookeeper.ZKUtil.deleteNode(ZKUtil.java:1236)
> at org.apache.hadoop.hbase.zookeeper.ZKUtil.deleteNode(ZKUtil.java:1220)
> at
> org.apache.hadoop.hbase.replication.ZKReplicationQueueStorage.removeWAL(ZKReplicationQueueStorage.java:198)
> ... 8 more
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)