[ 
https://issues.apache.org/jira/browse/HBASE-16350?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15406988#comment-15406988
 ] 

Hadoop QA commented on HBASE-16350:
-----------------------------------

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 
0s {color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 
49s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s 
{color} | {color:green} master passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
3s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
26s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
30s {color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s 
{color} | {color:green} master passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s 
{color} | {color:green} master passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
58s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 0s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 45s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 
2s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
25s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
39m 52s {color} | {color:green} Patch does not cause any errors with Hadoop 
2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
50s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s 
{color} | {color:green} the patch passed with JDK v1.8.0 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 44s 
{color} | {color:green} the patch passed with JDK v1.7.0_80 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 123m 12s 
{color} | {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 
33s {color} | {color:green} Patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 184m 24s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hbase.master.procedure.TestMasterFailoverWithProcedures |
|   | hadoop.hbase.security.visibility.TestVisibilityWithCheckAuths |
|   | hadoop.hbase.regionserver.TestRegionServerMetrics |
| Timed out junit tests | 
org.apache.hadoop.hbase.snapshot.TestMobSecureExportSnapshot |
|   | org.apache.hadoop.hbase.snapshot.TestSecureExportSnapshot |
|   | org.apache.hadoop.hbase.snapshot.TestMobExportSnapshot |
|   | org.apache.hadoop.hbase.snapshot.TestMobRestoreFlushSnapshotFromClient |
|   | org.apache.hadoop.hbase.snapshot.TestMobFlushSnapshotFromClient |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12821936/hbase-16350_v1.patch |
| JIRA Issue | HBASE-16350 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  hadoopcheck  
hbaseanti  checkstyle  compile  |
| uname | Linux asf910.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP 
PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 9b3bc5f |
| Default Java | 1.7.0_80 |
| Multi-JDK versions |  /usr/local/jenkins/java/jdk1.8.0:1.8.0 
/usr/local/asfpackages/java/jdk1.7.0_80:1.7.0_80 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HBASE-Build/2917/artifact/patchprocess/patch-unit-hbase-server.txt
 |
| unit test logs |  
https://builds.apache.org/job/PreCommit-HBASE-Build/2917/artifact/patchprocess/patch-unit-hbase-server.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/2917/testReport/ |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBASE-Build/2917/console |
| Powered by | Apache Yetus 0.2.1   http://yetus.apache.org |


This message was automatically generated.



> Undo server abort from HBASE-14968
> ----------------------------------
>
>                 Key: HBASE-16350
>                 URL: https://issues.apache.org/jira/browse/HBASE-16350
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Enis Soztutar
>            Assignee: Enis Soztutar
>             Fix For: 2.0.0, 1.3.0, 1.4.0, 1.1.6, 0.98.21, 1.2.3
>
>         Attachments: hbase-16350_v1.patch
>
>
> This came up during testing the ITBLL on a custom cluster. 
> Master aborts if WAL splitting fails: 
> {code}
> 2016-08-03 19:56:57,693 INFO  
> [B.priority.fifo.QRpcServer.handler=4,queue=0,port=20000] 
> master.ServerManager: Registering 
> server=hbase-bug58335-6.openstacklocal,16020,1470254212713
> 2016-08-03 19:57:00,062 INFO  [main-EventThread] 
> coordination.SplitLogManagerCoordination: task 
> /hbase-secure/splitWAL/WALs%2Fhbase-bug58335-6.openstacklocal%2C16020%2C1470253655579-splitting%2Fhbase-bug58335-6.openstacklocal%252C16020%252C1470253655579.default.1470253660327
>  entered state: ERR hbase-bug58335-4.openstacklocal,16020,1470253592920
> 2016-08-03 19:57:00,064 WARN  [main-EventThread] 
> coordination.SplitLogManagerCoordination: Error splitting 
> /hbase-secure/splitWAL/WALs%2Fhbase-bug58335-6.openstacklocal%2C16020%2C1470253655579-splitting%2Fhbase-bug58335-6.openstacklocal%252C16020%252C1470253655579.default.1470253660327
> 2016-08-03 19:57:00,064 WARN  
> [MASTER_SERVER_OPERATIONS-hbase-bug58335-7:20000-0] master.SplitLogManager: 
> error while splitting logs in 
> [hdfs://hbase-bug58335-7.openstacklocal:8020/apps/hbase/data/WALs/hbase-bug58335-6.openstacklocal,16020,1470253655579-splitting]
>  installed = 1 but only 0 done
> 2016-08-03 19:57:00,065 ERROR 
> [MASTER_SERVER_OPERATIONS-hbase-bug58335-7:20000-0] executor.EventHandler: 
> Caught throwable while processing event M_SERVER_SHUTDOWN
> java.io.IOException: failed log splitting for 
> hbase-bug58335-6.openstacklocal,16020,1470253655579, will retry
>         at 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:357)
>         at 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:220)
>         at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: error or interrupted while splitting logs in 
> [hdfs://hbase-bug58335-7.openstacklocal:8020/apps/hbase/data/WALs/hbase-bug58335-6.openstacklocal,16020,1470253655579-splitting]
>  Task = installed = 1 done = 0 error = 1
>         at 
> org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:290)
>         at 
> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:393)
>         at 
> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:366)
>         at 
> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:288)
>         at 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:213)
>         ... 4 more
> 2016-08-03 19:57:00,067 FATAL 
> [MASTER_SERVER_OPERATIONS-hbase-bug58335-7:20000-0] master.HMaster: Master 
> server abort: loaded coprocessors are: 
> [org.apache.ranger.authorization.hbase.RangerAuthorizationCoprocessor, 
> org.apache.hadoop.hbase.backup.master.BackupController]
> 2016-08-03 19:57:00,067 FATAL 
> [MASTER_SERVER_OPERATIONS-hbase-bug58335-7:20000-0] master.HMaster: Caught 
> throwable while processing event M_SERVER_SHUTDOWN
> java.io.IOException: failed log splitting for 
> hbase-bug58335-6.openstacklocal,16020,1470253655579, will retry
>         at 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.resubmit(ServerShutdownHandler.java:357)
>         at 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:220)
>         at 
> org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:129)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>         at java.lang.Thread.run(Thread.java:745)
> Caused by: java.io.IOException: error or interrupted while splitting logs in 
> [hdfs://hbase-bug58335-7.openstacklocal:8020/apps/hbase/data/WALs/hbase-bug58335-6.openstacklocal,16020,1470253655579-splitting]
>  Task = installed = 1 done = 0 error = 1
>         at 
> org.apache.hadoop.hbase.master.SplitLogManager.splitLogDistributed(SplitLogManager.java:290)
>         at 
> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:393)
>         at 
> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:366)
>         at 
> org.apache.hadoop.hbase.master.MasterFileSystem.splitLog(MasterFileSystem.java:288)
>         at 
> org.apache.hadoop.hbase.master.handler.ServerShutdownHandler.process(ServerShutdownHandler.java:213)
>         ... 4 more
> {code}
> If we fail to split the WAL, we normally retry, but it seems that it is doing 
> this by throwing an exception: 
> {code}
>   private void resubmit(final ServerName serverName, IOException ex) throws 
> IOException {
>     // typecast to SSH so that we make sure that it is the SSH instance that
>     // gets submitted as opposed to MSSH or some other derived instance of SSH
>     this.services.getExecutorService().submit((ServerShutdownHandler) this);
>     this.deadServers.add(serverName);
>     throw new IOException("failed log splitting for " + serverName + ", will 
> retry", ex);
>   }
> {code}
> In erie, this patch: HBASE-14968 made the change that if the handler is 
> throwing an uncaught exception, we abort the server: 
> {code}
>   protected void handleException(Throwable t) {
>     String msg = "Caught throwable while processing event " + eventType;
>     LOG.error(msg, t);
>     if (server != null) {
>       server.abort(msg, t);
>     }
>   }
> {code}
> It seems that SSH, and some other handlers are throwing exceptions in valid 
> cases to retry the operations. We can undo the server.abort() call which was 
> originally added in HBASE-14968 for cases where unchecked exceptions like 
> ConcurrentModificationException, etc are not ignored. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to