[jira] [Commented] (HBASE-18261) [AMv2] Create new RecoverMetaProcedure and use it from ServerCrashProcedure and HMaster.finishActiveMasterInitialization()
[ https://issues.apache.org/jira/browse/HBASE-18261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16131089#comment-16131089 ] Hudson commented on HBASE-18261: FAILURE: Integrated in Jenkins build HBASE-14070.HLC #233 (See [https://builds.apache.org/job/HBASE-14070.HLC/233/]) HBASE-18261 Created RecoverMetaProcedure and used it from (stack: rev a5db120e6090faecb680f3f1e297f78e567ba3a3) * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterMetaBootstrap.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterServices.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/master/MockNoopMasterServices.java * (add) hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/RecoverMetaProcedure.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestServerCrashProcedure.java * (edit) hbase-protocol-shaded/src/main/protobuf/MasterProcedure.proto * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterNoCluster.java * (edit) hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/StateMachineProcedure.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterWalManager.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureTestingUtility.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ServerCrashProcedure.java * (edit) hbase-protocol-shaded/src/main/java/org/apache/hadoop/hbase/shaded/protobuf/generated/MasterProcedureProtos.java > [AMv2] Create new RecoverMetaProcedure and use it from ServerCrashProcedure > and HMaster.finishActiveMasterInitialization() > -- > > Key: HBASE-18261 > URL: https://issues.apache.org/jira/browse/HBASE-18261 > Project: HBase > Issue Type: Improvement > Components: amv2 >Affects Versions: 2.0.0-alpha-1 >Reporter: Umesh Agashe >Assignee: Umesh Agashe > Fix For: 2.0.0-alpha-2 > > Attachments: hbase-18261.master.001.patch, > HBASE-18261.master.001.patch, hbase-18261.master.002.patch, > hbase-18261.master.003.patch, hbase-18261.master.004.patch, > hbase-18261.master.005.patch > > > When unit test > hbase.master.procedure.TestServerCrashProcedure#testRecoveryAndDoubleExecutionOnRsWithMeta() > is enabled and run several times, it fails intermittently. Cause is meta > recovery is done at two different places: > * ServerCrashProcedure.processMeta() > * HMaster.finishActiveMasterInitialization() > and its not coordinated. > When HMaster.finishActiveMasterInitialization() gets to submit splitMetaLog() > first and while its running call from ServerCrashProcedure.processMeta() > fails causing step to be retried again in a loop. > When ServerCrashProcedure.processMeta() submits splitMetaLog after > splitMetaLog from HMaster.finishActiveMasterInitialization() is finished, > success is returned without doing any work. > But if ServerCrashProcedure.processMeta() submits splitMetaLog request and > while its going HMaster.finishActiveMasterInitialization() submits it test > fails with exception. > [~stack] and I discussed the possible solution: > Create RecoverMetaProcedure and call it where required. Procedure framework > provides mutual exclusion and requires idempotence, which should fix the > problem. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18261) [AMv2] Create new RecoverMetaProcedure and use it from ServerCrashProcedure and HMaster.finishActiveMasterInitialization()
[ https://issues.apache.org/jira/browse/HBASE-18261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108220#comment-16108220 ] Hadoop QA commented on HBASE-18261: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 31s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 55s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 29s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 10m 11s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 29s{color} | {color:green} master passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 4s{color} | {color:red} hbase-protocol-shaded in master has 27 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 55s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 18s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 34s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 10m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 31m 2s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha4. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 1m 22s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 7m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 37s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 17s{color} | {color:green} hbase-procedure in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green}123m 9s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 46s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}207m 27s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.13.1 Server=1.13.1 Image:yetus/hbase:bdc94b1 | | JIRA Issue | HBASE-18261 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12879718/hbase-18261.master.005.patch | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile cc hbaseprotoc | | uname | Linux 89e9fa058dae 4.4.0-43-generic #63-Ubuntu SMP Wed Oct 12 13:48:03 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Commented] (HBASE-18261) [AMv2] Create new RecoverMetaProcedure and use it from ServerCrashProcedure and HMaster.finishActiveMasterInitialization()
[ https://issues.apache.org/jira/browse/HBASE-18261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108208#comment-16108208 ] Hudson commented on HBASE-18261: FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #3470 (See [https://builds.apache.org/job/HBase-Trunk_matrix/3470/]) HBASE-18261 Created RecoverMetaProcedure and used it from (stack: rev a5db120e6090faecb680f3f1e297f78e567ba3a3) * (edit) hbase-protocol-shaded/src/main/java/org/apache/hadoop/hbase/shaded/protobuf/generated/MasterProcedureProtos.java * (edit) hbase-protocol-shaded/src/main/protobuf/MasterProcedure.proto * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterWalManager.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ServerCrashProcedure.java * (edit) hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/StateMachineProcedure.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterNoCluster.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterMetaBootstrap.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureTestingUtility.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestServerCrashProcedure.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterServices.java * (add) hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/RecoverMetaProcedure.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/master/MockNoopMasterServices.java > [AMv2] Create new RecoverMetaProcedure and use it from ServerCrashProcedure > and HMaster.finishActiveMasterInitialization() > -- > > Key: HBASE-18261 > URL: https://issues.apache.org/jira/browse/HBASE-18261 > Project: HBase > Issue Type: Improvement > Components: amv2 >Affects Versions: 2.0.0-alpha-1 >Reporter: Umesh Agashe >Assignee: Umesh Agashe > Fix For: 2.0.0-alpha-2 > > Attachments: hbase-18261.master.001.patch, > HBASE-18261.master.001.patch, hbase-18261.master.002.patch, > hbase-18261.master.003.patch, hbase-18261.master.004.patch, > hbase-18261.master.005.patch > > > When unit test > hbase.master.procedure.TestServerCrashProcedure#testRecoveryAndDoubleExecutionOnRsWithMeta() > is enabled and run several times, it fails intermittently. Cause is meta > recovery is done at two different places: > * ServerCrashProcedure.processMeta() > * HMaster.finishActiveMasterInitialization() > and its not coordinated. > When HMaster.finishActiveMasterInitialization() gets to submit splitMetaLog() > first and while its running call from ServerCrashProcedure.processMeta() > fails causing step to be retried again in a loop. > When ServerCrashProcedure.processMeta() submits splitMetaLog after > splitMetaLog from HMaster.finishActiveMasterInitialization() is finished, > success is returned without doing any work. > But if ServerCrashProcedure.processMeta() submits splitMetaLog request and > while its going HMaster.finishActiveMasterInitialization() submits it test > fails with exception. > [~stack] and I discussed the possible solution: > Create RecoverMetaProcedure and call it where required. Procedure framework > provides mutual exclusion and requires idempotence, which should fix the > problem. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18261) [AMv2] Create new RecoverMetaProcedure and use it from ServerCrashProcedure and HMaster.finishActiveMasterInitialization()
[ https://issues.apache.org/jira/browse/HBASE-18261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108189#comment-16108189 ] Hudson commented on HBASE-18261: FAILURE: Integrated in Jenkins build HBase-2.0 #269 (See [https://builds.apache.org/job/HBase-2.0/269/]) HBASE-18261 Created RecoverMetaProcedure and used it from (stack: rev 7bdabed275bfba3c215fdba8847cf61fe53abf96) * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterServices.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterMetaBootstrap.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * (add) hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/RecoverMetaProcedure.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/MasterProcedureTestingUtility.java * (edit) hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/StateMachineProcedure.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestMasterNoCluster.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ServerCrashProcedure.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/master/procedure/TestServerCrashProcedure.java * (edit) hbase-server/src/test/java/org/apache/hadoop/hbase/master/MockNoopMasterServices.java * (edit) hbase-protocol-shaded/src/main/protobuf/MasterProcedure.proto * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterWalManager.java * (edit) hbase-protocol-shaded/src/main/java/org/apache/hadoop/hbase/shaded/protobuf/generated/MasterProcedureProtos.java > [AMv2] Create new RecoverMetaProcedure and use it from ServerCrashProcedure > and HMaster.finishActiveMasterInitialization() > -- > > Key: HBASE-18261 > URL: https://issues.apache.org/jira/browse/HBASE-18261 > Project: HBase > Issue Type: Improvement > Components: amv2 >Affects Versions: 2.0.0-alpha-1 >Reporter: Umesh Agashe >Assignee: Umesh Agashe > Fix For: 2.0.0-alpha-2 > > Attachments: hbase-18261.master.001.patch, > HBASE-18261.master.001.patch, hbase-18261.master.002.patch, > hbase-18261.master.003.patch, hbase-18261.master.004.patch, > hbase-18261.master.005.patch > > > When unit test > hbase.master.procedure.TestServerCrashProcedure#testRecoveryAndDoubleExecutionOnRsWithMeta() > is enabled and run several times, it fails intermittently. Cause is meta > recovery is done at two different places: > * ServerCrashProcedure.processMeta() > * HMaster.finishActiveMasterInitialization() > and its not coordinated. > When HMaster.finishActiveMasterInitialization() gets to submit splitMetaLog() > first and while its running call from ServerCrashProcedure.processMeta() > fails causing step to be retried again in a loop. > When ServerCrashProcedure.processMeta() submits splitMetaLog after > splitMetaLog from HMaster.finishActiveMasterInitialization() is finished, > success is returned without doing any work. > But if ServerCrashProcedure.processMeta() submits splitMetaLog request and > while its going HMaster.finishActiveMasterInitialization() submits it test > fails with exception. > [~stack] and I discussed the possible solution: > Create RecoverMetaProcedure and call it where required. Procedure framework > provides mutual exclusion and requires idempotence, which should fix the > problem. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18261) [AMv2] Create new RecoverMetaProcedure and use it from ServerCrashProcedure and HMaster.finishActiveMasterInitialization()
[ https://issues.apache.org/jira/browse/HBASE-18261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16108022#comment-16108022 ] Umesh Agashe commented on HBASE-18261: -- Thanks for reviewing and pushing the changes, [~stack]! > [AMv2] Create new RecoverMetaProcedure and use it from ServerCrashProcedure > and HMaster.finishActiveMasterInitialization() > -- > > Key: HBASE-18261 > URL: https://issues.apache.org/jira/browse/HBASE-18261 > Project: HBase > Issue Type: Improvement > Components: amv2 >Affects Versions: 2.0.0-alpha-1 >Reporter: Umesh Agashe >Assignee: Umesh Agashe > Fix For: 2.0.0-alpha-2 > > Attachments: hbase-18261.master.001.patch, > HBASE-18261.master.001.patch, hbase-18261.master.002.patch, > hbase-18261.master.003.patch, hbase-18261.master.004.patch, > hbase-18261.master.005.patch > > > When unit test > hbase.master.procedure.TestServerCrashProcedure#testRecoveryAndDoubleExecutionOnRsWithMeta() > is enabled and run several times, it fails intermittently. Cause is meta > recovery is done at two different places: > * ServerCrashProcedure.processMeta() > * HMaster.finishActiveMasterInitialization() > and its not coordinated. > When HMaster.finishActiveMasterInitialization() gets to submit splitMetaLog() > first and while its running call from ServerCrashProcedure.processMeta() > fails causing step to be retried again in a loop. > When ServerCrashProcedure.processMeta() submits splitMetaLog after > splitMetaLog from HMaster.finishActiveMasterInitialization() is finished, > success is returned without doing any work. > But if ServerCrashProcedure.processMeta() submits splitMetaLog request and > while its going HMaster.finishActiveMasterInitialization() submits it test > fails with exception. > [~stack] and I discussed the possible solution: > Create RecoverMetaProcedure and call it where required. Procedure framework > provides mutual exclusion and requires idempotence, which should fix the > problem. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18261) [AMv2] Create new RecoverMetaProcedure and use it from ServerCrashProcedure and HMaster.finishActiveMasterInitialization()
[ https://issues.apache.org/jira/browse/HBASE-18261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16102662#comment-16102662 ] Hadoop QA commented on HBASE-18261: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 4m 9s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 32s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 28s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 24s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 12m 10s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 35s{color} | {color:green} master passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 10s{color} | {color:red} hbase-protocol-shaded in master has 27 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 19s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 12m 16s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 33s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 32m 40s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha4. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 1m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 6m 28s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 30s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 2s{color} | {color:green} hbase-procedure in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green}115m 54s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 48s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}206m 22s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.12.3 Server=1.12.3 Image:yetus/hbase:bdc94b1 | | JIRA Issue | HBASE-18261 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12879082/hbase-18261.master.004.patch | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile cc hbaseprotoc | | uname | Linux 81f06248a8d4 3.13.0-119-generic #166-Ubuntu SMP Wed May 3 12:18:55 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality |
[jira] [Commented] (HBASE-18261) [AMv2] Create new RecoverMetaProcedure and use it from ServerCrashProcedure and HMaster.finishActiveMasterInitialization()
[ https://issues.apache.org/jira/browse/HBASE-18261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16102494#comment-16102494 ] Hadoop QA commented on HBASE-18261: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 4m 29s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 13s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 20s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 11m 31s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 31s{color} | {color:green} master passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 0s{color} | {color:red} hbase-protocol-shaded in master has 27 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 35s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 12m 24s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 39s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 31m 50s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha4. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 1m 11s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 54s{color} | {color:red} hbase-server generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 29s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 57s{color} | {color:green} hbase-procedure in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green}135m 25s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 50s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}224m 11s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hbase-server | | | Exception is caught when Exception is not thrown in org.apache.hadoop.hbase.master.procedure.RecoverMetaProcedure.executeFromState(MasterProcedureEnv, MasterProcedureProtos$RecoverMetaState) At RecoverMetaProcedure.java:is not thrown in org.apache.hadoop.hbase.master.procedure.RecoverMetaProcedure.executeFromState(MasterProcedureEnv, MasterProcedureProtos$RecoverMetaState) At RecoverMetaProcedure.java:[line 144] | \\
[jira] [Commented] (HBASE-18261) [AMv2] Create new RecoverMetaProcedure and use it from ServerCrashProcedure and HMaster.finishActiveMasterInitialization()
[ https://issues.apache.org/jira/browse/HBASE-18261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16102281#comment-16102281 ] Sean Busbey commented on HBASE-18261: - FYI, [~appy] and I have been trying to fix handling of the flaky test list in precommit and this JIRA happens to be the one mentioned in our test runs. > [AMv2] Create new RecoverMetaProcedure and use it from ServerCrashProcedure > and HMaster.finishActiveMasterInitialization() > -- > > Key: HBASE-18261 > URL: https://issues.apache.org/jira/browse/HBASE-18261 > Project: HBase > Issue Type: Improvement > Components: amv2 >Affects Versions: 2.0.0-alpha-1 >Reporter: Umesh Agashe >Assignee: Umesh Agashe > Fix For: 2.0.0-alpha-2 > > Attachments: hbase-18261.master.001.patch, > HBASE-18261.master.001.patch, hbase-18261.master.002.patch, > hbase-18261.master.003.patch > > > When unit test > hbase.master.procedure.TestServerCrashProcedure#testRecoveryAndDoubleExecutionOnRsWithMeta() > is enabled and run several times, it fails intermittently. Cause is meta > recovery is done at two different places: > * ServerCrashProcedure.processMeta() > * HMaster.finishActiveMasterInitialization() > and its not coordinated. > When HMaster.finishActiveMasterInitialization() gets to submit splitMetaLog() > first and while its running call from ServerCrashProcedure.processMeta() > fails causing step to be retried again in a loop. > When ServerCrashProcedure.processMeta() submits splitMetaLog after > splitMetaLog from HMaster.finishActiveMasterInitialization() is finished, > success is returned without doing any work. > But if ServerCrashProcedure.processMeta() submits splitMetaLog request and > while its going HMaster.finishActiveMasterInitialization() submits it test > fails with exception. > [~stack] and I discussed the possible solution: > Create RecoverMetaProcedure and call it where required. Procedure framework > provides mutual exclusion and requires idempotence, which should fix the > problem. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18261) [AMv2] Create new RecoverMetaProcedure and use it from ServerCrashProcedure and HMaster.finishActiveMasterInitialization()
[ https://issues.apache.org/jira/browse/HBASE-18261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16102233#comment-16102233 ] Hadoop QA commented on HBASE-18261: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 48s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 32s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 11m 49s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 35s{color} | {color:green} master passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 13s{color} | {color:red} hbase-protocol-shaded in master has 27 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 11m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 29m 46s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha4. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 1m 4s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 3m 0s{color} | {color:red} hbase-server generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 30s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 53s{color} | {color:green} hbase-procedure in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green}120m 14s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 46s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}203m 2s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hbase-server | | | Exception is caught when Exception is not thrown in org.apache.hadoop.hbase.master.procedure.RecoverMetaProcedure.executeFromState(MasterProcedureEnv, MasterProcedureProtos$RecoverMetaState) At RecoverMetaProcedure.java:is not thrown in org.apache.hadoop.hbase.master.procedure.RecoverMetaProcedure.executeFromState(MasterProcedureEnv, MasterProcedureProtos$RecoverMetaState) At RecoverMetaProcedure.java:[line 144] | \\
[jira] [Commented] (HBASE-18261) [AMv2] Create new RecoverMetaProcedure and use it from ServerCrashProcedure and HMaster.finishActiveMasterInitialization()
[ https://issues.apache.org/jira/browse/HBASE-18261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16101927#comment-16101927 ] Umesh Agashe commented on HBASE-18261: -- Fixed unit tests. trying with new patch. > [AMv2] Create new RecoverMetaProcedure and use it from ServerCrashProcedure > and HMaster.finishActiveMasterInitialization() > -- > > Key: HBASE-18261 > URL: https://issues.apache.org/jira/browse/HBASE-18261 > Project: HBase > Issue Type: Improvement > Components: amv2 >Affects Versions: 2.0.0-alpha-1 >Reporter: Umesh Agashe >Assignee: Umesh Agashe > Fix For: 2.0.0-alpha-2 > > Attachments: hbase-18261.master.001.patch, > HBASE-18261.master.001.patch, hbase-18261.master.002.patch, > hbase-18261.master.003.patch > > > When unit test > hbase.master.procedure.TestServerCrashProcedure#testRecoveryAndDoubleExecutionOnRsWithMeta() > is enabled and run several times, it fails intermittently. Cause is meta > recovery is done at two different places: > * ServerCrashProcedure.processMeta() > * HMaster.finishActiveMasterInitialization() > and its not coordinated. > When HMaster.finishActiveMasterInitialization() gets to submit splitMetaLog() > first and while its running call from ServerCrashProcedure.processMeta() > fails causing step to be retried again in a loop. > When ServerCrashProcedure.processMeta() submits splitMetaLog after > splitMetaLog from HMaster.finishActiveMasterInitialization() is finished, > success is returned without doing any work. > But if ServerCrashProcedure.processMeta() submits splitMetaLog request and > while its going HMaster.finishActiveMasterInitialization() submits it test > fails with exception. > [~stack] and I discussed the possible solution: > Create RecoverMetaProcedure and call it where required. Procedure framework > provides mutual exclusion and requires idempotence, which should fix the > problem. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18261) [AMv2] Create new RecoverMetaProcedure and use it from ServerCrashProcedure and HMaster.finishActiveMasterInitialization()
[ https://issues.apache.org/jira/browse/HBASE-18261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16101051#comment-16101051 ] Umesh Agashe commented on HBASE-18261: -- Looking... bunch of tests failed and/or timed out. > [AMv2] Create new RecoverMetaProcedure and use it from ServerCrashProcedure > and HMaster.finishActiveMasterInitialization() > -- > > Key: HBASE-18261 > URL: https://issues.apache.org/jira/browse/HBASE-18261 > Project: HBase > Issue Type: Improvement > Components: amv2 >Affects Versions: 2.0.0-alpha-1 >Reporter: Umesh Agashe >Assignee: Umesh Agashe > Fix For: 2.0.0-alpha-2 > > Attachments: hbase-18261.master.001.patch, > HBASE-18261.master.001.patch, hbase-18261.master.002.patch > > > When unit test > hbase.master.procedure.TestServerCrashProcedure#testRecoveryAndDoubleExecutionOnRsWithMeta() > is enabled and run several times, it fails intermittently. Cause is meta > recovery is done at two different places: > * ServerCrashProcedure.processMeta() > * HMaster.finishActiveMasterInitialization() > and its not coordinated. > When HMaster.finishActiveMasterInitialization() gets to submit splitMetaLog() > first and while its running call from ServerCrashProcedure.processMeta() > fails causing step to be retried again in a loop. > When ServerCrashProcedure.processMeta() submits splitMetaLog after > splitMetaLog from HMaster.finishActiveMasterInitialization() is finished, > success is returned without doing any work. > But if ServerCrashProcedure.processMeta() submits splitMetaLog request and > while its going HMaster.finishActiveMasterInitialization() submits it test > fails with exception. > [~stack] and I discussed the possible solution: > Create RecoverMetaProcedure and call it where required. Procedure framework > provides mutual exclusion and requires idempotence, which should fix the > problem. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18261) [AMv2] Create new RecoverMetaProcedure and use it from ServerCrashProcedure and HMaster.finishActiveMasterInitialization()
[ https://issues.apache.org/jira/browse/HBASE-18261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100974#comment-16100974 ] Hadoop QA commented on HBASE-18261: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 4 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 29s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 15s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 20s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 11m 36s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 31s{color} | {color:green} master passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 1m 59s{color} | {color:red} hbase-protocol-shaded in master has 27 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s{color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 16s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 1m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 18s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 11m 33s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s{color} | {color:red} The patch has 2 line(s) that end in whitespace. Use git apply --whitespace=fix <>. Refer https://git-scm.com/docs/git-apply {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 29m 37s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.1 2.6.2 2.6.3 2.6.4 2.6.5 2.7.1 2.7.2 2.7.3 or 3.0.0-alpha4. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 1m 5s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 3m 12s{color} | {color:red} hbase-server generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 31s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 3m 0s{color} | {color:green} hbase-procedure in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 75m 17s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 1m 39s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}157m 37s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hbase-server | | | Exception is caught when Exception is not thrown in org.apache.hadoop.hbase.master.procedure.RecoverMetaProcedure.executeFromState(MasterProcedureEnv, MasterProcedureProtos$RecoverMetaState) At RecoverMetaProcedure.java:is not thrown in org.apache.hadoop.hbase.master.procedure.RecoverMetaProcedure.executeFromState(MasterProcedureEnv, MasterProcedureProtos$RecoverMetaState) At RecoverMetaProcedure.java:[line 144] | | Failed
[jira] [Commented] (HBASE-18261) [AMv2] Create new RecoverMetaProcedure and use it from ServerCrashProcedure and HMaster.finishActiveMasterInitialization()
[ https://issues.apache.org/jira/browse/HBASE-18261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100804#comment-16100804 ] Hadoop QA commented on HBASE-18261: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s{color} | {color:red} HBASE-18261 does not apply to master. Rebase required? Wrong Branch? See https://yetus.apache.org/documentation/0.4.0/precommit-patchnames for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Issue | HBASE-18261 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12878905/hbase-18261.master.001.patch | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/7785/console | | Powered by | Apache Yetus 0.4.0 http://yetus.apache.org | This message was automatically generated. > [AMv2] Create new RecoverMetaProcedure and use it from ServerCrashProcedure > and HMaster.finishActiveMasterInitialization() > -- > > Key: HBASE-18261 > URL: https://issues.apache.org/jira/browse/HBASE-18261 > Project: HBase > Issue Type: Improvement > Components: amv2 >Affects Versions: 2.0.0-alpha-1 >Reporter: Umesh Agashe >Assignee: Umesh Agashe > Fix For: 2.0.0-alpha-2 > > Attachments: hbase-18261.master.001.patch, > HBASE-18261.master.001.patch > > > When unit test > hbase.master.procedure.TestServerCrashProcedure#testRecoveryAndDoubleExecutionOnRsWithMeta() > is enabled and run several times, it fails intermittently. Cause is meta > recovery is done at two different places: > * ServerCrashProcedure.processMeta() > * HMaster.finishActiveMasterInitialization() > and its not coordinated. > When HMaster.finishActiveMasterInitialization() gets to submit splitMetaLog() > first and while its running call from ServerCrashProcedure.processMeta() > fails causing step to be retried again in a loop. > When ServerCrashProcedure.processMeta() submits splitMetaLog after > splitMetaLog from HMaster.finishActiveMasterInitialization() is finished, > success is returned without doing any work. > But if ServerCrashProcedure.processMeta() submits splitMetaLog request and > while its going HMaster.finishActiveMasterInitialization() submits it test > fails with exception. > [~stack] and I discussed the possible solution: > Create RecoverMetaProcedure and call it where required. Procedure framework > provides mutual exclusion and requires idempotence, which should fix the > problem. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18261) [AMv2] Create new RecoverMetaProcedure and use it from ServerCrashProcedure and HMaster.finishActiveMasterInitialization()
[ https://issues.apache.org/jira/browse/HBASE-18261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16100800#comment-16100800 ] Umesh Agashe commented on HBASE-18261: -- Submitted patch with new procedure 'RecoverMetaProcedure' that can be used by any code to initialize/ recover meta before accessing it. It avoids duplication and handles synchronization and thread-safety issues. > [AMv2] Create new RecoverMetaProcedure and use it from ServerCrashProcedure > and HMaster.finishActiveMasterInitialization() > -- > > Key: HBASE-18261 > URL: https://issues.apache.org/jira/browse/HBASE-18261 > Project: HBase > Issue Type: Improvement > Components: amv2 >Affects Versions: 2.0.0-alpha-1 >Reporter: Umesh Agashe >Assignee: Umesh Agashe > Fix For: 2.0.0-alpha-2 > > Attachments: hbase-18261.master.001.patch, > HBASE-18261.master.001.patch > > > When unit test > hbase.master.procedure.TestServerCrashProcedure#testRecoveryAndDoubleExecutionOnRsWithMeta() > is enabled and run several times, it fails intermittently. Cause is meta > recovery is done at two different places: > * ServerCrashProcedure.processMeta() > * HMaster.finishActiveMasterInitialization() > and its not coordinated. > When HMaster.finishActiveMasterInitialization() gets to submit splitMetaLog() > first and while its running call from ServerCrashProcedure.processMeta() > fails causing step to be retried again in a loop. > When ServerCrashProcedure.processMeta() submits splitMetaLog after > splitMetaLog from HMaster.finishActiveMasterInitialization() is finished, > success is returned without doing any work. > But if ServerCrashProcedure.processMeta() submits splitMetaLog request and > while its going HMaster.finishActiveMasterInitialization() submits it test > fails with exception. > [~stack] and I discussed the possible solution: > Create RecoverMetaProcedure and call it where required. Procedure framework > provides mutual exclusion and requires idempotence, which should fix the > problem. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18261) [AMv2] Create new RecoverMetaProcedure and use it from ServerCrashProcedure and HMaster.finishActiveMasterInitialization()
[ https://issues.apache.org/jira/browse/HBASE-18261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16086469#comment-16086469 ] Umesh Agashe commented on HBASE-18261: -- Hi [~stack], [~yangzhe1991]: FWICS here is the root cause: The UT tests ServerCrashProcedure when RS carrying meta region crashes. It also simulates master crash after executing each step in the procedure. Initially all RS are at the same version i.e. 3.0.0-SNAPSHOT. HMaster.getRegionServerVersion() returns version 0.0.0 for dead RS (carrying meta). This makes AssignmentManager.getExcludedServersForSystemTable() return non-empty list and the logic in AssignmentManager.checkIfShouldMoveSystemRegionAsync() is triggered which in turn submits MoveRegionProcedure to move meta region from RS with version 0.0.0 to one of other RS with latest version. As commented before this causes race condition between scan and MoveRegionProcedure. AssignmentManager.getExcludedServersForSystemTable() uses master.getServerManager().getOnlineServersList() to get list of online servers only. But on further scrutiny of code and logs I found that server can be online and dead at the same time! IMO, * Currently meta is re/assigned from ServerCrashProcedure, during master initialization from MasterMetaBootstrap and followed by in checkIfShouldMoveSystemRegionAsync(). * that means meta re/assignment may be attempted at max 3 times in certain conditions. * I am working on HBASE-18261 to have meta recovery/ assignment logic at one place. * I think we can pull these changes for assigning meta to RS with highest version number there. * This will result in, RS with highest version number will be considered for meta region assignment when: # When meta region carrying RS crashes # During master startup Along with above changes, obviously we need to fix ServerManager.isServerOnline() and ServerManager.isServerDead() returning true at the same time. This could be result of test code simulating crash but the class itself should not allow this case (IMHO). I have a following fix ready (and tested) which will fix the test but I don't consider it a long term fix. {code} diff --git a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java index 046612a..1a2d53b 100644 --- a/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java +++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/master/assignment/AssignmentManager.java @@ -1760,6 +1760,7 @@ public class AssignmentManager implements ServerListener { public List getExcludedServersForSystemTable() { List> serverList = master.getServerManager().getOnlineServersList() .stream() +.filter((s)->!master.getServerManager().isServerDead(s)) .map((s)->new Pair<>(s, master.getRegionServerVersion(s))) .collect(Collectors.toList()); if (serverList.isEmpty()) { {code} [~stack], as you have suggested, we can disable the test for now. When we agree on fix, we can enable it. Let me know your thoughts. Thanks! > [AMv2] Create new RecoverMetaProcedure and use it from ServerCrashProcedure > and HMaster.finishActiveMasterInitialization() > -- > > Key: HBASE-18261 > URL: https://issues.apache.org/jira/browse/HBASE-18261 > Project: HBase > Issue Type: Improvement > Components: amv2 >Affects Versions: 2.0.0-alpha-1 >Reporter: Umesh Agashe >Assignee: Umesh Agashe > Fix For: 2.0.0-alpha-2 > > Attachments: HBASE-18261.master.001.patch > > > When unit test > hbase.master.procedure.TestServerCrashProcedure#testRecoveryAndDoubleExecutionOnRsWithMeta() > is enabled and run several times, it fails intermittently. Cause is meta > recovery is done at two different places: > * ServerCrashProcedure.processMeta() > * HMaster.finishActiveMasterInitialization() > and its not coordinated. > When HMaster.finishActiveMasterInitialization() gets to submit splitMetaLog() > first and while its running call from ServerCrashProcedure.processMeta() > fails causing step to be retried again in a loop. > When ServerCrashProcedure.processMeta() submits splitMetaLog after > splitMetaLog from HMaster.finishActiveMasterInitialization() is finished, > success is returned without doing any work. > But if ServerCrashProcedure.processMeta() submits splitMetaLog request and > while its going HMaster.finishActiveMasterInitialization() submits it test > fails with exception. > [~stack] and I discussed the possible solution: > Create RecoverMetaProcedure and call it where required. Procedure framework > provides
[jira] [Commented] (HBASE-18261) [AMv2] Create new RecoverMetaProcedure and use it from ServerCrashProcedure and HMaster.finishActiveMasterInitialization()
[ https://issues.apache.org/jira/browse/HBASE-18261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16065230#comment-16065230 ] Umesh Agashe commented on HBASE-18261: -- Thanks [~stack]! I will create a new issue to fix unit test and will continue working on RecoverMetaProcedure with this JIRA. > [AMv2] Create new RecoverMetaProcedure and use it from ServerCrashProcedure > and HMaster.finishActiveMasterInitialization() > -- > > Key: HBASE-18261 > URL: https://issues.apache.org/jira/browse/HBASE-18261 > Project: HBase > Issue Type: Improvement > Components: amv2 >Affects Versions: 2.0.0-alpha-1 >Reporter: Umesh Agashe >Assignee: Umesh Agashe > Fix For: 2.0.0-alpha-2 > > Attachments: HBASE-18261.master.001.patch > > > When unit test > hbase.master.procedure.TestServerCrashProcedure#testRecoveryAndDoubleExecutionOnRsWithMeta() > is enabled and run several times, it fails intermittently. Cause is meta > recovery is done at two different places: > * ServerCrashProcedure.processMeta() > * HMaster.finishActiveMasterInitialization() > and its not coordinated. > When HMaster.finishActiveMasterInitialization() gets to submit splitMetaLog() > first and while its running call from ServerCrashProcedure.processMeta() > fails causing step to be retried again in a loop. > When ServerCrashProcedure.processMeta() submits splitMetaLog after > splitMetaLog from HMaster.finishActiveMasterInitialization() is finished, > success is returned without doing any work. > But if ServerCrashProcedure.processMeta() submits splitMetaLog request and > while its going HMaster.finishActiveMasterInitialization() submits it test > fails with exception. > [~stack] and I discussed the possible solution: > Create RecoverMetaProcedure and call it where required. Procedure framework > provides mutual exclusion and requires idempotence, which should fix the > problem. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (HBASE-18261) [AMv2] Create new RecoverMetaProcedure and use it from ServerCrashProcedure and HMaster.finishActiveMasterInitialization()
[ https://issues.apache.org/jira/browse/HBASE-18261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16063951#comment-16063951 ] stack commented on HBASE-18261: --- This patch is great. Belongs in a new issue, a subprocedure? Make new JIRA and I'll push it. Nice one [~uagashe] > [AMv2] Create new RecoverMetaProcedure and use it from ServerCrashProcedure > and HMaster.finishActiveMasterInitialization() > -- > > Key: HBASE-18261 > URL: https://issues.apache.org/jira/browse/HBASE-18261 > Project: HBase > Issue Type: Improvement > Components: amv2 >Affects Versions: 2.0.0-alpha-1 >Reporter: Umesh Agashe >Assignee: Umesh Agashe > Fix For: 2.0.0-alpha-2 > > Attachments: HBASE-18261.master.001.patch > > > When unit test > hbase.master.procedure.TestServerCrashProcedure#testRecoveryAndDoubleExecutionOnRsWithMeta() > is enabled and run several times, it fails intermittently. Cause is meta > recovery is done at two different places: > * ServerCrashProcedure.processMeta() > * HMaster.finishActiveMasterInitialization() > and its not coordinated. > When HMaster.finishActiveMasterInitialization() gets to submit splitMetaLog() > first and while its running call from ServerCrashProcedure.processMeta() > fails causing step to be retried again in a loop. > When ServerCrashProcedure.processMeta() submits splitMetaLog after > splitMetaLog from HMaster.finishActiveMasterInitialization() is finished, > success is returned without doing any work. > But if ServerCrashProcedure.processMeta() submits splitMetaLog request and > while its going HMaster.finishActiveMasterInitialization() submits it test > fails with exception. > [~stack] and I discussed the possible solution: > Create RecoverMetaProcedure and call it where required. Procedure framework > provides mutual exclusion and requires idempotence, which should fix the > problem. -- This message was sent by Atlassian JIRA (v6.4.14#64029)