[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck
[ https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16510346#comment-16510346 ] Hudson commented on HBASE-20700: Results for branch branch-2.0 [build #421 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/421/]: (/) *{color:green}+1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/421//General_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/421//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/421//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > Move meta region when server crash can cause the procedure to be stuck > -- > > Key: HBASE-20700 > URL: https://issues.apache.org/jira/browse/HBASE-20700 > Project: HBase > Issue Type: Sub-task > Components: master, proc-v2, Region Assignment >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0, 2.1.0, 2.0.1 > > Attachments: HBASE-20700-UT.patch, HBASE-20700-branch-2.0-v1.patch, > HBASE-20700-branch-2.0-v1.patch, HBASE-20700-branch-2.0.patch, > HBASE-20700-v1.patch, HBASE-20700-v2.patch, HBASE-20700.patch > > > As said in HBASE-20682. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck
[ https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16509185#comment-16509185 ] Hadoop QA commented on HBASE-20700: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 6 new or modified test files. {color} | || || || || {color:brown} branch-2.0 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 24s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 40s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 0s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 27s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 12s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 41s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s{color} | {color:green} branch-2.0 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 15s{color} | {color:green} The patch hbase-procedure passed checkstyle {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 13s{color} | {color:green} hbase-server: The patch generated 0 new + 274 unchanged - 4 fixed = 274 total (was 278) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 22s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 11m 38s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 4s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 45s{color} | {color:green} hbase-procedure in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green}124m 16s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 38s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}171m 0s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:6f01af0 | | JIRA Issue | HBASE-20700 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12927415/HBASE-20700-branch-2.0-v1.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux c20df5b21f39 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/je
[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck
[ https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16508321#comment-16508321 ] Hudson commented on HBASE-20700: Results for branch branch-2 [build #850 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/850/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/850//General_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/850//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/850//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > Move meta region when server crash can cause the procedure to be stuck > -- > > Key: HBASE-20700 > URL: https://issues.apache.org/jira/browse/HBASE-20700 > Project: HBase > Issue Type: Sub-task > Components: master, proc-v2, Region Assignment >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0, 2.1.0, 2.0.1 > > Attachments: HBASE-20700-UT.patch, HBASE-20700-branch-2.0-v1.patch, > HBASE-20700-branch-2.0.patch, HBASE-20700-v1.patch, HBASE-20700-v2.patch, > HBASE-20700.patch > > > As said in HBASE-20682. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck
[ https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16508255#comment-16508255 ] Hadoop QA commented on HBASE-20700: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 6 new or modified test files. {color} | || || || || {color:brown} branch-2.0 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 21s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 16s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 47s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 13s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 40s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 14s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 39s{color} | {color:green} branch-2.0 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 13s{color} | {color:green} The patch hbase-procedure passed checkstyle {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 2s{color} | {color:green} hbase-server: The patch generated 0 new + 274 unchanged - 4 fixed = 274 total (was 278) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 3m 32s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 10m 3s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 25s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 34s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 32s{color} | {color:green} hbase-procedure in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}106m 29s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 27s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}146m 32s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.master.procedure.TestProcedurePriority | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:6f01af0 | | JIRA Issue | HBASE-20700 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12927308/HBASE-20700-branch-2.0-v1.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 2e6d02e6ddaf 4.4.0-43-generic #63-Ubuntu SMP Wed
[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck
[ https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16508150#comment-16508150 ] Hudson commented on HBASE-20700: Results for branch master [build #362 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/362/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/master/362//General_Nightly_Build_Report/] (/) {color:green}+1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/master/362//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/master/362//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. > Move meta region when server crash can cause the procedure to be stuck > -- > > Key: HBASE-20700 > URL: https://issues.apache.org/jira/browse/HBASE-20700 > Project: HBase > Issue Type: Sub-task > Components: master, proc-v2, Region Assignment >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0, 2.1.0, 2.0.1 > > Attachments: HBASE-20700-UT.patch, HBASE-20700-branch-2.0-v1.patch, > HBASE-20700-branch-2.0.patch, HBASE-20700-v1.patch, HBASE-20700-v2.patch, > HBASE-20700.patch > > > As said in HBASE-20682. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck
[ https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16507908#comment-16507908 ] Hadoop QA commented on HBASE-20700: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 6 new or modified test files. {color} | || || || || {color:brown} branch-2.0 Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 19s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 57s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 9s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 27s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 11s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 31s{color} | {color:green} branch-2.0 passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s{color} | {color:green} branch-2.0 passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 12s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 58s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 9s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 1m 47s{color} | {color:red} hbase-server generated 1 new + 187 unchanged - 1 fixed = 188 total (was 188) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s{color} | {color:green} The patch hbase-procedure passed checkstyle {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 15s{color} | {color:green} hbase-server: The patch generated 0 new + 274 unchanged - 4 fixed = 274 total (was 278) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 11s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 11m 34s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 49s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 44s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 37s{color} | {color:green} hbase-procedure in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}121m 2s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 40s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}168m 25s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.master.procedure.TestMasterFailoverWithProcedures | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:6f01af0 | | JIRA Issue | HBASE-20700 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12927264/HBASE-20700-branch-2.0.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | u
[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck
[ https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16507731#comment-16507731 ] Duo Zhang commented on HBASE-20700: --- Thanks sir. Let me commit to master and branch-2 first. 2.0 needs a separated patch as it does not have a peer queue. > Move meta region when server crash can cause the procedure to be stuck > -- > > Key: HBASE-20700 > URL: https://issues.apache.org/jira/browse/HBASE-20700 > Project: HBase > Issue Type: Sub-task > Components: master, proc-v2, Region Assignment >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0, 2.1.0, 2.0.1 > > Attachments: HBASE-20700-UT.patch, HBASE-20700-v1.patch, > HBASE-20700-v2.patch, HBASE-20700.patch > > > As said in HBASE-20682. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck
[ https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16507650#comment-16507650 ] stack commented on HBASE-20700: --- None. You answered my concerns. Skimmed patch +1 (+1 for branch-2.0 too. Thanks). > Move meta region when server crash can cause the procedure to be stuck > -- > > Key: HBASE-20700 > URL: https://issues.apache.org/jira/browse/HBASE-20700 > Project: HBase > Issue Type: Sub-task > Components: master, proc-v2, Region Assignment >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0, 2.1.0, 2.0.1 > > Attachments: HBASE-20700-UT.patch, HBASE-20700-v1.patch, > HBASE-20700-v2.patch, HBASE-20700.patch > > > As said in HBASE-20682. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck
[ https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506885#comment-16506885 ] Duo Zhang commented on HBASE-20700: --- Any other concerns? [~stack] Thanks. > Move meta region when server crash can cause the procedure to be stuck > -- > > Key: HBASE-20700 > URL: https://issues.apache.org/jira/browse/HBASE-20700 > Project: HBase > Issue Type: Sub-task > Components: master, proc-v2, Region Assignment >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0, 2.1.0, 2.0.1 > > Attachments: HBASE-20700-UT.patch, HBASE-20700-v1.patch, > HBASE-20700-v2.patch, HBASE-20700.patch > > > As said in HBASE-20682. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck
[ https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506829#comment-16506829 ] Hadoop QA commented on HBASE-20700: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 1s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 6 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 25s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 42s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 26s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 41s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 5m 25s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 51s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 57s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 36s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 14s{color} | {color:green} The patch hbase-procedure passed checkstyle {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 24s{color} | {color:green} hbase-server: The patch generated 0 new + 280 unchanged - 4 fixed = 280 total (was 284) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 5m 44s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 11m 10s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 45s{color} | {color:green} hbase-procedure in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}109m 41s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 38s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}164m 47s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-20700 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12927150/HBASE-20700-v2.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 9eac7ba0c1a2 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@
[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck
[ https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506795#comment-16506795 ] Duo Zhang commented on HBASE-20700: --- {quote} I worry the below check of ONLINE. Is it too specific? 971 if (serverNode.isInState(ServerState.SPLITTING, ServerState.OFFLINE)) { 972 if (!serverNode.isInState(ServerState.ONLINE)) { We can see I suppose. Would be good if we could get away with it. {quote} I think this is the common case? If the server is not in state ONLINE then it means there is a SCP for it which means it has already crashed... {quote} I'm wary of calls to this method below settting server state inside setServerState because it will create the server node if it doesn't exist (It may not exist because it has been processed by SCP). If we call the below after SCP is done w/ it, the server comes back to life. You sure we will not do this? {quote} These methods will only be called in SCP, and at the end of SCP we will call removeServer to remove the ServerStateNode. Let me add some comments. {quote} What is the lifecycle for a server node now? ONLINE => SPLITTING => OFFLINE is what it used to be. It can still do this? But it can also go ONLINE => META_SPLITTING => META_SPLITTING_DONE => SPLITTING => OFFLINE? We might want to not this somewhere. Not obvious. {quote} If not carrying meta then ONLINE=>SPLITTING=>OFFLINE, otherwise ONLINE=>META_SPLITTING_META_SPLITTING_DONE=>SPLITTING=>OFFLINE. I've added comments in UnassignProcedure to say why we need these state. We can only fail an unassign after we make sure that the log splitting is finished, otherwise we may schedule an AssignProcedure which will cause data loss. And for unassign meta, the SCP will wait until the RMP is finished before splitting other logs, so if we do not introduce special states for meta splitting, we will stuck there forever... {quote} Oh... this is interesting adding the synchronized public synchronized void remoteCallFailed(final MasterProcedureEnv env, ... Up to this we've been synchronizing on the objects whose state we change. What you thinking by adding the synchronize? I can't see anything wrong w/ it. {quote} It could be called in two places, one is from the RemoteProcedureScheduler, where the remote call is failed, and the other is from SCP or RMP's handleRIT, I think there is no strong guarantee that they will not happen at the same time so it is better to add a synchronized on the method... {quote} If MoveRegionProcedure gets scheduled before RecoverMetaProcedure, what happens now? {quote} Now the RMP will not hold the same lock with MRP, so it could break the execution of UnassignProcedure scheduled by MRP. And also, if the UnassignProcedure is scheduled after we calling handleRIT, when calling isLogSplittingDone method in remoteCallFailed, it will find that the meta log splitting has already been done and give up. So there will be no dead lock any more. {quote} s/MetaProcedureInterface/MetaProcedure/ {quote} Just follow the patterns, we have TableProcedureInterface, RegionProcedureInterface, ServerProcedureInterface, etc. > Move meta region when server crash can cause the procedure to be stuck > -- > > Key: HBASE-20700 > URL: https://issues.apache.org/jira/browse/HBASE-20700 > Project: HBase > Issue Type: Sub-task > Components: master, proc-v2, Region Assignment >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0, 2.1.0, 2.0.1 > > Attachments: HBASE-20700-UT.patch, HBASE-20700-v1.patch, > HBASE-20700.patch > > > As said in HBASE-20682. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck
[ https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506240#comment-16506240 ] Hadoop QA commented on HBASE-20700: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 14s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 1s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 6 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 24s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 44s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 1s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 25s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 51s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 22s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 43s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 47s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 3s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 12s{color} | {color:green} The patch hbase-procedure passed checkstyle {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 13s{color} | {color:green} hbase-server: The patch generated 0 new + 280 unchanged - 4 fixed = 280 total (was 284) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 46s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 9m 55s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 31s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 41s{color} | {color:green} hbase-procedure in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green}115m 33s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 42s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}162m 48s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-20700 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12927050/HBASE-20700-v1.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 9578c0a80b1e 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HB
[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck
[ https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506235#comment-16506235 ] stack commented on HBASE-20700: --- Looking at patch I worry the below check of ONLINE. Is it too specific? 971 if (serverNode.isInState(ServerState.SPLITTING, ServerState.OFFLINE)) { 972 if (!serverNode.isInState(ServerState.ONLINE)) { We can see I suppose. Would be good if we could get away with it. I'm wary of calls to this method below settting server state inside setServerState because it will create the server node if it doesn't exist (It may not exist because it has been processed by SCP). If we call the below after SCP is done w/ it, the server comes back to life. You sure we will not do this? ServerStateNode serverNode = getOrCreateServer(serverName); What is the lifecycle for a server node now? ONLINE => SPLITTING => OFFLINE is what it used to be. It can still do this? But it can also go ONLINE => META_SPLITTING => META_SPLITTING_DONE => SPLITTING => OFFLINE? We might want to not this somewhere. Not obvious. Oh... this is interesting adding the synchronized public synchronized void remoteCallFailed(final MasterProcedureEnv env, ... Up to this we've been synchronizing on the objects whose state we change. What you thinking by adding the synchronize? I can't see anything wrong w/ it. If MoveRegionProcedure gets scheduled before RecoverMetaProcedure, what happens now? No need of evolving if private 23 @InterfaceAudience.Private 24 @InterfaceStability.Evolving s/MetaProcedureInterface/MetaProcedure/ getMetaOperationType is not used? but makes sense I suppose. You are following pattern. Otherwise, nice cleanup and appreciate the doc -- especially the edit by another. > Move meta region when server crash can cause the procedure to be stuck > -- > > Key: HBASE-20700 > URL: https://issues.apache.org/jira/browse/HBASE-20700 > Project: HBase > Issue Type: Sub-task > Components: master, proc-v2, Region Assignment >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0, 2.1.0, 2.0.1 > > Attachments: HBASE-20700-UT.patch, HBASE-20700-v1.patch, > HBASE-20700.patch > > > As said in HBASE-20682. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck
[ https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506189#comment-16506189 ] stack commented on HBASE-20700: --- bq. Of course the logic here is not clear enough I'd say and there maybe races, we can file new issue to fix it. Yes. Ok. bq. And for the new lock, we need to prevent two RMPs run at the same time so I think we need it... This makes sense. > Move meta region when server crash can cause the procedure to be stuck > -- > > Key: HBASE-20700 > URL: https://issues.apache.org/jira/browse/HBASE-20700 > Project: HBase > Issue Type: Sub-task > Components: master, proc-v2, Region Assignment >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0, 2.1.0, 2.0.1 > > Attachments: HBASE-20700-UT.patch, HBASE-20700-v1.patch, > HBASE-20700.patch > > > As said in HBASE-20682. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck
[ https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506036#comment-16506036 ] Duo Zhang commented on HBASE-20700: --- Yes, the new 'meta' queue will always be served first. And in RecoverMetaProcedure we will do the log splitting work and assign meta, so ideally there is no problem if the meta is on a crash RS before. Of course the logic here is not clear enough I'd say and there maybe races, we can file new issue to fix it. And for the new lock, we need to prevent two RMPs run at the same time so I think we need it... > Move meta region when server crash can cause the procedure to be stuck > -- > > Key: HBASE-20700 > URL: https://issues.apache.org/jira/browse/HBASE-20700 > Project: HBase > Issue Type: Sub-task > Components: master, proc-v2, Region Assignment >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0, 2.1.0, 2.0.1 > > Attachments: HBASE-20700-UT.patch, HBASE-20700.patch > > > As said in HBASE-20682. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck
[ https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506012#comment-16506012 ] stack commented on HBASE-20700: --- bq. ... and after the meta region is online, we disable the flag and then everything back to normal. What happens on crash of a server that was carrying hbase:meta? On new queue, do we favor it? Does it get serviced before all others? Do we slow the scheduler? On the new type of lock, do we even need it if only RMP knows of it? Thanks D. > Move meta region when server crash can cause the procedure to be stuck > -- > > Key: HBASE-20700 > URL: https://issues.apache.org/jira/browse/HBASE-20700 > Project: HBase > Issue Type: Sub-task > Components: master, proc-v2, Region Assignment >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0, 2.1.0, 2.0.1 > > Attachments: HBASE-20700-UT.patch, HBASE-20700.patch > > > As said in HBASE-20682. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck
[ https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16505978#comment-16505978 ] Hadoop QA commented on HBASE-20700: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 6 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s{color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 36s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 1s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 26s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 49s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 24s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s{color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 40s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 2m 2s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 2m 2s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 12s{color} | {color:red} hbase-server: The patch generated 1 new + 280 unchanged - 4 fixed = 281 total (was 284) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 47s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 9m 49s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green} 2m 37s{color} | {color:green} hbase-procedure in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red}110m 13s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 38s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}157m 3s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.master.TestServerCrashProcedureCarryingMetaStuck | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-20700 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12927035/HBASE-20700.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 179178fdc30f 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 14:43:09 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revis
[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck
[ https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16505904#comment-16505904 ] Duo Zhang commented on HBASE-20700: --- Review board link: https://reviews.apache.org/r/67500/ > Move meta region when server crash can cause the procedure to be stuck > -- > > Key: HBASE-20700 > URL: https://issues.apache.org/jira/browse/HBASE-20700 > Project: HBase > Issue Type: Sub-task > Components: master, proc-v2, Region Assignment >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0, 2.1.0, 2.0.1 > > Attachments: HBASE-20700-UT.patch, HBASE-20700.patch > > > As said in HBASE-20682. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck
[ https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16505842#comment-16505842 ] Duo Zhang commented on HBASE-20700: --- OK #3 does not work because RecoverMetaProcedure will schedule sub procedures to assign meta region. Since #1 and #2 are enough to make the above UT pass, let me post the patch here. Will open a new issues to make the start up clearer. > Move meta region when server crash can cause the procedure to be stuck > -- > > Key: HBASE-20700 > URL: https://issues.apache.org/jira/browse/HBASE-20700 > Project: HBase > Issue Type: Sub-task > Components: master, proc-v2, Region Assignment >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0, 2.1.0, 2.0.1 > > Attachments: HBASE-20700-UT.patch > > > As said in HBASE-20682. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck
[ https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16505792#comment-16505792 ] Duo Zhang commented on HBASE-20700: --- OK seems letting meta online first can make life much easier... So I think we could do the following: 1. Introduce a new type of queue in MasterProcedureScheduler called meta, only RecoverMetaProcedure can be put into this queue. 2. Introduce a new type of lock for RMP, do not use table lock any more so that RMP will not be blocked by MRP. 3. MasterProcedureScheduler will be started with a flag which only enable polling procedures out from meta queue, and after the meta region is online, we disable the flag and then everything back to normal. > Move meta region when server crash can cause the procedure to be stuck > -- > > Key: HBASE-20700 > URL: https://issues.apache.org/jira/browse/HBASE-20700 > Project: HBase > Issue Type: Sub-task > Components: master, proc-v2, Region Assignment >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0, 2.1.0, 2.0.1 > > Attachments: HBASE-20700-UT.patch > > > As said in HBASE-20682. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck
[ https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16505674#comment-16505674 ] stack commented on HBASE-20700: --- Agree with your reasoning even that RMP is strange. It has in it all needed to onilne the meta region including recovery (meta recovery is not like other recovery -- it has its own dedicated WALs so it can be onlined before any other region). We run RMP always, because it looks for WALs to split always... its hard to discern it a clean startup from a messy one, just so there just the one way only of onlining meta. SCP fires off an RMP when it notices the crashed server was carrying meta. bq. Oh for meta there is another problem... The RecoverMetaProcedure will hold the exclusive lock for the meta table, and since the MRP for meta will hold the shared lock on meta table so the RecoverMetaProcedure can not be executed... This is a problem though. I need to run the unit test to manufacture the condition? MRP and hbase:meta needs particular treatment? Its not like any other region. It has to be online for all other stuff to work... .so RMP should have precedence. > Move meta region when server crash can cause the procedure to be stuck > -- > > Key: HBASE-20700 > URL: https://issues.apache.org/jira/browse/HBASE-20700 > Project: HBase > Issue Type: Sub-task >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Attachments: HBASE-20700-UT.patch > > > As said in HBASE-20682. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck
[ https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16505654#comment-16505654 ] Duo Zhang commented on HBASE-20700: --- Oh for meta there is another problem... The RecoverMetaProcedure will hold the exclusive lock for the meta table, and since the MRP for meta will hold the shared lock on meta table so the RecoverMetaProcedure can not be executed... This is not correct I believe. In SCP we do not hold any table/region lock so that we are free to execute and then we can fail other RIT procedures to let our assign procedures go. For me, the RecoverMetaProcedure is a bit strange. In general, if an RS is crashed then we will have a SCP for it and if it carries meta then we will assign meta somewhere else finally. When master start up, we just need to wait until the RS is online and do not need to mess up the recovery processing... > Move meta region when server crash can cause the procedure to be stuck > -- > > Key: HBASE-20700 > URL: https://issues.apache.org/jira/browse/HBASE-20700 > Project: HBase > Issue Type: Sub-task >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Attachments: HBASE-20700-UT.patch > > > As said in HBASE-20682. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck
[ https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16505537#comment-16505537 ] Duo Zhang commented on HBASE-20700: --- I could do it. And I plan to add more comments in MRP and SCP to say why the current approach works, for example which operation is the fencing point. Thanks. > Move meta region when server crash can cause the procedure to be stuck > -- > > Key: HBASE-20700 > URL: https://issues.apache.org/jira/browse/HBASE-20700 > Project: HBase > Issue Type: Sub-task >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Attachments: HBASE-20700-UT.patch > > > As said in HBASE-20682. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck
[ https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16504891#comment-16504891 ] stack commented on HBASE-20700: --- Want me to take it [~Apache9]? > Move meta region when server crash can cause the procedure to be stuck > -- > > Key: HBASE-20700 > URL: https://issues.apache.org/jira/browse/HBASE-20700 > Project: HBase > Issue Type: Sub-task >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Major > Attachments: HBASE-20700-UT.patch > > > As said in HBASE-20682. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck
[ https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16504700#comment-16504700 ] Duo Zhang commented on HBASE-20700: --- Very similar with the one in HBASE-20634... Will prepare a patch to fix it tomorrow. I think we need to add a new state called SPLIT_META_DONE, and when unassign meta region, we should check for this state instead of OFFLINE. [~stack] FYI. > Move meta region when server crash can cause the procedure to be stuck > -- > > Key: HBASE-20700 > URL: https://issues.apache.org/jira/browse/HBASE-20700 > Project: HBase > Issue Type: Sub-task >Reporter: Duo Zhang >Priority: Major > Attachments: HBASE-20700-UT.patch > > > As said in HBASE-20682. -- This message was sent by Atlassian JIRA (v7.6.3#76005)