[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck

2018-06-12 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16510346#comment-16510346
 ] 

Hudson commented on HBASE-20700:


Results for branch branch-2.0
[build #421 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/421/]: 
(/) *{color:green}+1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/421//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/421//JDK8_Nightly_Build_Report_(Hadoop2)/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/421//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Move meta region when server crash can cause the procedure to be stuck
> --
>
> Key: HBASE-20700
> URL: https://issues.apache.org/jira/browse/HBASE-20700
> Project: HBase
>  Issue Type: Sub-task
>  Components: master, proc-v2, Region Assignment
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.1.0, 2.0.1
>
> Attachments: HBASE-20700-UT.patch, HBASE-20700-branch-2.0-v1.patch, 
> HBASE-20700-branch-2.0-v1.patch, HBASE-20700-branch-2.0.patch, 
> HBASE-20700-v1.patch, HBASE-20700-v2.patch, HBASE-20700.patch
>
>
> As said in HBASE-20682.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck

2018-06-11 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16509185#comment-16509185
 ] 

Hadoop QA commented on HBASE-20700:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.0 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
24s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
40s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
0s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
27s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
12s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
41s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} branch-2.0 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
15s{color} | {color:green} The patch hbase-procedure passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
13s{color} | {color:green} hbase-server: The patch generated 0 new + 274 
unchanged - 4 fixed = 274 total (was 278) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
22s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m 38s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
45s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}124m 
16s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
38s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}171m  0s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:6f01af0 |
| JIRA Issue | HBASE-20700 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12927415/HBASE-20700-branch-2.0-v1.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux c20df5b21f39 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/je

[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck

2018-06-11 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16508321#comment-16508321
 ] 

Hudson commented on HBASE-20700:


Results for branch branch-2
[build #850 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/850/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/850//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/850//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/850//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Move meta region when server crash can cause the procedure to be stuck
> --
>
> Key: HBASE-20700
> URL: https://issues.apache.org/jira/browse/HBASE-20700
> Project: HBase
>  Issue Type: Sub-task
>  Components: master, proc-v2, Region Assignment
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.1.0, 2.0.1
>
> Attachments: HBASE-20700-UT.patch, HBASE-20700-branch-2.0-v1.patch, 
> HBASE-20700-branch-2.0.patch, HBASE-20700-v1.patch, HBASE-20700-v2.patch, 
> HBASE-20700.patch
>
>
> As said in HBASE-20682.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck

2018-06-11 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16508255#comment-16508255
 ] 

Hadoop QA commented on HBASE-20700:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.0 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
21s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
16s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
47s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
13s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
40s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
14s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
39s{color} | {color:green} branch-2.0 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} The patch hbase-procedure passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 2s{color} | {color:green} hbase-server: The patch generated 0 new + 274 
unchanged - 4 fixed = 274 total (was 278) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
32s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m  3s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
32s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}106m 29s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
27s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}146m 32s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.master.procedure.TestProcedurePriority |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:6f01af0 |
| JIRA Issue | HBASE-20700 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12927308/HBASE-20700-branch-2.0-v1.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 2e6d02e6ddaf 4.4.0-43-generic #63-Ubuntu SMP Wed 

[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck

2018-06-11 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16508150#comment-16508150
 ] 

Hudson commented on HBASE-20700:


Results for branch master
[build #362 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/362/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/362//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/362//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/362//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Move meta region when server crash can cause the procedure to be stuck
> --
>
> Key: HBASE-20700
> URL: https://issues.apache.org/jira/browse/HBASE-20700
> Project: HBase
>  Issue Type: Sub-task
>  Components: master, proc-v2, Region Assignment
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.1.0, 2.0.1
>
> Attachments: HBASE-20700-UT.patch, HBASE-20700-branch-2.0-v1.patch, 
> HBASE-20700-branch-2.0.patch, HBASE-20700-v1.patch, HBASE-20700-v2.patch, 
> HBASE-20700.patch
>
>
> As said in HBASE-20682.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck

2018-06-11 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16507908#comment-16507908
 ] 

Hadoop QA commented on HBASE-20700:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.0 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
19s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
57s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
9s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
27s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
11s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
31s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} branch-2.0 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
12s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
9s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  1m 47s{color} 
| {color:red} hbase-server generated 1 new + 187 unchanged - 1 fixed = 188 
total (was 188) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} The patch hbase-procedure passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
15s{color} | {color:green} hbase-server: The patch generated 0 new + 274 
unchanged - 4 fixed = 274 total (was 278) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
11s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m 34s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
44s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
37s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}121m  2s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
40s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}168m 25s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hbase.master.procedure.TestMasterFailoverWithProcedures |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:6f01af0 |
| JIRA Issue | HBASE-20700 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12927264/HBASE-20700-branch-2.0.patch
 |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| u

[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck

2018-06-10 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16507731#comment-16507731
 ] 

Duo Zhang commented on HBASE-20700:
---

Thanks sir. Let me commit to master and branch-2 first. 2.0 needs a separated 
patch as it does not have a peer queue.

> Move meta region when server crash can cause the procedure to be stuck
> --
>
> Key: HBASE-20700
> URL: https://issues.apache.org/jira/browse/HBASE-20700
> Project: HBase
>  Issue Type: Sub-task
>  Components: master, proc-v2, Region Assignment
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.1.0, 2.0.1
>
> Attachments: HBASE-20700-UT.patch, HBASE-20700-v1.patch, 
> HBASE-20700-v2.patch, HBASE-20700.patch
>
>
> As said in HBASE-20682.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck

2018-06-10 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16507650#comment-16507650
 ] 

stack commented on HBASE-20700:
---

None. You answered my concerns. Skimmed patch +1 (+1 for branch-2.0 too. 
Thanks).

> Move meta region when server crash can cause the procedure to be stuck
> --
>
> Key: HBASE-20700
> URL: https://issues.apache.org/jira/browse/HBASE-20700
> Project: HBase
>  Issue Type: Sub-task
>  Components: master, proc-v2, Region Assignment
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.1.0, 2.0.1
>
> Attachments: HBASE-20700-UT.patch, HBASE-20700-v1.patch, 
> HBASE-20700-v2.patch, HBASE-20700.patch
>
>
> As said in HBASE-20682.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck

2018-06-09 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506885#comment-16506885
 ] 

Duo Zhang commented on HBASE-20700:
---

Any other concerns? [~stack] Thanks.

> Move meta region when server crash can cause the procedure to be stuck
> --
>
> Key: HBASE-20700
> URL: https://issues.apache.org/jira/browse/HBASE-20700
> Project: HBase
>  Issue Type: Sub-task
>  Components: master, proc-v2, Region Assignment
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.1.0, 2.0.1
>
> Attachments: HBASE-20700-UT.patch, HBASE-20700-v1.patch, 
> HBASE-20700-v2.patch, HBASE-20700.patch
>
>
> As said in HBASE-20682.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck

2018-06-08 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506829#comment-16506829
 ] 

Hadoop QA commented on HBASE-20700:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
1s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
42s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
26s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
41s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
25s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
51s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
57s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} The patch hbase-procedure passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
24s{color} | {color:green} hbase-server: The patch generated 0 new + 280 
unchanged - 4 fixed = 280 total (was 284) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
44s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m 10s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
45s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}109m 41s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
38s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}164m 47s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20700 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12927150/HBASE-20700-v2.patch |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 9eac7ba0c1a2 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@

[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck

2018-06-08 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506795#comment-16506795
 ] 

Duo Zhang commented on HBASE-20700:
---

{quote}
I worry the below check of ONLINE. Is it too specific?

971 if (serverNode.isInState(ServerState.SPLITTING, ServerState.OFFLINE)) { 
972 if (!serverNode.isInState(ServerState.ONLINE)) {

We can see I suppose. Would be good if we could get away with it.
{quote}
I think this is the common case? If the server is not in state ONLINE then it 
means there is a SCP for it which means it has already crashed...

{quote}
I'm wary of calls to this method below settting server state inside 
setServerState because it will create the server node if it doesn't exist (It 
may not exist because it has been processed by SCP). If we call the below after 
SCP is done w/ it, the server comes back to life. You sure we will not do this?
{quote}
These methods will only be called in SCP, and at the end of SCP we will call 
removeServer to remove the ServerStateNode. Let me add some comments.

{quote}
What is the lifecycle for a server node now? ONLINE => SPLITTING => OFFLINE is 
what it used to be. It can still do this? But it can also go ONLINE => 
META_SPLITTING => META_SPLITTING_DONE => SPLITTING => OFFLINE? We might want to 
not this somewhere. Not obvious.
{quote}
If not carrying meta then ONLINE=>SPLITTING=>OFFLINE, otherwise 
ONLINE=>META_SPLITTING_META_SPLITTING_DONE=>SPLITTING=>OFFLINE.
I've added comments in UnassignProcedure to say why we need these state. We can 
only fail an unassign after we make sure that the log splitting is finished, 
otherwise we may schedule an AssignProcedure which will cause data loss. And 
for unassign meta, the SCP will wait until the RMP is finished before splitting 
other logs, so if we do not introduce special states for meta splitting, we 
will stuck there forever...

{quote}
Oh... this is interesting adding the synchronized

public synchronized void remoteCallFailed(final MasterProcedureEnv env,

... Up to this we've been synchronizing on the objects whose state we change. 
What you thinking by adding the synchronize? I can't see anything wrong w/ 
it.
{quote}
It could be called in two places, one is from the RemoteProcedureScheduler, 
where the remote call is failed, and the other is from SCP or RMP's handleRIT, 
I think there is no strong guarantee that they will not happen at the same time 
so it is better to add a synchronized on the method...

{quote}
If MoveRegionProcedure gets scheduled before RecoverMetaProcedure, what happens 
now?
{quote}
Now the RMP will not hold the same lock with MRP, so it could break the 
execution of UnassignProcedure scheduled by MRP. And also, if the 
UnassignProcedure is scheduled after we calling handleRIT, when calling 
isLogSplittingDone method in remoteCallFailed, it will find that the meta log 
splitting has already been done and give up. So there will be no dead lock any 
more.

{quote}
s/MetaProcedureInterface/MetaProcedure/
{quote}

Just follow the patterns, we have TableProcedureInterface, 
RegionProcedureInterface, ServerProcedureInterface, etc.




> Move meta region when server crash can cause the procedure to be stuck
> --
>
> Key: HBASE-20700
> URL: https://issues.apache.org/jira/browse/HBASE-20700
> Project: HBase
>  Issue Type: Sub-task
>  Components: master, proc-v2, Region Assignment
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.1.0, 2.0.1
>
> Attachments: HBASE-20700-UT.patch, HBASE-20700-v1.patch, 
> HBASE-20700.patch
>
>
> As said in HBASE-20682.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck

2018-06-08 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506240#comment-16506240
 ] 

Hadoop QA commented on HBASE-20700:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
1s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
24s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
44s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
1s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
25s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
51s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
22s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
43s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
47s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} The patch hbase-procedure passed checkstyle {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
13s{color} | {color:green} hbase-server: The patch generated 0 new + 280 
unchanged - 4 fixed = 280 total (was 284) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
46s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m 55s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
31s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
41s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}115m 
33s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
42s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}162m 48s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20700 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12927050/HBASE-20700-v1.patch |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 9578c0a80b1e 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HB

[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck

2018-06-08 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506235#comment-16506235
 ] 

stack commented on HBASE-20700:
---

Looking at patch

I worry the below check of ONLINE. Is it too specific?

971   if (serverNode.isInState(ServerState.SPLITTING, 
ServerState.OFFLINE)) {   972   if 
(!serverNode.isInState(ServerState.ONLINE)) {

We can see I suppose. Would be good if we could get away with it.

I'm wary of calls to this method below settting server state inside 
setServerState because it will create the server node if it doesn't exist (It 
may not exist because it has been processed by SCP). If we call the below after 
SCP is done w/ it, the server comes back to life. You sure we will not do this?

 ServerStateNode serverNode = getOrCreateServer(serverName);

What is the lifecycle for a server node now? ONLINE => SPLITTING => OFFLINE is 
what it used to be. It can still do this? But it can also go ONLINE => 
META_SPLITTING => META_SPLITTING_DONE => SPLITTING => OFFLINE? We might want to 
not this somewhere. Not obvious.

Oh... this is interesting adding the synchronized

public synchronized void remoteCallFailed(final MasterProcedureEnv env,

... Up to this we've been synchronizing on the objects whose state we change. 
What you thinking by adding the synchronize? I can't see anything wrong w/ 
it.

If MoveRegionProcedure gets scheduled before RecoverMetaProcedure, what happens 
now?

No need of evolving if private

23  @InterfaceAudience.Private
24  @InterfaceStability.Evolving

s/MetaProcedureInterface/MetaProcedure/

getMetaOperationType is not used?  but makes sense I suppose. You are 
following pattern.

Otherwise, nice cleanup and appreciate the doc -- especially the edit by 
another.

> Move meta region when server crash can cause the procedure to be stuck
> --
>
> Key: HBASE-20700
> URL: https://issues.apache.org/jira/browse/HBASE-20700
> Project: HBase
>  Issue Type: Sub-task
>  Components: master, proc-v2, Region Assignment
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.1.0, 2.0.1
>
> Attachments: HBASE-20700-UT.patch, HBASE-20700-v1.patch, 
> HBASE-20700.patch
>
>
> As said in HBASE-20682.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck

2018-06-08 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506189#comment-16506189
 ] 

stack commented on HBASE-20700:
---

bq.  Of course the logic here is not clear enough I'd say and there maybe 
races, we can file new issue to fix it.

Yes. Ok.

bq. And for the new lock, we need to prevent two RMPs run at the same time so I 
think we need it...

This makes sense.



> Move meta region when server crash can cause the procedure to be stuck
> --
>
> Key: HBASE-20700
> URL: https://issues.apache.org/jira/browse/HBASE-20700
> Project: HBase
>  Issue Type: Sub-task
>  Components: master, proc-v2, Region Assignment
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.1.0, 2.0.1
>
> Attachments: HBASE-20700-UT.patch, HBASE-20700-v1.patch, 
> HBASE-20700.patch
>
>
> As said in HBASE-20682.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck

2018-06-08 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506036#comment-16506036
 ] 

Duo Zhang commented on HBASE-20700:
---

Yes, the new 'meta' queue will always be served first.

And in RecoverMetaProcedure we will do the log splitting work and assign meta, 
so ideally there is no problem if the meta is on a crash RS before. Of course 
the logic here is not clear enough I'd say and there maybe races, we can file 
new issue to fix it.

And for the new lock, we need to prevent two RMPs run at the same time so I 
think we need it...

> Move meta region when server crash can cause the procedure to be stuck
> --
>
> Key: HBASE-20700
> URL: https://issues.apache.org/jira/browse/HBASE-20700
> Project: HBase
>  Issue Type: Sub-task
>  Components: master, proc-v2, Region Assignment
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.1.0, 2.0.1
>
> Attachments: HBASE-20700-UT.patch, HBASE-20700.patch
>
>
> As said in HBASE-20682.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck

2018-06-08 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16506012#comment-16506012
 ] 

stack commented on HBASE-20700:
---

bq. ... and after the meta region is online, we disable the flag and then 
everything back to normal.

What happens on crash of a server that was carrying hbase:meta?

On new queue, do we favor it? Does it get serviced before all others? Do we 
slow the scheduler?

On the new type of lock, do we even need it if only RMP knows of it?

Thanks D.

> Move meta region when server crash can cause the procedure to be stuck
> --
>
> Key: HBASE-20700
> URL: https://issues.apache.org/jira/browse/HBASE-20700
> Project: HBase
>  Issue Type: Sub-task
>  Components: master, proc-v2, Region Assignment
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.1.0, 2.0.1
>
> Attachments: HBASE-20700-UT.patch, HBASE-20700.patch
>
>
> As said in HBASE-20682.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck

2018-06-08 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16505978#comment-16505978
 ] 

Hadoop QA commented on HBASE-20700:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
15s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 6 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
36s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
1s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
26s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
49s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
24s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m  
2s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m  
2s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
12s{color} | {color:red} hbase-server: The patch generated 1 new + 280 
unchanged - 4 fixed = 281 total (was 284) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
47s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m 49s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
37s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}110m 13s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
38s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}157m  3s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hbase.master.TestServerCrashProcedureCarryingMetaStuck |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-20700 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12927035/HBASE-20700.patch |
| Optional Tests |  asflicense  javac  javadoc  unit  findbugs  shadedjars  
hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 179178fdc30f 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revis

[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck

2018-06-08 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16505904#comment-16505904
 ] 

Duo Zhang commented on HBASE-20700:
---

Review board link:

https://reviews.apache.org/r/67500/

> Move meta region when server crash can cause the procedure to be stuck
> --
>
> Key: HBASE-20700
> URL: https://issues.apache.org/jira/browse/HBASE-20700
> Project: HBase
>  Issue Type: Sub-task
>  Components: master, proc-v2, Region Assignment
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.1.0, 2.0.1
>
> Attachments: HBASE-20700-UT.patch, HBASE-20700.patch
>
>
> As said in HBASE-20682.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck

2018-06-08 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16505842#comment-16505842
 ] 

Duo Zhang commented on HBASE-20700:
---

OK #3 does not work because RecoverMetaProcedure will schedule sub procedures 
to assign meta region. Since #1 and #2 are enough to make the above UT pass, 
let me post the patch here. Will open a new issues to make the start up clearer.

> Move meta region when server crash can cause the procedure to be stuck
> --
>
> Key: HBASE-20700
> URL: https://issues.apache.org/jira/browse/HBASE-20700
> Project: HBase
>  Issue Type: Sub-task
>  Components: master, proc-v2, Region Assignment
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.1.0, 2.0.1
>
> Attachments: HBASE-20700-UT.patch
>
>
> As said in HBASE-20682.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck

2018-06-08 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16505792#comment-16505792
 ] 

Duo Zhang commented on HBASE-20700:
---

OK seems letting meta online first can make life much easier...

So I think we could do the following:

1. Introduce a new type of queue in MasterProcedureScheduler called meta, only 
RecoverMetaProcedure can be put into this queue.
2. Introduce a new type of lock for RMP, do not use table lock any more so that 
RMP will not be blocked by MRP.
3. MasterProcedureScheduler will be started with a flag which only enable 
polling procedures out from meta queue, and after the meta region is online, we 
disable the flag and then everything back to normal.

> Move meta region when server crash can cause the procedure to be stuck
> --
>
> Key: HBASE-20700
> URL: https://issues.apache.org/jira/browse/HBASE-20700
> Project: HBase
>  Issue Type: Sub-task
>  Components: master, proc-v2, Region Assignment
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Critical
> Fix For: 3.0.0, 2.1.0, 2.0.1
>
> Attachments: HBASE-20700-UT.patch
>
>
> As said in HBASE-20682.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck

2018-06-07 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16505674#comment-16505674
 ] 

stack commented on HBASE-20700:
---

Agree with your reasoning even that RMP is strange. It has in it all needed to 
onilne the meta region including recovery (meta recovery is not like other 
recovery -- it has its own dedicated WALs so it can be onlined before any other 
region). We run RMP always, because it looks for WALs to split always... its 
hard to discern it a clean startup from a messy one, just so there just the one 
way only of onlining meta. SCP fires off an RMP when it notices the crashed 
server was carrying meta.

bq. Oh for meta there is another problem... The RecoverMetaProcedure will hold 
the exclusive lock for the meta table, and since the MRP for meta will hold the 
shared lock on meta table so the RecoverMetaProcedure can not be executed...

This is a problem though. I need to run the unit test to manufacture the 
condition? MRP and hbase:meta needs particular treatment? Its not like any 
other region. It has to be online for all other stuff to work... .so RMP should 
have precedence.

> Move meta region when server crash can cause the procedure to be stuck
> --
>
> Key: HBASE-20700
> URL: https://issues.apache.org/jira/browse/HBASE-20700
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Attachments: HBASE-20700-UT.patch
>
>
> As said in HBASE-20682.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck

2018-06-07 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16505654#comment-16505654
 ] 

Duo Zhang commented on HBASE-20700:
---

Oh for meta there is another problem... The RecoverMetaProcedure will hold the 
exclusive lock for the meta table, and since the MRP for meta will hold the 
shared lock on meta table so the RecoverMetaProcedure can not be executed...

This is not correct I believe. In SCP we do not hold any table/region lock so 
that we are free to execute and then we can fail other RIT procedures to let 
our assign procedures go.

For me, the RecoverMetaProcedure is a bit strange. In general, if an RS is 
crashed then we will have a SCP for it and if it carries meta then we will 
assign meta somewhere else finally. When master start up, we just need to wait 
until the RS is online and do not need to mess up the recovery processing...

> Move meta region when server crash can cause the procedure to be stuck
> --
>
> Key: HBASE-20700
> URL: https://issues.apache.org/jira/browse/HBASE-20700
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Attachments: HBASE-20700-UT.patch
>
>
> As said in HBASE-20682.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck

2018-06-07 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16505537#comment-16505537
 ] 

Duo Zhang commented on HBASE-20700:
---

I could do it. And I plan to add more comments in MRP and SCP to say why the 
current approach works, for example which operation is the fencing point.

Thanks.

> Move meta region when server crash can cause the procedure to be stuck
> --
>
> Key: HBASE-20700
> URL: https://issues.apache.org/jira/browse/HBASE-20700
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Attachments: HBASE-20700-UT.patch
>
>
> As said in HBASE-20682.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck

2018-06-07 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16504891#comment-16504891
 ] 

stack commented on HBASE-20700:
---

Want me to take it [~Apache9]?


> Move meta region when server crash can cause the procedure to be stuck
> --
>
> Key: HBASE-20700
> URL: https://issues.apache.org/jira/browse/HBASE-20700
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Assignee: Duo Zhang
>Priority: Major
> Attachments: HBASE-20700-UT.patch
>
>
> As said in HBASE-20682.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20700) Move meta region when server crash can cause the procedure to be stuck

2018-06-07 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16504700#comment-16504700
 ] 

Duo Zhang commented on HBASE-20700:
---

Very similar with the one in HBASE-20634...

Will prepare a patch to fix it tomorrow. I think we need to add a new state 
called SPLIT_META_DONE, and when unassign meta region, we should check for this 
state instead of OFFLINE.

[~stack] FYI.

> Move meta region when server crash can cause the procedure to be stuck
> --
>
> Key: HBASE-20700
> URL: https://issues.apache.org/jira/browse/HBASE-20700
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Duo Zhang
>Priority: Major
> Attachments: HBASE-20700-UT.patch
>
>
> As said in HBASE-20682.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)