[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-08-05 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569515#comment-16569515
 ] 

stack commented on HBASE-20846:
---

bq. Duo Zhang, is there any problem for this fix before it can be back-ported 
branch-2.0?

What [~allan163] says... Yeah, what test is failing. I can take a look.

> Restore procedure locks when master restarts
> 
>
> Key: HBASE-20846
> URL: https://issues.apache.org/jira/browse/HBASE-20846
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1
>
> Attachments: HBASE-20846-v1.patch, HBASE-20846-v2.patch, 
> HBASE-20846-v3.patch, HBASE-20846-v4.patch, HBASE-20846-v4.patch, 
> HBASE-20846-v4.patch, HBASE-20846-v5.patch, HBASE-20846-v6.patch, 
> HBASE-20846.branch-2.0.002.patch, HBASE-20846.branch-2.0.patch, 
> HBASE-20846.patch
>
>
> Found this one when investigating ModifyTableProcedure got stuck while there 
> was a MoveRegionProcedure going on after master restart.
> Though this issue can be solved by HBASE-20752. But I discovered something 
> else.
> Before a MoveRegionProcedure can execute, it will hold the table's shared 
> lock. so,, when a UnassignProcedure was spwaned, it will not check the 
> table's shared lock since it is sure that its parent(MoveRegionProcedure) has 
> aquired the table's lock.
> {code:java}
> // If there is parent procedure, it would have already taken xlock, so no 
> need to take
>   // shared lock here. Otherwise, take shared lock.
>   if (!procedure.hasParent()
>   && waitTableQueueSharedLock(procedure, table) == null) {
>   return true;
>   }
> {code}
> But, it is not the case when Master was restarted. The child 
> procedure(UnassignProcedure) will be executed first after restart. Though it 
> has a parent(MoveRegionProcedure), but apprently the parent didn't hold the 
> table's lock.
> So, since it began to execute without hold the table's shared lock. A 
> ModifyTableProcedure can aquire the table's exclusive lock and execute at the 
> same time. Which is not possible if the master was not restarted.
> This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, 
> I wrote a simple UT to repo this case.
> I think we don't have to check the parent for table's shared lock. It is a 
> shared lock, right? I think we can acquire it every time we need it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-08-05 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16569422#comment-16569422
 ] 

Hudson commented on HBASE-20846:


Results for branch branch-2.0
[build #634 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/634/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/634//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/634//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/634//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Restore procedure locks when master restarts
> 
>
> Key: HBASE-20846
> URL: https://issues.apache.org/jira/browse/HBASE-20846
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1
>
> Attachments: HBASE-20846-v1.patch, HBASE-20846-v2.patch, 
> HBASE-20846-v3.patch, HBASE-20846-v4.patch, HBASE-20846-v4.patch, 
> HBASE-20846-v4.patch, HBASE-20846-v5.patch, HBASE-20846-v6.patch, 
> HBASE-20846.branch-2.0.002.patch, HBASE-20846.branch-2.0.patch, 
> HBASE-20846.patch
>
>
> Found this one when investigating ModifyTableProcedure got stuck while there 
> was a MoveRegionProcedure going on after master restart.
> Though this issue can be solved by HBASE-20752. But I discovered something 
> else.
> Before a MoveRegionProcedure can execute, it will hold the table's shared 
> lock. so,, when a UnassignProcedure was spwaned, it will not check the 
> table's shared lock since it is sure that its parent(MoveRegionProcedure) has 
> aquired the table's lock.
> {code:java}
> // If there is parent procedure, it would have already taken xlock, so no 
> need to take
>   // shared lock here. Otherwise, take shared lock.
>   if (!procedure.hasParent()
>   && waitTableQueueSharedLock(procedure, table) == null) {
>   return true;
>   }
> {code}
> But, it is not the case when Master was restarted. The child 
> procedure(UnassignProcedure) will be executed first after restart. Though it 
> has a parent(MoveRegionProcedure), but apprently the parent didn't hold the 
> table's lock.
> So, since it began to execute without hold the table's shared lock. A 
> ModifyTableProcedure can aquire the table's exclusive lock and execute at the 
> same time. Which is not possible if the master was not restarted.
> This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, 
> I wrote a simple UT to repo this case.
> I think we don't have to check the parent for table's shared lock. It is a 
> shared lock, right? I think we can acquire it every time we need it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-30 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16561847#comment-16561847
 ] 

Allan Yang commented on HBASE-20846:


{quote}
Seems the flaky tests report can reproduce the problem of getting MERGED state 
when disabling. Let me dig more.
{quote}
[~Apache9], is there any problem for this fix before it can be back-ported 
branch-2.0?

> Restore procedure locks when master restarts
> 
>
> Key: HBASE-20846
> URL: https://issues.apache.org/jira/browse/HBASE-20846
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1
>
> Attachments: HBASE-20846-v1.patch, HBASE-20846-v2.patch, 
> HBASE-20846-v3.patch, HBASE-20846-v4.patch, HBASE-20846-v4.patch, 
> HBASE-20846-v4.patch, HBASE-20846-v5.patch, HBASE-20846-v6.patch, 
> HBASE-20846.branch-2.0.002.patch, HBASE-20846.branch-2.0.patch, 
> HBASE-20846.patch
>
>
> Found this one when investigating ModifyTableProcedure got stuck while there 
> was a MoveRegionProcedure going on after master restart.
> Though this issue can be solved by HBASE-20752. But I discovered something 
> else.
> Before a MoveRegionProcedure can execute, it will hold the table's shared 
> lock. so,, when a UnassignProcedure was spwaned, it will not check the 
> table's shared lock since it is sure that its parent(MoveRegionProcedure) has 
> aquired the table's lock.
> {code:java}
> // If there is parent procedure, it would have already taken xlock, so no 
> need to take
>   // shared lock here. Otherwise, take shared lock.
>   if (!procedure.hasParent()
>   && waitTableQueueSharedLock(procedure, table) == null) {
>   return true;
>   }
> {code}
> But, it is not the case when Master was restarted. The child 
> procedure(UnassignProcedure) will be executed first after restart. Though it 
> has a parent(MoveRegionProcedure), but apprently the parent didn't hold the 
> table's lock.
> So, since it began to execute without hold the table's shared lock. A 
> ModifyTableProcedure can aquire the table's exclusive lock and execute at the 
> same time. Which is not possible if the master was not restarted.
> This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, 
> I wrote a simple UT to repo this case.
> I think we don't have to check the parent for table's shared lock. It is a 
> shared lock, right? I think we can acquire it every time we need it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-26 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16558308#comment-16558308
 ] 

Duo Zhang commented on HBASE-20846:
---

Seems the flaky tests report can reproduce the problem of getting MERGED state 
when disabling. Let me dig more.

> Restore procedure locks when master restarts
> 
>
> Key: HBASE-20846
> URL: https://issues.apache.org/jira/browse/HBASE-20846
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1
>
> Attachments: HBASE-20846-v1.patch, HBASE-20846-v2.patch, 
> HBASE-20846-v3.patch, HBASE-20846-v4.patch, HBASE-20846-v4.patch, 
> HBASE-20846-v4.patch, HBASE-20846-v5.patch, HBASE-20846-v6.patch, 
> HBASE-20846.branch-2.0.002.patch, HBASE-20846.branch-2.0.patch, 
> HBASE-20846.patch
>
>
> Found this one when investigating ModifyTableProcedure got stuck while there 
> was a MoveRegionProcedure going on after master restart.
> Though this issue can be solved by HBASE-20752. But I discovered something 
> else.
> Before a MoveRegionProcedure can execute, it will hold the table's shared 
> lock. so,, when a UnassignProcedure was spwaned, it will not check the 
> table's shared lock since it is sure that its parent(MoveRegionProcedure) has 
> aquired the table's lock.
> {code:java}
> // If there is parent procedure, it would have already taken xlock, so no 
> need to take
>   // shared lock here. Otherwise, take shared lock.
>   if (!procedure.hasParent()
>   && waitTableQueueSharedLock(procedure, table) == null) {
>   return true;
>   }
> {code}
> But, it is not the case when Master was restarted. The child 
> procedure(UnassignProcedure) will be executed first after restart. Though it 
> has a parent(MoveRegionProcedure), but apprently the parent didn't hold the 
> table's lock.
> So, since it began to execute without hold the table's shared lock. A 
> ModifyTableProcedure can aquire the table's exclusive lock and execute at the 
> same time. Which is not possible if the master was not restarted.
> This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, 
> I wrote a simple UT to repo this case.
> I think we don't have to check the parent for table's shared lock. It is a 
> shared lock, right? I think we can acquire it every time we need it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16556315#comment-16556315
 ] 

Hudson commented on HBASE-20846:


Results for branch branch-2.1
[build #103 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/103/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- Something went wrong running this stage, please [check relevant console 
output|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/103//console].




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- Something went wrong running this stage, please [check relevant console 
output|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/103//console].


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/103//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.




> Restore procedure locks when master restarts
> 
>
> Key: HBASE-20846
> URL: https://issues.apache.org/jira/browse/HBASE-20846
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1
>
> Attachments: HBASE-20846-v1.patch, HBASE-20846-v2.patch, 
> HBASE-20846-v3.patch, HBASE-20846-v4.patch, HBASE-20846-v4.patch, 
> HBASE-20846-v4.patch, HBASE-20846-v5.patch, HBASE-20846-v6.patch, 
> HBASE-20846.branch-2.0.002.patch, HBASE-20846.branch-2.0.patch, 
> HBASE-20846.patch
>
>
> Found this one when investigating ModifyTableProcedure got stuck while there 
> was a MoveRegionProcedure going on after master restart.
> Though this issue can be solved by HBASE-20752. But I discovered something 
> else.
> Before a MoveRegionProcedure can execute, it will hold the table's shared 
> lock. so,, when a UnassignProcedure was spwaned, it will not check the 
> table's shared lock since it is sure that its parent(MoveRegionProcedure) has 
> aquired the table's lock.
> {code:java}
> // If there is parent procedure, it would have already taken xlock, so no 
> need to take
>   // shared lock here. Otherwise, take shared lock.
>   if (!procedure.hasParent()
>   && waitTableQueueSharedLock(procedure, table) == null) {
>   return true;
>   }
> {code}
> But, it is not the case when Master was restarted. The child 
> procedure(UnassignProcedure) will be executed first after restart. Though it 
> has a parent(MoveRegionProcedure), but apprently the parent didn't hold the 
> table's lock.
> So, since it began to execute without hold the table's shared lock. A 
> ModifyTableProcedure can aquire the table's exclusive lock and execute at the 
> same time. Which is not possible if the master was not restarted.
> This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, 
> I wrote a simple UT to repo this case.
> I think we don't have to check the parent for table's shared lock. It is a 
> shared lock, right? I think we can acquire it every time we need it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16555963#comment-16555963
 ] 

Hudson commented on HBASE-20846:


Results for branch branch-2
[build #1025 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1025/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1025//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1025//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1025//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Restore procedure locks when master restarts
> 
>
> Key: HBASE-20846
> URL: https://issues.apache.org/jira/browse/HBASE-20846
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1
>
> Attachments: HBASE-20846-v1.patch, HBASE-20846-v2.patch, 
> HBASE-20846-v3.patch, HBASE-20846-v4.patch, HBASE-20846-v4.patch, 
> HBASE-20846-v4.patch, HBASE-20846-v5.patch, HBASE-20846-v6.patch, 
> HBASE-20846.branch-2.0.002.patch, HBASE-20846.branch-2.0.patch, 
> HBASE-20846.patch
>
>
> Found this one when investigating ModifyTableProcedure got stuck while there 
> was a MoveRegionProcedure going on after master restart.
> Though this issue can be solved by HBASE-20752. But I discovered something 
> else.
> Before a MoveRegionProcedure can execute, it will hold the table's shared 
> lock. so,, when a UnassignProcedure was spwaned, it will not check the 
> table's shared lock since it is sure that its parent(MoveRegionProcedure) has 
> aquired the table's lock.
> {code:java}
> // If there is parent procedure, it would have already taken xlock, so no 
> need to take
>   // shared lock here. Otherwise, take shared lock.
>   if (!procedure.hasParent()
>   && waitTableQueueSharedLock(procedure, table) == null) {
>   return true;
>   }
> {code}
> But, it is not the case when Master was restarted. The child 
> procedure(UnassignProcedure) will be executed first after restart. Though it 
> has a parent(MoveRegionProcedure), but apprently the parent didn't hold the 
> table's lock.
> So, since it began to execute without hold the table's shared lock. A 
> ModifyTableProcedure can aquire the table's exclusive lock and execute at the 
> same time. Which is not possible if the master was not restarted.
> This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, 
> I wrote a simple UT to repo this case.
> I think we don't have to check the parent for table's shared lock. It is a 
> shared lock, right? I think we can acquire it every time we need it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-25 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16555616#comment-16555616
 ] 

Hudson commented on HBASE-20846:


Results for branch master
[build #408 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/408/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/408//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/408//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- Something went wrong running this stage, please [check relevant console 
output|https://builds.apache.org/job/HBase%20Nightly/job/master/408//console].


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Restore procedure locks when master restarts
> 
>
> Key: HBASE-20846
> URL: https://issues.apache.org/jira/browse/HBASE-20846
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1
>
> Attachments: HBASE-20846-v1.patch, HBASE-20846-v2.patch, 
> HBASE-20846-v3.patch, HBASE-20846-v4.patch, HBASE-20846-v4.patch, 
> HBASE-20846-v4.patch, HBASE-20846-v5.patch, HBASE-20846-v6.patch, 
> HBASE-20846.branch-2.0.002.patch, HBASE-20846.branch-2.0.patch, 
> HBASE-20846.patch
>
>
> Found this one when investigating ModifyTableProcedure got stuck while there 
> was a MoveRegionProcedure going on after master restart.
> Though this issue can be solved by HBASE-20752. But I discovered something 
> else.
> Before a MoveRegionProcedure can execute, it will hold the table's shared 
> lock. so,, when a UnassignProcedure was spwaned, it will not check the 
> table's shared lock since it is sure that its parent(MoveRegionProcedure) has 
> aquired the table's lock.
> {code:java}
> // If there is parent procedure, it would have already taken xlock, so no 
> need to take
>   // shared lock here. Otherwise, take shared lock.
>   if (!procedure.hasParent()
>   && waitTableQueueSharedLock(procedure, table) == null) {
>   return true;
>   }
> {code}
> But, it is not the case when Master was restarted. The child 
> procedure(UnassignProcedure) will be executed first after restart. Though it 
> has a parent(MoveRegionProcedure), but apprently the parent didn't hold the 
> table's lock.
> So, since it began to execute without hold the table's shared lock. A 
> ModifyTableProcedure can aquire the table's exclusive lock and execute at the 
> same time. Which is not possible if the master was not restarted.
> This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, 
> I wrote a simple UT to repo this case.
> I think we don't have to check the parent for table's shared lock. It is a 
> shared lock, right? I think we can acquire it every time we need it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-23 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16553175#comment-16553175
 ] 

stack commented on HBASE-20846:
---

Link to HBASE-20924... An issue to backport to branch-2.0.

> Restore procedure locks when master restarts
> 
>
> Key: HBASE-20846
> URL: https://issues.apache.org/jira/browse/HBASE-20846
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.2.0, 2.1.1
>
> Attachments: HBASE-20846-v1.patch, HBASE-20846-v2.patch, 
> HBASE-20846-v3.patch, HBASE-20846-v4.patch, HBASE-20846-v4.patch, 
> HBASE-20846-v4.patch, HBASE-20846-v5.patch, HBASE-20846-v6.patch, 
> HBASE-20846.branch-2.0.002.patch, HBASE-20846.branch-2.0.patch, 
> HBASE-20846.patch
>
>
> Found this one when investigating ModifyTableProcedure got stuck while there 
> was a MoveRegionProcedure going on after master restart.
> Though this issue can be solved by HBASE-20752. But I discovered something 
> else.
> Before a MoveRegionProcedure can execute, it will hold the table's shared 
> lock. so,, when a UnassignProcedure was spwaned, it will not check the 
> table's shared lock since it is sure that its parent(MoveRegionProcedure) has 
> aquired the table's lock.
> {code:java}
> // If there is parent procedure, it would have already taken xlock, so no 
> need to take
>   // shared lock here. Otherwise, take shared lock.
>   if (!procedure.hasParent()
>   && waitTableQueueSharedLock(procedure, table) == null) {
>   return true;
>   }
> {code}
> But, it is not the case when Master was restarted. The child 
> procedure(UnassignProcedure) will be executed first after restart. Though it 
> has a parent(MoveRegionProcedure), but apprently the parent didn't hold the 
> table's lock.
> So, since it began to execute without hold the table's shared lock. A 
> ModifyTableProcedure can aquire the table's exclusive lock and execute at the 
> same time. Which is not possible if the master was not restarted.
> This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, 
> I wrote a simple UT to repo this case.
> I think we don't have to check the parent for table's shared lock. It is a 
> shared lock, right? I think we can acquire it every time we need it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-23 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16552366#comment-16552366
 ] 

Duo Zhang commented on HBASE-20846:
---

Let me commit to branch-2.1+.

[~stack] For 2.0 I think we could open a backport issue for it?

> Restore procedure locks when master restarts
> 
>
> Key: HBASE-20846
> URL: https://issues.apache.org/jira/browse/HBASE-20846
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20846-v1.patch, HBASE-20846-v2.patch, 
> HBASE-20846-v3.patch, HBASE-20846-v4.patch, HBASE-20846-v4.patch, 
> HBASE-20846-v4.patch, HBASE-20846-v5.patch, HBASE-20846-v6.patch, 
> HBASE-20846.branch-2.0.002.patch, HBASE-20846.branch-2.0.patch, 
> HBASE-20846.patch
>
>
> Found this one when investigating ModifyTableProcedure got stuck while there 
> was a MoveRegionProcedure going on after master restart.
> Though this issue can be solved by HBASE-20752. But I discovered something 
> else.
> Before a MoveRegionProcedure can execute, it will hold the table's shared 
> lock. so,, when a UnassignProcedure was spwaned, it will not check the 
> table's shared lock since it is sure that its parent(MoveRegionProcedure) has 
> aquired the table's lock.
> {code:java}
> // If there is parent procedure, it would have already taken xlock, so no 
> need to take
>   // shared lock here. Otherwise, take shared lock.
>   if (!procedure.hasParent()
>   && waitTableQueueSharedLock(procedure, table) == null) {
>   return true;
>   }
> {code}
> But, it is not the case when Master was restarted. The child 
> procedure(UnassignProcedure) will be executed first after restart. Though it 
> has a parent(MoveRegionProcedure), but apprently the parent didn't hold the 
> table's lock.
> So, since it began to execute without hold the table's shared lock. A 
> ModifyTableProcedure can aquire the table's exclusive lock and execute at the 
> same time. Which is not possible if the master was not restarted.
> This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, 
> I wrote a simple UT to repo this case.
> I think we don't have to check the parent for table's shared lock. It is a 
> shared lock, right? I think we can acquire it every time we need it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-21 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16551736#comment-16551736
 ] 

stack commented on HBASE-20846:
---

We thinking this goes into 2.1 and 2.0? Seems risky for 2.0.x, even though it 
bug fix. Maybe make a backport JIRA so can try it in long-running test. The 
risk might be worth it given this fix clears out a whole range of possible bug 
types.

> Restore procedure locks when master restarts
> 
>
> Key: HBASE-20846
> URL: https://issues.apache.org/jira/browse/HBASE-20846
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20846-v1.patch, HBASE-20846-v2.patch, 
> HBASE-20846-v3.patch, HBASE-20846-v4.patch, HBASE-20846-v4.patch, 
> HBASE-20846-v4.patch, HBASE-20846-v5.patch, HBASE-20846.branch-2.0.002.patch, 
> HBASE-20846.branch-2.0.patch, HBASE-20846.patch
>
>
> Found this one when investigating ModifyTableProcedure got stuck while there 
> was a MoveRegionProcedure going on after master restart.
> Though this issue can be solved by HBASE-20752. But I discovered something 
> else.
> Before a MoveRegionProcedure can execute, it will hold the table's shared 
> lock. so,, when a UnassignProcedure was spwaned, it will not check the 
> table's shared lock since it is sure that its parent(MoveRegionProcedure) has 
> aquired the table's lock.
> {code:java}
> // If there is parent procedure, it would have already taken xlock, so no 
> need to take
>   // shared lock here. Otherwise, take shared lock.
>   if (!procedure.hasParent()
>   && waitTableQueueSharedLock(procedure, table) == null) {
>   return true;
>   }
> {code}
> But, it is not the case when Master was restarted. The child 
> procedure(UnassignProcedure) will be executed first after restart. Though it 
> has a parent(MoveRegionProcedure), but apprently the parent didn't hold the 
> table's lock.
> So, since it began to execute without hold the table's shared lock. A 
> ModifyTableProcedure can aquire the table's exclusive lock and execute at the 
> same time. Which is not possible if the master was not restarted.
> This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, 
> I wrote a simple UT to repo this case.
> I think we don't have to check the parent for table's shared lock. It is a 
> shared lock, right? I think we can acquire it every time we need it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-21 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16551688#comment-16551688
 ] 

Duo Zhang commented on HBASE-20846:
---

Ping [~stack] for reviewing. It is on the RB now.

> Restore procedure locks when master restarts
> 
>
> Key: HBASE-20846
> URL: https://issues.apache.org/jira/browse/HBASE-20846
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20846-v1.patch, HBASE-20846-v2.patch, 
> HBASE-20846-v3.patch, HBASE-20846-v4.patch, HBASE-20846-v4.patch, 
> HBASE-20846-v4.patch, HBASE-20846-v5.patch, HBASE-20846.branch-2.0.002.patch, 
> HBASE-20846.branch-2.0.patch, HBASE-20846.patch
>
>
> Found this one when investigating ModifyTableProcedure got stuck while there 
> was a MoveRegionProcedure going on after master restart.
> Though this issue can be solved by HBASE-20752. But I discovered something 
> else.
> Before a MoveRegionProcedure can execute, it will hold the table's shared 
> lock. so,, when a UnassignProcedure was spwaned, it will not check the 
> table's shared lock since it is sure that its parent(MoveRegionProcedure) has 
> aquired the table's lock.
> {code:java}
> // If there is parent procedure, it would have already taken xlock, so no 
> need to take
>   // shared lock here. Otherwise, take shared lock.
>   if (!procedure.hasParent()
>   && waitTableQueueSharedLock(procedure, table) == null) {
>   return true;
>   }
> {code}
> But, it is not the case when Master was restarted. The child 
> procedure(UnassignProcedure) will be executed first after restart. Though it 
> has a parent(MoveRegionProcedure), but apprently the parent didn't hold the 
> table's lock.
> So, since it began to execute without hold the table's shared lock. A 
> ModifyTableProcedure can aquire the table's exclusive lock and execute at the 
> same time. Which is not possible if the master was not restarted.
> This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, 
> I wrote a simple UT to repo this case.
> I think we don't have to check the parent for table's shared lock. It is a 
> shared lock, right? I think we can acquire it every time we need it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-20 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16550321#comment-16550321
 ] 

Duo Zhang commented on HBASE-20846:
---

{quote}
Nice writeup. Include in lock section on top of the class?
{quote}

Sure. Any other concerns?

Thanks.

> Restore procedure locks when master restarts
> 
>
> Key: HBASE-20846
> URL: https://issues.apache.org/jira/browse/HBASE-20846
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20846-v1.patch, HBASE-20846-v2.patch, 
> HBASE-20846-v3.patch, HBASE-20846-v4.patch, HBASE-20846-v4.patch, 
> HBASE-20846-v4.patch, HBASE-20846-v5.patch, HBASE-20846.branch-2.0.002.patch, 
> HBASE-20846.branch-2.0.patch, HBASE-20846.patch
>
>
> Found this one when investigating ModifyTableProcedure got stuck while there 
> was a MoveRegionProcedure going on after master restart.
> Though this issue can be solved by HBASE-20752. But I discovered something 
> else.
> Before a MoveRegionProcedure can execute, it will hold the table's shared 
> lock. so,, when a UnassignProcedure was spwaned, it will not check the 
> table's shared lock since it is sure that its parent(MoveRegionProcedure) has 
> aquired the table's lock.
> {code:java}
> // If there is parent procedure, it would have already taken xlock, so no 
> need to take
>   // shared lock here. Otherwise, take shared lock.
>   if (!procedure.hasParent()
>   && waitTableQueueSharedLock(procedure, table) == null) {
>   return true;
>   }
> {code}
> But, it is not the case when Master was restarted. The child 
> procedure(UnassignProcedure) will be executed first after restart. Though it 
> has a parent(MoveRegionProcedure), but apprently the parent didn't hold the 
> table's lock.
> So, since it began to execute without hold the table's shared lock. A 
> ModifyTableProcedure can aquire the table's exclusive lock and execute at the 
> same time. Which is not possible if the master was not restarted.
> This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, 
> I wrote a simple UT to repo this case.
> I think we don't have to check the parent for table's shared lock. It is a 
> shared lock, right? I think we can acquire it every time we need it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-19 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16549669#comment-16549669
 ] 

Hadoop QA commented on HBASE-20846:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
13s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 8 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
18s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
52s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
58s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
39s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
38s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m  
5s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  3m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m 
16s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
12s{color} | {color:green} The patch hbase-protocol-shaded passed checkstyle 
{color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} hbase-procedure: The patch generated 0 new + 28 
unchanged - 16 fixed = 28 total (was 44) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
20s{color} | {color:green} hbase-server: The patch generated 0 new + 316 
unchanged - 7 fixed = 316 total (was 323) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
49s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m  5s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
1m 25s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
33s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
41s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}132m  
0s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
 9s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | 

[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-19 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16549564#comment-16549564
 ] 

stack commented on HBASE-20846:
---

Nice writeup. Include in lock section on top of the class?

> Restore procedure locks when master restarts
> 
>
> Key: HBASE-20846
> URL: https://issues.apache.org/jira/browse/HBASE-20846
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20846-v1.patch, HBASE-20846-v2.patch, 
> HBASE-20846-v3.patch, HBASE-20846-v4.patch, HBASE-20846-v4.patch, 
> HBASE-20846-v4.patch, HBASE-20846-v5.patch, HBASE-20846.branch-2.0.002.patch, 
> HBASE-20846.branch-2.0.patch, HBASE-20846.patch
>
>
> Found this one when investigating ModifyTableProcedure got stuck while there 
> was a MoveRegionProcedure going on after master restart.
> Though this issue can be solved by HBASE-20752. But I discovered something 
> else.
> Before a MoveRegionProcedure can execute, it will hold the table's shared 
> lock. so,, when a UnassignProcedure was spwaned, it will not check the 
> table's shared lock since it is sure that its parent(MoveRegionProcedure) has 
> aquired the table's lock.
> {code:java}
> // If there is parent procedure, it would have already taken xlock, so no 
> need to take
>   // shared lock here. Otherwise, take shared lock.
>   if (!procedure.hasParent()
>   && waitTableQueueSharedLock(procedure, table) == null) {
>   return true;
>   }
> {code}
> But, it is not the case when Master was restarted. The child 
> procedure(UnassignProcedure) will be executed first after restart. Though it 
> has a parent(MoveRegionProcedure), but apprently the parent didn't hold the 
> table's lock.
> So, since it began to execute without hold the table's shared lock. A 
> ModifyTableProcedure can aquire the table's exclusive lock and execute at the 
> same time. Which is not possible if the master was not restarted.
> This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, 
> I wrote a simple UT to repo this case.
> I think we don't have to check the parent for table's shared lock. It is a 
> shared lock, right? I think we can acquire it every time we need it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-19 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16549364#comment-16549364
 ] 

Duo Zhang commented on HBASE-20846:
---

Review board link:

https://reviews.apache.org/r/67973/

> Restore procedure locks when master restarts
> 
>
> Key: HBASE-20846
> URL: https://issues.apache.org/jira/browse/HBASE-20846
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20846-v1.patch, HBASE-20846-v2.patch, 
> HBASE-20846-v3.patch, HBASE-20846-v4.patch, HBASE-20846-v4.patch, 
> HBASE-20846-v4.patch, HBASE-20846-v5.patch, HBASE-20846.branch-2.0.002.patch, 
> HBASE-20846.branch-2.0.patch, HBASE-20846.patch
>
>
> Found this one when investigating ModifyTableProcedure got stuck while there 
> was a MoveRegionProcedure going on after master restart.
> Though this issue can be solved by HBASE-20752. But I discovered something 
> else.
> Before a MoveRegionProcedure can execute, it will hold the table's shared 
> lock. so,, when a UnassignProcedure was spwaned, it will not check the 
> table's shared lock since it is sure that its parent(MoveRegionProcedure) has 
> aquired the table's lock.
> {code:java}
> // If there is parent procedure, it would have already taken xlock, so no 
> need to take
>   // shared lock here. Otherwise, take shared lock.
>   if (!procedure.hasParent()
>   && waitTableQueueSharedLock(procedure, table) == null) {
>   return true;
>   }
> {code}
> But, it is not the case when Master was restarted. The child 
> procedure(UnassignProcedure) will be executed first after restart. Though it 
> has a parent(MoveRegionProcedure), but apprently the parent didn't hold the 
> table's lock.
> So, since it began to execute without hold the table's shared lock. A 
> ModifyTableProcedure can aquire the table's exclusive lock and execute at the 
> same time. Which is not possible if the master was not restarted.
> This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, 
> I wrote a simple UT to repo this case.
> I think we don't have to check the parent for table's shared lock. It is a 
> shared lock, right? I think we can acquire it every time we need it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-19 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16549287#comment-16549287
 ] 

Hadoop QA commented on HBASE-20846:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 8 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
43s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
58s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
40s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
30s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
38s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
51s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  3m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 9s{color} | {color:green} The patch hbase-protocol-shaded passed checkstyle 
{color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
14s{color} | {color:green} hbase-procedure: The patch generated 0 new + 28 
unchanged - 16 fixed = 28 total (was 44) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
16s{color} | {color:green} hbase-server: The patch generated 0 new + 316 
unchanged - 7 fixed = 316 total (was 323) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
31s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m 59s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
1m 20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
32s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
40s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}116m 
41s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
 1s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | 

[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-19 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16549096#comment-16549096
 ] 

Duo Zhang commented on HBASE-20846:
---

{quote}
Most of patch content is swapping Procedure for Procedure.
{quote}

Just want to eliminate the warnings. Will upload the patch to RB when the pre 
commit result is OK, but the jenkins is in a bad state... The infra team said 
that they will move the jenkins server to a new machine at Saturday, will try 
later.

The changes of this patch are:
1. Make hasLock method final, and add a locked field in Procedure to record 
whether we have the lock. We will set it to true in doAcquireLock and to false 
in doReleaseLock. The sub procedures do not need to manage it any more.
2. Also added a locked field in the proto message. When storing, the field will 
be set according to the return value of hasLock. And when loading, there is a 
new field in Procedure called lockedWhenLoading. We will set it to true if the 
locked field in proto message is true.
3. The reason why we can not set the locked field directly to true by calling 
doAcquireLock is that, during initialization, most procedures need to wait 
until master is initialized. So the solution here is that, we introduced a new 
method called waitInitialized in Procedure, and move the wait master 
initialized related code from acquireLock to this method. And we added a 
restoreLock method to Procedure, if lockedWhenLoading is true, we will call the 
acquireLock to get the lock, but do not set locked to true. And later when we 
call doAcquireLock and pass the waitInitialized check, we will test 
lockedWhenLoading, if it is true, when we just set the locked field to true and 
return, without actually calling the acquireLock method since we have already 
called it once.

{quote}
We release the lock even if we didn't have it? And maybe update store if 
though we didn't have lock.
{quote}
No, we will test hasLock before calling doReleaseLock. The only place where we 
do not test hasLock is in execProcedure, where we can make sure that we have 
the lock.

And I think there are some space for optimization, for example, if holdLock is 
false when maybe we do not need to store the lock operation, but as I said 
above, later we want to make holdLock depend on the state, for example, for 
ModifyTableProcedure, all the states before the last state which we reopen all 
the regions, the holdLock should be true as we do not want other one to jump in 
the middle, but after the last state where we schedule a lot of sub procedures 
to reopen all the regions, we need to release the lock, so holdLock should be 
false. So I'm not sure if it is safe to use holdLock to determine whether we 
should store the log operations. May be OK, but need to think more.

And also, for release lock, I think it could be merged with the 
updateStoreOnExec call, but the logic is a bit complicated as the 
updateStoreOnExec is in a loop, so in this patch I plan to implement it in a 
simple way. We can optimize it later.

> Restore procedure locks when master restarts
> 
>
> Key: HBASE-20846
> URL: https://issues.apache.org/jira/browse/HBASE-20846
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20846-v1.patch, HBASE-20846-v2.patch, 
> HBASE-20846-v3.patch, HBASE-20846-v4.patch, HBASE-20846-v4.patch, 
> HBASE-20846.branch-2.0.002.patch, HBASE-20846.branch-2.0.patch, 
> HBASE-20846.patch
>
>
> Found this one when investigating ModifyTableProcedure got stuck while there 
> was a MoveRegionProcedure going on after master restart.
> Though this issue can be solved by HBASE-20752. But I discovered something 
> else.
> Before a MoveRegionProcedure can execute, it will hold the table's shared 
> lock. so,, when a UnassignProcedure was spwaned, it will not check the 
> table's shared lock since it is sure that its parent(MoveRegionProcedure) has 
> aquired the table's lock.
> {code:java}
> // If there is parent procedure, it would have already taken xlock, so no 
> need to take
>   // shared lock here. Otherwise, take shared lock.
>   if (!procedure.hasParent()
>   && waitTableQueueSharedLock(procedure, table) == null) {
>   return true;
>   }
> {code}
> But, it is not the case when Master was restarted. The child 
> procedure(UnassignProcedure) will be executed first after restart. Though it 
> has a parent(MoveRegionProcedure), but apprently the parent didn't hold the 
> table's lock.
> So, since it began to execute without hold the table's shared lock. A 
> ModifyTableProcedure can aquire the table's exclusive lock and execute at the 
> same time. Which is not 

[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-18 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548792#comment-16548792
 ] 

stack commented on HBASE-20846:
---

On the patch...

Non-change in AbstractProcedureScheduler.java ?

Most of patch content is swapping Procedure for Procedure.

Trying to understand... now Procedure has a *locked* data member that can be 
used by subprocedures. Before subprocedures would supply their own boolean... 
sometimes it was named lock... other times locked. A subprocedure that wants a 
lock for the life of the procedure returns holdLock as per before? If a 
procedure wants to be locked for the procedure step, they do same as before 
too? They just call up to the super class to manage the setting an unsetting of 
locked?

This lockedWhenLoading is not just for procedures that holdLock for life of the 
Procedure run it seems. We'll restore the lock even if we crashed mid-procedure 
step on load? That seems right/expected.

920   // persist that we have held the lock. This must be done before 
we actually execute the
921   // procedure, otherwise when restarting, we may consider the 
procedure does not have a lock,
922   // but it may have already done some changes as we have already 
executed it, and if another
923   // procedure gets the lock, then the semantic will be broken if 
the holdLock is true, as we do
924   // not expect that another procedure can be executed in the 
middle.
925   store.update(this);

Now we do double-store to ProcedureWALStore? Once for the lock and then once 
for end-state of the Procedure step? Each time? Slows down pv2 throughput? Any 
chance of  a reordering so we don't release lock until after we've persisted 
step and lock state?

Is this right?

190* The {@link #doAcquireLock(Object, ProcedureStore)} will be split 
into two steps, first, it will
191* call us to determine whether we need to wait for initialization, 
second, it will call
192* {@link #doAcquireLock(Object, ProcedureStore)} to actually handle 
the lock for this procedure.

Seems to say doAcquireLock calls itself.


You remove @InterfaceAudience.Private on methods because class is 
@InterfaceAudience.Private? Could remove the Evolving from the class too... 
Evolving and Private don't make sense together. Nit.

Nice translation: 94return 
subprocs.stream().mapToLong(Procedure::getProcId).toArray();

These will become annoying?

  LOG.debug("{} didn't hold the lock before restarting, skip acquiring 
lock.", this);

You probably need them for the moment debugging though? Could leave them in and 
then strip them later when known working?

We release the lock even if we didn't have it? And maybe update store if 
though we didn't have lock.

933   final void doReleaseLock(TEnvironment env, ProcedureStore store) {
934 locked = false;

Will be back with more. Its taking time to digest. Thanks.

This will change how we operate post-master crash but it should make stuff way 
better, easier to stand at least, and more predictable.



> Restore procedure locks when master restarts
> 
>
> Key: HBASE-20846
> URL: https://issues.apache.org/jira/browse/HBASE-20846
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20846-v1.patch, HBASE-20846-v2.patch, 
> HBASE-20846-v3.patch, HBASE-20846-v4.patch, HBASE-20846-v4.patch, 
> HBASE-20846.branch-2.0.002.patch, HBASE-20846.branch-2.0.patch, 
> HBASE-20846.patch
>
>
> Found this one when investigating ModifyTableProcedure got stuck while there 
> was a MoveRegionProcedure going on after master restart.
> Though this issue can be solved by HBASE-20752. But I discovered something 
> else.
> Before a MoveRegionProcedure can execute, it will hold the table's shared 
> lock. so,, when a UnassignProcedure was spwaned, it will not check the 
> table's shared lock since it is sure that its parent(MoveRegionProcedure) has 
> aquired the table's lock.
> {code:java}
> // If there is parent procedure, it would have already taken xlock, so no 
> need to take
>   // shared lock here. Otherwise, take shared lock.
>   if (!procedure.hasParent()
>   && waitTableQueueSharedLock(procedure, table) == null) {
>   return true;
>   }
> {code}
> But, it is not the case when Master was restarted. The child 
> procedure(UnassignProcedure) will be executed first after restart. Though it 
> has a parent(MoveRegionProcedure), but apprently the parent didn't hold the 
> table's lock.
> So, since it began to execute without hold the table's shared lock. A 
> ModifyTableProcedure can aquire the table's exclusive 

[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-18 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16548787#comment-16548787
 ] 

Hadoop QA commented on HBASE-20846:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 8 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
59s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
58s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
39s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
36s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m  
9s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
6s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  3m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} The patch hbase-protocol-shaded passed checkstyle 
{color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
18s{color} | {color:green} hbase-procedure: The patch generated 0 new + 28 
unchanged - 16 fixed = 28 total (was 44) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
24s{color} | {color:green} hbase-server: The patch generated 0 new + 316 
unchanged - 7 fixed = 316 total (was 323) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
18s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m  4s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
1m 22s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
32s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
36s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
53s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}174m 20s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
12s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}237m 

[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-17 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547284#comment-16547284
 ] 

Duo Zhang commented on HBASE-20846:
---

Tried the failed UT several times locally, all passed. The error on jenkins is 
that, the disable table is executed before we finish the merge table procedure, 
and we get a region in MERGED state. But this never happened for me locally, 
since merge table procedure will hold the shared lock of the table while 
disable table procedure needs the exclusive lock. And I added logs about 
acquire/release locks locally, it is fine, the disable table procedure does 
schedule before we finish the merge table procedure, but it can only be 
executed after the merge table procedure releases the shared lock on table.

Let's try again.

> Restore procedure locks when master restarts
> 
>
> Key: HBASE-20846
> URL: https://issues.apache.org/jira/browse/HBASE-20846
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20846-v1.patch, HBASE-20846-v2.patch, 
> HBASE-20846-v3.patch, HBASE-20846-v4.patch, HBASE-20846.branch-2.0.002.patch, 
> HBASE-20846.branch-2.0.patch, HBASE-20846.patch
>
>
> Found this one when investigating ModifyTableProcedure got stuck while there 
> was a MoveRegionProcedure going on after master restart.
> Though this issue can be solved by HBASE-20752. But I discovered something 
> else.
> Before a MoveRegionProcedure can execute, it will hold the table's shared 
> lock. so,, when a UnassignProcedure was spwaned, it will not check the 
> table's shared lock since it is sure that its parent(MoveRegionProcedure) has 
> aquired the table's lock.
> {code:java}
> // If there is parent procedure, it would have already taken xlock, so no 
> need to take
>   // shared lock here. Otherwise, take shared lock.
>   if (!procedure.hasParent()
>   && waitTableQueueSharedLock(procedure, table) == null) {
>   return true;
>   }
> {code}
> But, it is not the case when Master was restarted. The child 
> procedure(UnassignProcedure) will be executed first after restart. Though it 
> has a parent(MoveRegionProcedure), but apprently the parent didn't hold the 
> table's lock.
> So, since it began to execute without hold the table's shared lock. A 
> ModifyTableProcedure can aquire the table's exclusive lock and execute at the 
> same time. Which is not possible if the master was not restarted.
> This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, 
> I wrote a simple UT to repo this case.
> I think we don't have to check the parent for table's shared lock. It is a 
> shared lock, right? I think we can acquire it every time we need it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-17 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547213#comment-16547213
 ] 

Hadoop QA commented on HBASE-20846:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  3m 
48s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 8 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
28s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m  
5s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
33s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
21s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
41s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  3m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m  
1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} The patch hbase-protocol-shaded passed checkstyle 
{color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
14s{color} | {color:red} hbase-procedure: The patch generated 1 new + 28 
unchanged - 16 fixed = 29 total (was 44) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 7s{color} | {color:green} hbase-server: The patch generated 0 new + 316 
unchanged - 7 fixed = 316 total (was 323) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
22s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m 34s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
1m 24s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m  
6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
32s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
47s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}196m 11s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
 9s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}256m 

[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-17 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16547003#comment-16547003
 ] 

Hadoop QA commented on HBASE-20846:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 8 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
29s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
31s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
43s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
26s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 4s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
25s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
50s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  2m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} The patch hbase-protocol-shaded passed checkstyle 
{color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
13s{color} | {color:red} hbase-procedure: The patch generated 1 new + 28 
unchanged - 16 fixed = 29 total (was 44) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 3s{color} | {color:green} hbase-server: The patch generated 0 new + 316 
unchanged - 7 fixed = 316 total (was 323) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 1s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m  4s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
1m 14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
29s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
33s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}192m  1s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
 2s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}244m 

[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-17 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16546646#comment-16546646
 ] 

Duo Zhang commented on HBASE-20846:
---

Just ignore TestProcedureReplayOrder. Now we will restore locks when restarting 
master, so we do not rely on the loading order any more. We should use lock to 
obtain the correct execution order.

> Restore procedure locks when master restarts
> 
>
> Key: HBASE-20846
> URL: https://issues.apache.org/jira/browse/HBASE-20846
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20846-v1.patch, HBASE-20846-v2.patch, 
> HBASE-20846-v3.patch, HBASE-20846.branch-2.0.002.patch, 
> HBASE-20846.branch-2.0.patch, HBASE-20846.patch
>
>
> Found this one when investigating ModifyTableProcedure got stuck while there 
> was a MoveRegionProcedure going on after master restart.
> Though this issue can be solved by HBASE-20752. But I discovered something 
> else.
> Before a MoveRegionProcedure can execute, it will hold the table's shared 
> lock. so,, when a UnassignProcedure was spwaned, it will not check the 
> table's shared lock since it is sure that its parent(MoveRegionProcedure) has 
> aquired the table's lock.
> {code:java}
> // If there is parent procedure, it would have already taken xlock, so no 
> need to take
>   // shared lock here. Otherwise, take shared lock.
>   if (!procedure.hasParent()
>   && waitTableQueueSharedLock(procedure, table) == null) {
>   return true;
>   }
> {code}
> But, it is not the case when Master was restarted. The child 
> procedure(UnassignProcedure) will be executed first after restart. Though it 
> has a parent(MoveRegionProcedure), but apprently the parent didn't hold the 
> table's lock.
> So, since it began to execute without hold the table's shared lock. A 
> ModifyTableProcedure can aquire the table's exclusive lock and execute at the 
> same time. Which is not possible if the master was not restarted.
> This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, 
> I wrote a simple UT to repo this case.
> I think we don't have to check the parent for table's shared lock. It is a 
> shared lock, right? I think we can acquire it every time we need it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-16 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545969#comment-16545969
 ] 

Duo Zhang commented on HBASE-20846:
---

Thanks [~stack]. I checked the code, for a procedure in ROLLEDBACK state, we 
will call store.delete to remove it so we should not update the lock operation 
any more.

And we are getting closer. Let me check the failed UTs.

> Restore procedure locks when master restarts
> 
>
> Key: HBASE-20846
> URL: https://issues.apache.org/jira/browse/HBASE-20846
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20846-v1.patch, HBASE-20846-v2.patch, 
> HBASE-20846.branch-2.0.002.patch, HBASE-20846.branch-2.0.patch, 
> HBASE-20846.patch
>
>
> Found this one when investigating ModifyTableProcedure got stuck while there 
> was a MoveRegionProcedure going on after master restart.
> Though this issue can be solved by HBASE-20752. But I discovered something 
> else.
> Before a MoveRegionProcedure can execute, it will hold the table's shared 
> lock. so,, when a UnassignProcedure was spwaned, it will not check the 
> table's shared lock since it is sure that its parent(MoveRegionProcedure) has 
> aquired the table's lock.
> {code:java}
> // If there is parent procedure, it would have already taken xlock, so no 
> need to take
>   // shared lock here. Otherwise, take shared lock.
>   if (!procedure.hasParent()
>   && waitTableQueueSharedLock(procedure, table) == null) {
>   return true;
>   }
> {code}
> But, it is not the case when Master was restarted. The child 
> procedure(UnassignProcedure) will be executed first after restart. Though it 
> has a parent(MoveRegionProcedure), but apprently the parent didn't hold the 
> table's lock.
> So, since it began to execute without hold the table's shared lock. A 
> ModifyTableProcedure can aquire the table's exclusive lock and execute at the 
> same time. Which is not possible if the master was not restarted.
> This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, 
> I wrote a simple UT to repo this case.
> I think we don't have to check the parent for table's shared lock. It is a 
> shared lock, right? I think we can acquire it every time we need it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545861#comment-16545861
 ] 

Hadoop QA commented on HBASE-20846:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
38s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 7 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
46s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  7m 
19s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m 
10s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
35s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
22s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
47s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
27s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  2m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 8s{color} | {color:green} The patch hbase-protocol-shaded passed checkstyle 
{color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
13s{color} | {color:red} hbase-procedure: The patch generated 1 new + 38 
unchanged - 14 fixed = 39 total (was 52) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 1s{color} | {color:green} hbase-server: The patch generated 0 new + 316 
unchanged - 7 fixed = 316 total (was 323) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 7s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m 14s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
1m 13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
54s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
29s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  2m 46s{color} 
| {color:red} hbase-procedure in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}197m 24s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
 7s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}255m 33s{color} | 

[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-16 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545309#comment-16545309
 ] 

stack commented on HBASE-20846:
---

bq. Now the biggest problem is that, the original code does not allow storing 
ROLLEDBACK procedure into the procedure store.

You can't store ROLLBACK steps as we do forward steps; the framework does not 
currently support this. Shout if you want more detail.

> Restore procedure locks when master restarts
> 
>
> Key: HBASE-20846
> URL: https://issues.apache.org/jira/browse/HBASE-20846
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20846-v1.patch, HBASE-20846-v2.patch, 
> HBASE-20846.branch-2.0.002.patch, HBASE-20846.branch-2.0.patch, 
> HBASE-20846.patch
>
>
> Found this one when investigating ModifyTableProcedure got stuck while there 
> was a MoveRegionProcedure going on after master restart.
> Though this issue can be solved by HBASE-20752. But I discovered something 
> else.
> Before a MoveRegionProcedure can execute, it will hold the table's shared 
> lock. so,, when a UnassignProcedure was spwaned, it will not check the 
> table's shared lock since it is sure that its parent(MoveRegionProcedure) has 
> aquired the table's lock.
> {code:java}
> // If there is parent procedure, it would have already taken xlock, so no 
> need to take
>   // shared lock here. Otherwise, take shared lock.
>   if (!procedure.hasParent()
>   && waitTableQueueSharedLock(procedure, table) == null) {
>   return true;
>   }
> {code}
> But, it is not the case when Master was restarted. The child 
> procedure(UnassignProcedure) will be executed first after restart. Though it 
> has a parent(MoveRegionProcedure), but apprently the parent didn't hold the 
> table's lock.
> So, since it began to execute without hold the table's shared lock. A 
> ModifyTableProcedure can aquire the table's exclusive lock and execute at the 
> same time. Which is not possible if the master was not restarted.
> This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, 
> I wrote a simple UT to repo this case.
> I think we don't have to check the parent for table's shared lock. It is a 
> shared lock, right? I think we can acquire it every time we need it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16545192#comment-16545192
 ] 

Hadoop QA commented on HBASE-20846:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
27s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 7 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
25s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
37s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
51s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
21s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 9s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
36s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
47s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  2m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
 8s{color} | {color:green} The patch hbase-protocol-shaded passed checkstyle 
{color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
12s{color} | {color:red} hbase-procedure: The patch generated 1 new + 38 
unchanged - 14 fixed = 39 total (was 52) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
59s{color} | {color:green} hbase-server: The patch generated 0 new + 316 
unchanged - 7 fixed = 316 total (was 323) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 5s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m 27s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
1m 14s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m  
3s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
29s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  2m 42s{color} 
| {color:red} hbase-procedure in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}241m 18s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
50s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}294m 57s{color} | 

[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-16 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16544914#comment-16544914
 ] 

Hadoop QA commented on HBASE-20846:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 7 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
45s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m  
1s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
36s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
24s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
41s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:red}-1{color} | {color:red} mvninstall {color} | {color:red}  1m 
47s{color} | {color:red} root in the patch failed. {color} |
| {color:red}-1{color} | {color:red} compile {color} | {color:red}  0m 
45s{color} | {color:red} hbase-server in the patch failed. {color} |
| {color:red}-1{color} | {color:red} cc {color} | {color:red}  0m 45s{color} | 
{color:red} hbase-server in the patch failed. {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 45s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} The patch hbase-protocol-shaded passed checkstyle 
{color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
15s{color} | {color:red} hbase-procedure: The patch generated 1 new + 28 
unchanged - 14 fixed = 29 total (was 42) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
 7s{color} | {color:green} hbase-server: The patch generated 0 new + 316 
unchanged - 7 fixed = 316 total (was 323) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} shadedjars {color} | {color:red}  3m 
14s{color} | {color:red} patch has 14 errors when building our shaded 
downstream artifacts. {color} |
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red}  1m 
41s{color} | {color:red} The patch causes 15 errors with Hadoop v2.7.4. {color} 
|
| {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red}  3m 
26s{color} | {color:red} The patch causes 15 errors with Hadoop v3.0.0. {color} 
|
| {color:red}-1{color} | {color:red} hbaseprotoc {color} | {color:red}  0m 
25s{color} | {color:red} hbase-server in the patch failed. {color} |
| {color:red}-1{color} | {color:red} findbugs {color} | {color:red}  0m 
25s{color} | {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
54s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
33s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  2m  1s{color} 
| {color:red} hbase-procedure in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}  0m 31s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  

[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-16 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16544874#comment-16544874
 ] 

Duo Zhang commented on HBASE-20846:
---

Still have plenty of problems. Now the biggest problem is that, the original 
code does not allow storing ROLLEDBACK procedure into the procedure store. Need 
to understand more about the framework...

> Restore procedure locks when master restarts
> 
>
> Key: HBASE-20846
> URL: https://issues.apache.org/jira/browse/HBASE-20846
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20846-v1.patch, HBASE-20846.branch-2.0.002.patch, 
> HBASE-20846.branch-2.0.patch, HBASE-20846.patch
>
>
> Found this one when investigating ModifyTableProcedure got stuck while there 
> was a MoveRegionProcedure going on after master restart.
> Though this issue can be solved by HBASE-20752. But I discovered something 
> else.
> Before a MoveRegionProcedure can execute, it will hold the table's shared 
> lock. so,, when a UnassignProcedure was spwaned, it will not check the 
> table's shared lock since it is sure that its parent(MoveRegionProcedure) has 
> aquired the table's lock.
> {code:java}
> // If there is parent procedure, it would have already taken xlock, so no 
> need to take
>   // shared lock here. Otherwise, take shared lock.
>   if (!procedure.hasParent()
>   && waitTableQueueSharedLock(procedure, table) == null) {
>   return true;
>   }
> {code}
> But, it is not the case when Master was restarted. The child 
> procedure(UnassignProcedure) will be executed first after restart. Though it 
> has a parent(MoveRegionProcedure), but apprently the parent didn't hold the 
> table's lock.
> So, since it began to execute without hold the table's shared lock. A 
> ModifyTableProcedure can aquire the table's exclusive lock and execute at the 
> same time. Which is not possible if the master was not restarted.
> This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, 
> I wrote a simple UT to repo this case.
> I think we don't have to check the parent for table's shared lock. It is a 
> shared lock, right? I think we can acquire it every time we need it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-14 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16544297#comment-16544297
 ] 

Hadoop QA commented on HBASE-20846:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
16s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
27s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
 3s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m  
2s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
36s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
30s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
35s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  3m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green}  3m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  3m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
10s{color} | {color:green} The patch hbase-protocol-shaded passed checkstyle 
{color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
14s{color} | {color:red} hbase-procedure: The patch generated 1 new + 34 
unchanged - 4 fixed = 35 total (was 38) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
11s{color} | {color:green} hbase-server: The patch generated 0 new + 82 
unchanged - 7 fixed = 82 total (was 89) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
30s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 14s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
1m 20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  4m 
55s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
13s{color} | {color:red} hbase-procedure generated 2 new + 0 unchanged - 0 
fixed = 2 total (was 0) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
32s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 40m 15s{color} 
| {color:red} hbase-procedure in the patch failed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}134m 31s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
 5s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | 

[jira] [Commented] (HBASE-20846) Restore procedure locks when master restarts

2018-07-14 Thread Duo Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20846?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16544228#comment-16544228
 ] 

Duo Zhang commented on HBASE-20846:
---

A first try.

TestMasterFailoverWithProcedures can not pass, I know. Just want to see if 
there are other problems.

And also need to add some UTs to confirm that the restore locks works as 
expected.

> Restore procedure locks when master restarts
> 
>
> Key: HBASE-20846
> URL: https://issues.apache.org/jira/browse/HBASE-20846
> Project: HBase
>  Issue Type: Sub-task
>Affects Versions: 2.1.0
>Reporter: Allan Yang
>Assignee: Duo Zhang
>Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-20846.branch-2.0.002.patch, 
> HBASE-20846.branch-2.0.patch, HBASE-20846.patch
>
>
> Found this one when investigating ModifyTableProcedure got stuck while there 
> was a MoveRegionProcedure going on after master restart.
> Though this issue can be solved by HBASE-20752. But I discovered something 
> else.
> Before a MoveRegionProcedure can execute, it will hold the table's shared 
> lock. so,, when a UnassignProcedure was spwaned, it will not check the 
> table's shared lock since it is sure that its parent(MoveRegionProcedure) has 
> aquired the table's lock.
> {code:java}
> // If there is parent procedure, it would have already taken xlock, so no 
> need to take
>   // shared lock here. Otherwise, take shared lock.
>   if (!procedure.hasParent()
>   && waitTableQueueSharedLock(procedure, table) == null) {
>   return true;
>   }
> {code}
> But, it is not the case when Master was restarted. The child 
> procedure(UnassignProcedure) will be executed first after restart. Though it 
> has a parent(MoveRegionProcedure), but apprently the parent didn't hold the 
> table's lock.
> So, since it began to execute without hold the table's shared lock. A 
> ModifyTableProcedure can aquire the table's exclusive lock and execute at the 
> same time. Which is not possible if the master was not restarted.
> This will cause a stuck before HBASE-20752. But since HBASE-20752 has fixed, 
> I wrote a simple UT to repo this case.
> I think we don't have to check the parent for table's shared lock. It is a 
> shared lock, right? I think we can acquire it every time we need it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)