[
https://issues.apache.org/jira/browse/HBASE-21050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16582444#comment-16582444
]
Hudson commented on HBASE-21050:
--------------------------------
Results for branch master
[build #436 on
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/436/]: (x)
*{color:red}-1 overall{color}*
----
details (if available):
(/) {color:green}+1 general checks{color}
-- For more information [see general
report|https://builds.apache.org/job/HBase%20Nightly/job/master/436//General_Nightly_Build_Report/]
(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2)
report|https://builds.apache.org/job/HBase%20Nightly/job/master/436//JDK8_Nightly_Build_Report_(Hadoop2)/]
(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3)
report|https://builds.apache.org/job/HBase%20Nightly/job/master/436//JDK8_Nightly_Build_Report_(Hadoop3)/]
(/) {color:green}+1 source release artifact{color}
-- See build output for details.
(/) {color:green}+1 client integration test{color}
> Exclusive lock may be held by a SUCCESS state procedure forever
> ---------------------------------------------------------------
>
> Key: HBASE-21050
> URL: https://issues.apache.org/jira/browse/HBASE-21050
> Project: HBase
> Issue Type: Sub-task
> Components: amv2
> Affects Versions: 2.1.0, 2.0.1
> Reporter: Allan Yang
> Assignee: Allan Yang
> Priority: Major
> Fix For: 3.0.0, 2.0.2, 2.1.1
>
> Attachments: HBASE-21050.branch-2.0.001.patch
>
>
> After HBASE-20846, we restore lock info for procedures. But, there is a case
> that the lock and be held by a already success procedure. Since the procedure
> won't execute again, the lock will held by the procedure forever.
> 1. All children for pid=1208 had been finished, but before procedure 1208
> awake, the master was killed
> {code}
> 2018-08-05 02:20:14,465 INFO [PEWorker-8]
> procedure2.ProcedureExecutor(1659): Finished subprocedure(s) of pid=1208,
> ppid=1206, state=RUNNABLE, hasLock=true; MoveRegionProcedure
> hri=c2a23a735f16df57299
> dba6fd4599f2f, source=e010125050127.bja,60020,1533403109034,
> destination=e010125050127.bja,60020,1533403109034; resume parent processing.
> 2018-08-05 02:20:14,466 INFO [PEWorker-8]
> procedure2.ProcedureExecutor(1296): Finished pid=1232, ppid=1208,
> state=SUCCESS, hasLock=false; AssignProcedure
> table=IntegrationTestBigLinkedList, region=c2a
> 23a735f16df57299dba6fd4599f2f, target=e010125050127.bja,60020,1533403109034
> in 1.5060sec
> {code}
> 2. Master restarts, since procedure 1208 held the lock before restart, so the
> lock was resotore for it
> {code}
> 2018-08-05 02:20:30,803 DEBUG [Thread-15] procedure2.ProcedureExecutor(456):
> Loading pid=1208, ppid=1206, state=SUCCESS, hasLock=false;
> MoveRegionProcedure hri=c2a23a735f16df57299dba6fd4599f2f, source=
> e010125050127.bja,60020,1533403109034,
> destination=e010125050127.bja,60020,1533403109034
> 2018-08-05 02:20:30,818 DEBUG [Thread-15] procedure2.Procedure(898):
> pid=1208, ppid=1206, state=SUCCESS, hasLock=false; MoveRegionProcedure
> hri=c2a23a735f16df57299dba6fd4599f2f, source=e010125050127.bj
> a,60020,1533403109034, destination=e010125050127.bja,60020,1533403109034 held
> the lock before restarting, call acquireLock to restore it.
> 2018-08-05 02:20:30,818 INFO [Thread-15]
> procedure.MasterProcedureScheduler(631): pid=1208, ppid=1206, state=SUCCESS,
> hasLock=false; MoveRegionProcedure hri=c2a23a735f16df57299dba6fd4599f2f,
> source=e0
> 10125050127.bja,60020,1533403109034,
> destination=e010125050127.bja,60020,1533403109034 checking lock on
> c2a23a735f16df57299dba6fd4599f2f
> {code}
> 3. Since procedure 1208 is success, it won't execute later, so the lock will
> be held by it forever
> We need to check the state of the procedure before restoring locks, if the
> procedure is already finished (success or rollback), we do not need to
> acquire lock for it.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)