[ 
https://issues.apache.org/jira/browse/HBASE-20975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16565123#comment-16565123
 ] 

Allan Yang edited comment on HBASE-20975 at 8/1/18 11:38 AM:
-------------------------------------------------------------

The failed UTs are related, upload a patch to fix it. My fix reveals a bug 
which exists all the time...

the acquire and release lock logic in procedure is a bit of mess here...  For 
different procedures, we have holdLock() set to true of false, for procedures 
which holdLock() == true, we need to override hasLock() method to test whether 
we have the lock...
when executing procedure, we acquire the lock and release it immediately.
{code}
// ownership of a procedure of an entity such as a region or table
      LockState lockState = acquireLock(proc);
      switch (lockState) {
        case LOCK_ACQUIRED:
          execProcedure(procStack, proc);
          releaseLock(proc, false);  // call release lock with force = false
{code}
And when the procedure is success, we release the lock again in 
execCompletionCleanup()...
{code}
if (proc.isSuccess()) {
        ......
        if (proc.getProcId() == rootProcId) {
          procedureFinished(proc);  // call release lock with force = true
        } else {
          execCompletionCleanup(proc);  // call release lock with force = true
        }
        break;
      }
{code}
It works, because, for procedure with holdLock = true, we call releaseLock() 
with force = false the first time, the lock won't release here, but in 
execCompletionCleanup, where we call releaseLock() with force = true, the lock 
will be released here.
For procedure with holdLock = false, it also works since when the first time we 
call releaseLock() with force = false, the lock will release, and the second 
time we call relseaseLock in execCompletionCleanup(), we only release it lock 
if procedure with holdLock=true. So releasing lock won't called twice

But when rolling back, we acquire the lock but only release the lock for 
procedure with holdLock = true...
*Rolling back for procedures with holdLock=false(e.g. most of the Table 
Procedures) will never release the lock...*
{code}
LockState lockState;
      if (!reuseLock && (lockState = acquireLock(proc)) != 
LockState.LOCK_ACQUIRED) {
        // can't take a lock on the procedure, add the root-proc back on the
        // queue waiting for the lock availability
        return lockState;
      }
......
execCompletionCleanup(proc); // only release lock for procedure with 
holdLock=true
{code}


was (Author: allan163):
The failed UTs are related, upload a patch to fix it. My fix reveals a bug 
which exists all the time...

the acquire and release lock logic in procedure is a bit of mess here...  For 
different procedures, we have holdLock() set to true of false, for procedures 
which holdLock() == true, we need to override hasLock() method to test whether 
we have the lock...
when executing procedure, we acquire the lock and release it immediately.
{code}
// ownership of a procedure of an entity such as a region or table
      LockState lockState = acquireLock(proc);
      switch (lockState) {
        case LOCK_ACQUIRED:
          execProcedure(procStack, proc);
          releaseLock(proc, false);  // call release lock with force = false
{code}
And when the procedure is success, we release the lock again in 
execCompletionCleanup()...
{code}
if (proc.isSuccess()) {
        ......
        if (proc.getProcId() == rootProcId) {
          procedureFinished(proc);  // call release lock with force = true
        } else {
          execCompletionCleanup(proc);  // call release lock with force = true
        }
        break;
      }
{code}
It works, because we call releaseLock() with force = false, for procedure with 
holdLock = true, the lock won't release here, but in execCompletionCleanup, 
where we call releaseLock() with force = true.
For procedure with holdLock = false, it also works since when the first time we 
call releaseLock() with force = false, the lock will release, and the second 
time we call relseaseLock in execCompletionCleanup(), we only release it lock 
if procedure with holdLock=true. So releasing lock won't called twice

But when rolling back, we acquire the lock but only release the lock for 
procedure with holdLock = true...
*Rolling back for procedures with holdLock=false(e.g. most of the Table 
Procedures) will never release the lock...*
{code}
LockState lockState;
      if (!reuseLock && (lockState = acquireLock(proc)) != 
LockState.LOCK_ACQUIRED) {
        // can't take a lock on the procedure, add the root-proc back on the
        // queue waiting for the lock availability
        return lockState;
      }
......
execCompletionCleanup(proc); // only release lock for procedure with 
holdLock=true
{code}

> Lock may not be taken while rolling back procedure
> --------------------------------------------------
>
>                 Key: HBASE-20975
>                 URL: https://issues.apache.org/jira/browse/HBASE-20975
>             Project: HBase
>          Issue Type: Sub-task
>          Components: amv2
>    Affects Versions: 2.1.0, 2.0.1
>            Reporter: Allan Yang
>            Assignee: Allan Yang
>            Priority: Major
>         Attachments: HBASE-20975.branch-2.0.001.patch, 
> HBASE-20975.branch-2.0.002.patch, HBASE-20975.branch-2.0.003.patch, 
> HBASE-20975.branch-2.0.004.patch
>
>
> Find this one when investigating HBASE-20921, too.
> Here is some code from executeRollback in ProcedureExecutor.java.
> {code}
>     boolean reuseLock = false;
>     while (stackTail --> 0) {
>       final Procedure proc = subprocStack.get(stackTail);
>       LockState lockState;
>       //If reuseLock, then don't acquire the lock
>       if (!reuseLock && (lockState = acquireLock(proc)) != 
> LockState.LOCK_ACQUIRED) {
>         return lockState;
>       }
>       lockState = executeRollback(proc);
>       boolean abortRollback = lockState != LockState.LOCK_ACQUIRED;
>       abortRollback |= !isRunning() || !store.isRunning();
>       //If the next procedure in the stack is the current one, then reuseLock 
> = true
>       reuseLock = stackTail > 0 && (subprocStack.get(stackTail - 1) == proc) 
> && !abortRollback;
>       //If reuseLock, don't releaseLock
>       if (!reuseLock) {
>         releaseLock(proc, false);
>       }
>       if (abortRollback) {
>         return lockState;
>       }
>       subprocStack.remove(stackTail);
>       if (proc.isYieldAfterExecutionStep(getEnvironment())) {
>         return LockState.LOCK_YIELD_WAIT;
>       }
>       //But, here, lock is released no matter reuseLock is true or false
>       if (proc != rootProc) {
>         execCompletionCleanup(proc);
>       }
>     }
> {code}
> You can see my comments in the code above, reuseLock can cause the procedure 
> executing(rollback) without a lock. Though I haven't found any bugs 
> introduced by this issue, it is indeed a potential bug need to fix.
> I think we can just remove the reuseLock logic. Acquire and release lock 
> every time. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to