[
https://issues.apache.org/jira/browse/HBASE-20975?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16565123#comment-16565123
]
Allan Yang edited comment on HBASE-20975 at 8/1/18 11:35 AM:
-------------------------------------------------------------
The failed UTs are related, upload a patch to fix it. My fix reveals a bug
which exists all the time...
the acquire and release lock logic in procedure is a bit of mess here... For
different procedures, we have holdLock() set to true of false, for procedures
which holdLock() == true, we need to override hasLock() method to test whether
we have the lock...
when executing procedure, we acquire the lock and release it immediately.
{code}
// ownership of a procedure of an entity such as a region or table
LockState lockState = acquireLock(proc);
switch (lockState) {
case LOCK_ACQUIRED:
execProcedure(procStack, proc);
releaseLock(proc, false); // call release lock with force = false
{code}
And when the procedure is success, we release the lock again in
execCompletionCleanup()...
{code}
if (proc.isSuccess()) {
......
if (proc.getProcId() == rootProcId) {
procedureFinished(proc); // call release lock with force = true
} else {
execCompletionCleanup(proc); // call release lock with force = true
}
break;
}
{code}
It works, because we call releaseLock() with force = false, for procedure with
holdLock = true, the lock won't release here, but in execCompletionCleanup,
where we call releaseLock() with force = true.
For procedure with holdLock = false, it also works since when the first time we
call releaseLock() with force = false, the lock will release, and the second
time we call relseaseLock in execCompletionCleanup(), we only release it lock
if procedure with holdLock=true. So releasing lock won't called twice
But when rolling back, we acquire the lock but only release the lock for
procedure with holdLock = true...
*Rolling back for procedures with holdLock=false(e.g. most of the Table
Procedures) will never release the lock...*
{code}
LockState lockState;
if (!reuseLock && (lockState = acquireLock(proc)) !=
LockState.LOCK_ACQUIRED) {
// can't take a lock on the procedure, add the root-proc back on the
// queue waiting for the lock availability
return lockState;
}
......
execCompletionCleanup(proc); // only release lock for procedure with
holdLock=true
{code}
was (Author: allan163):
The failed UTs are related, upload a patch to fix it. My fix revealed a bug
exist all the time...
the acquire and release logic in procedure is a bit of mess here... For
different procedures, we have holdLock() set to true of false, for procedures
which holdLock() == true, we need to override hasLock() method to test whether
we have the lock...
when executing procedure, we acquire the lock and release it immediately.
{code}
// ownership of a procedure of an entity such as a region or table
LockState lockState = acquireLock(proc);
switch (lockState) {
case LOCK_ACQUIRED:
execProcedure(procStack, proc);
releaseLock(proc, false); // call release lock with force = false
{code}
And when the procedure is success, we release the lock gain in
execCompletionCleanup()...
{code}
if (proc.isSuccess()) {
......
if (proc.getProcId() == rootProcId) {
procedureFinished(proc); // call release lock with force = true
} else {
execCompletionCleanup(proc); // call release lock with force = true
}
break;
}
{code}
It works, because we call releaseLock() with force = false, for procedure with
holdLock = true, the lock won't release here, but in execCompletionCleanup,
where we call releaseLock() with force = true.
For procedure with holdLock = false, it also works since when the first time we
call releaseLock() with force = false, the lock will release, and the second
time we call relseaseLock in execCompletionCleanup(), we only release it lock
if procedure with holdLock=true. So releasing lock won't called twice
But when rolling back, we acquire the lock but only release the lock for
procedure with holdLock = true...
*Rolling back for procedures with holdLock=false(e.g. most of the Table
Procedures) will never release the lock...*
{code}
LockState lockState;
if (!reuseLock && (lockState = acquireLock(proc)) !=
LockState.LOCK_ACQUIRED) {
// can't take a lock on the procedure, add the root-proc back on the
// queue waiting for the lock availability
return lockState;
}
......
execCompletionCleanup(proc); // only release lock for procedure with
holdLock=true
{code}
> Lock may not be taken while rolling back procedure
> --------------------------------------------------
>
> Key: HBASE-20975
> URL: https://issues.apache.org/jira/browse/HBASE-20975
> Project: HBase
> Issue Type: Sub-task
> Components: amv2
> Affects Versions: 2.1.0, 2.0.1
> Reporter: Allan Yang
> Assignee: Allan Yang
> Priority: Major
> Attachments: HBASE-20975.branch-2.0.001.patch,
> HBASE-20975.branch-2.0.002.patch, HBASE-20975.branch-2.0.003.patch,
> HBASE-20975.branch-2.0.004.patch
>
>
> Find this one when investigating HBASE-20921, too.
> Here is some code from executeRollback in ProcedureExecutor.java.
> {code}
> boolean reuseLock = false;
> while (stackTail --> 0) {
> final Procedure proc = subprocStack.get(stackTail);
> LockState lockState;
> //If reuseLock, then don't acquire the lock
> if (!reuseLock && (lockState = acquireLock(proc)) !=
> LockState.LOCK_ACQUIRED) {
> return lockState;
> }
> lockState = executeRollback(proc);
> boolean abortRollback = lockState != LockState.LOCK_ACQUIRED;
> abortRollback |= !isRunning() || !store.isRunning();
> //If the next procedure in the stack is the current one, then reuseLock
> = true
> reuseLock = stackTail > 0 && (subprocStack.get(stackTail - 1) == proc)
> && !abortRollback;
> //If reuseLock, don't releaseLock
> if (!reuseLock) {
> releaseLock(proc, false);
> }
> if (abortRollback) {
> return lockState;
> }
> subprocStack.remove(stackTail);
> if (proc.isYieldAfterExecutionStep(getEnvironment())) {
> return LockState.LOCK_YIELD_WAIT;
> }
> //But, here, lock is released no matter reuseLock is true or false
> if (proc != rootProc) {
> execCompletionCleanup(proc);
> }
> }
> {code}
> You can see my comments in the code above, reuseLock can cause the procedure
> executing(rollback) without a lock. Though I haven't found any bugs
> introduced by this issue, it is indeed a potential bug need to fix.
> I think we can just remove the reuseLock logic. Acquire and release lock
> every time.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)