Jingyun Tian created HBASE-21291:
------------------------------------
Summary: Bypass doesn't work for state-machine procedures
Key: HBASE-21291
URL: https://issues.apache.org/jira/browse/HBASE-21291
Project: HBase
Issue Type: Improvement
Affects Versions: 2.2.0
Reporter: Jingyun Tian
{code}
if (!procedure.isFailed()) {
if (subprocs != null) {
if (subprocs.length == 1 && subprocs[0] == procedure) {
// Procedure returned itself. Quick-shortcut for a state
machine-like procedure;
// i.e. we go around this loop again rather than go back out on the
scheduler queue.
subprocs = null;
reExecute = true;
LOG.trace("Short-circuit to next step on pid={}",
procedure.getProcId());
} else {
// Yield the current procedure, and make the subprocedure runnable
// subprocs may come back 'null'.
subprocs = initializeChildren(procStack, procedure, subprocs);
LOG.info("Initialized subprocedures=" +
(subprocs == null? null:
Stream.of(subprocs).map(e -> "{" + e.toString() + "}").
collect(Collectors.toList()).toString()));
}
} else if (procedure.getState() == ProcedureState.WAITING_TIMEOUT) {
LOG.debug("Added to timeoutExecutor {}", procedure);
timeoutExecutor.add(procedure);
} else if (!suspended) {
// No subtask, so we are done
procedure.setState(ProcedureState.SUCCESS);
}
}
{code}
Currently implementation of ProcedureExecutor will set the reExcecute to true
for state machine like procedure. Then if this procedure is stuck at one
certain state, it will loop forever.
{code}
IdLock.Entry lockEntry =
procExecutionLock.getLockEntry(proc.getProcId());
try {
executeProcedure(proc);
} catch (AssertionError e) {
LOG.info("ASSERT pid=" + proc.getProcId(), e);
throw e;
} finally {
procExecutionLock.releaseLockEntry(lockEntry);
{code}
Since procedure will get the IdLock and release it after execution done, state
machine procedure will never release IdLock until it is finished.
Then bypassProcedure doesn't work because is will try to grab the IdLock at
first.
{code}
IdLock.Entry lockEntry =
procExecutionLock.tryLockEntry(procedure.getProcId(), lockWait);
{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)