[jira] [Updated] (HBASE-21291) Add a test for bypassing stuck state-machine procedures
[ https://issues.apache.org/jira/browse/HBASE-21291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-21291: -- Release Note: bypass will now throw an Exception if passed a lockWait <= 0; i.e bypass will prevent an operator getting stuck on an entity lock waiting forever (lockWait == 0) > Add a test for bypassing stuck state-machine procedures > --- > > Key: HBASE-21291 > URL: https://issues.apache.org/jira/browse/HBASE-21291 > Project: HBase > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Jingyun Tian >Assignee: Jingyun Tian >Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21291.master.001.patch, > HBASE-21291.master.002.patch, HBASE-21291.master.003.patch, > HBASE-21291.master.004.patch, HBASE-21291.master.005.patch > > > {code} > if (!procedure.isFailed()) { > if (subprocs != null) { > if (subprocs.length == 1 && subprocs[0] == procedure) { > // Procedure returned itself. Quick-shortcut for a state > machine-like procedure; > // i.e. we go around this loop again rather than go back out on > the scheduler queue. > subprocs = null; > reExecute = true; > LOG.trace("Short-circuit to next step on pid={}", > procedure.getProcId()); > } else { > // Yield the current procedure, and make the subprocedure runnable > // subprocs may come back 'null'. > subprocs = initializeChildren(procStack, procedure, subprocs); > LOG.info("Initialized subprocedures=" + > (subprocs == null? null: > Stream.of(subprocs).map(e -> "{" + e.toString() + "}"). > collect(Collectors.toList()).toString())); > } > } else if (procedure.getState() == ProcedureState.WAITING_TIMEOUT) { > LOG.debug("Added to timeoutExecutor {}", procedure); > timeoutExecutor.add(procedure); > } else if (!suspended) { > // No subtask, so we are done > procedure.setState(ProcedureState.SUCCESS); > } > } > {code} > Currently implementation of ProcedureExecutor will set the reExcecute to true > for state machine like procedure. Then if this procedure is stuck at one > certain state, it will loop forever. > {code} > IdLock.Entry lockEntry = > procExecutionLock.getLockEntry(proc.getProcId()); > try { > executeProcedure(proc); > } catch (AssertionError e) { > LOG.info("ASSERT pid=" + proc.getProcId(), e); > throw e; > } finally { > procExecutionLock.releaseLockEntry(lockEntry); > {code} > Since procedure will get the IdLock and release it after execution done, > state machine procedure will never release IdLock until it is finished. > Then bypassProcedure doesn't work because is will try to grab the IdLock at > first. > {code} > IdLock.Entry lockEntry = > procExecutionLock.tryLockEntry(procedure.getProcId(), lockWait); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21291) Add a test for bypassing stuck state-machine procedures
[ https://issues.apache.org/jira/browse/HBASE-21291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Allan Yang updated HBASE-21291: --- Resolution: Fixed Fix Version/s: 2.2.0 3.0.0 Status: Resolved (was: Patch Available) > Add a test for bypassing stuck state-machine procedures > --- > > Key: HBASE-21291 > URL: https://issues.apache.org/jira/browse/HBASE-21291 > Project: HBase > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Jingyun Tian >Assignee: Jingyun Tian >Priority: Major > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21291.master.001.patch, > HBASE-21291.master.002.patch, HBASE-21291.master.003.patch, > HBASE-21291.master.004.patch, HBASE-21291.master.005.patch > > > {code} > if (!procedure.isFailed()) { > if (subprocs != null) { > if (subprocs.length == 1 && subprocs[0] == procedure) { > // Procedure returned itself. Quick-shortcut for a state > machine-like procedure; > // i.e. we go around this loop again rather than go back out on > the scheduler queue. > subprocs = null; > reExecute = true; > LOG.trace("Short-circuit to next step on pid={}", > procedure.getProcId()); > } else { > // Yield the current procedure, and make the subprocedure runnable > // subprocs may come back 'null'. > subprocs = initializeChildren(procStack, procedure, subprocs); > LOG.info("Initialized subprocedures=" + > (subprocs == null? null: > Stream.of(subprocs).map(e -> "{" + e.toString() + "}"). > collect(Collectors.toList()).toString())); > } > } else if (procedure.getState() == ProcedureState.WAITING_TIMEOUT) { > LOG.debug("Added to timeoutExecutor {}", procedure); > timeoutExecutor.add(procedure); > } else if (!suspended) { > // No subtask, so we are done > procedure.setState(ProcedureState.SUCCESS); > } > } > {code} > Currently implementation of ProcedureExecutor will set the reExcecute to true > for state machine like procedure. Then if this procedure is stuck at one > certain state, it will loop forever. > {code} > IdLock.Entry lockEntry = > procExecutionLock.getLockEntry(proc.getProcId()); > try { > executeProcedure(proc); > } catch (AssertionError e) { > LOG.info("ASSERT pid=" + proc.getProcId(), e); > throw e; > } finally { > procExecutionLock.releaseLockEntry(lockEntry); > {code} > Since procedure will get the IdLock and release it after execution done, > state machine procedure will never release IdLock until it is finished. > Then bypassProcedure doesn't work because is will try to grab the IdLock at > first. > {code} > IdLock.Entry lockEntry = > procExecutionLock.tryLockEntry(procedure.getProcId(), lockWait); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21291) Add a test for bypassing stuck state-machine procedures
[ https://issues.apache.org/jira/browse/HBASE-21291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingyun Tian updated HBASE-21291: - Attachment: HBASE-21291.master.005.patch > Add a test for bypassing stuck state-machine procedures > --- > > Key: HBASE-21291 > URL: https://issues.apache.org/jira/browse/HBASE-21291 > Project: HBase > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Jingyun Tian >Assignee: Jingyun Tian >Priority: Major > Attachments: HBASE-21291.master.001.patch, > HBASE-21291.master.002.patch, HBASE-21291.master.003.patch, > HBASE-21291.master.004.patch, HBASE-21291.master.005.patch > > > {code} > if (!procedure.isFailed()) { > if (subprocs != null) { > if (subprocs.length == 1 && subprocs[0] == procedure) { > // Procedure returned itself. Quick-shortcut for a state > machine-like procedure; > // i.e. we go around this loop again rather than go back out on > the scheduler queue. > subprocs = null; > reExecute = true; > LOG.trace("Short-circuit to next step on pid={}", > procedure.getProcId()); > } else { > // Yield the current procedure, and make the subprocedure runnable > // subprocs may come back 'null'. > subprocs = initializeChildren(procStack, procedure, subprocs); > LOG.info("Initialized subprocedures=" + > (subprocs == null? null: > Stream.of(subprocs).map(e -> "{" + e.toString() + "}"). > collect(Collectors.toList()).toString())); > } > } else if (procedure.getState() == ProcedureState.WAITING_TIMEOUT) { > LOG.debug("Added to timeoutExecutor {}", procedure); > timeoutExecutor.add(procedure); > } else if (!suspended) { > // No subtask, so we are done > procedure.setState(ProcedureState.SUCCESS); > } > } > {code} > Currently implementation of ProcedureExecutor will set the reExcecute to true > for state machine like procedure. Then if this procedure is stuck at one > certain state, it will loop forever. > {code} > IdLock.Entry lockEntry = > procExecutionLock.getLockEntry(proc.getProcId()); > try { > executeProcedure(proc); > } catch (AssertionError e) { > LOG.info("ASSERT pid=" + proc.getProcId(), e); > throw e; > } finally { > procExecutionLock.releaseLockEntry(lockEntry); > {code} > Since procedure will get the IdLock and release it after execution done, > state machine procedure will never release IdLock until it is finished. > Then bypassProcedure doesn't work because is will try to grab the IdLock at > first. > {code} > IdLock.Entry lockEntry = > procExecutionLock.tryLockEntry(procedure.getProcId(), lockWait); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21291) Add a test for bypassing stuck state-machine procedures
[ https://issues.apache.org/jira/browse/HBASE-21291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingyun Tian updated HBASE-21291: - Attachment: (was: HBASE-21291.master.005.patch) > Add a test for bypassing stuck state-machine procedures > --- > > Key: HBASE-21291 > URL: https://issues.apache.org/jira/browse/HBASE-21291 > Project: HBase > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Jingyun Tian >Assignee: Jingyun Tian >Priority: Major > Attachments: HBASE-21291.master.001.patch, > HBASE-21291.master.002.patch, HBASE-21291.master.003.patch, > HBASE-21291.master.004.patch > > > {code} > if (!procedure.isFailed()) { > if (subprocs != null) { > if (subprocs.length == 1 && subprocs[0] == procedure) { > // Procedure returned itself. Quick-shortcut for a state > machine-like procedure; > // i.e. we go around this loop again rather than go back out on > the scheduler queue. > subprocs = null; > reExecute = true; > LOG.trace("Short-circuit to next step on pid={}", > procedure.getProcId()); > } else { > // Yield the current procedure, and make the subprocedure runnable > // subprocs may come back 'null'. > subprocs = initializeChildren(procStack, procedure, subprocs); > LOG.info("Initialized subprocedures=" + > (subprocs == null? null: > Stream.of(subprocs).map(e -> "{" + e.toString() + "}"). > collect(Collectors.toList()).toString())); > } > } else if (procedure.getState() == ProcedureState.WAITING_TIMEOUT) { > LOG.debug("Added to timeoutExecutor {}", procedure); > timeoutExecutor.add(procedure); > } else if (!suspended) { > // No subtask, so we are done > procedure.setState(ProcedureState.SUCCESS); > } > } > {code} > Currently implementation of ProcedureExecutor will set the reExcecute to true > for state machine like procedure. Then if this procedure is stuck at one > certain state, it will loop forever. > {code} > IdLock.Entry lockEntry = > procExecutionLock.getLockEntry(proc.getProcId()); > try { > executeProcedure(proc); > } catch (AssertionError e) { > LOG.info("ASSERT pid=" + proc.getProcId(), e); > throw e; > } finally { > procExecutionLock.releaseLockEntry(lockEntry); > {code} > Since procedure will get the IdLock and release it after execution done, > state machine procedure will never release IdLock until it is finished. > Then bypassProcedure doesn't work because is will try to grab the IdLock at > first. > {code} > IdLock.Entry lockEntry = > procExecutionLock.tryLockEntry(procedure.getProcId(), lockWait); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21291) Add a test for bypassing stuck state-machine procedures
[ https://issues.apache.org/jira/browse/HBASE-21291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingyun Tian updated HBASE-21291: - Attachment: HBASE-21291.master.005.patch > Add a test for bypassing stuck state-machine procedures > --- > > Key: HBASE-21291 > URL: https://issues.apache.org/jira/browse/HBASE-21291 > Project: HBase > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Jingyun Tian >Assignee: Jingyun Tian >Priority: Major > Attachments: HBASE-21291.master.001.patch, > HBASE-21291.master.002.patch, HBASE-21291.master.003.patch, > HBASE-21291.master.004.patch, HBASE-21291.master.005.patch > > > {code} > if (!procedure.isFailed()) { > if (subprocs != null) { > if (subprocs.length == 1 && subprocs[0] == procedure) { > // Procedure returned itself. Quick-shortcut for a state > machine-like procedure; > // i.e. we go around this loop again rather than go back out on > the scheduler queue. > subprocs = null; > reExecute = true; > LOG.trace("Short-circuit to next step on pid={}", > procedure.getProcId()); > } else { > // Yield the current procedure, and make the subprocedure runnable > // subprocs may come back 'null'. > subprocs = initializeChildren(procStack, procedure, subprocs); > LOG.info("Initialized subprocedures=" + > (subprocs == null? null: > Stream.of(subprocs).map(e -> "{" + e.toString() + "}"). > collect(Collectors.toList()).toString())); > } > } else if (procedure.getState() == ProcedureState.WAITING_TIMEOUT) { > LOG.debug("Added to timeoutExecutor {}", procedure); > timeoutExecutor.add(procedure); > } else if (!suspended) { > // No subtask, so we are done > procedure.setState(ProcedureState.SUCCESS); > } > } > {code} > Currently implementation of ProcedureExecutor will set the reExcecute to true > for state machine like procedure. Then if this procedure is stuck at one > certain state, it will loop forever. > {code} > IdLock.Entry lockEntry = > procExecutionLock.getLockEntry(proc.getProcId()); > try { > executeProcedure(proc); > } catch (AssertionError e) { > LOG.info("ASSERT pid=" + proc.getProcId(), e); > throw e; > } finally { > procExecutionLock.releaseLockEntry(lockEntry); > {code} > Since procedure will get the IdLock and release it after execution done, > state machine procedure will never release IdLock until it is finished. > Then bypassProcedure doesn't work because is will try to grab the IdLock at > first. > {code} > IdLock.Entry lockEntry = > procExecutionLock.tryLockEntry(procedure.getProcId(), lockWait); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21291) Add a test for bypassing stuck state-machine procedures
[ https://issues.apache.org/jira/browse/HBASE-21291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingyun Tian updated HBASE-21291: - Attachment: HBASE-21291.master.004.patch > Add a test for bypassing stuck state-machine procedures > --- > > Key: HBASE-21291 > URL: https://issues.apache.org/jira/browse/HBASE-21291 > Project: HBase > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Jingyun Tian >Assignee: Jingyun Tian >Priority: Major > Attachments: HBASE-21291.master.001.patch, > HBASE-21291.master.002.patch, HBASE-21291.master.003.patch, > HBASE-21291.master.004.patch > > > {code} > if (!procedure.isFailed()) { > if (subprocs != null) { > if (subprocs.length == 1 && subprocs[0] == procedure) { > // Procedure returned itself. Quick-shortcut for a state > machine-like procedure; > // i.e. we go around this loop again rather than go back out on > the scheduler queue. > subprocs = null; > reExecute = true; > LOG.trace("Short-circuit to next step on pid={}", > procedure.getProcId()); > } else { > // Yield the current procedure, and make the subprocedure runnable > // subprocs may come back 'null'. > subprocs = initializeChildren(procStack, procedure, subprocs); > LOG.info("Initialized subprocedures=" + > (subprocs == null? null: > Stream.of(subprocs).map(e -> "{" + e.toString() + "}"). > collect(Collectors.toList()).toString())); > } > } else if (procedure.getState() == ProcedureState.WAITING_TIMEOUT) { > LOG.debug("Added to timeoutExecutor {}", procedure); > timeoutExecutor.add(procedure); > } else if (!suspended) { > // No subtask, so we are done > procedure.setState(ProcedureState.SUCCESS); > } > } > {code} > Currently implementation of ProcedureExecutor will set the reExcecute to true > for state machine like procedure. Then if this procedure is stuck at one > certain state, it will loop forever. > {code} > IdLock.Entry lockEntry = > procExecutionLock.getLockEntry(proc.getProcId()); > try { > executeProcedure(proc); > } catch (AssertionError e) { > LOG.info("ASSERT pid=" + proc.getProcId(), e); > throw e; > } finally { > procExecutionLock.releaseLockEntry(lockEntry); > {code} > Since procedure will get the IdLock and release it after execution done, > state machine procedure will never release IdLock until it is finished. > Then bypassProcedure doesn't work because is will try to grab the IdLock at > first. > {code} > IdLock.Entry lockEntry = > procExecutionLock.tryLockEntry(procedure.getProcId(), lockWait); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21291) Add a test for bypassing stuck state-machine procedures
[ https://issues.apache.org/jira/browse/HBASE-21291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingyun Tian updated HBASE-21291: - Attachment: HBASE-21291.master.003.patch > Add a test for bypassing stuck state-machine procedures > --- > > Key: HBASE-21291 > URL: https://issues.apache.org/jira/browse/HBASE-21291 > Project: HBase > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Jingyun Tian >Assignee: Jingyun Tian >Priority: Major > Attachments: HBASE-21291.master.001.patch, > HBASE-21291.master.002.patch, HBASE-21291.master.003.patch > > > {code} > if (!procedure.isFailed()) { > if (subprocs != null) { > if (subprocs.length == 1 && subprocs[0] == procedure) { > // Procedure returned itself. Quick-shortcut for a state > machine-like procedure; > // i.e. we go around this loop again rather than go back out on > the scheduler queue. > subprocs = null; > reExecute = true; > LOG.trace("Short-circuit to next step on pid={}", > procedure.getProcId()); > } else { > // Yield the current procedure, and make the subprocedure runnable > // subprocs may come back 'null'. > subprocs = initializeChildren(procStack, procedure, subprocs); > LOG.info("Initialized subprocedures=" + > (subprocs == null? null: > Stream.of(subprocs).map(e -> "{" + e.toString() + "}"). > collect(Collectors.toList()).toString())); > } > } else if (procedure.getState() == ProcedureState.WAITING_TIMEOUT) { > LOG.debug("Added to timeoutExecutor {}", procedure); > timeoutExecutor.add(procedure); > } else if (!suspended) { > // No subtask, so we are done > procedure.setState(ProcedureState.SUCCESS); > } > } > {code} > Currently implementation of ProcedureExecutor will set the reExcecute to true > for state machine like procedure. Then if this procedure is stuck at one > certain state, it will loop forever. > {code} > IdLock.Entry lockEntry = > procExecutionLock.getLockEntry(proc.getProcId()); > try { > executeProcedure(proc); > } catch (AssertionError e) { > LOG.info("ASSERT pid=" + proc.getProcId(), e); > throw e; > } finally { > procExecutionLock.releaseLockEntry(lockEntry); > {code} > Since procedure will get the IdLock and release it after execution done, > state machine procedure will never release IdLock until it is finished. > Then bypassProcedure doesn't work because is will try to grab the IdLock at > first. > {code} > IdLock.Entry lockEntry = > procExecutionLock.tryLockEntry(procedure.getProcId(), lockWait); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21291) Add a test for bypassing stuck state-machine procedures
[ https://issues.apache.org/jira/browse/HBASE-21291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingyun Tian updated HBASE-21291: - Attachment: HBASE-21291.master.002.patch > Add a test for bypassing stuck state-machine procedures > --- > > Key: HBASE-21291 > URL: https://issues.apache.org/jira/browse/HBASE-21291 > Project: HBase > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Jingyun Tian >Assignee: Jingyun Tian >Priority: Major > Attachments: HBASE-21291.master.001.patch, > HBASE-21291.master.002.patch > > > {code} > if (!procedure.isFailed()) { > if (subprocs != null) { > if (subprocs.length == 1 && subprocs[0] == procedure) { > // Procedure returned itself. Quick-shortcut for a state > machine-like procedure; > // i.e. we go around this loop again rather than go back out on > the scheduler queue. > subprocs = null; > reExecute = true; > LOG.trace("Short-circuit to next step on pid={}", > procedure.getProcId()); > } else { > // Yield the current procedure, and make the subprocedure runnable > // subprocs may come back 'null'. > subprocs = initializeChildren(procStack, procedure, subprocs); > LOG.info("Initialized subprocedures=" + > (subprocs == null? null: > Stream.of(subprocs).map(e -> "{" + e.toString() + "}"). > collect(Collectors.toList()).toString())); > } > } else if (procedure.getState() == ProcedureState.WAITING_TIMEOUT) { > LOG.debug("Added to timeoutExecutor {}", procedure); > timeoutExecutor.add(procedure); > } else if (!suspended) { > // No subtask, so we are done > procedure.setState(ProcedureState.SUCCESS); > } > } > {code} > Currently implementation of ProcedureExecutor will set the reExcecute to true > for state machine like procedure. Then if this procedure is stuck at one > certain state, it will loop forever. > {code} > IdLock.Entry lockEntry = > procExecutionLock.getLockEntry(proc.getProcId()); > try { > executeProcedure(proc); > } catch (AssertionError e) { > LOG.info("ASSERT pid=" + proc.getProcId(), e); > throw e; > } finally { > procExecutionLock.releaseLockEntry(lockEntry); > {code} > Since procedure will get the IdLock and release it after execution done, > state machine procedure will never release IdLock until it is finished. > Then bypassProcedure doesn't work because is will try to grab the IdLock at > first. > {code} > IdLock.Entry lockEntry = > procExecutionLock.tryLockEntry(procedure.getProcId(), lockWait); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21291) Add a test for bypassing stuck state-machine procedures
[ https://issues.apache.org/jira/browse/HBASE-21291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingyun Tian updated HBASE-21291: - Attachment: HBASE-21291.master.001.patch > Add a test for bypassing stuck state-machine procedures > --- > > Key: HBASE-21291 > URL: https://issues.apache.org/jira/browse/HBASE-21291 > Project: HBase > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Jingyun Tian >Assignee: Jingyun Tian >Priority: Major > Attachments: HBASE-21291.master.001.patch > > > {code} > if (!procedure.isFailed()) { > if (subprocs != null) { > if (subprocs.length == 1 && subprocs[0] == procedure) { > // Procedure returned itself. Quick-shortcut for a state > machine-like procedure; > // i.e. we go around this loop again rather than go back out on > the scheduler queue. > subprocs = null; > reExecute = true; > LOG.trace("Short-circuit to next step on pid={}", > procedure.getProcId()); > } else { > // Yield the current procedure, and make the subprocedure runnable > // subprocs may come back 'null'. > subprocs = initializeChildren(procStack, procedure, subprocs); > LOG.info("Initialized subprocedures=" + > (subprocs == null? null: > Stream.of(subprocs).map(e -> "{" + e.toString() + "}"). > collect(Collectors.toList()).toString())); > } > } else if (procedure.getState() == ProcedureState.WAITING_TIMEOUT) { > LOG.debug("Added to timeoutExecutor {}", procedure); > timeoutExecutor.add(procedure); > } else if (!suspended) { > // No subtask, so we are done > procedure.setState(ProcedureState.SUCCESS); > } > } > {code} > Currently implementation of ProcedureExecutor will set the reExcecute to true > for state machine like procedure. Then if this procedure is stuck at one > certain state, it will loop forever. > {code} > IdLock.Entry lockEntry = > procExecutionLock.getLockEntry(proc.getProcId()); > try { > executeProcedure(proc); > } catch (AssertionError e) { > LOG.info("ASSERT pid=" + proc.getProcId(), e); > throw e; > } finally { > procExecutionLock.releaseLockEntry(lockEntry); > {code} > Since procedure will get the IdLock and release it after execution done, > state machine procedure will never release IdLock until it is finished. > Then bypassProcedure doesn't work because is will try to grab the IdLock at > first. > {code} > IdLock.Entry lockEntry = > procExecutionLock.tryLockEntry(procedure.getProcId(), lockWait); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21291) Add a test for bypassing stuck state-machine procedures
[ https://issues.apache.org/jira/browse/HBASE-21291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingyun Tian updated HBASE-21291: - Status: Patch Available (was: Open) > Add a test for bypassing stuck state-machine procedures > --- > > Key: HBASE-21291 > URL: https://issues.apache.org/jira/browse/HBASE-21291 > Project: HBase > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Jingyun Tian >Assignee: Jingyun Tian >Priority: Major > Attachments: HBASE-21291.master.001.patch > > > {code} > if (!procedure.isFailed()) { > if (subprocs != null) { > if (subprocs.length == 1 && subprocs[0] == procedure) { > // Procedure returned itself. Quick-shortcut for a state > machine-like procedure; > // i.e. we go around this loop again rather than go back out on > the scheduler queue. > subprocs = null; > reExecute = true; > LOG.trace("Short-circuit to next step on pid={}", > procedure.getProcId()); > } else { > // Yield the current procedure, and make the subprocedure runnable > // subprocs may come back 'null'. > subprocs = initializeChildren(procStack, procedure, subprocs); > LOG.info("Initialized subprocedures=" + > (subprocs == null? null: > Stream.of(subprocs).map(e -> "{" + e.toString() + "}"). > collect(Collectors.toList()).toString())); > } > } else if (procedure.getState() == ProcedureState.WAITING_TIMEOUT) { > LOG.debug("Added to timeoutExecutor {}", procedure); > timeoutExecutor.add(procedure); > } else if (!suspended) { > // No subtask, so we are done > procedure.setState(ProcedureState.SUCCESS); > } > } > {code} > Currently implementation of ProcedureExecutor will set the reExcecute to true > for state machine like procedure. Then if this procedure is stuck at one > certain state, it will loop forever. > {code} > IdLock.Entry lockEntry = > procExecutionLock.getLockEntry(proc.getProcId()); > try { > executeProcedure(proc); > } catch (AssertionError e) { > LOG.info("ASSERT pid=" + proc.getProcId(), e); > throw e; > } finally { > procExecutionLock.releaseLockEntry(lockEntry); > {code} > Since procedure will get the IdLock and release it after execution done, > state machine procedure will never release IdLock until it is finished. > Then bypassProcedure doesn't work because is will try to grab the IdLock at > first. > {code} > IdLock.Entry lockEntry = > procExecutionLock.tryLockEntry(procedure.getProcId(), lockWait); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (HBASE-21291) Add a test for bypassing stuck state-machine procedures
[ https://issues.apache.org/jira/browse/HBASE-21291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jingyun Tian updated HBASE-21291: - Summary: Add a test for bypassing stuck state-machine procedures (was: Bypass doesn't work for state-machine procedures) > Add a test for bypassing stuck state-machine procedures > --- > > Key: HBASE-21291 > URL: https://issues.apache.org/jira/browse/HBASE-21291 > Project: HBase > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Jingyun Tian >Assignee: Jingyun Tian >Priority: Major > > {code} > if (!procedure.isFailed()) { > if (subprocs != null) { > if (subprocs.length == 1 && subprocs[0] == procedure) { > // Procedure returned itself. Quick-shortcut for a state > machine-like procedure; > // i.e. we go around this loop again rather than go back out on > the scheduler queue. > subprocs = null; > reExecute = true; > LOG.trace("Short-circuit to next step on pid={}", > procedure.getProcId()); > } else { > // Yield the current procedure, and make the subprocedure runnable > // subprocs may come back 'null'. > subprocs = initializeChildren(procStack, procedure, subprocs); > LOG.info("Initialized subprocedures=" + > (subprocs == null? null: > Stream.of(subprocs).map(e -> "{" + e.toString() + "}"). > collect(Collectors.toList()).toString())); > } > } else if (procedure.getState() == ProcedureState.WAITING_TIMEOUT) { > LOG.debug("Added to timeoutExecutor {}", procedure); > timeoutExecutor.add(procedure); > } else if (!suspended) { > // No subtask, so we are done > procedure.setState(ProcedureState.SUCCESS); > } > } > {code} > Currently implementation of ProcedureExecutor will set the reExcecute to true > for state machine like procedure. Then if this procedure is stuck at one > certain state, it will loop forever. > {code} > IdLock.Entry lockEntry = > procExecutionLock.getLockEntry(proc.getProcId()); > try { > executeProcedure(proc); > } catch (AssertionError e) { > LOG.info("ASSERT pid=" + proc.getProcId(), e); > throw e; > } finally { > procExecutionLock.releaseLockEntry(lockEntry); > {code} > Since procedure will get the IdLock and release it after execution done, > state machine procedure will never release IdLock until it is finished. > Then bypassProcedure doesn't work because is will try to grab the IdLock at > first. > {code} > IdLock.Entry lockEntry = > procExecutionLock.tryLockEntry(procedure.getProcId(), lockWait); > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)