[
https://issues.apache.org/jira/browse/HBASE-21291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16654548#comment-16654548
]
stack commented on HBASE-21291:
-------------------------------
[~tianjingyun] With this patch applied, now when I do bypass it does the
below....
{code}
18/10/17 19:36:20 ERROR client.HBaseHbck: 2441732
org.apache.hbase.thirdparty.com.google.protobuf.ServiceException:
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(java.io.IOException):
java.io.IOException: lockWait should be positive
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:472)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
at
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
at
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
Caused by: java.lang.IllegalArgumentException: lockWait should be positive
at
org.apache.hbase.thirdparty.com.google.common.base.Preconditions.checkArgument(Preconditions.java:134)
at
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.bypassProcedure(ProcedureExecutor.java:1050)
at
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.bypassProcedure(ProcedureExecutor.java:1043)
at
org.apache.hadoop.hbase.master.MasterRpcServices.bypassProcedure(MasterRpcServices.java:2421)
at
org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$HbckService$2.callBlockingMethod(MasterProtos.java)
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:413)
... 3 more
at
org.apache.hadoop.hbase.ipc.AbstractRpcClient.callBlockingMethod(AbstractRpcClient.java:336)
at
org.apache.hadoop.hbase.ipc.AbstractRpcClient.access$200(AbstractRpcClient.java:95)
at
org.apache.hadoop.hbase.ipc.AbstractRpcClient$BlockingRpcChannelImplementation.callBlockingMethod(AbstractRpcClient.java:571)
at
org.apache.hadoop.hbase.shaded.protobuf.generated.MasterProtos$HbckService$BlockingStub.bypassProcedure(MasterProtos.java)
at org.apache.hadoop.hbase.client.HBaseHbck$1.call(HBaseHbck.java:145)
at org.apache.hadoop.hbase.client.HBaseHbck$1.call(HBaseHbck.java:141)
at
org.apache.hadoop.hbase.shaded.protobuf.ProtobufUtil.call(ProtobufUtil.java:2945)
at
org.apache.hadoop.hbase.client.HBaseHbck.bypassProcedure(HBaseHbck.java:140)
at org.apache.hbase.HBCK2.bypass(HBCK2.java:183)
at org.apache.hbase.HBCK2.run(HBCK2.java:342)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:76)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:90)
at org.apache.hbase.HBCK2.main(HBCK2.java:389)
Caused by:
org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(java.io.IOException):
java.io.IOException: lockWait should be positive
at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:472)
at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:130)
at
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:324)
at
org.apache.hadoop.hbase.ipc.RpcExecutor$Handler.run(RpcExecutor.java:304)
Caused by: java.lang.IllegalArgumentException: lockWait should be positive
at
org.apache.hbase.thirdparty.com.google.common.base.Preconditions.checkArgument(Preconditions.java:134)
at
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.bypassProcedure(ProcedureExecutor.java:1050)
at
org.apache.hadoop.hbase.procedure2.ProcedureExecutor.bypassProcedure(ProcedureExecutor.java:1043)
at
org.apache.hadoop.hbase.master.MasterRpcServices.bypassProcedure(MasterRpcServices.java:2421)
...
{code}
That what you expect sir?
> Add a test for bypassing stuck state-machine procedures
> -------------------------------------------------------
>
> Key: HBASE-21291
> URL: https://issues.apache.org/jira/browse/HBASE-21291
> Project: HBase
> Issue Type: Bug
> Affects Versions: 2.2.0
> Reporter: Jingyun Tian
> Assignee: Jingyun Tian
> Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21291.master.001.patch,
> HBASE-21291.master.002.patch, HBASE-21291.master.003.patch,
> HBASE-21291.master.004.patch, HBASE-21291.master.005.patch
>
>
> {code}
> if (!procedure.isFailed()) {
> if (subprocs != null) {
> if (subprocs.length == 1 && subprocs[0] == procedure) {
> // Procedure returned itself. Quick-shortcut for a state
> machine-like procedure;
> // i.e. we go around this loop again rather than go back out on
> the scheduler queue.
> subprocs = null;
> reExecute = true;
> LOG.trace("Short-circuit to next step on pid={}",
> procedure.getProcId());
> } else {
> // Yield the current procedure, and make the subprocedure runnable
> // subprocs may come back 'null'.
> subprocs = initializeChildren(procStack, procedure, subprocs);
> LOG.info("Initialized subprocedures=" +
> (subprocs == null? null:
> Stream.of(subprocs).map(e -> "{" + e.toString() + "}").
> collect(Collectors.toList()).toString()));
> }
> } else if (procedure.getState() == ProcedureState.WAITING_TIMEOUT) {
> LOG.debug("Added to timeoutExecutor {}", procedure);
> timeoutExecutor.add(procedure);
> } else if (!suspended) {
> // No subtask, so we are done
> procedure.setState(ProcedureState.SUCCESS);
> }
> }
> {code}
> Currently implementation of ProcedureExecutor will set the reExcecute to true
> for state machine like procedure. Then if this procedure is stuck at one
> certain state, it will loop forever.
> {code}
> IdLock.Entry lockEntry =
> procExecutionLock.getLockEntry(proc.getProcId());
> try {
> executeProcedure(proc);
> } catch (AssertionError e) {
> LOG.info("ASSERT pid=" + proc.getProcId(), e);
> throw e;
> } finally {
> procExecutionLock.releaseLockEntry(lockEntry);
> {code}
> Since procedure will get the IdLock and release it after execution done,
> state machine procedure will never release IdLock until it is finished.
> Then bypassProcedure doesn't work because is will try to grab the IdLock at
> first.
> {code}
> IdLock.Entry lockEntry =
> procExecutionLock.tryLockEntry(procedure.getProcId(), lockWait);
> {code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)