[
https://issues.apache.org/jira/browse/HBASE-21437?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16680740#comment-16680740
]
Allan Yang commented on HBASE-21437:
------------------------------------
You need to remove from timeoutExecutor first, and only call setTimeoutFailure
if remove is successful. Otherwise, there could be a race condition that the
setTimeoutFailure called twice.
Another thing need to discuss is that you want to call setTimeoutFailure
directly? What if it returns true? As previous design, if return true,
timeoutExecutor should handle the abortion. Do we need to handle the abortion
here?
> Bypassed procedure throw IllegalArgumentException when its state is
> WAITING_TIMEOUT
> -----------------------------------------------------------------------------------
>
> Key: HBASE-21437
> URL: https://issues.apache.org/jira/browse/HBASE-21437
> Project: HBase
> Issue Type: Bug
> Reporter: Jingyun Tian
> Assignee: Jingyun Tian
> Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21437.master.001.patch
>
>
> {code}
> 2018-11-05,18:25:52,735 WARN
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Worker terminating
> UNNATURALLY null
> java.lang.IllegalArgumentException: NOT RUNNABLE! pid=3,
> state=WAITING_TIMEOUT:REGION_STATE_TRANSITION_CLOSE, hasLock=true,
> bypass=true; TransitRegionStateProcedure table=test_fail
> over, region=1bb029ba4ec03b92061be5c4329d2096, UNASSIGN
> at
> org.apache.hbase.thirdparty.com.google.common.base.Preconditions.checkArgument(Preconditions.java:134)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.execProcedure(ProcedureExecutor.java:1620)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.executeProcedure(ProcedureExecutor.java:1384)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor.access$1100(ProcedureExecutor.java:78)
> at
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor$WorkerThread.run(ProcedureExecutor.java:1948)
> 2018-11-05,18:25:52,736 TRACE
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Worker terminated.
> {code}
> Since when we bypassed a WAITING_TIMEOUT procedure and resubmit it, its state
> is still WAITING_TIMEOUT, then when executor run this procedure, it will
> throw exception and cause worker terminated.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)