[
https://issues.apache.org/jira/browse/TEZ-2379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14525521#comment-14525521
]
Bikas Saha commented on TEZ-2379:
---------------------------------
If you still think that ignoring the attempt killed event in the task killed
state is the right fix, then please go ahead and make the change. This race
condition is not quite relevant since the job is anyways going to killed.
However, the attempt killed event would need to be ignored in all task terminal
states (success, failed and killed) since the race could happen for any of
these transitions if the attempt actually succeeded while the task was trying
to kill it.
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event:
> T_ATTEMPT_KILLED at KILLED
> ------------------------------------------------------------------------------------------------------
>
> Key: TEZ-2379
> URL: https://issues.apache.org/jira/browse/TEZ-2379
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Rajesh Balamohan
> Assignee: Hitesh Shah
> Priority: Blocker
> Attachments: TEZ-2379.1.patch
>
>
> {noformat}
> 2015-04-28 04:49:32,455 ERROR [Dispatcher thread: Central] impl.TaskImpl:
> Can't handle this event at current state for
> task_1429683757595_0479_1_03_000013
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event:
> T_ATTEMPT_KILLED at KILLED
> at
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at
> org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:57)
> at org.apache.tez.dag.app.dag.impl.TaskImpl.handle(TaskImpl.java:853)
> at org.apache.tez.dag.app.dag.impl.TaskImpl.handle(TaskImpl.java:106)
> at
> org.apache.tez.dag.app.DAGAppMaster$TaskEventDispatcher.handle(DAGAppMaster.java:1874)
> at
> org.apache.tez.dag.app.DAGAppMaster$TaskEventDispatcher.handle(DAGAppMaster.java:1860)
> at
> org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:182)
> at
> org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:113)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Additional notes:
> ============
> Hive - latest build
> Tez - master
> tpch-200 gb scale q_17 (kill the job in the middle of execution)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)