[
https://issues.apache.org/jira/browse/TEZ-2379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14523666#comment-14523666
]
Bikas Saha commented on TEZ-2379:
---------------------------------
Please see analysis above. It should actually be invalid for an attempt killed
to come after the task state is killed because the task is supposed to wait for
attempts to be complete before entering a final state - thats the whole point
of the kill_wait state, right?
IMO, the fix is to have the attempt ignore a kill request if its already done.
Thoughts?
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event:
> T_ATTEMPT_KILLED at KILLED
> ------------------------------------------------------------------------------------------------------
>
> Key: TEZ-2379
> URL: https://issues.apache.org/jira/browse/TEZ-2379
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Rajesh Balamohan
> Assignee: Hitesh Shah
> Priority: Blocker
> Attachments: TEZ-2379.1.patch
>
>
> {noformat}
> 2015-04-28 04:49:32,455 ERROR [Dispatcher thread: Central] impl.TaskImpl:
> Can't handle this event at current state for
> task_1429683757595_0479_1_03_000013
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event:
> T_ATTEMPT_KILLED at KILLED
> at
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at
> org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:57)
> at org.apache.tez.dag.app.dag.impl.TaskImpl.handle(TaskImpl.java:853)
> at org.apache.tez.dag.app.dag.impl.TaskImpl.handle(TaskImpl.java:106)
> at
> org.apache.tez.dag.app.DAGAppMaster$TaskEventDispatcher.handle(DAGAppMaster.java:1874)
> at
> org.apache.tez.dag.app.DAGAppMaster$TaskEventDispatcher.handle(DAGAppMaster.java:1860)
> at
> org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:182)
> at
> org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:113)
> at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Additional notes:
> ============
> Hive - latest build
> Tez - master
> tpch-200 gb scale q_17 (kill the job in the middle of execution)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)