[ 
https://issues.apache.org/jira/browse/TEZ-2379?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14527033#comment-14527033
 ] 

Siddharth Seth commented on TEZ-2379:
-------------------------------------

One thing to consider here is that the individual state machines should be 
complete in themselves, and should not make assumptions about other state 
machines. This makes them a lot easier to reason about (we aren't there yet 
though)
TaskImpl
- Already knows how to handle ATTEMPT_KILLED and ATTEMPT_FAILED in the SUCCESS 
state. It'll, however, error out in the FAILED or KILLED state - but there's 
nothing to be done there if these events are received.

TaskAttemptImpl
- If moving from an one 'external' state to another - should inform the Task, 
and let it deal with the state change.


> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> T_ATTEMPT_KILLED at KILLED
> ------------------------------------------------------------------------------------------------------
>
>                 Key: TEZ-2379
>                 URL: https://issues.apache.org/jira/browse/TEZ-2379
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Rajesh Balamohan
>            Assignee: Hitesh Shah
>            Priority: Blocker
>         Attachments: TEZ-2379.1.patch, TEZ-2379.2.patch, TEZ-2379.3.patch
>
>
> {noformat}
> 2015-04-28 04:49:32,455 ERROR [Dispatcher thread: Central] impl.TaskImpl: 
> Can't handle this event at current state for 
> task_1429683757595_0479_1_03_000013
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> T_ATTEMPT_KILLED at KILLED
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>         at 
> org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:57)
>         at org.apache.tez.dag.app.dag.impl.TaskImpl.handle(TaskImpl.java:853)
>         at org.apache.tez.dag.app.dag.impl.TaskImpl.handle(TaskImpl.java:106)
>         at 
> org.apache.tez.dag.app.DAGAppMaster$TaskEventDispatcher.handle(DAGAppMaster.java:1874)
>         at 
> org.apache.tez.dag.app.DAGAppMaster$TaskEventDispatcher.handle(DAGAppMaster.java:1860)
>         at 
> org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:182)
>         at 
> org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:113)
>         at java.lang.Thread.run(Thread.java:745)
> {noformat}
> Additional notes:
> ============
> Hive - latest build 
> Tez - master
> tpch-200 gb scale q_17 (kill the job in the middle of execution)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to