[
https://issues.apache.org/jira/browse/TEZ-2429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jeff Zhang resolved TEZ-2429.
-----------------------------
Resolution: Cannot Reproduce
> Tez AM does not die after hitting internal error
> -------------------------------------------------
>
> Key: TEZ-2429
> URL: https://issues.apache.org/jira/browse/TEZ-2429
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Hitesh Shah
> Priority: Blocker
> Attachments: syslog_dag_1430956448478_0001_16_post,
> syslog_dag_1430956448478_0001_17
>
>
> From https://builds.apache.org/job/Tez-Build/1055/:
> 2015-05-06 23:55:54,421 ERROR [Dispatcher thread: Central] impl.DAGImpl:
> Can't handle this event at current state
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event:
> DAG_VERTEX_RERUNNING at SUCCEEDED
> at
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at
> org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:57)
> at org.apache.tez.dag.app.dag.impl.DAGImpl.handle(DAGImpl.java:1079)
> at org.apache.tez.dag.app.dag.impl.DAGImpl.handle(DAGImpl.java:143)
> at
> org.apache.tez.dag.app.DAGAppMaster$DagEventDispatcher.handle(DAGAppMaster.java:1871)
> at
> org.apache.tez.dag.app.DAGAppMaster$DagEventDispatcher.handle(DAGAppMaster.java:1862)
> at
> org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
> at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114)
> at java.lang.Thread.run(Thread.java:662)
> 2015-05-06 23:55:54,423 INFO [Dispatcher thread: Central] app.DAGAppMaster:
> Cleaning up DAG: name=testRandomFailingInputs, with
> id=dag_1430956448478_0001_16
> 2015-05-06 23:55:54,423 INFO [Dispatcher thread: Central] app.DAGAppMaster:
> Completed cleanup for DAG: name=testRandomFailingInputs, with
> id=dag_1430956448478_0001_16
> 2015-05-06 23:55:54,424 INFO [Dispatcher thread: Central] impl.DAGImpl:
> dag_1430956448478_0001_16 terminating due to internal error
> 2015-05-06 23:55:54,433 INFO [IPC Server handler 0 on 47432]
> app.DAGAppMaster: Starting DAG submitted via RPC:
> testBasicInputFailureWithExit
> 2015-05-06 23:55:54,455 ERROR [Dispatcher thread: Central]
> common.AsyncDispatcher: Error in dispatcher thread
> java.lang.NullPointerException
> at
> org.apache.tez.dag.history.recovery.RecoveryService.doFlush(RecoveryService.java:458)
> at
> org.apache.tez.dag.history.recovery.RecoveryService.handle(RecoveryService.java:289)
> at
> org.apache.tez.dag.history.HistoryEventHandler.handleCriticalEvent(HistoryEventHandler.java:102)
> at
> org.apache.tez.dag.app.dag.impl.DAGImpl.logJobHistoryUnsuccesfulEvent(DAGImpl.java:1161)
> at org.apache.tez.dag.app.dag.impl.DAGImpl.finished(DAGImpl.java:1275)
> at org.apache.tez.dag.app.dag.impl.DAGImpl.access$2600(DAGImpl.java:144)
> at
> org.apache.tez.dag.app.dag.impl.DAGImpl$InternalErrorTransition.transition(DAGImpl.java:2151)
> at
> org.apache.tez.dag.app.dag.impl.DAGImpl$InternalErrorTransition.transition(DAGImpl.java:2140)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at
> org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:57)
> at org.apache.tez.dag.app.dag.impl.DAGImpl.handle(DAGImpl.java:1079)
> at org.apache.tez.dag.app.dag.impl.DAGImpl.handle(DAGImpl.java:143)
> at
> org.apache.tez.dag.app.DAGAppMaster$DagEventDispatcher.handle(DAGAppMaster.java:1871)
> at
> org.apache.tez.dag.app.DAGAppMaster$DagEventDispatcher.handle(DAGAppMaster.java:1862)
> at
> org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
> at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114)
> at java.lang.Thread.run(Thread.java:662)
> 2015-05-06 23:55:54,456 INFO [Dispatcher thread: Central] impl.VertexImpl:
> Killing tasks in vertex: vertex_1430956448478_0001_16_10 [l4v1] due to
> trigger: INTERNAL_ERROR
> 2015-05-06 23:55:54,456 INFO [Dispatcher thread: Central] impl.VertexImpl:
> vertex_1430956448478_0001_16_10 [l4v1] transitioned from RUNNING to
> TERMINATING due to event V_TERMINATE
> 2015-05-06 23:55:54,456 INFO [AsyncDispatcher ShutDown handler]
> common.AsyncDispatcher: Exiting, bbye..
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)