[
https://issues.apache.org/jira/browse/TEZ-3213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15239706#comment-15239706
]
Jason Lowe commented on TEZ-3213:
---------------------------------
Sample log showing the initial error and the subsequent loop
{noformat}
2016-04-12 08:46:23,002 [ERROR] [Dispatcher thread {Central}]
|impl.VertexImpl|: Uncaught Exception when handling event
V_SOURCE_VERTEX_RECOVERED on vertex scope-4784 with vertexId
vertex_1459233834927_3098531_1_14 at current state RECOVERING
java.lang.RuntimeException: Invalid Vertex state, found non-zero recovered
events in invalid state, recoveredState=KILLED, recoveredEvents=3840
at
org.apache.tez.dag.app.dag.impl.VertexImpl$RecoverTransition.transition(VertexImpl.java:3298)
at
org.apache.tez.dag.app.dag.impl.VertexImpl$RecoverTransition.transition(VertexImpl.java:3004)
at
org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
at
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
at
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at
org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:57)
at
org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:1875)
at
org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:202)
at
org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2115)
at
org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2101)
at
org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114)
at java.lang.Thread.run(Thread.java:745)
[...]
2016-04-12 08:46:23,062 [ERROR] [Dispatcher thread {Central}]
|impl.VertexImpl|: Can't handle Invalid event V_INTERNAL_ERROR on vertex
scope-4784 with vertexId vertex_1459233834927_3098531_1_14 at current state
RECOVERING
org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event:
V_INTERNAL_ERROR at RECOVERING
at
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
at
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at
org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:57)
at
org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:1875)
at
org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:202)
at
org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2115)
at
org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2101)
at
org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114)
at java.lang.Thread.run(Thread.java:745)
[...]
2016-04-12 08:46:23,086 [ERROR] [Dispatcher thread {Central}]
|impl.VertexImpl|: Can't handle Invalid event V_INTERNAL_ERROR on vertex
scope-4784 with vertexId vertex_1459233834927_3098531_1_14 at current state
RECOVERING
org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event:
V_INTERNAL_ERROR at RECOVERING
at
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
at
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at
org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:57)
at
org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:1875)
at
org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:202)
at
org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2115)
at
org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2101)
at
org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114)
at java.lang.Thread.run(Thread.java:745)
[...]
org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event:
V_INTERNAL_ERROR at RECOVERING
at
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
at
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
at
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
at
org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:57)
at
org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:1875)
at
org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:202)
at
org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2115)
at
org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:2101)
at
org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114)
at java.lang.Thread.run(Thread.java:745)
{noformat}
> Uncaught exception during vertex recovery leads to invalid state transition
> loop
> --------------------------------------------------------------------------------
>
> Key: TEZ-3213
> URL: https://issues.apache.org/jira/browse/TEZ-3213
> Project: Apache Tez
> Issue Type: Bug
> Affects Versions: 0.7.0
> Reporter: Jason Lowe
>
> If an uncaught exception occurs during a state transition from the RECOVERING
> vertex then V_INTERNAL_ERROR will be delivered to the state machine, but that
> event is not handled in the RECOVERING state. That in turn causes a
> V_INTERNAL_ERROR event to be delivered to the state machine, and it loops
> logging the invalid transitions.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)