[
https://issues.apache.org/jira/browse/TEZ-2568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594349#comment-14594349
]
Bikas Saha commented on TEZ-2568:
---------------------------------
I think I found the issue. V_INPUT_DATA_INFORMATION event is supposed to come
before the vertex moves to inited state because it is bringing events that must
go to the data source inputs of the tasks of this vertex. So vertex cannot be
inited without it.
The bug is here in InputDataInformationTransition.
{code}
// done. check if we need to do the initialization
if (vertex.getState() == VertexState.INITIALIZING &&
vertex.initWaitsForRootInitializers) {
if (vertex.numInitializedInputs ==
vertex.inputsWithInitializers.size()) { <<======== HERE
// set the wait flag to false if all initializers are done
vertex.initWaitsForRootInitializers = false;
}
{code}
vertex.numInitializedInputs is being incremented in
RootInputInitializedTransition which is ok because it counts how many
initializers have completed. For, InputDataInformationTransition we need a new
counter, say inputDataInformationEventsReceived that counts the number of
V_INPUT_DATA_INFORMATION events received and use that one for the check "if
(vertex.inputDataInformationEventsReceived ==
vertex.inputsWithInitializers.size())". We can add this check in addition to
the current checks in the same if stmt. This will ensure that all root inputs
initializers have completed and all input data information events have been
handled (ie. all the root input initializer results have been taken care of by
the VM) before the "vertex.initWaitsForRootInitializers = false" is executed.
> V_INPUT_DATA_INFORMATION may happen after vertex is initialized
> ---------------------------------------------------------------
>
> Key: TEZ-2568
> URL: https://issues.apache.org/jira/browse/TEZ-2568
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Jeff Zhang
> Assignee: Jeff Zhang
> Attachments: a.log
>
>
> {code}
> 2015-06-19 15:57:28,462 ERROR [Dispatcher thread: Central] impl.VertexImpl:
> Can't handle Invalid event V_INPUT_DATA_INFORMATION on vertex Map 2 with
> vertexId vertex_1434754502979_0002_2_00 at current state INITED
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event:
> V_INPUT_DATA_INFORMATION at INITED
> at
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at
> org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:57)
> at
> org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:1799)
> at
> org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:198)
> at
> org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:1963)
> at
> org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:1949)
> at
> org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
> at
> org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114)
> at java.lang.Thread.run(Thread.java:722)
> {code}
> Vertex move to INITED as long as its parallelism is determined, no null edges
> and root inputs are initialized. RootInputDataInformation handling is not a
> precondition of vertex move to INITED. We can't wait for all the
> V_INPUT_DATA_INFORMATION events available in INITIALIZING state, because it
> is not know how many V_INPUT_DATA_INFORMATION we may receive, it is
> determined by VM. So will allow V_INPUT_DATA_INFORMATION happens when vertex
> is initialized.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)