[ 
https://issues.apache.org/jira/browse/TEZ-2568?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14594349#comment-14594349
 ] 

Bikas Saha commented on TEZ-2568:
---------------------------------

I think I found the issue. V_INPUT_DATA_INFORMATION event is supposed to come 
before the vertex moves to inited state because it is bringing events that must 
go to the data source inputs of the tasks of this vertex. So vertex cannot be 
inited without it.

The bug is here in InputDataInformationTransition. 
{code}
      // done. check if we need to do the initialization
      if (vertex.getState() == VertexState.INITIALIZING && 
vertex.initWaitsForRootInitializers) {
        if (vertex.numInitializedInputs == 
vertex.inputsWithInitializers.size()) { <<======== HERE
          // set the wait flag to false if all initializers are done
          vertex.initWaitsForRootInitializers = false;
        }
{code}
vertex.numInitializedInputs is being incremented in 
RootInputInitializedTransition which is ok because it counts how many 
initializers have completed. For, InputDataInformationTransition we need a new 
counter, say inputDataInformationEventsReceived that counts the number of 
V_INPUT_DATA_INFORMATION events received and use that one for the check "if 
(vertex.inputDataInformationEventsReceived == 
vertex.inputsWithInitializers.size())". We can add this check in addition to 
the current checks in the same if stmt. This will ensure that all root inputs 
initializers have completed and all input data information events have been 
handled (ie. all the root input initializer results have been taken care of by 
the VM) before the "vertex.initWaitsForRootInitializers = false" is executed.


> V_INPUT_DATA_INFORMATION may happen after vertex is initialized
> ---------------------------------------------------------------
>
>                 Key: TEZ-2568
>                 URL: https://issues.apache.org/jira/browse/TEZ-2568
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Jeff Zhang
>            Assignee: Jeff Zhang
>         Attachments: a.log
>
>
> {code}
> 2015-06-19 15:57:28,462 ERROR [Dispatcher thread: Central] impl.VertexImpl: 
> Can't handle Invalid event V_INPUT_DATA_INFORMATION on vertex Map 2 with 
> vertexId vertex_1434754502979_0002_2_00 at current state INITED
> org.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: 
> V_INPUT_DATA_INFORMATION at INITED
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305)
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
>         at 
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
>         at 
> org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:57)
>         at 
> org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:1799)
>         at 
> org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:198)
>         at 
> org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:1963)
>         at 
> org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:1949)
>         at 
> org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
>         at 
> org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114)
>         at java.lang.Thread.run(Thread.java:722)
> {code}
> Vertex move to INITED as long as its parallelism is determined, no null edges 
> and root inputs are initialized. RootInputDataInformation handling is not a 
> precondition of vertex move to INITED.   We can't wait for all the 
> V_INPUT_DATA_INFORMATION events available in INITIALIZING state, because it 
> is not know how many V_INPUT_DATA_INFORMATION we may receive, it is 
> determined by VM.  So will allow V_INPUT_DATA_INFORMATION happens when vertex 
> is initialized. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to