[
https://issues.apache.org/jira/browse/TEZ-2022?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14301713#comment-14301713
]
Bikas Saha commented on TEZ-2022:
---------------------------------
This Precondition is not valid. A vertex can be in initializing state if the
edges are not defined or the parallelism is not set. This can be at any vertex
in the graph and not necessarily a root vertex reading primary inputs. Even a
root vertex can have another vertex as a source.
For this specific case, it seems like recovery has kicked in for the attached
app log and the vertex managed to get a start event before recovery code moved
the vertex out of initializing. Are the first attempt logs available?
> java.lang.IllegalStateException: Vertex: got invalid start event
> ----------------------------------------------------------------
>
> Key: TEZ-2022
> URL: https://issues.apache.org/jira/browse/TEZ-2022
> Project: Apache Tez
> Issue Type: Bug
> Reporter: Rajesh Balamohan
> Attachments: TEZ-2022-DAG.png, TEZ-2022.applog.txt
>
>
> Latest Tez (with tez-2020 patch) + Pig
> When running pig with rank function, the following exception happens
> consistently
> {code}
> java.lang.IllegalStateException: Vertex: vertex_1422270854961_0113_1_03
> [scope-47] got invalid start event
> at
> com.google.common.base.Preconditions.checkState(Preconditions.java:145)
> at
> org.apache.tez.dag.app.dag.impl.VertexImpl$StartWhileInitializingTransition.transition(VertexImpl.java:3178)
> at
> org.apache.tez.dag.app.dag.impl.VertexImpl$StartWhileInitializingTransition.transition(VertexImpl.java:3170)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory$SingleInternalArc.doTransition(StateMachineFactory.java:362)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
> at
> org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
> at
> org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:57)
> at
> org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:1547)
> at
> org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:181)
> at
> org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:1768)
> at
> org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:1754)
> at
> org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:184)
> at
> org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:116)
> at java.lang.Thread.run(Thread.java:744)
> {code}
> {code}
> rawLogs = load '/tmp/logs/root/logs/application_1422270854961_0093/' using
> org.apache.tez.tools.TFileLoader() as (machine:chararray, key:chararray,
> line:chararray);
> raw = FOREACH rawLogs GENERATE TRIM(REGEX_EXTRACT(machine, '(.*)_(\\d+)', 1))
> as machine, key, line;
> machines = FOREACH raw GENERATE machine;
> distinctMachines = DISTINCT machines;
> sortByMachines = ORDER distinctMachines BY machine;
> ranked = RANK sortByMachines;
> dump ranked;
> {code}
> Will attach the DAG and applog asap.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)