[
https://issues.apache.org/jira/browse/TEZ-3336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15373680#comment-15373680
]
Jason Lowe commented on TEZ-3336:
---------------------------------
One example of the failure:
{noformat}
Vertex failed, vertexName=Map 1, vertexId=vertex_1467094199147_3081640_1_01,
diagnostics=[Vertex vertex_1467094199147_3081640_1_01 [Map 1] killed/failed due
to:ROOT_INPUT_INIT_FAILURE, Vertex Input: input initializer failed,
vertex=vertex_1467094199147_3081640_1_01 [Map 1],
java.lang.UnsupportedOperationException: Not expecting to handle any events
at
org.apache.tez.mapreduce.common.MRInputAMSplitGenerator.handleInputInitializerEvent(MRInputAMSplitGenerator.java:170)
at
org.apache.tez.dag.app.dag.RootInputInitializerManager$InitializerWrapper.sendEvents(RootInputInitializerManager.java:501)
at
org.apache.tez.dag.app.dag.RootInputInitializerManager$InitializerWrapper.onTaskSucceeded(RootInputInitializerManager.java:451)
at
org.apache.tez.dag.app.dag.StateChangeNotifier.taskSucceeded(StateChangeNotifier.java:290)
at
org.apache.tez.dag.app.dag.impl.TaskImpl$TaskStateChangedCallback.onStateChanged(TaskImpl.java:1524)
at
org.apache.tez.dag.app.dag.impl.TaskImpl$TaskStateChangedCallback.onStateChanged(TaskImpl.java:1508)
at
org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:61)
at org.apache.tez.dag.app.dag.impl.TaskImpl.handle(TaskImpl.java:918)
at org.apache.tez.dag.app.dag.impl.TaskImpl.handle(TaskImpl.java:112)
at
org.apache.tez.dag.app.DAGAppMaster$TaskEventDispatcher.handle(DAGAppMaster.java:2068)
at
org.apache.tez.dag.app.DAGAppMaster$TaskEventDispatcher.handle(DAGAppMaster.java:2054)
at
org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114)
at java.lang.Thread.run(Thread.java:745)
]
{noformat}
RootInputInitializerManager delegates the input initializers to a thread pool
and listens for vertex/task events while those initializers are running. Once
they complete it unregisters from those events. If the initializer completes
before an upstream task succeeds we're OK, but if a task succeeds first it ends
up sending events to the initializer which doesn't expect any events.
Looks like MRInputSplitDistributor could have the same issue, and a fix for
TEZ-3274 would aggravate the issue further.
> Hive map-side join job sometimes fails with ROOT_INPUT_INIT_FAILURE
> -------------------------------------------------------------------
>
> Key: TEZ-3336
> URL: https://issues.apache.org/jira/browse/TEZ-3336
> Project: Apache Tez
> Issue Type: Bug
> Affects Versions: 0.7.1
> Reporter: Jason Lowe
>
> When Hive does a map-side join it can generate a DAG where a vertex has two
> inputs, one from an upstream task and another using MRInputAMSplitGenerator.
> If it takes a while for MRInputAMSplitGenerator to compute the splits and one
> of the tasks for the other upstream vertex completes then the job can fail
> with an error since MRInputAMSplitGenerator does not expect to receive any
> events.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)