[ 
https://issues.apache.org/jira/browse/TEZ-3336?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15373680#comment-15373680
 ] 

Jason Lowe commented on TEZ-3336:
---------------------------------

One example of the failure:
{noformat}
Vertex failed, vertexName=Map 1, vertexId=vertex_1467094199147_3081640_1_01, 
diagnostics=[Vertex vertex_1467094199147_3081640_1_01 [Map 1] killed/failed due 
to:ROOT_INPUT_INIT_FAILURE, Vertex Input: input initializer failed, 
vertex=vertex_1467094199147_3081640_1_01 [Map 1], 
java.lang.UnsupportedOperationException: Not expecting to handle any events
        at 
org.apache.tez.mapreduce.common.MRInputAMSplitGenerator.handleInputInitializerEvent(MRInputAMSplitGenerator.java:170)
        at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InitializerWrapper.sendEvents(RootInputInitializerManager.java:501)
        at 
org.apache.tez.dag.app.dag.RootInputInitializerManager$InitializerWrapper.onTaskSucceeded(RootInputInitializerManager.java:451)
        at 
org.apache.tez.dag.app.dag.StateChangeNotifier.taskSucceeded(StateChangeNotifier.java:290)
        at 
org.apache.tez.dag.app.dag.impl.TaskImpl$TaskStateChangedCallback.onStateChanged(TaskImpl.java:1524)
        at 
org.apache.tez.dag.app.dag.impl.TaskImpl$TaskStateChangedCallback.onStateChanged(TaskImpl.java:1508)
        at 
org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:61)
        at org.apache.tez.dag.app.dag.impl.TaskImpl.handle(TaskImpl.java:918)
        at org.apache.tez.dag.app.dag.impl.TaskImpl.handle(TaskImpl.java:112)
        at 
org.apache.tez.dag.app.DAGAppMaster$TaskEventDispatcher.handle(DAGAppMaster.java:2068)
        at 
org.apache.tez.dag.app.DAGAppMaster$TaskEventDispatcher.handle(DAGAppMaster.java:2054)
        at 
org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:183)
        at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:114)
        at java.lang.Thread.run(Thread.java:745)
]
{noformat}

RootInputInitializerManager delegates the input initializers to a thread pool 
and listens for vertex/task events while those initializers are running.  Once 
they complete it unregisters from those events.  If the initializer completes 
before an upstream task succeeds we're OK, but if a task succeeds first it ends 
up sending events to the initializer which doesn't expect any events.

Looks like MRInputSplitDistributor could have the same issue, and a fix for 
TEZ-3274 would aggravate the issue further.


> Hive map-side join job sometimes fails with ROOT_INPUT_INIT_FAILURE
> -------------------------------------------------------------------
>
>                 Key: TEZ-3336
>                 URL: https://issues.apache.org/jira/browse/TEZ-3336
>             Project: Apache Tez
>          Issue Type: Bug
>    Affects Versions: 0.7.1
>            Reporter: Jason Lowe
>
> When Hive does a map-side join it can generate a DAG where a vertex has two 
> inputs, one from an upstream task and another using MRInputAMSplitGenerator.  
> If it takes a while for MRInputAMSplitGenerator to compute the splits and one 
> of the tasks for the other upstream vertex completes then the job can fail 
> with an error since MRInputAMSplitGenerator does not expect to receive any 
> events.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to