Jeff Zhang created TEZ-2107:
-------------------------------

             Summary: Recovery failure in the case of Auto-reduce parallelism
                 Key: TEZ-2107
                 URL: https://issues.apache.org/jira/browse/TEZ-2107
             Project: Apache Tez
          Issue Type: Bug
            Reporter: Jeff Zhang
            Assignee: Jeff Zhang


The following errors happens when recovering in the case of auto-reduce 
parallelism.  The task number is reduced from 2 to 1. while the upstream 
vertex's DataMovementEvent is still routed to task 2 which has been removed 
when auto-reduce parallelism.
{code}
2015-02-16 09:11:54,587 FATAL [Dispatcher thread: Central] 
common.AsyncDispatcher: Error in dispatcher thread
org.apache.tez.dag.api.TezUncheckedException: Unexpected null task. 
sourceVertex=vertex_1424048826974_0002_1_00 [scope-47] srcTaskIndex = 0 
destVertex=vertex_1424048826974_0002_1_01 [scope-50] destTaskIndex=1 
destNumTasks=1 
edgeManager=org.apache.tez.dag.app.dag.impl.ScatterGatherEdgeManager
    at 
org.apache.tez.dag.app.dag.impl.Edge.sendDmEventOrIfEventToTasks(Edge.java:358)
    at 
org.apache.tez.dag.app.dag.impl.Edge.sendTezEventToDestinationTasks(Edge.java:422)
    at 
org.apache.tez.dag.app.dag.impl.Edge.handleCompositeDataMovementEvent(Edge.java:310)
    at 
org.apache.tez.dag.app.dag.impl.Edge.sendTezEventToDestinationTasks(Edge.java:378)
    at 
org.apache.tez.dag.app.dag.impl.VertexImpl.handleRoutedTezEvents(VertexImpl.java:3795)
    at 
org.apache.tez.dag.app.dag.impl.VertexImpl.access$3600(VertexImpl.java:187)
    at 
org.apache.tez.dag.app.dag.impl.VertexImpl$RouteEventTransition.transition(VertexImpl.java:3708)
    at 
org.apache.tez.dag.app.dag.impl.VertexImpl$RouteEventTransition.transition(VertexImpl.java:3700)
    at 
org.apache.hadoop.yarn.state.StateMachineFactory$MultipleInternalArc.doTransition(StateMachineFactory.java:385)
    at 
org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:302)
    at 
org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46)
    at 
org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448)
    at 
org.apache.tez.state.StateMachineTez.doTransition(StateMachineTez.java:57)
    at org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:1575)
    at org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:186)
    at 
org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:1802)
    at 
org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:1788)
    at org.apache.tez.common.AsyncDispatcher.dispatch(AsyncDispatcher.java:184)
    at org.apache.tez.common.AsyncDispatcher$1.run(AsyncDispatcher.java:115)
    at java.lang.Thread.run(Thread.java:745)
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to