Daniel Dai created TEZ-1143: ------------------------------- Summary: One-one edge fail when change source edge dynamically Key: TEZ-1143 URL: https://issues.apache.org/jira/browse/TEZ-1143 Project: Apache Tez Issue Type: Bug Reporter: Daniel Dai Assignee: Bikas Saha
One-one edge fail when the parallelism of source vertex changes dynamically (through a ShuffleVertexManager). Here is the stack: {code} 2014-05-21 00:05:55,284 INFO [AsyncDispatcher event handler] org.apache.tez.dag.app.dag.impl.VertexImpl: Vertex vertex_1400646157236_0012_1_03 parallelism set to 1 from 202014-05-21 00:05:55,284 INFO [AsyncDispatcher event handler] org.apache.tez.dag.app.dag.impl.VertexImpl: Removing task: task_1400646157236_0012_1_03_0000012014-05-21 00:05:55,284 INFO [AsyncDispatcher event handler] org.apache.tez.dag.app.dag.impl.VertexImpl: Removing task: task_1400646157236_0012_1_03_0000022014-05-21 00:05:55,284 INFO [AsyncDispatcher event handler] org.apache.tez.dag.app.dag.impl.VertexImpl: Removing task: task_1400646157236_0012_1_03_0000032014-05-21 00:05:55,284 INFO [AsyncDispatcher event handler] org.apache.tez.dag.app.dag.impl.VertexImpl: Removing task: task_1400646157236_0012_1_03_0000042014-05-21 00:05:55,284 INFO [AsyncDispatcher event handler] org.apache.tez.dag.app.dag.impl.VertexImpl: Removing task: task_1400646157236_0012_1_03_0000052014-05-21 00:05:55,284 INFO [AsyncDispatcher event handler] org.apache.tez.dag.app.dag.impl.VertexImpl: Removing task: task_1400646157236_0012_1_03_000006 2014-05-21 00:05:55,284 INFO [AsyncDispatcher event handler] org.apache.tez.dag.app.dag.impl.VertexImpl: Removing task: task_1400646157236_0012_1_03_0000072014-05-21 00:05:55,284 INFO [AsyncDispatcher event handler] org.apache.tez.dag.app.dag.impl.VertexImpl: Removing task: task_1400646157236_0012_1_03_000008 2014-05-21 00:05:55,284 INFO [AsyncDispatcher event handler] org.apache.tez.dag.app.dag.impl.VertexImpl: Removing task: task_1400646157236_0012_1_03_000009 2014-05-21 00:05:55,285 INFO [AsyncDispatcher event handler] org.apache.tez.dag.app.dag.impl.VertexImpl: Removing task: task_1400646157236_0012_1_03_000010 2014-05-21 00:05:55,285 INFO [AsyncDispatcher event handler] org.apache.tez.dag.app.dag.impl.VertexImpl: Removing task: task_1400646157236_0012_1_03_000011 2014-05-21 00:05:55,285 INFO [AsyncDispatcher event handler] org.apache.tez.dag.app.dag.impl.VertexImpl: Removing task: task_1400646157236_0012_1_03_000012 2014-05-21 00:05:55,285 INFO [AsyncDispatcher event handler] org.apache.tez.dag.app.dag.impl.VertexImpl: Removing task: task_1400646157236_0012_1_03_000013 2014-05-21 00:05:55,285 INFO [AsyncDispatcher event handler] org.apache.tez.dag.app.dag.impl.VertexImpl: Removing task: task_1400646157236_0012_1_03_000014 2014-05-21 00:05:55,285 INFO [AsyncDispatcher event handler] org.apache.tez.dag.app.dag.impl.VertexImpl: Removing task: task_1400646157236_0012_1_03_000015 2014-05-21 00:05:55,285 INFO [AsyncDispatcher event handler] org.apache.tez.dag.app.dag.impl.VertexImpl: Removing task: task_1400646157236_0012_1_03_000016 2014-05-21 00:05:55,285 INFO [AsyncDispatcher event handler] org.apache.tez.dag.app.dag.impl.VertexImpl: Removing task: task_1400646157236_0012_1_03_000017 2014-05-21 00:05:55,285 INFO [AsyncDispatcher event handler] org.apache.tez.dag.app.dag.impl.VertexImpl: Removing task: task_1400646157236_0012_1_03_000018 2014-05-21 00:05:55,285 INFO [AsyncDispatcher event handler] org.apache.tez.dag.app.dag.impl.VertexImpl: Removing task: task_1400646157236_0012_1_03_000019 2014-05-21 00:05:55,285 INFO [AsyncDispatcher event handler] org.apache.tez.dag.app.dag.impl.VertexImpl: Replacing edge manager for source:scope-41 destination: vertex_1400646157236_0012_1_032014-05-21 00:05:55,285 INFO [AsyncDispatcher event handler] org.apache.tez.dag.history.HistoryEventHandler: [HISTORY][DAG:dag_1400646157236_0012_1][Event:VERTEX_PARALLELISM_UPDATED]: vertexId=vertex_1400646157236_0012_1_03, numTasks=1, vertexLocationHint=null, edgeManagersCount=12014-05-21 00:05:55,286 INFO [AsyncDispatcher event handler] org.apache.tez.dag.app.dag.impl.DAGImpl: Vertex vertex_1400646157236_0012_1_02 completed., numCompletedVertices=3, numSuccessfulVertices=3, numFailedVertices=0, numKilledVertices=0, numVertices=72014-05-21 00:05:55,287 ERROR [AsyncDispatcher event handler] org.apache.tez.dag.app.dag.impl.VertexImpl: Can't handle Invalid event V_ONE_TO_ONE_SOURCE_SPLIT on vertex scope-61 with vertexId vertex_1400646157236_0012_1_05 at current state RUNNINGorg.apache.hadoop.yarn.state.InvalidStateTransitonException: Invalid event: V_ONE_TO_ONE_SOURCE_SPLIT at RUNNING at org.apache.hadoop.yarn.state.StateMachineFactory.doTransition(StateMachineFactory.java:305) at org.apache.hadoop.yarn.state.StateMachineFactory.access$300(StateMachineFactory.java:46) at org.apache.hadoop.yarn.state.StateMachineFactory$InternalStateMachine.doTransition(StateMachineFactory.java:448) at org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:1263) at org.apache.tez.dag.app.dag.impl.VertexImpl.handle(VertexImpl.java:158) at org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:1716) at org.apache.tez.dag.app.DAGAppMaster$VertexEventDispatcher.handle(DAGAppMaster.java:1702) at org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:134) at org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:81) at java.lang.Thread.run(Thread.java:695) {code} Attached complete AM log. scope-42 is the source vertex and scope-61 is the destination vertex. -- This message was sent by Atlassian JIRA (v6.2#6252)