yinqiang created SPARK-14890:
--------------------------------
Summary: DAGScheduler should not accept the result of a previous
task attempt, since its stage has been completed.
Key: SPARK-14890
URL: https://issues.apache.org/jira/browse/SPARK-14890
Project: Spark
Issue Type: Bug
Components: Spark Core
Affects Versions: 1.6.1
Environment: spark1.6.1 hadoop-2.6.0-cdh5.4.2
Reporter: yinqiang
......
16/04/14 17:07:28 INFO TaskSetManager: Starting task 109.0 in stage 46.0 (TID
18023, cnsz033569.app.paic.com.cn, partition 109,RACK_LOCAL, 2316 bytes)
......
16/04/14 17:08:32 WARN TaskSetManager: Lost task 109.0 in stage 46.0 (TID
18023, cnsz033569.app.paic.com.cn): ExecutorLostFailure (executor 23 exited
caused by one of the running tasks) Reason: Container marked as failed:
container_146045
9369308_5903_01_000035 on host: cnsz033569.app.paic.com.cn. Exit status: 143.
Diagnostics: Container killed on request. Exit code is 143
......
16/04/14 17:08:37 INFO TaskSetManager: Starting task 109.1 in stage 46.0 (TID
20237, cnsz033561.app.paic.com.cn, partition 109,RACK_LOCAL, 2316 bytes)
......
16/04/14 17:08:54 WARN TaskSetManager: Lost task 109.1 in stage 46.0 (TID
20237, cnsz033561.app.paic.com.cn): ExecutorLostFailure (executor 6 exited
caused by one of the running tasks) Reason: Container marked as failed:
container_1460459
369308_5903_01_000007 on host: cnsz033561.app.paic.com.cn. Exit status: 143.
Diagnostics: Container killed on request. Exit code is 143
......
16/04/14 17:09:38 INFO TaskSetManager: Starting task 109.2 in stage 46.0 (TID
21034, cnsz033580.app.paic.com.cn, partition 109,RACK_LOCAL, 2316 bytes)
......
16/04/14 17:10:41 INFO YarnScheduler: Removed TaskSet 46.0, whose tasks have
all completed, from pool
......
16/04/14 17:10:41 INFO DAGScheduler: Ignoring possibly bogus ShuffleMapTask(46,
109) completion from executor 23
......
16/04/14 17:10:46 INFO TaskSetManager: Ignoring task-finished event for 109.1
in stage 46.0 because task 109 has already completed successfully
16/04/14 17:10:46 INFO DAGScheduler: Ignoring possibly bogus ShuffleMapTask(46,
109) completion from executor 6
......
16/04/14 17:10:47 INFO TaskSetManager: Ignoring task-finished event for 109.2
in stage 46.0 because task 109 has already completed successfully
......
16/04/14 17:10:47 ERROR DAGSchedulerEventProcessLoop:
DAGSchedulerEventProcessLoop failed; shutting down SparkContext
java.lang.IllegalStateException: more than one active taskSet for stage 46:
46.2,46.1
at
org.apache.spark.scheduler.TaskSchedulerImpl.submitTasks(TaskSchedulerImpl.scala:173)
at
org.apache.spark.scheduler.DAGScheduler.submitMissingTasks(DAGScheduler.scala:1052)
at
org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$submitStage(DAGScheduler.scala:921)
at
org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:1214)
at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1637)
at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1599)
at
org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1588)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]