[ https://issues.apache.org/jira/browse/SPARK-14649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15242376#comment-15242376 ]
Sital Kedia commented on SPARK-14649: ------------------------------------- [~kayousterhout] - Any idea how to handle this? > DagScheduler runs duplicate tasks on fetch failure > -------------------------------------------------- > > Key: SPARK-14649 > URL: https://issues.apache.org/jira/browse/SPARK-14649 > Project: Spark > Issue Type: Bug > Components: Scheduler > Reporter: Sital Kedia > > When running a job we found out that there are many duplicate tasks running > after fetch failure in a stage. The issue is that when submitting tasks for a > stage, the dag scheduler submits all the pending tasks (tasks whose output is > not available). But out of those pending tasks, some tasks might already be > running on the cluster. The dag scheduler need to submit only non-running > tasks for a stage. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org