[GitHub] squito commented on a change in pull request #22806: [SPARK-25250][CORE] : Late zombie task completions handled correctly even before new taskset launched

GitBox Thu, 17 Jan 2019 09:47:11 -0800

squito commented on a change in pull request #22806: [SPARK-25250][CORE] : Late 
zombie task completions handled correctly even before new taskset launched
URL: https://github.com/apache/spark/pull/22806#discussion_r248768327


 ##########
 File path: 
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala
 ##########
 @@ -286,6 +286,33 @@ private[spark] class TaskSchedulerImpl(
     }
   }
 
+  override def completeTasks(
+    partitionId: Int, stageId: Int, taskInfo: TaskInfo, killTasks: Boolean): 
Unit = {
+    taskSetsByStageIdAndAttempt.getOrElse(stageId, Map()).values.foreach { tsm 
=>
+      tsm.partitionToIndex.get(partitionId) match {
+        case Some(index) =>
+          tsm.markPartitionCompleted(index, taskInfo)
+          if (killTasks) {
+            val taskInfoList = tsm.taskAttempts(index)
+            taskInfoList.filter(_.running).foreach { tInfo =>
+              try {
+                killTaskAttempt(tInfo.taskId, false,
+                  s"Partition $partitionId is already completed")
+              } catch {
+                case e: Exception =>
+                  logWarning(s"Unable to kill Task ID ${tInfo.taskId}.")
+              }
+            }
+          }
+
+        case None =>
+          throw new SparkException(s"No corresponding index found for" +
+            s" partition ID $partitionId in TaskSet ${tsm.name}. This is 
likely a bug" +
 
 Review comment:
   this should be not be an exception, it should just be a no-op.  You might 
have taskset 1 w/ partitions 1 - 100, then taskset 2 gets launched after some 
have completed from taskset 1 so it only runs partitions 10-100, and then 
taskset 3 gets launched with partitions 1-50 after a different failure.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] squito commented on a change in pull request #22806: [SPARK-25250][CORE] : Late zombie task completions handled correctly even before new taskset launched

Reply via email to