Github user viirya commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21494#discussion_r193648185
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala 
---
    @@ -1310,6 +1311,44 @@ class DAGScheduler(
                 }
             }
     
    +      case failure: TaskFailedReason if task.isBarrier =>
    +        // Always fail the current stage and retry all the tasks when a 
barrier task fail.
    +        val failedStage = stageIdToStage(task.stageId)
    +        logInfo(s"Marking $failedStage (${failedStage.name}) as failed " +
    +          "due to a barrier task failed.")
    +        val message = "Stage failed because a barrier task finished 
unsuccessfully. " +
    +          s"${failure.toErrorString}"
    +        try { // cancelTasks will fail if a SchedulerBackend does not 
implement killTask
    +          taskScheduler.cancelTasks(stageId, interruptThread = false)
    +        } catch {
    +          case e: UnsupportedOperationException =>
    +            logInfo(s"Could not cancel tasks for stage $stageId", e)
    --- End diff --
    
    Under barrier execution, will it be a problem if we can not cancel tasks?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to