Github user ankuriitg commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22209#discussion_r212773791
  
    --- Diff: 
core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ---
    @@ -350,11 +350,16 @@ private[spark] class AppStatusListener(
             val e = it.next()
             if (job.stageIds.contains(e.getKey()._1)) {
               val stage = e.getValue()
    -          stage.status = v1.StageStatus.SKIPPED
    -          job.skippedStages += stage.info.stageId
    -          job.skippedTasks += stage.info.numTasks
    -          it.remove()
    -          update(stage, now)
    +          // Only update the stage if it has not finished already
    +          if (v1.StageStatus.ACTIVE.equals(stage.status) ||
    +              v1.StageStatus.PENDING.equals(stage.status)) {
    +            stage.status = v1.StageStatus.SKIPPED
    +            job.skippedStages += stage.info.stageId
    +            job.skippedTasks += stage.info.numTasks
    +            job.activeStages -= 1
    +            it.remove()
    --- End diff --
    
    This code now only handles the scenario when onStageCompleted event is 
dropped (not received). If we don't want to handle that scenario, then we can 
remove this part of the code altogether.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to