Github user ankuriitg commented on a diff in the pull request:
https://github.com/apache/spark/pull/22209#discussion_r214372932
--- Diff:
core/src/main/scala/org/apache/spark/status/AppStatusListener.scala ---
@@ -350,11 +350,20 @@ private[spark] class AppStatusListener(
val e = it.next()
if (job.stageIds.contains(e.getKey()._1)) {
val stage = e.getValue()
- stage.status = v1.StageStatus.SKIPPED
- job.skippedStages += stage.info.stageId
- job.skippedTasks += stage.info.numTasks
- it.remove()
- update(stage, now)
+ if (v1.StageStatus.PENDING.equals(stage.status)) {
--- End diff --
In this PR, I am only trying to fix any issues which are caused by
out-of-order events leaving the ones caused by dropped events.
So, with that information, onJobEnd event does not do anything for active
stages. Active stages are updated in onTaskEnd event or onStageCompleted event.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]