Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/8559#discussion_r38451212
  
    --- Diff: core/src/main/scala/org/apache/spark/scheduler/DAGScheduler.scala 
---
    @@ -1123,8 +1133,15 @@ class DAGScheduler(
                 // TODO: Cancel running tasks in the stage
                 logInfo(s"Resubmitting $mapStage (${mapStage.name}) and " +
                   s"$failedStage (${failedStage.name}) due to fetch failure")
    +            // We might get lots of fetch failed for this stage, from lots 
of executors.
    +            // Its better if we can resubmit for all the failed executors 
at one time, so lets
    +            // just wait a *bit* before we resubmit.
    --- End diff --
    
    I have no idea if this comment is accurate or not -- I just found this 
pretty confusing and felt it deserved some comment, so I took my best guess. 
The tests in `DAGSchedulerSuite` pass if you just post this event directly 
(with other corresponding changes to go along with it ...)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to