Github user kayousterhout commented on a diff in the pull request:

    https://github.com/apache/spark/pull/16270#discussion_r92276753
  
    --- Diff: 
core/src/test/scala/org/apache/spark/scheduler/SchedulerIntegrationSuite.scala 
---
    @@ -157,8 +160,16 @@ abstract class SchedulerIntegrationSuite[T <: 
MockBackend: ClassTag] extends Spa
           }
           // When a job fails, we terminate before waiting for all the task 
end events to come in,
           // so there might still be a running task set.  So we only check 
these conditions
    -      // when the job succeeds
    -      assert(taskScheduler.runningTaskSets.isEmpty)
    +      // when the job succeeds.
    +      // When the final task of a taskset completes, we post
    +      // the event to the DAGScheduler event loop before we finish 
processing in the taskscheduler
    +      // thread.  It's possible the DAGScheduler thread processes the 
event, finishes the job,
    +      // and notifies the job waiter before our original thread in the 
task scheduler finishes
    +      // handling the event and marks the taskset as complete.  So its ok 
if we need to wait a
    +      // *little* bit longer for the original taskscheduler thread to 
finish up to deal w/ the race.
    +      eventually(timeout(1 second), interval(100 millis)) {
    --- End diff --
    
    I might do 10 millis here since it seems like there's not much cost here to 
checking more frequently, and 100 millis is somewhat long to wait


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to