Github user squito commented on a diff in the pull request:

    https://github.com/apache/spark/pull/6750#discussion_r33720045
  
    --- Diff: 
core/src/main/scala/org/apache/spark/scheduler/TaskSchedulerImpl.scala ---
    @@ -163,6 +163,15 @@ private[spark] class TaskSchedulerImpl(
         this.synchronized {
           val manager = createTaskSetManager(taskSet, maxTaskFailures)
           activeTaskSets(taskSet.id) = manager
    +      val stage = taskSet.stageId
    +      val conflictingTaskSet = activeTaskSets.exists { case (id, ts) =>
    +        // if the id matches, it really should be the same taskSet, but in 
some unit tests
    +        // we add new taskSets with the same id
    +        id != taskSet.id && !ts.isZombie && ts.stageId == stage
    +      }
    +      if (conflictingTaskSet) {
    +        throw new SparkIllegalStateException(s"more than one active 
taskSet for stage $stage")
    +      }
    --- End diff --
    
    restoring the comments from the old diff b/c they are still relevant:
    
    from mark:
    > @kayousterhout How much of a concern should the extra overhead be here? 
Just wondering whether this (let's hope rare) condition might better be handled 
only in a non-production environment and behind an if(debug) kind of flag.
    
    from marcelo:
    >Perhaps the code could just look for an existing task set that matches the 
stage ID of the task set being added? That should be a little better than the 
filter / groupBy.
    
    good point, there isn't any need to do the `groupBy`, so I've made it 
simpler.
    
    I'd really rather leave the check in place.  In fact I think this fail-fast 
behavior is especially important in a *production* environment -- that's much 
better than an infinite loop of failures.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to