Github user markhamstra commented on the issue:

    https://github.com/apache/spark/pull/15326
  
    I've got some issues with this PR.
    
    - You've unnecessarily changed the prior behavior where the last added pool 
wins.  We should preserve that behavior in case anyone was relying upon it to 
handle their broken configuration.
    
    - Adding duplicate TaskSetManagers (or other Schedulable types in the 
future) would also be an error, so I'd rather do the checking and 
de-duplication in `addSchedulable` instead of in `buildFairSchedulerPool`.  
Like this in `Pool`:
    ```scala
      override def addSchedulable(schedulable: Schedulable) {
        require(schedulable != null)
        val name = schedulable.name
        if (null == schedulableNameToSchedulable.put(name, schedulable)) {
          schedulableQueue.add(schedulable)
        } else {
          logWarning(s"Duplicate Schedulable added: $name")
          // remove previously enqueued schedulable with same name
          schedulableQueue.remove(getSchedulableByName(name))
          if (!schedulableQueue.contains(schedulable)) {
            schedulableQueue.add(schedulable)
          }
        }
        schedulable.parent = this
      }
    ```
    The only routes to this code are from the one-time initialization of the 
backend on the creation of a SparkContext and from 
`schedulableBuilder.addTaskSetManager` in `TaskSchedulerImpl#submitTasks`, 
which is synchronized -- so the above should be fine in terms of thread safety.
    
    This is also the only route by which a `Schedulable` can be added to 
`schedulableQueue`, so with this code there should never be more than one prior 
instance of a queued `Schedulable` with the same name that needs to be removed 
before replacing it with the duplicate (which matches the 
`schedulableNameToSchedulable` entry).
    
    This should pass all the existing tests; but if you do something more 
severe than just logging the warning (such as adding a `System.exit` to that 
branch), you'll find that we are actually doing some abusive adding of 
duplicate schedulables within some tests.
    
    @kayousterhout 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to