Github user markhamstra commented on the issue:
https://github.com/apache/spark/pull/15326
I've got some issues with this PR.
- You've unnecessarily changed the prior behavior where the last added pool
wins. We should preserve that behavior in case anyone was relying upon it to
handle their broken configuration.
- Adding duplicate TaskSetManagers (or other Schedulable types in the
future) would also be an error, so I'd rather do the checking and
de-duplication in `addSchedulable` instead of in `buildFairSchedulerPool`.
Like this in `Pool`:
```scala
override def addSchedulable(schedulable: Schedulable) {
require(schedulable != null)
val name = schedulable.name
if (null == schedulableNameToSchedulable.put(name, schedulable)) {
schedulableQueue.add(schedulable)
} else {
logWarning(s"Duplicate Schedulable added: $name")
// remove previously enqueued schedulable with same name
schedulableQueue.remove(getSchedulableByName(name))
if (!schedulableQueue.contains(schedulable)) {
schedulableQueue.add(schedulable)
}
}
schedulable.parent = this
}
```
The only routes to this code are from the one-time initialization of the
backend on the creation of a SparkContext and from
`schedulableBuilder.addTaskSetManager` in `TaskSchedulerImpl#submitTasks`,
which is synchronized -- so the above should be fine in terms of thread safety.
This is also the only route by which a `Schedulable` can be added to
`schedulableQueue`, so with this code there should never be more than one prior
instance of a queued `Schedulable` with the same name that needs to be removed
before replacing it with the duplicate (which matches the
`schedulableNameToSchedulable` entry).
This should pass all the existing tests; but if you do something more
severe than just logging the warning (such as adding a `System.exit` to that
branch), you'll find that we are actually doing some abusive adding of
duplicate schedulables within some tests.
@kayousterhout
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]