Github user kayousterhout commented on the pull request:
https://github.com/apache/spark/pull/9154#issuecomment-149289335
To give a specific example, suppose task t1 has preferred locations on
executor e1 (on host h1), e2 (also on host h1) and e3 (on host h2).
The data structures will look like:
pendingTasksForExecutor: {e1: t1, e2: t1, e3: t1}
pendingTasksForHost: {h1: t1, h2: t1}
We've agreed that the "addPendingTask" call is irrelevant for tasks that
are currently running (because addPendingTask is called for any running tasks
in handleFailedTask), so let's say t1 hasn't been run yet.
Now suppose executor e2 dies. We never remove any entries from
pendingTasksForExecutor or pendingTasksForHost (not in addPendingTask, nor
anyplace else, as far as I can tell; we still won't schedule things on the died
executor, because the TaskSetManager will never get a resource offer for it).
addPendingTask will "readd" entries for each of t1's preferred locations
(including the lost executor -- we don't check whether the executor is alive
when updating the map entries). However, all of these locations were already
added above, so this call has no effect.
Which part of this reasoning do you think is incorrect?
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]