Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/892#issuecomment-45378774
@pwendell pendingTasksWithNoPrefs handles case of executor failures, tasks
which have no preferences, etc. Or maybe I misunderstood the question ?
As I mentioned before, the motivation for the PR is orthogonal to the fix
proposed.
If nodes are not available for locality for the initial (few) stages the
better approach would be to wait for adequate nodes to appear.
The more general problem of how to handle this when churn rate of executors
is high is different imo and probably what seems to be attempted to be
addressed in the PR (minus the wait time, etc).
To tackle that, we would need to redefine what noPrefs does (which is what
@pwendell is alluding to I think ?) and probably do something along lines of
what Matei suggested : add to rack/host list even if the host/rack is not
available right now. Only problem with that is the number of lists we end up
maintaining can be high : though not too high given 4k or so current upper
bound to hadoop cluster.
Ofcourse, this would be per active stage - which gives us some notion of
how expensive it could be in worst case.
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---