[
https://issues.apache.org/jira/browse/SPARK-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14551677#comment-14551677
]
Sandy Ryza commented on SPARK-4352:
-----------------------------------
Thanks for posting this Saisai. Can you export and attach it as a PDF so that
we've got an immutable copy for posterity?
This mostly looks good to me. Regarding a couple of your open questions:
1. I think best would be to modify requestTotalExecutors (which is private) and
avoid any public API additions for now.
2. Task locality preferences are computed already, right? Can we put any
computation beyond that in ExecutorAllocationManager so that it only happens
when dynamic allocation is turned on?
I think another big question is how we translate task locality preferences into
requests to YARN. This impacts what the preferredNodeLocalityData should look
like. At any moment, we have a set of pending tasks, each with a set of node
preferences and rack preferences, as well as a number of desired executors.
How do we account for the fact that some nodes have more pending tasks than
others? What happens when we're working with executors with 5 cores, but none
of the tasks share nodes that they want? What happens when the number of
pending tasks exceeds the total capacity of the desired number of executors?
One approach would be to request executors at every location that we have
pending tasks, and then return executors once we've reached the number that we
need. Another would be to condense down our preferences into an optimal number
of executor requests.
> Incorporate locality preferences in dynamic allocation requests
> ---------------------------------------------------------------
>
> Key: SPARK-4352
> URL: https://issues.apache.org/jira/browse/SPARK-4352
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core, YARN
> Affects Versions: 1.2.0
> Reporter: Sandy Ryza
> Priority: Critical
>
> Currently, achieving data locality in Spark is difficult unless an
> application takes resources on every node in the cluster.
> preferredNodeLocalityData provides a sort of hacky workaround that has been
> broken since 1.0.
> With dynamic executor allocation, Spark requests executors in response to
> demand from the application. When this occurs, it would be useful to look at
> the pending tasks and communicate their location preferences to the cluster
> resource manager.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]