[
https://issues.apache.org/jira/browse/SPARK-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14567755#comment-14567755
]
Sandy Ryza commented on SPARK-4352:
-----------------------------------
[~jerryshao] I wouldn't say that the goal is necessarily to get as close as
possible to the ratio of requests (3 : 3 : 2 : 1 in the example). My idea was
to get as close as possible to sum(cores from all executor requests with that
node on their preferred list) = number tasks that prefer that node.
Why? Let's look at the situation where we're requesting 18 executors. Let's
say we request 6 executors with a preference for <a, b, c, d> like you
suggested. YARN would be perfectly happy giving us 6 executors on node d. But
we only have 10 tasks (with executors that have 2 cores, this means 5
executors) that need to run on node d. So we'd really prefer that the 6th
executor be scheduled on a, b, or c, because placing it on d confers no
additional advantage.
For the situation where we're requesting 7 executors I have less of an argument
for why my 5 : 2 is better than your 2 : 2 : 3. Thinking about it more now, it
seems like your approach could be closer to optimal because getting executors
on a or b means more of our tasks get to run on local data. So I would
certainly be open to something that tries to preserve the ratio when the number
of executors we're allowed to request is under the maximum number of tasks
targeted for any particular node.
> Incorporate locality preferences in dynamic allocation requests
> ---------------------------------------------------------------
>
> Key: SPARK-4352
> URL: https://issues.apache.org/jira/browse/SPARK-4352
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core, YARN
> Affects Versions: 1.2.0
> Reporter: Sandy Ryza
> Assignee: Saisai Shao
> Priority: Critical
> Attachments: Supportpreferrednodelocationindynamicallocation.pdf
>
>
> Currently, achieving data locality in Spark is difficult unless an
> application takes resources on every node in the cluster.
> preferredNodeLocalityData provides a sort of hacky workaround that has been
> broken since 1.0.
> With dynamic executor allocation, Spark requests executors in response to
> demand from the application. When this occurs, it would be useful to look at
> the pending tasks and communicate their location preferences to the cluster
> resource manager.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]