[jira] [Comment Edited] (SPARK-4352) Incorporate locality preferences in dynamic allocation requests

Saisai Shao (JIRA) Mon, 01 Jun 2015 23:03:08 -0700

    [ 
https://issues.apache.org/jira/browse/SPARK-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14568410#comment-14568410
 ]


Saisai Shao edited comment on SPARK-4352 at 6/2/15 6:01 AM:
------------------------------------------------------------

Hi [~sandyr], I start to know your algorithm, your algorithm is trying to 
fulfill the smallest task number request at first, say in your example about 7 
executors, firstly you assigned 5 on <a, b, c, d> to fulfill request of node d, 
and left 2 for <a, b, c> because the request number is not enough. So what is 
your purpose of fulfill the smallest request at first? Since node d only 
requires 1/9 requests, why not satisfy a, b firstly, obviously they will run 
more tasks and need more locality.

Also for the 18 executors situation, your solution of:

requests for 5 executors with nodes = <a, b, c, d>
requests for 5 executors with nodes = <a, b, c>
requests for 5 executors with nodes = <a, b>
requests for 3 executors with no locality preferences

will have chance to be:

requests for 6 executors with nodes = <a, b, c, d>
requests for 6 executors with nodes = <a, b, c>
requests for 6 executors with nodes = <a, b>

Besides I think the algorithm is limited to the {{task number <= executor 
number * cores}}, say if we have 1000 task but only have 30 total cores, how to 
distribute according current algorithm?



was (Author: jerryshao):
Hi [~sandyr], I start to know your algorithm, your algorithm is trying to 
fulfill the smallest task number request at first, say in your example about 7 
executors, firstly you assigned 5 on <a, b, c, d> to fulfill request of node d, 
and left 2 for <a, b, c> because the request number is not enough. So what is 
your purpose of fulfill the smallest request at first? Since node d only 
requires 1/9 requests, why not satisfy a, b firstly, obviously they will run 
more tasks and need more locality.

Also in several cases, the task number is larger than the total cores 
available, say like 1000 tasks, but only have 15 executor requests, for your 
algorithm, without ratio consideration, how to handle this?

> Incorporate locality preferences in dynamic allocation requests
> ---------------------------------------------------------------
>
>                 Key: SPARK-4352
>                 URL: https://issues.apache.org/jira/browse/SPARK-4352
>             Project: Spark
>          Issue Type: Improvement
>          Components: Spark Core, YARN
>    Affects Versions: 1.2.0
>            Reporter: Sandy Ryza
>            Assignee: Saisai Shao
>            Priority: Critical
>         Attachments: Supportpreferrednodelocationindynamicallocation.pdf
>
>
> Currently, achieving data locality in Spark is difficult unless an 
> application takes resources on every node in the cluster.  
> preferredNodeLocalityData provides a sort of hacky workaround that has been 
> broken since 1.0.
> With dynamic executor allocation, Spark requests executors in response to 
> demand from the application.  When this occurs, it would be useful to look at 
> the pending tasks and communicate their location preferences to the cluster 
> resource manager. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (SPARK-4352) Incorporate locality preferences in dynamic allocation requests

Reply via email to