[
https://issues.apache.org/jira/browse/SPARK-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14551713#comment-14551713
]
Saisai Shao commented on SPARK-4352:
------------------------------------
Hi Sandy, thanks a lot for your comments, I will post a PDF version as
attachment.
Currently task locality preference are computed when SparkListener posting
{{SparkListenerStageSubmitted}} event, let me think of a way, maybe a little
tricky.
Do we need to consider the distribution details of each pending task, like task
number unbalance between nodes, pending tasks exceeds the capacity of
executors? We just get a list of preferred node locations which is computed
from all pending tasks, and try to request the target containers using this
hint, do we need to care which container will get more tasks than others? Also
if the current container number already meets the target we wanted, but the
preferred node locations is not matched, do we still need to kill the old
containers and acquire the new ones?
> Incorporate locality preferences in dynamic allocation requests
> ---------------------------------------------------------------
>
> Key: SPARK-4352
> URL: https://issues.apache.org/jira/browse/SPARK-4352
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core, YARN
> Affects Versions: 1.2.0
> Reporter: Sandy Ryza
> Priority: Critical
>
> Currently, achieving data locality in Spark is difficult unless an
> application takes resources on every node in the cluster.
> preferredNodeLocalityData provides a sort of hacky workaround that has been
> broken since 1.0.
> With dynamic executor allocation, Spark requests executors in response to
> demand from the application. When this occurs, it would be useful to look at
> the pending tasks and communicate their location preferences to the cluster
> resource manager.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]