[
https://issues.apache.org/jira/browse/SPARK-4352?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14558018#comment-14558018
]
Saisai Shao commented on SPARK-4352:
------------------------------------
Hi [~sandyr], currently my solution of launching executors is to cover the
preferred locations of pending task as large as possible, this solution is a
little different from the old code.
The locality preference will be maintained in ApplicationMaster and initialized
as *false*, which means currently there's no executor on this node, if executor
is launched, then this host will be set as *true*, which is a hint for
ContainerRequest to avoid reassign the container on the same node, and assign
the container to the unused locality as possible. Newly submitted tasks' node
locality will also be transferred to ApplicationMaster in the run-time,
ApplicationMaster will update and merge the new node localities (node where
already has executors will not be updated to *false* state to request a new
container). Also when the container is killed or completed, locality will be
removed which means we don't need this locality any more.
This algorithm tries to cover the node locality as much as possible, and for
dynamic new container request, try not to allocate container on the node which
already has one, try to cover the localities as possible. This algorithm has
some pros and cons:
Pros:
1. with default yarn mode (dynamic allocation disabled, executor number
specified, no preferred node location), this algorithm has no impact on the old
logic, the logic is exact the same.
2. with dynamic allocation enabled and not preferred node location, this
algorithm also keep the same semantics has previous code.
3. with dynamic allocation enabled and preferred node location exists, this
algorithm will try to cover the range of locality as much as possible.
Cons:
1. This algorithm do not consider the task numbers distribution, if some nodes
have more task then other nodes, this algorithm will not assign more containers
on these node (preferred locality coverage is the first priority).
2. In the run-time of dynamic allocation, this algorithm will not kill the
container to reassign the container to keep the best locality, so at extreme
situation, locality will not be matched at all if the current container number
is satisfied.
So what is your suggestion?
> Incorporate locality preferences in dynamic allocation requests
> ---------------------------------------------------------------
>
> Key: SPARK-4352
> URL: https://issues.apache.org/jira/browse/SPARK-4352
> Project: Spark
> Issue Type: Improvement
> Components: Spark Core, YARN
> Affects Versions: 1.2.0
> Reporter: Sandy Ryza
> Priority: Critical
>
> Currently, achieving data locality in Spark is difficult unless an
> application takes resources on every node in the cluster.
> preferredNodeLocalityData provides a sort of hacky workaround that has been
> broken since 1.0.
> With dynamic executor allocation, Spark requests executors in response to
> demand from the application. When this occurs, it would be useful to look at
> the pending tasks and communicate their location preferences to the cluster
> resource manager.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]