GitHub user sryza opened a pull request:
https://github.com/apache/spark/pull/655
SPARK-1714. Take advantage of AMRMClient APIs to simplify logic in YarnA...
...llocationHandler
This patch does a few things:
* Uses AMRMClient APIs for matching containers to requests.
* Calls AMRMClient.removeContainerRequest so that, when we use a container,
we don't end up requesting it again.
* Removes YarnAllocationHandler's host->rack cache. YARN's RackResolver
already does this caching, so this is redundant.
* Adds tests for basic YarnAllocationHandler functionality.
* Breaks up allocateResources.
* A little bit of stylistic cleanup.
The patch is lossy. In particular, it loses the logic for trying to avoid
containers bunching up on nodes. As I understand it, the logic that's gone is
two-part:
* If, in a single response from the RM, we receive a set of containers on a
node, and prefer some number of containers on that node greater than 0 but less
than the number we received, give back the delta between what we preferred and
what we received.
This seems like a weird way to avoid bunching E.g. it does nothing to
avoid bunching when we don't request containers on particular nodes. I think
we can come up with something better.
* If we receive more containers than the number of executors we desire,
make sure that the containers we use are distributed as evenly as possible
among the available nodes.
It's rare for YARN to allocate more containers than requested, and when it
does, it's only a small number. In fact, with the current code, because we
call allocate() and handle allocations in the same thread, it should never
happen. Having a bunch of logic to deal with this seems unnecessary.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/sryza/spark sandy-spark-1714
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/655.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #655
----
commit 6bf2b4854cad7e4e6d9f42db7d61fc17a5a3cec9
Author: Sandy Ryza <[email protected]>
Date: 2014-04-25T17:00:35Z
SPARK-1714. Take advantage of AMRMClient APIs to simplify logic in
YarnAllocationHandler
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---