GitHub user sryza opened a pull request:

    https://github.com/apache/spark/pull/3765

    SPARK-1714. Take advantage of AMRMClient APIs to simplify logic in YarnA...

    ...llocator
    
    The goal of this PR is to simplify YarnAllocator as much as possible and 
get it up to the level of code quality we see in the rest of Spark.
    
    In service of this, it does a few things:
    * Uses AMRMClient APIs for matching containers to requests.
    * Adds calls to AMRMClient.removeContainerRequest so that, when we use a 
container, we don't end up requesting it again.
    * Removes YarnAllocator's host->rack cache. YARN's RackResolver already 
does this caching, so this is redundant.
    * Adds tests for basic YarnAllocator functionality.
    * Breaks up the allocateResources method, which was previously nearly 300 
lines.
    * A little bit of stylistic cleanup.
    * Fixes a bug that causes three times the requests to be filed when 
preferred host locations are given.
    
    The patch is lossy. In particular, it loses the logic for trying to avoid 
containers bunching up on nodes. As I understand it, the logic that's gone is:
    
    * If, in a single response from the RM, we receive a set of containers on a 
node, and prefer some number of containers on that node greater than 0 but less 
than the number we received, give back the delta between what we preferred and 
what we received.
    
    This seems like a weird way to avoid bunching E.g. it does nothing to avoid 
bunching when we don't request containers on particular nodes.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sryza/spark sandy-spark-1714

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/3765.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3765
    
----
commit 1becc3794b12000b2ad32c8cf4593652543641c6
Author: Sandy Ryza <[email protected]>
Date:   2014-12-22T05:34:39Z

    SPARK-1714. Take advantage of AMRMClient APIs to simplify logic in 
YarnAllocator

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to