GitHub user sryza opened a pull request:

    https://github.com/apache/spark/pull/655

    SPARK-1714. Take advantage of AMRMClient APIs to simplify logic in YarnA...

    ...llocationHandler
    
    This patch does a few things:
    * Uses AMRMClient APIs for matching containers to requests.
    * Calls AMRMClient.removeContainerRequest so that, when we use a container, 
we don't end up requesting it again.
    * Removes YarnAllocationHandler's host->rack cache.  YARN's RackResolver 
already does this caching, so this is redundant.
    * Adds tests for basic YarnAllocationHandler functionality.
    * Breaks up allocateResources.
    * A little bit of stylistic cleanup.
    
    
    The patch is lossy.  In particular, it loses the logic for trying to avoid 
containers bunching up on nodes.  As I understand it, the logic that's gone is 
two-part:
    * If, in a single response from the RM, we receive a set of containers on a 
node, and prefer some number of containers on that node greater than 0 but less 
than the number we received, give back the delta between what we preferred and 
what we received.
    
    This seems like a weird way to avoid bunching  E.g. it does nothing to 
avoid bunching when we don't request containers on particular nodes.  I think 
we can come up with something better.
    * If we receive more containers than the number of executors we desire, 
make sure that the containers we use are distributed as evenly as possible 
among the available nodes.
    
    It's rare for YARN to allocate more containers than requested, and when it 
does, it's only a small number.  In fact, with the current code, because we 
call allocate() and handle allocations in the same thread, it should never 
happen.  Having a bunch of logic to deal with this seems unnecessary.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sryza/spark sandy-spark-1714

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/655.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #655
    
----
commit 6bf2b4854cad7e4e6d9f42db7d61fc17a5a3cec9
Author: Sandy Ryza <[email protected]>
Date:   2014-04-25T17:00:35Z

    SPARK-1714. Take advantage of AMRMClient APIs to simplify logic in 
YarnAllocationHandler

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

Reply via email to