GitHub user vanzin opened a pull request:

    https://github.com/apache/spark/pull/9021

    [SPARK-10987] [yarn] Work around for netty rpc disconnection event.

    In YARN client mode, when the AM connects to the driver, it may be the case
    that the driver never needs to send a message back to the AM (i.e., no
    dynamic allocation or preemption). This triggers an issue in the netty rpc
    backend where no disconnection event is sent to endpoints, and the AM never
    exits after the driver shuts down.
    
    The real fix is too complicated, so this is a quick hack to unblock YARN
    client mode until we can work on the real fix. It forces the driver to
    send a message to the AM when the AM registers, thus establishing that
    connection and enabling the disconnection event when the driver goes
    away.
    
    Also, a minor side issue: when the executor is shutting down, it needs
    to send an "ack" back to the driver when using the netty rpc backend; but
    that "ack" wasn't being sent because the handler was shutting down the rpc
    env before returning. So added a change to delay the shutdown a little bit,
    allowing the ack to be sent back.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/vanzin/spark SPARK-10987

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/9021.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #9021
    
----
commit 8b7620fcfa0a677865dde799ea933e0e7ee266ab
Author: Marcelo Vanzin <[email protected]>
Date:   2015-10-08T00:35:50Z

    [SPARK-10987] [yarn] Work around for netty rpc disconnection event.
    
    In YARN client mode, when the AM connects to the driver, it may be the case
    that the driver never needs to send a message back to the AM (i.e., no
    dynamic allocation or preemption). This triggers an issue in the netty rpc
    backend where no disconnection event is sent to endpoints, and the AM never
    exits after the driver shuts down.
    
    The real fix is too complicated, so this is a quick hack to unblock YARN
    client mode until we can work on the real fix. It forces the driver to
    send a message to the AM when the AM registers, thus establishing that
    connection and enabling the disconnection event when the driver goes
    away.
    
    Also, a minor side issue: when the executor is shutting down, it needs
    to send an "ack" back to the driver when using the netty rpc backend; but
    that "ack" wasn't being sent because the handler was shutting down the rpc
    env before returning. So added a change to delay the shutdown a little bit,
    allowing the ack to be sent back.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to