[GitHub] spark pull request: [SPARK-5697] Configurable registration retry i...

mccheah Mon, 09 Feb 2015 13:55:26 -0800

GitHub user mccheah opened a pull request:

    https://github.com/apache/spark/pull/4481


    [SPARK-5697] Configurable registration retry interval and max attempts

    Before, the Spark Driver would attempt to connect 3 times to the master,
    with 20 seconds between each attempt, and if the master did not respond,
    the driver would give up.
    
    In practice, however, users may have a long queue of jobs or a busy
    network that makes giving up this early unreasonable. Users should
    choose to allow the driver to wait a longer time to eventually have its
    registration be processed by the master node. The user does not need to
    manually resubmit jobs as often if they set the timeout and number of
    retries to higher numbers, as long as they are willing to accept that a
    job may fail later if the master truly does crash or become unreachable.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/mccheah/spark 
allow-different-register-timeouts

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/4481.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #4481
    
----
commit 4cb5bf8a7f9bd3a2f412819542490e3b91d6aeac
Author: mcheah <[email protected]>
Date:   2015-02-09T21:52:05Z

    [SPARK-5697] Configurable registration retry interval and max attempts
    
    Before, the Spark Driver would attempt to connect 3 times to the master,
    with 20 seconds between each attempt, and if the master did not respond,
    the driver would give up.
    
    In practice, however, users may have a long queue of jobs or a busy
    network that makes giving up this early unreasonable. Users should
    choose to allow the driver to wait a longer time to eventually have its
    registration be processed by the master node. The user does not need to
    manually resubmit jobs as often if they set the timeout and number of
    retries to higher numbers, as long as they are willing to accept that a
    job may fail later if the master truly does crash or become unreachable.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: [SPARK-5697] Configurable registration retry i...

Reply via email to