GitHub user mccheah opened a pull request:
https://github.com/apache/spark/pull/4481
[SPARK-5697] Configurable registration retry interval and max attempts
Before, the Spark Driver would attempt to connect 3 times to the master,
with 20 seconds between each attempt, and if the master did not respond,
the driver would give up.
In practice, however, users may have a long queue of jobs or a busy
network that makes giving up this early unreasonable. Users should
choose to allow the driver to wait a longer time to eventually have its
registration be processed by the master node. The user does not need to
manually resubmit jobs as often if they set the timeout and number of
retries to higher numbers, as long as they are willing to accept that a
job may fail later if the master truly does crash or become unreachable.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/mccheah/spark
allow-different-register-timeouts
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/4481.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #4481
----
commit 4cb5bf8a7f9bd3a2f412819542490e3b91d6aeac
Author: mcheah <[email protected]>
Date: 2015-02-09T21:52:05Z
[SPARK-5697] Configurable registration retry interval and max attempts
Before, the Spark Driver would attempt to connect 3 times to the master,
with 20 seconds between each attempt, and if the master did not respond,
the driver would give up.
In practice, however, users may have a long queue of jobs or a busy
network that makes giving up this early unreasonable. Users should
choose to allow the driver to wait a longer time to eventually have its
registration be processed by the master node. The user does not need to
manually resubmit jobs as often if they set the timeout and number of
retries to higher numbers, as long as they are willing to accept that a
job may fail later if the master truly does crash or become unreachable.
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]