[12/13] git commit: Minor clarification and cleanup to spark-standalone.md

matei Thu, 10 Oct 2013 17:17:56 -0700

Minor clarification and cleanup to spark-standalone.md


Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/66c20635
Tree: http://git-wip-us.apache.org/repos/asf/incubator-spark/tree/66c20635
Diff: http://git-wip-us.apache.org/repos/asf/incubator-spark/diff/66c20635

Branch: refs/heads/master
Commit: 66c20635fa1fe18604bb4042ce31152180cb541d
Parents: 42d8b8e
Author: Aaron Davidson <aa...@databricks.com>
Authored: Thu Oct 10 14:45:12 2013 -0700
Committer: Aaron Davidson <aa...@databricks.com>
Committed: Thu Oct 10 14:45:12 2013 -0700

----------------------------------------------------------------------
 docs/spark-standalone.md | 43 +++++++++++++++++++++++++++++++++----------
 1 file changed, 33 insertions(+), 10 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-spark/blob/66c20635/docs/spark-standalone.md
----------------------------------------------------------------------
diff --git a/docs/spark-standalone.md b/docs/spark-standalone.md
index daf04f1..17066ef 100644
--- a/docs/spark-standalone.md
+++ b/docs/spark-standalone.md
@@ -185,14 +185,27 @@ Utilizing ZooKeeper to provide leader election and some 
state storage, you can l
 
 Learn more about getting started with ZooKeeper 
[here](http://zookeeper.apache.org/doc/trunk/zookeeperStarted.html).
 
-**Possible gotcha:** If you have multiple Masters in your cluster but fail to 
correctly configure the Masters to use ZooKeeper, the Masters will fail to 
discover each other and think they're all leaders. This will not lead to a 
healthy cluster state (as all Masters will schedule independently).
-
 **Configuration**
 
-    # May be configured as SPARK_DAEMON_JAVA_OPTS in spark-env.sh
-    spark.deploy.recoveryMode=ZOOKEEPER
-    spark.deploy.zookeeper.url=ZK_URL1:ZK_PORT1,ZK_URL2:ZK_PORT2 # eg 
192.168.1.100:2181,192.168.1.101:2181
-    spark.deploy.zookeeper.dir=/spark # OPTIONAL! /spark is the default.
+In order to enable this recovery mode, you can set SPARK_DAEMON_JAVA_OPTS in 
spark-env using this configuration:
+
+<table class="table">
+  <tr><th style="width:21%">System property</th><th>Meaning</th></tr>
+  <tr>
+    <td><code>spark.deploy.recoveryMode</code></td>
+    <td>Set to ZOOKEEPER to enable standby Master recovery mode (default: 
NONE).</td>
+  </tr>
+  <tr>
+    <td><code>spark.deploy.zookeeper.url</code></td>
+    <td>The ZooKeeper cluster url (e.g., 
192.168.1.100:2181,192.168.1.101:2181).</td>
+  </tr>
+  <tr>
+    <td><code>spark.deploy.zookeeper.dir</code></td>
+    <td>The directory in ZooKeeper to store recovery state (default: 
/spark).</td>
+  </tr>
+</table>
+
+Possible gotcha: If you have multiple Masters in your cluster but fail to 
correctly configure the Masters to use ZooKeeper, the Masters will fail to 
discover each other and think they're all leaders. This will not lead to a 
healthy cluster state (as all Masters will schedule independently).
 
 **Details**
 
@@ -212,12 +225,22 @@ ZooKeeper is the best way to go for production-level high 
availability, but if y
 
 **Configuration**
 
-    # May be configured as SPARK_DAEMON_JAVA_OPTS in spark-env.sh
-    spark.deploy.recoveryMode=FILESYSTEM
-    spark.deploy.recoveryDirectory=PATH_ACCESSIBLE_TO_MASTER
+In order to enable this recovery mode, you can set SPARK_DAEMON_JAVA_OPTS in 
spark-env using this configuration:
+
+<table class="table">
+  <tr><th style="width:21%">System property</th><th>Meaning</th></tr>
+  <tr>
+    <td><code>spark.deploy.recoveryMode</code></td>
+    <td>Set to FILESYSTEM to enable single-node recovery mode (default: 
NONE).</td>
+  </tr>
+  <tr>
+    <td><code>spark.deploy.recoveryDirectory</code></td>
+    <td>The directory in which Spark will store recovery state, accessible 
from the Master's perspective.</td>
+  </tr>
+</table>
 
 **Details**
 
 * This solution can be used in tandem with a process monitor/manager like 
[monit](http://mmonit.com/monit/), or just to enable manual recovery via 
restart.
 * While filesystem recovery seems straightforwardly better than not doing any 
recovery at all, this mode may be suboptimal for certain development or 
experimental purposes. In particular, killing a master via stop-master.sh does 
not clean up its recovery state, so whenever you start a new Master, it will 
enter recovery mode. This could increase the startup time by up to 1 minute if 
it needs to wait for all previously-registered Workers/clients to timeout.
-* While it's not officially supported, you could mount an NFS directory as the 
recovery directory. If the original Master node dies completely, you could then 
start a Master on a different node, which would correctly recover all 
previously registered Workers/clients (equivalent to ZooKeeper recovery). Note, 
however, that you **cannot** have multiple Masters alive concurrently using 
this approach; you need to upgrade to ZooKeeper to provide leader election for 
that use-case.
+* While it's not officially supported, you could mount an NFS directory as the 
recovery directory. If the original Master node dies completely, you could then 
start a Master on a different node, which would correctly recover all 
previously registered Workers/applications (equivalent to ZooKeeper recovery). 
Future applications will have to be able to find the new Master, however, in 
order to register.

[12/13] git commit: Minor clarification and cleanup to spark-standalone.md

Reply via email to