Updated Branches:
  refs/heads/master d6bdd5e42 -> a274b0944

Created High-Availability doc and merged it with Using-Zookeeper.


Project: http://git-wip-us.apache.org/repos/asf/mesos/repo
Commit: http://git-wip-us.apache.org/repos/asf/mesos/commit/a274b094
Tree: http://git-wip-us.apache.org/repos/asf/mesos/tree/a274b094
Diff: http://git-wip-us.apache.org/repos/asf/mesos/diff/a274b094

Branch: refs/heads/master
Commit: a274b0944a7cadcef689482d43faeb6341ccdeb7
Parents: d6bdd5e
Author: Benjamin Mahler <[email protected]>
Authored: Fri Jul 19 13:46:59 2013 -0700
Committer: Benjamin Mahler <[email protected]>
Committed: Fri Jul 19 17:05:24 2013 -0700

----------------------------------------------------------------------
 docs/High-Availability.md    | 49 +++++++++++++++++++++++++++++++++++++++
 docs/Master-Detection.md     | 35 ----------------------------
 docs/Using-ZooKeeper.textile |  5 ----
 3 files changed, 49 insertions(+), 40 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/mesos/blob/a274b094/docs/High-Availability.md
----------------------------------------------------------------------
diff --git a/docs/High-Availability.md b/docs/High-Availability.md
new file mode 100644
index 0000000..4235817
--- /dev/null
+++ b/docs/High-Availability.md
@@ -0,0 +1,49 @@
+High Availability
+=========
+Mesos can run in high-availability mode, in which multiple Mesos masters run 
simultaneously. In this mode there is only one active master, and the others 
masters act as stand-by replacements. These non-active masters will take over 
if the active master fails.
+
+Mesos uses Apache ZooKeeper in to co-ordinate the leader election of masters.
+
+Running mesos in this mode requires ZooKeeper. Mesos is automatically built 
with an included ZooKeeper client library.
+
+Leader election and detection in Mesos is done via ZooKeeper. See: 
http://zookeeper.apache.org/doc/trunk/recipes.html#sc_leaderElection for more 
information how ZooKeeper leader election works.
+
+Use
+---------
+First, a ZooKeeper cluster must be running, and a znode should be created for 
exclusive use by Mesos:
+
+  - Ensure the ZooKeeper cluster is up and running.
+  - Create a znode for exclusive use by mesos. The znode path will need to be 
provided to all slaves / scheduler drivers.
+
+In order to spin up a Mesos cluster using multiple masters for fault-tolerance:
+
+  - Start the mesos-master binaries using the `--zk` flag, e.g. 
`--zk=zk://host1:port1/path,host2:port2/path,...`
+  - Start the mesos-slave binaries with 
`--master=zk://host1:port1/path,host2:port2/path,...`
+  - Start any framework schedulers using the same zk path, the SchedulerDriver 
must be constructed with this path.
+
+Refer to the Scheduler API for how to deal with leadership changes.
+
+Semantics
+---------
+The detector is implemented in the `src/detector` folder. In particular, we 
watch for several ZooKeeper session events:
+
+  - Connection
+  - Reconnection
+  - Session Expiration
+  - ZNode creation, deletion, updates
+
+We also explicitly timeout our sessions, when disconnected from ZooKeeper for 
an amount of time, see: `ZOOKEEPER_SESSION_TIMEOUT`. This is because the 
ZooKeeper client libraries only notify of session expiration upon reconnection. 
These timeouts are of particular interest in the case of network partitions.
+
+Network Partitions
+------------------
+When a network partition occurs, if a particular component is disconnected 
from ZooKeeper, the Master Detector of the partitioned component will induce a 
timeout event. This causes the component to be notified that there is no 
leading master.
+
+When slaves are master-less, they ignore incoming messages from masters to 
ensure that we don't act on a non-leading master's decision.
+
+When masters enter a leader-less state, they commit suicide.
+
+The following semantics are enforced:
+
+  - If a slave is partitioned from the master, it will fail health-checks. The 
master will mark the slave as deactivated and send its tasks to LOST.
+  - Deactivated slaves may not re-register with the master, and are instructed 
to shut down upon any further communication after deactivation.
+  - When a slave is partitioned from ZooKeeper, it will enter a master-less 
state. It will ignore any master messages until a new master is elected.

http://git-wip-us.apache.org/repos/asf/mesos/blob/a274b094/docs/Master-Detection.md
----------------------------------------------------------------------
diff --git a/docs/Master-Detection.md b/docs/Master-Detection.md
deleted file mode 100644
index a8b2268..0000000
--- a/docs/Master-Detection.md
+++ /dev/null
@@ -1,35 +0,0 @@
-Master Detection
-=========
-Leader election and detection in Mesos is done via ZooKeeper. See: 
http://zookeeper.apache.org/doc/trunk/recipes.html#sc_leaderElection for more 
information how ZooKeeper leader election works.
-
-Use
----
-In order to spin up a Mesos cluster using multiple masters for fault-tolerance:
-
-  - Start the mesos-master binaries using the `--zk` flag, e.g. 
`--zk=zk://host1:port1/path,host2:port2/path,...`
-  - Start the mesos-slave binaries with 
`--master=zk://host1:port1/path,host2:port2/path,...`
-
-Semantics
----------
-The detector is implemented in the `src/detector` folder. In particular, we 
watch for several ZooKeeper session events:
-
-  - Connection
-  - Reconnection
-  - Session Expiration
-  - ZNode creation, deletion, updates
-
-We also explicitly timeout our sessions, when disconnected from ZooKeeper for 
an amount of time, see: `ZOOKEEPER_SESSION_TIMEOUT`. This is because the 
ZooKeeper client libraries only notify of session expiration upon reconnection. 
These timeouts are of particular interest in the case of network partitions.
-
-Network Partitions
-------------------
-When a network partition occurs, if a particular component is disconnected 
from ZooKeeper, the Master Detector of the partitioned component will induce a 
timeout event. This causes the component to be notified that there is no 
leading master.
-
-When slaves are master-less, they ignore incoming messages from masters to 
ensure that we don't act on a non-leading master's decision.
-
-When masters enter a leader-less state, they commit suicide.
-
-The following semantics are enforced:
-
-  - If a slave is partitioned from the master, it will fail health-checks. The 
master will mark the slave as deactivated and send its tasks to LOST.
-  - Deactivated slaves may not re-register with the master, and are instructed 
to shut down upon any further communication after deactivation.
-  - When a slave is partitioned from ZooKeeper, it will enter a master-less 
state. It will ignore any master messages until a new master is elected.

http://git-wip-us.apache.org/repos/asf/mesos/blob/a274b094/docs/Using-ZooKeeper.textile
----------------------------------------------------------------------
diff --git a/docs/Using-ZooKeeper.textile b/docs/Using-ZooKeeper.textile
deleted file mode 100644
index 1703b0f..0000000
--- a/docs/Using-ZooKeeper.textile
+++ /dev/null
@@ -1,5 +0,0 @@
-Mesos can run in fault-tolerant mode, in which multiple Mesos masters run 
simultaneously, with one of them being the active master, and the others acting 
as stand-bys ready to take over if the active master fails. Mesos uses Apache 
ZooKeeper in to elect a new active master. 
-
-Fault-tolerant mode requires Mesos to be built with ZooKeeper. This can be 
done with the configure option @--with-included-zookeeper@, which will ensure 
that ZooKeeper (which resides in the @third_party@ directory) gets compiled. It 
is also possible to run Mesos with an external ZooKeeper by using the configure 
option @--with-zookeeper=DIR@, setting @DIR@ to the directory of the external 
ZooKeeper.
-
-To run Mesos in fault-tolerant mode, ZooKeeper has to be up and running. The 
script @third_party/zookeeper-*/bin/zkServer.sh@ can be used to launch 
ZooKeeper (see the ZooKeeper documentation for more information). Once 
ZooKeeper is running, the master daemon, slave daemon(s), and the framework 
schedulers have to be passed a URL to the running ZooKeeper instance. The URL 
is of the form @zoo://host1:port1,host2:port2/znode@, where the @host:port@ 
pairs are ZooKeeper servers and @znode@ is a path to a znode (ZooKeeper's 
equivalent of a directory) for use by Mesos. It is also possible to use the URL 
@zoofile://filename/znode@, in which case @filename@ should contain one 
@host:port@ pair per line. This URL replaces the Mesos master URL (i.e. 
@mesos://@) which is passed when Mesos is not running in fault-tolerant mode. 
Multiple Mesos masters can be executed this way. Mesos will ensure, through 
ZooKeeper, that only one of them is the active master at any given time.

Reply via email to