Updated Branches: refs/heads/master d6bdd5e42 -> a274b0944
Created High-Availability doc and merged it with Using-Zookeeper. Project: http://git-wip-us.apache.org/repos/asf/mesos/repo Commit: http://git-wip-us.apache.org/repos/asf/mesos/commit/a274b094 Tree: http://git-wip-us.apache.org/repos/asf/mesos/tree/a274b094 Diff: http://git-wip-us.apache.org/repos/asf/mesos/diff/a274b094 Branch: refs/heads/master Commit: a274b0944a7cadcef689482d43faeb6341ccdeb7 Parents: d6bdd5e Author: Benjamin Mahler <[email protected]> Authored: Fri Jul 19 13:46:59 2013 -0700 Committer: Benjamin Mahler <[email protected]> Committed: Fri Jul 19 17:05:24 2013 -0700 ---------------------------------------------------------------------- docs/High-Availability.md | 49 +++++++++++++++++++++++++++++++++++++++ docs/Master-Detection.md | 35 ---------------------------- docs/Using-ZooKeeper.textile | 5 ---- 3 files changed, 49 insertions(+), 40 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/mesos/blob/a274b094/docs/High-Availability.md ---------------------------------------------------------------------- diff --git a/docs/High-Availability.md b/docs/High-Availability.md new file mode 100644 index 0000000..4235817 --- /dev/null +++ b/docs/High-Availability.md @@ -0,0 +1,49 @@ +High Availability +========= +Mesos can run in high-availability mode, in which multiple Mesos masters run simultaneously. In this mode there is only one active master, and the others masters act as stand-by replacements. These non-active masters will take over if the active master fails. + +Mesos uses Apache ZooKeeper in to co-ordinate the leader election of masters. + +Running mesos in this mode requires ZooKeeper. Mesos is automatically built with an included ZooKeeper client library. + +Leader election and detection in Mesos is done via ZooKeeper. See: http://zookeeper.apache.org/doc/trunk/recipes.html#sc_leaderElection for more information how ZooKeeper leader election works. + +Use +--------- +First, a ZooKeeper cluster must be running, and a znode should be created for exclusive use by Mesos: + + - Ensure the ZooKeeper cluster is up and running. + - Create a znode for exclusive use by mesos. The znode path will need to be provided to all slaves / scheduler drivers. + +In order to spin up a Mesos cluster using multiple masters for fault-tolerance: + + - Start the mesos-master binaries using the `--zk` flag, e.g. `--zk=zk://host1:port1/path,host2:port2/path,...` + - Start the mesos-slave binaries with `--master=zk://host1:port1/path,host2:port2/path,...` + - Start any framework schedulers using the same zk path, the SchedulerDriver must be constructed with this path. + +Refer to the Scheduler API for how to deal with leadership changes. + +Semantics +--------- +The detector is implemented in the `src/detector` folder. In particular, we watch for several ZooKeeper session events: + + - Connection + - Reconnection + - Session Expiration + - ZNode creation, deletion, updates + +We also explicitly timeout our sessions, when disconnected from ZooKeeper for an amount of time, see: `ZOOKEEPER_SESSION_TIMEOUT`. This is because the ZooKeeper client libraries only notify of session expiration upon reconnection. These timeouts are of particular interest in the case of network partitions. + +Network Partitions +------------------ +When a network partition occurs, if a particular component is disconnected from ZooKeeper, the Master Detector of the partitioned component will induce a timeout event. This causes the component to be notified that there is no leading master. + +When slaves are master-less, they ignore incoming messages from masters to ensure that we don't act on a non-leading master's decision. + +When masters enter a leader-less state, they commit suicide. + +The following semantics are enforced: + + - If a slave is partitioned from the master, it will fail health-checks. The master will mark the slave as deactivated and send its tasks to LOST. + - Deactivated slaves may not re-register with the master, and are instructed to shut down upon any further communication after deactivation. + - When a slave is partitioned from ZooKeeper, it will enter a master-less state. It will ignore any master messages until a new master is elected. http://git-wip-us.apache.org/repos/asf/mesos/blob/a274b094/docs/Master-Detection.md ---------------------------------------------------------------------- diff --git a/docs/Master-Detection.md b/docs/Master-Detection.md deleted file mode 100644 index a8b2268..0000000 --- a/docs/Master-Detection.md +++ /dev/null @@ -1,35 +0,0 @@ -Master Detection -========= -Leader election and detection in Mesos is done via ZooKeeper. See: http://zookeeper.apache.org/doc/trunk/recipes.html#sc_leaderElection for more information how ZooKeeper leader election works. - -Use ---- -In order to spin up a Mesos cluster using multiple masters for fault-tolerance: - - - Start the mesos-master binaries using the `--zk` flag, e.g. `--zk=zk://host1:port1/path,host2:port2/path,...` - - Start the mesos-slave binaries with `--master=zk://host1:port1/path,host2:port2/path,...` - -Semantics ---------- -The detector is implemented in the `src/detector` folder. In particular, we watch for several ZooKeeper session events: - - - Connection - - Reconnection - - Session Expiration - - ZNode creation, deletion, updates - -We also explicitly timeout our sessions, when disconnected from ZooKeeper for an amount of time, see: `ZOOKEEPER_SESSION_TIMEOUT`. This is because the ZooKeeper client libraries only notify of session expiration upon reconnection. These timeouts are of particular interest in the case of network partitions. - -Network Partitions ------------------- -When a network partition occurs, if a particular component is disconnected from ZooKeeper, the Master Detector of the partitioned component will induce a timeout event. This causes the component to be notified that there is no leading master. - -When slaves are master-less, they ignore incoming messages from masters to ensure that we don't act on a non-leading master's decision. - -When masters enter a leader-less state, they commit suicide. - -The following semantics are enforced: - - - If a slave is partitioned from the master, it will fail health-checks. The master will mark the slave as deactivated and send its tasks to LOST. - - Deactivated slaves may not re-register with the master, and are instructed to shut down upon any further communication after deactivation. - - When a slave is partitioned from ZooKeeper, it will enter a master-less state. It will ignore any master messages until a new master is elected. http://git-wip-us.apache.org/repos/asf/mesos/blob/a274b094/docs/Using-ZooKeeper.textile ---------------------------------------------------------------------- diff --git a/docs/Using-ZooKeeper.textile b/docs/Using-ZooKeeper.textile deleted file mode 100644 index 1703b0f..0000000 --- a/docs/Using-ZooKeeper.textile +++ /dev/null @@ -1,5 +0,0 @@ -Mesos can run in fault-tolerant mode, in which multiple Mesos masters run simultaneously, with one of them being the active master, and the others acting as stand-bys ready to take over if the active master fails. Mesos uses Apache ZooKeeper in to elect a new active master. - -Fault-tolerant mode requires Mesos to be built with ZooKeeper. This can be done with the configure option @--with-included-zookeeper@, which will ensure that ZooKeeper (which resides in the @third_party@ directory) gets compiled. It is also possible to run Mesos with an external ZooKeeper by using the configure option @--with-zookeeper=DIR@, setting @DIR@ to the directory of the external ZooKeeper. - -To run Mesos in fault-tolerant mode, ZooKeeper has to be up and running. The script @third_party/zookeeper-*/bin/zkServer.sh@ can be used to launch ZooKeeper (see the ZooKeeper documentation for more information). Once ZooKeeper is running, the master daemon, slave daemon(s), and the framework schedulers have to be passed a URL to the running ZooKeeper instance. The URL is of the form @zoo://host1:port1,host2:port2/znode@, where the @host:port@ pairs are ZooKeeper servers and @znode@ is a path to a znode (ZooKeeper's equivalent of a directory) for use by Mesos. It is also possible to use the URL @zoofile://filename/znode@, in which case @filename@ should contain one @host:port@ pair per line. This URL replaces the Mesos master URL (i.e. @mesos://@) which is passed when Mesos is not running in fault-tolerant mode. Multiple Mesos masters can be executed this way. Mesos will ensure, through ZooKeeper, that only one of them is the active master at any given time.
