Repository: aurora Updated Branches: refs/heads/master 277382633 -> 4577de4dd
Adding notes on changing the scheduler quorum size. Bugs closed: AURORA-1484 Reviewed at https://reviews.apache.org/r/38200/ Project: http://git-wip-us.apache.org/repos/asf/aurora/repo Commit: http://git-wip-us.apache.org/repos/asf/aurora/commit/4577de4d Tree: http://git-wip-us.apache.org/repos/asf/aurora/tree/4577de4d Diff: http://git-wip-us.apache.org/repos/asf/aurora/diff/4577de4d Branch: refs/heads/master Commit: 4577de4dd4b48b4519d120aace8b94215cd1299d Parents: 2773826 Author: Jeffrey Schroeder <[email protected]> Authored: Wed Sep 9 08:08:38 2015 -0700 Committer: Bill Farner <[email protected]> Committed: Wed Sep 9 08:08:38 2015 -0700 ---------------------------------------------------------------------- docs/deploying-aurora-scheduler.md | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/aurora/blob/4577de4d/docs/deploying-aurora-scheduler.md ---------------------------------------------------------------------- diff --git a/docs/deploying-aurora-scheduler.md b/docs/deploying-aurora-scheduler.md index 8a1e68e..73f7b19 100644 --- a/docs/deploying-aurora-scheduler.md +++ b/docs/deploying-aurora-scheduler.md @@ -31,6 +31,9 @@ machines. This guide helps you get the scheduler set up and troubleshoot some c - [Tasks are stuck in PENDING forever](#tasks-are-stuck-in-pending-forever) - [Symptoms](#symptoms-2) - [Solution](#solution-2) +- [Changing Scheduler Quorum Size](#changing-scheduler-quorum-size) + - [Preparation](#preparation) + - [Adding New Schedulers](#adding-new-schedulers) ## Installing Aurora The Aurora scheduler is a standalone Java server. As part of the build process it creates a bundle @@ -287,3 +290,19 @@ slaves are tagged with these two common failure domains to ensure that it can sa such that jobs are resilient to failure. See our [vagrant example](examples/vagrant/upstart/mesos-slave.conf) for details. + +## Changing Scheduler Quorum Size +Special care needs to be taken when changing the size of the Aurora scheduler quorum. +Since Aurora uses a Mesos replicated log, similar steps need to be followed as when +[changing the mesos quorum size](http://mesos.apache.org/documentation/latest/operational-guide). + +### Preparation +Increase [-native_log_quorum_size](storage-config.md#-native_log_quorum_size) on each +existing scheduler and restart them. When updating from 3 to 5 schedulers, the quorum size +would grow from 2 to 3. + +### Adding New Schedulers +Start the new schedulers with `-native_log_quorum_size` set to the new value. Failing to +first increase the quorum size on running schedulers can in some cases result in corruption +or truncating of the replicated log used by Aurora. In that case, see the documentation on +[recovering from backup](storage-config.md#recovering-from-a-scheduler-backup).
