Added: aurora/site/source/documentation/0.22.0/getting-started/vagrant.md
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/getting-started/vagrant.md?rev=1871319&view=auto
==============================================================================
--- aurora/site/source/documentation/0.22.0/getting-started/vagrant.md (added)
+++ aurora/site/source/documentation/0.22.0/getting-started/vagrant.md Fri Dec 
13 05:37:33 2019
@@ -0,0 +1,154 @@
+A local Cluster with Vagrant
+============================
+
+This document shows you how to configure a complete cluster using a virtual 
machine. This setup
+replicates a real cluster in your development machine as closely as possible. 
After you complete
+the steps outlined here, you will be ready to create and run your first Aurora 
job.
+
+The following sections describe these steps in detail:
+
+1. [Overview](#overview)
+1. [Install VirtualBox and Vagrant](#install-virtualbox-and-vagrant)
+1. [Clone the Aurora repository](#clone-the-aurora-repository)
+1. [Start the local cluster](#start-the-local-cluster)
+1. [Log onto the VM](#log-onto-the-vm)
+1. [Run your first job](#run-your-first-job)
+1. [Rebuild components](#rebuild-components)
+1. [Shut down or delete your local 
cluster](#shut-down-or-delete-your-local-cluster)
+1. [Troubleshooting](#troubleshooting)
+
+
+Overview
+--------
+
+The Aurora distribution includes a set of scripts that enable you to create a 
local cluster in
+your development machine. These scripts use 
[Vagrant](https://www.vagrantup.com/) and
+[VirtualBox](https://www.virtualbox.org/) to run and configure a virtual 
machine. Once the
+virtual machine is running, the scripts install and initialize Aurora and any 
required components
+to create the local cluster.
+
+
+Install VirtualBox and Vagrant
+------------------------------
+
+First, download and install [VirtualBox](https://www.virtualbox.org/) on your 
development machine.
+
+Then download and install [Vagrant](https://www.vagrantup.com/). To verify 
that the installation
+was successful, open a terminal window and type the `vagrant` command. You 
should see a list of
+common commands for this tool.
+
+
+Clone the Aurora repository
+---------------------------
+
+To obtain the Aurora source distribution, clone its Git repository using the 
following command:
+
+     git clone git://git.apache.org/aurora.git
+
+
+Start the local cluster
+-----------------------
+
+Now change into the `aurora/` directory, which contains the Aurora source code 
and
+other scripts and tools:
+
+     cd aurora/
+
+To start the local cluster, type the following command:
+
+     vagrant up
+
+This command uses the configuration scripts in the Aurora distribution to:
+
+* Download a Linux system image.
+* Start a virtual machine (VM) and configure it.
+* Install the required build tools on the VM.
+* Install Aurora's requirements (like [Mesos](http://mesos.apache.org/) and
+[Zookeeper](http://zookeeper.apache.org/)) on the VM.
+* Build and install Aurora from source on the VM.
+* Start Aurora's services on the VM.
+
+This process takes several minutes to complete.
+
+You may notice a warning that guest additions in the VM don't match your 
version of VirtualBox.
+This should generally be harmless, but you may wish to install a vagrant 
plugin to take care of
+mismatches like this for you:
+
+     vagrant plugin install vagrant-vbguest
+
+With this plugin installed, whenever you `vagrant up` the plugin will upgrade 
the guest additions
+for you when a version mis-match is detected. You can read more about the 
plugin
+[here](https://github.com/dotless-de/vagrant-vbguest).
+
+To verify that Aurora is running on the cluster, visit the following URLs:
+
+* Scheduler - http://192.168.33.7:8081
+* Observer - http://192.168.33.7:1338
+* Mesos Master - http://192.168.33.7:5050
+* Mesos Agent - http://192.168.33.7:5051
+
+
+Log onto the VM
+---------------
+
+To SSH into the VM, run the following command in your development machine:
+
+     vagrant ssh
+
+To verify that Aurora is installed in the VM, type the `aurora` command. You 
should see a list
+of arguments and possible commands.
+
+The `/vagrant` directory on the VM is mapped to the `aurora/` local directory
+from which you started the cluster. You can edit files inside this directory 
in your development
+machine and access them from the VM under `/vagrant`.
+
+A pre-installed `clusters.json` file refers to your local cluster as 
`devcluster`, which you
+will use in client commands.
+
+
+Run your first job
+------------------
+
+Now that your cluster is up and running, you are ready to define and run your 
first job in Aurora.
+For more information, see the [Aurora Tutorial](../tutorial/).
+
+
+Rebuild components
+------------------
+
+If you are changing Aurora code and would like to rebuild a component, you can 
use the `aurorabuild`
+command on the VM to build and restart a component.  This is considerably 
faster than destroying
+and rebuilding your VM.
+
+`aurorabuild` accepts a list of components to build and update. To get a list 
of supported
+components, invoke the `aurorabuild` command with no arguments:
+
+     vagrant ssh -c 'aurorabuild client'
+
+
+Shut down or delete your local cluster
+--------------------------------------
+
+To shut down your local cluster, run the `vagrant halt` command in your 
development machine. To
+start it again, run the `vagrant up` command.
+
+Once you are finished with your local cluster, or if you would otherwise like 
to start from scratch,
+you can use the command `vagrant destroy` to turn off and delete the virtual 
file system.
+
+
+Troubleshooting
+---------------
+
+Most of the Vagrant related problems can be fixed by the following steps:
+
+* Destroying the vagrant environment with `vagrant destroy`
+* Killing any orphaned VMs (see AURORA-499) with `virtualbox` UI or 
`VBoxManage` command line tool
+* Cleaning the repository of build artifacts and other intermediate output 
with `git clean -fdx`
+* Bringing up the vagrant environment with `vagrant up`
+
+If that still doesn't solve your problem, make sure to inspect the log files:
+
+* Scheduler: `/var/log/aurora/scheduler.log` or `sudo journalctl -u 
aurora-scheduler`
+* Observer: `/var/log/thermos/observer.log` or `sudo journalctl -u 
thermos-observer`
+* Mesos Master: `/var/log/mesos/mesos-master.INFO` (also see `.WARNING` and 
`.ERROR`)
+* Mesos Agent: `/var/log/mesos/mesos-slave.INFO` (also see `.WARNING` and 
`.ERROR`)

Added: aurora/site/source/documentation/0.22.0/images/CPUavailability.png
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/images/CPUavailability.png?rev=1871319&view=auto
==============================================================================
Binary file - no diff available.

Propchange: aurora/site/source/documentation/0.22.0/images/CPUavailability.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: aurora/site/source/documentation/0.22.0/images/CompletedTasks.png
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/images/CompletedTasks.png?rev=1871319&view=auto
==============================================================================
Binary file - no diff available.

Propchange: aurora/site/source/documentation/0.22.0/images/CompletedTasks.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: aurora/site/source/documentation/0.22.0/images/HelloWorldJob.png
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/images/HelloWorldJob.png?rev=1871319&view=auto
==============================================================================
Binary file - no diff available.

Propchange: aurora/site/source/documentation/0.22.0/images/HelloWorldJob.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: aurora/site/source/documentation/0.22.0/images/RoleJobs.png
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/images/RoleJobs.png?rev=1871319&view=auto
==============================================================================
Binary file - no diff available.

Propchange: aurora/site/source/documentation/0.22.0/images/RoleJobs.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: aurora/site/source/documentation/0.22.0/images/RunningJob.png
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/images/RunningJob.png?rev=1871319&view=auto
==============================================================================
Binary file - no diff available.

Propchange: aurora/site/source/documentation/0.22.0/images/RunningJob.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: aurora/site/source/documentation/0.22.0/images/ScheduledJobs.png
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/images/ScheduledJobs.png?rev=1871319&view=auto
==============================================================================
Binary file - no diff available.

Propchange: aurora/site/source/documentation/0.22.0/images/ScheduledJobs.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: aurora/site/source/documentation/0.22.0/images/TaskBreakdown.png
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/images/TaskBreakdown.png?rev=1871319&view=auto
==============================================================================
Binary file - no diff available.

Propchange: aurora/site/source/documentation/0.22.0/images/TaskBreakdown.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: aurora/site/source/documentation/0.22.0/images/aurora_hierarchy.png
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/images/aurora_hierarchy.png?rev=1871319&view=auto
==============================================================================
Binary file - no diff available.

Propchange: aurora/site/source/documentation/0.22.0/images/aurora_hierarchy.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: aurora/site/source/documentation/0.22.0/images/aurora_logo.png
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/images/aurora_logo.png?rev=1871319&view=auto
==============================================================================
Binary file - no diff available.

Propchange: aurora/site/source/documentation/0.22.0/images/aurora_logo.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: aurora/site/source/documentation/0.22.0/images/components.odg
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/images/components.odg?rev=1871319&view=auto
==============================================================================
Binary file - no diff available.

Propchange: aurora/site/source/documentation/0.22.0/images/components.odg
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: aurora/site/source/documentation/0.22.0/images/components.png
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/images/components.png?rev=1871319&view=auto
==============================================================================
Binary file - no diff available.

Propchange: aurora/site/source/documentation/0.22.0/images/components.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: aurora/site/source/documentation/0.22.0/images/debug-client-test.png
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/images/debug-client-test.png?rev=1871319&view=auto
==============================================================================
Binary file - no diff available.

Propchange: aurora/site/source/documentation/0.22.0/images/debug-client-test.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: aurora/site/source/documentation/0.22.0/images/debugging-client-test.png
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/images/debugging-client-test.png?rev=1871319&view=auto
==============================================================================
Binary file - no diff available.

Propchange: 
aurora/site/source/documentation/0.22.0/images/debugging-client-test.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: aurora/site/source/documentation/0.22.0/images/killedtask.png
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/images/killedtask.png?rev=1871319&view=auto
==============================================================================
Binary file - no diff available.

Propchange: aurora/site/source/documentation/0.22.0/images/killedtask.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: aurora/site/source/documentation/0.22.0/images/lifeofatask.png
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/images/lifeofatask.png?rev=1871319&view=auto
==============================================================================
Binary file - no diff available.

Propchange: aurora/site/source/documentation/0.22.0/images/lifeofatask.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: 
aurora/site/source/documentation/0.22.0/images/presentations/02_19_2015_aurora_adopters_panel_thumb.png
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/images/presentations/02_19_2015_aurora_adopters_panel_thumb.png?rev=1871319&view=auto
==============================================================================
Binary file - no diff available.

Propchange: 
aurora/site/source/documentation/0.22.0/images/presentations/02_19_2015_aurora_adopters_panel_thumb.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: 
aurora/site/source/documentation/0.22.0/images/presentations/02_19_2015_aurora_at_tellapart_thumb.png
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/images/presentations/02_19_2015_aurora_at_tellapart_thumb.png?rev=1871319&view=auto
==============================================================================
Binary file - no diff available.

Propchange: 
aurora/site/source/documentation/0.22.0/images/presentations/02_19_2015_aurora_at_tellapart_thumb.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: 
aurora/site/source/documentation/0.22.0/images/presentations/02_19_2015_aurora_at_twitter_thumb.png
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/images/presentations/02_19_2015_aurora_at_twitter_thumb.png?rev=1871319&view=auto
==============================================================================
Binary file - no diff available.

Propchange: 
aurora/site/source/documentation/0.22.0/images/presentations/02_19_2015_aurora_at_twitter_thumb.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: 
aurora/site/source/documentation/0.22.0/images/presentations/02_28_2015_apache_aurora_thumb.png
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/images/presentations/02_28_2015_apache_aurora_thumb.png?rev=1871319&view=auto
==============================================================================
Binary file - no diff available.

Propchange: 
aurora/site/source/documentation/0.22.0/images/presentations/02_28_2015_apache_aurora_thumb.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: 
aurora/site/source/documentation/0.22.0/images/presentations/03_07_2015_aurora_mesos_in_practice_at_twitter_thumb.png
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/images/presentations/03_07_2015_aurora_mesos_in_practice_at_twitter_thumb.png?rev=1871319&view=auto
==============================================================================
Binary file - no diff available.

Propchange: 
aurora/site/source/documentation/0.22.0/images/presentations/03_07_2015_aurora_mesos_in_practice_at_twitter_thumb.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: 
aurora/site/source/documentation/0.22.0/images/presentations/03_25_2014_introduction_to_aurora_thumb.png
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/images/presentations/03_25_2014_introduction_to_aurora_thumb.png?rev=1871319&view=auto
==============================================================================
Binary file - no diff available.

Propchange: 
aurora/site/source/documentation/0.22.0/images/presentations/03_25_2014_introduction_to_aurora_thumb.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: 
aurora/site/source/documentation/0.22.0/images/presentations/04_30_2015_monolith_to_microservices_thumb.png
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/images/presentations/04_30_2015_monolith_to_microservices_thumb.png?rev=1871319&view=auto
==============================================================================
Binary file - no diff available.

Propchange: 
aurora/site/source/documentation/0.22.0/images/presentations/04_30_2015_monolith_to_microservices_thumb.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: 
aurora/site/source/documentation/0.22.0/images/presentations/08_21_2014_past_present_future_thumb.png
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/images/presentations/08_21_2014_past_present_future_thumb.png?rev=1871319&view=auto
==============================================================================
Binary file - no diff available.

Propchange: 
aurora/site/source/documentation/0.22.0/images/presentations/08_21_2014_past_present_future_thumb.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: 
aurora/site/source/documentation/0.22.0/images/presentations/09_20_2015_shipping_code_with_aurora_thumb.png
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/images/presentations/09_20_2015_shipping_code_with_aurora_thumb.png?rev=1871319&view=auto
==============================================================================
Binary file - no diff available.

Propchange: 
aurora/site/source/documentation/0.22.0/images/presentations/09_20_2015_shipping_code_with_aurora_thumb.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: 
aurora/site/source/documentation/0.22.0/images/presentations/09_20_2015_twitter_production_scale_thumb.png
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/images/presentations/09_20_2015_twitter_production_scale_thumb.png?rev=1871319&view=auto
==============================================================================
Binary file - no diff available.

Propchange: 
aurora/site/source/documentation/0.22.0/images/presentations/09_20_2015_twitter_production_scale_thumb.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: 
aurora/site/source/documentation/0.22.0/images/presentations/10_08_2015_mesos_aurora_on_a_small_scale_thumb.png
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/images/presentations/10_08_2015_mesos_aurora_on_a_small_scale_thumb.png?rev=1871319&view=auto
==============================================================================
Binary file - no diff available.

Propchange: 
aurora/site/source/documentation/0.22.0/images/presentations/10_08_2015_mesos_aurora_on_a_small_scale_thumb.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: 
aurora/site/source/documentation/0.22.0/images/presentations/10_08_2015_sla_aware_maintenance_for_operators_thumb.png
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/images/presentations/10_08_2015_sla_aware_maintenance_for_operators_thumb.png?rev=1871319&view=auto
==============================================================================
Binary file - no diff available.

Propchange: 
aurora/site/source/documentation/0.22.0/images/presentations/10_08_2015_sla_aware_maintenance_for_operators_thumb.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: aurora/site/source/documentation/0.22.0/images/runningtask.png
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/images/runningtask.png?rev=1871319&view=auto
==============================================================================
Binary file - no diff available.

Propchange: aurora/site/source/documentation/0.22.0/images/runningtask.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: aurora/site/source/documentation/0.22.0/images/stderr.png
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/images/stderr.png?rev=1871319&view=auto
==============================================================================
Binary file - no diff available.

Propchange: aurora/site/source/documentation/0.22.0/images/stderr.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: aurora/site/source/documentation/0.22.0/images/stdout.png
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/images/stdout.png?rev=1871319&view=auto
==============================================================================
Binary file - no diff available.

Propchange: aurora/site/source/documentation/0.22.0/images/stdout.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: aurora/site/source/documentation/0.22.0/images/storage_hierarchy.png
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/images/storage_hierarchy.png?rev=1871319&view=auto
==============================================================================
Binary file - no diff available.

Propchange: aurora/site/source/documentation/0.22.0/images/storage_hierarchy.png
------------------------------------------------------------------------------
    svn:mime-type = application/octet-stream

Added: aurora/site/source/documentation/0.22.0/index.html.md
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/index.html.md?rev=1871319&view=auto
==============================================================================
--- aurora/site/source/documentation/0.22.0/index.html.md (added)
+++ aurora/site/source/documentation/0.22.0/index.html.md Fri Dec 13 05:37:33 
2019
@@ -0,0 +1,80 @@
+## Introduction
+
+Apache Aurora is a service scheduler that runs on top of Apache Mesos, 
enabling you to run
+long-running services, cron jobs, and ad-hoc jobs that take advantage of 
Apache Mesos' scalability,
+fault-tolerance, and resource isolation.
+
+We encourage you to ask questions on the [Aurora user 
list](http://aurora.apache.org/community/) or
+the `#aurora` IRC channel on `irc.freenode.net`.
+
+
+## Getting Started
+Information for everyone new to Apache Aurora.
+
+ * [Aurora System Overview](getting-started/overview/)
+ * [Hello World Tutorial](getting-started/tutorial/)
+ * [Local cluster with Vagrant](getting-started/vagrant/)
+
+## Features
+Description of important Aurora features.
+
+ * [Containers](features/containers/)
+ * [Cron Jobs](features/cron-jobs/)
+ * [Custom Executors](features/custom-executors/)
+ * [Job Updates](features/job-updates/)
+ * [Multitenancy](features/multitenancy/)
+ * [Resource Isolation](features/resource-isolation/)
+ * [Scheduling Constraints](features/constraints/)
+ * [Services](features/services/)
+ * [Service Discovery](features/service-discovery/)
+ * [SLA Metrics](features/sla-metrics/)
+ * [SLA Requirements](features/sla-requirements/)
+ * [Webhooks](features/webhooks/)
+
+## Operators
+For those that wish to manage and fine-tune an Aurora cluster.
+
+ * [Installation](operations/installation/)
+ * [Configuration](operations/configuration/)
+ * [Upgrades](operations/upgrades/)
+ * [Troubleshooting](operations/troubleshooting/)
+ * [Monitoring](operations/monitoring/)
+ * [Security](operations/security/)
+ * [Storage](operations/storage/)
+ * [Backup](operations/backup-restore/)
+
+## Reference
+The complete reference of commands, configuration options, and scheduler 
internals.
+
+ * [Task lifecycle](reference/task-lifecycle/)
+ * Configuration (`.aurora` files)
+    - [Configuration Reference](reference/configuration/)
+    - [Configuration Tutorial](reference/configuration-tutorial/)
+    - [Configuration Best Practices](reference/configuration-best-practices/)
+    - [Configuration Templating](reference/configuration-templating/)
+ * Aurora Client
+    - [Client Commands](reference/client-commands/)
+    - [Client Hooks](reference/client-hooks/)
+    - [Client Cluster Configuration](reference/client-cluster-configuration/)
+ * [Scheduler Configuration](reference/scheduler-configuration/)
+ * [Observer Configuration](reference/observer-configuration/)
+ * [Endpoints](reference/scheduler-endpoints/)
+
+## Additional Resources
+ * [Tools integrating with Aurora](additional-resources/tools/)
+ * [Presentation videos and slides](additional-resources/presentations/)
+
+## Developers
+All the information you need to start modifying Aurora and contributing back 
to the project.
+
+ * [Contributing to the project](contributing/)
+ * [Committer's Guide](development/committers-guide/)
+ * [Design Documents](development/design-documents/)
+ * Developing the Aurora components:
+     - [Client](development/client/)
+     - [Scheduler](development/scheduler/)
+     - [Scheduler UI](development/ui/)
+     - [Thermos](development/thermos/)
+     - [Thrift structures](development/thrift/)
+
+

Added: aurora/site/source/documentation/0.22.0/operations/backup-restore.md
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/operations/backup-restore.md?rev=1871319&view=auto
==============================================================================
--- aurora/site/source/documentation/0.22.0/operations/backup-restore.md (added)
+++ aurora/site/source/documentation/0.22.0/operations/backup-restore.md Fri 
Dec 13 05:37:33 2019
@@ -0,0 +1,80 @@
+# Recovering from a Scheduler Backup
+
+**Be sure to read the entire page before attempting to restore from a backup, 
as it may have
+unintended consequences.**
+
+## Summary
+
+The restoration procedure replaces the existing (possibly corrupted) Mesos 
replicated log with an
+earlier, backed up, version and requires all schedulers to be taken down 
temporarily while
+restoring. Once completed, the scheduler state resets to what it was when the 
backup was created.
+This means any jobs/tasks created or updated after the backup are unknown to 
the scheduler and will
+be killed shortly after the cluster restarts. All other tasks continue 
operating as normal.
+
+Usually, it is a bad idea to restore a backup that is not extremely recent 
(i.e. older than a few
+hours). This is because the scheduler will expect the cluster to look exactly 
as the backup does,
+so any tasks that have been rescheduled since the backup was taken will be 
killed.
+
+Instructions below have been verified in [Vagrant 
environment](../../getting-started/vagrant/) and with minor
+syntax/path changes should be applicable to any Aurora cluster.
+
+Follow these steps to prepare the cluster for restoring from a backup:
+
+##  Preparation
+
+* Stop all scheduler instances.
+
+* Pick a backup to use for rehydrating the mesos-replicated log. Backups can 
be found in the
+directory given to the scheduler as the `-backup_dir` argument. Backups are 
stored in the format
+`scheduler-backup-<yyyy-MM-dd-HH-mm>`.
+
+* If running the Aurora Scheduler in HA mode, pick a single scheduler instance 
to rehydrate.
+
+* Locate the `recovery-tool` in your setup. If Aurora was installed using a 
Debian package
+generated by our `aurora-packaging` script, the recovery tool can be found
+in `/usr/share/aurora/bin/recovery-tool`.
+
+## Cleanup
+
+* Delete (or move) the Mesos replicated log path for each scheduler instance. 
The location of the
+Mesos replicated log file path can be found by looking at the value given to 
the flag
+`-native_log_file_path` for each instance.
+
+* Initialize the Mesos replicated log files using the mesos-log tool:
+```
+sudo su -u <USER> mesos-log initialize --path=<native_log_file_path>
+```
+Where `USER` is the user under which the scheduler instance will be run. For 
installations using
+Debian packages, the default user will be `aurora`. You may alternatively 
choose to specify
+a group as well by passing the `-g <GROUP>` option to `su`.
+Note that if the user under which the Aurora scheduler instance is run _does 
not_ have permissions
+to read this directory and the files it contains, the instance will fail to 
start.
+
+## Restore from backup
+
+* Run the `recovery-tool`. Wherever the flags match those used for the 
scheduler instance,
+use the same values:
+```
+$ recovery-tool -from BACKUP \
+-to LOG \
+-backup=<selected_backup_location> \
+-native_log_zk_group_path=<native_log_zk_group_path> \
+-native_log_file_path=<native_log_file_path> \
+-zk_endpoints=<zk_endpoints>
+```
+
+## Bring scheduler instances back online
+
+### If running in HA Mode
+
+* Start the rehydrated scheduler instance along with enough cleaned up 
instances to
+meet the `-native_log_quorum_size`. The mesos-replicated log algorithm will 
replenish
+the "blank" scheduler instances with the information from the rehydrated 
instance.
+
+* Start any remaining scheduler instances.
+
+### If running in singleton mode
+
+* Start the single scheduler instance.
+
+

Added: aurora/site/source/documentation/0.22.0/operations/configuration.md
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/operations/configuration.md?rev=1871319&view=auto
==============================================================================
--- aurora/site/source/documentation/0.22.0/operations/configuration.md (added)
+++ aurora/site/source/documentation/0.22.0/operations/configuration.md Fri Dec 
13 05:37:33 2019
@@ -0,0 +1,380 @@
+# Scheduler Configuration
+
+The Aurora scheduler can take a variety of configuration options through 
command-line arguments.
+Examples are available under `examples/scheduler/`. For a list of available 
Aurora flags and their
+documentation, see [Scheduler Configuration 
Reference](../../reference/scheduler-configuration/).
+
+
+## A Note on Configuration
+Like Mesos, Aurora uses command-line flags for runtime configuration. As such 
the Aurora
+"configuration file" is typically a `scheduler.sh` shell script of the form.
+
+    #!/bin/bash
+    AURORA_HOME=/usr/local/aurora-scheduler
+
+    # Flags controlling the JVM.
+    JAVA_OPTS=(
+      -Xmx2g
+      -Xms2g
+      # GC tuning, etc.
+    )
+
+    # Flags controlling the scheduler.
+    AURORA_FLAGS=(
+      # Port for client RPCs and the web UI
+      -http_port=8081
+      # Log configuration, etc.
+    )
+
+    # Environment variables controlling libmesos
+    export JAVA_HOME=...
+    export GLOG_v=1
+    export LIBPROCESS_PORT=8083
+    export LIBPROCESS_IP=192.168.33.7
+
+    JAVA_OPTS="${JAVA_OPTS[*]}" exec "$AURORA_HOME/bin/aurora-scheduler" 
"${AURORA_FLAGS[@]}"
+
+That way Aurora's current flags are visible in `ps` and in the `/vars` admin 
endpoint.
+
+
+## JVM Configuration
+
+JVM settings are dependent on your environment and cluster size. They might 
require
+custom tuning. As a starting point, we recommend:
+
+* Ensure the initial (`-Xms`) and maximum (`-Xmx`) heap size are idential to 
prevent heap resizing
+  at runtime.
+* Either `-XX:+UseConcMarkSweepGC` or `-XX:+UseG1GC 
-XX:+UseStringDeduplication` are
+  sane defaults for the garbage collector.
+* `-Djava.net.preferIPv4Stack=true` makes sense in most cases as well.
+
+
+## Network Configuration
+
+By default, Aurora binds to all interfaces and auto-discovers its hostname. To 
reduce ambiguity
+it helps to hardcode them though:
+
+    -http_port=8081
+    -ip=192.168.33.7
+    -hostname="aurora1.us-east1.example.org"
+
+Two environment variables control the ip and port for the communication with 
the Mesos master
+and for the replicated log used by Aurora:
+
+    export LIBPROCESS_PORT=8083
+    export LIBPROCESS_IP=192.168.33.7
+
+It is important that those can be reached from all Mesos master and Aurora 
scheduler instances.
+
+
+## Replicated Log Configuration
+
+Aurora schedulers use ZooKeeper to discover log replicas and elect a leader. 
Only one scheduler is
+leader at a given time - the other schedulers follow log writes and prepare to 
take over as leader
+but do not communicate with the Mesos master. Either 3 or 5 schedulers are 
recommended in a
+production deployment depending on failure tolerance and they must have 
persistent storage.
+
+Below is a summary of scheduler storage configuration flags that either don't 
have default values
+or require attention before deploying in a production environment.
+
+### `-native_log_quorum_size`
+Defines the Mesos replicated log quorum size. In a cluster with `N` 
schedulers, the flag
+`-native_log_quorum_size` should be set to `floor(N/2) + 1`. So in a cluster 
with 1 scheduler
+it should be set to `1`, in a cluster with 3 it should be set to `2`, and in a 
cluster of 5 it
+should be set to `3`.
+
+  Number of schedulers (N) | ```-native_log_quorum_size``` setting 
(```floor(N/2) + 1```)
+  ------------------------ | 
-------------------------------------------------------------
+  1                        | 1
+  3                        | 2
+  5                        | 3
+  7                        | 4
+
+*Incorrectly setting this flag will cause data corruption to occur!*
+
+### `-native_log_file_path`
+Location of the Mesos replicated log files. For optimal and consistent 
performance, consider
+allocating a dedicated disk (preferably SSD) for the replicated log. Ensure 
that this disk is not
+used by anything else (e.g. no process logging) and in particular that it is a 
real disk
+and not just a partition.
+
+Even when a dedicated disk is used, switching from `CFQ` to `deadline` I/O 
scheduler of Linux kernel
+can furthermore help with storage performance in Aurora ([see this ticket for 
details](https://issues.apache.org/jira/browse/AURORA-1211)).
+
+### `-native_log_zk_group_path`
+ZooKeeper path used for Mesos replicated log quorum discovery.
+
+See 
[code](https://github.com/apache/aurora/blob/rel/0.22.0/src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLogStreamModule.java)
 for
+other available Mesos replicated log configuration options and default values.
+
+### Changing the Quorum Size
+Special care needs to be taken when changing the size of the Aurora scheduler 
quorum.
+Since Aurora uses a Mesos replicated log, similar steps need to be followed as 
when
+[changing the Mesos quorum 
size](http://mesos.apache.org/documentation/latest/operational-guide).
+
+As a preparation, increase `-native_log_quorum_size` on each existing 
scheduler and restart them.
+When updating from 3 to 5 schedulers, the quorum size would grow from 2 to 3.
+
+When starting the new schedulers, use the `-native_log_quorum_size` set to the 
new value. Failing to
+first increase the quorum size on running schedulers can in some cases result 
in corruption
+or truncating of the replicated log used by Aurora. In that case, see the 
documentation on
+[recovering from backup](../backup-restore/).
+
+
+## Backup Configuration
+
+Configuration options for the Aurora scheduler backup manager.
+
+* `-backup_interval`: The interval on which the scheduler writes local storage 
backups.
+   The default is every hour.
+* `-backup_dir`: Directory to write backups to. As stated above, this should 
not be co-located on the
+   same disk as the replicated log.
+* `-max_saved_backups`: Maximum number of backups to retain before deleting 
the oldest backup(s).
+
+
+## Resource Isolation
+
+For proper CPU, memory, and disk isolation as mentioned in our [enduser 
documentation](../../features/resource-isolation/),
+we recommend to add the following isolators to the `--isolation` flag of the 
Mesos agent:
+
+* `cgroups/cpu`
+* `cgroups/mem`
+* `disk/du`
+
+In addition, we recommend to set the following [agent 
flags](http://mesos.apache.org/documentation/latest/configuration/):
+
+* `--cgroups_limit_swap` to enable memory limits on both memory and swap 
instead of just memory.
+  Alternatively, you could disable swap on your agent hosts.
+* `--cgroups_enable_cfs` to enable hard limits on CPU resources via the CFS 
bandwidth limiting
+  feature.
+* `--enforce_container_disk_quota` to enable disk quota enforcement for 
containers.
+
+To enable the optional GPU support in Mesos, please see the GPU related flags 
in the
+[Mesos 
configuration](http://mesos.apache.org/documentation/latest/configuration/).
+To enable the corresponding feature in Aurora, you have to start the scheduler 
with the
+flag
+
+    -allow_gpu_resource=true
+
+If you want to use revocable resources, first follow the
+[Mesos oversubscription 
documentation](http://mesos.apache.org/documentation/latest/oversubscription/)
+and then set set this Aurora scheduler flag to allow receiving revocable Mesos 
offers:
+
+    -receive_revocable_resources=true
+
+Both CPUs and RAM are supported as revocable resources. The former is enabled 
by the default,
+the latter needs to be enabled via:
+
+    -enable_revocable_ram=true
+
+Unless you want to use the 
[default](https://github.com/apache/aurora/blob/rel/0.22.0/src/main/resources/org/apache/aurora/scheduler/tiers.json)
+tier configuration, you will also have to specify a file path:
+
+    -tier_config=path/to/tiers/config.json
+
+
+## Multi-Framework Setup
+
+Aurora holds onto Mesos offers in order to provide efficient scheduling and
+[preemption](../../features/multitenancy/#preemption). This is problematic in 
multi-framework
+environments as Aurora might starve other frameworks.
+
+With a downside of increased scheduling latency, Aurora can be configured to 
be more cooperative:
+
+* Lowering `-min_offer_hold_time` (e.g. to `1mins`) can ensure unused offers 
are returned back to
+  Mesos more frequently.
+* Increasing `-offer_filter_duration` (e.g to `30secs`) will instruct Mesos
+  not to re-offer rejected resources for the given duration.
+
+Setting a [minimum amount of 
resources](http://mesos.apache.org/documentation/latest/quota/) for
+each Mesos role can furthermore help to ensure no framework is starved 
entirely.
+
+
+## Containers
+
+Both the Mesos and Docker containerizers require configuration of the Mesos 
agent.
+
+### Mesos Containerizer
+
+The minimal agent configuration requires to enable Docker and Appc image 
support for the Mesos
+containerizer:
+
+    --containerizers=mesos
+    --image_providers=appc,docker
+    --isolation=filesystem/linux,docker/runtime  # as an addition to your 
other isolators
+
+Further details can be found in the corresponding [Mesos 
documentation](http://mesos.apache.org/documentation/latest/container-image/).
+
+### Docker Containerizer
+
+The [Docker 
containerizer](http://mesos.apache.org/documentation/latest/docker-containerizer/)
+requires the Docker engine is installed on each agent host. In addition, it  
must be enabled on the
+Mesos agents by launching them with the option:
+
+    --containerizers=mesos,docker
+
+If you would like to run a container with a read-only filesystem, it may also 
be necessary to use
+the scheduler flag `-thermos_home_in_sandbox` in order to set HOME to the 
sandbox
+before the executor runs. This will make sure that the executor/runner PEX 
extractions happens
+inside of the sandbox instead of the container filesystem root.
+
+If you would like to supply your own parameters to `docker run` when launching 
jobs in docker
+containers, you may use the following flags:
+
+    -allow_docker_parameters
+    -default_docker_parameters
+
+`-allow_docker_parameters` controls whether or not users may pass their own 
configuration parameters
+through the job configuration files. If set to `false` (the default), the 
scheduler will reject
+jobs with custom parameters. *NOTE*: this setting should be used with caution 
as it allows any job
+owner to specify any parameters they wish, including those that may introduce 
security concerns
+(`privileged=true`, for example).
+
+`-default_docker_parameters` allows a cluster operator to specify a universal 
set of parameters that
+should be used for every container that does not have parameters explicitly 
configured at the job
+level. The argument accepts a multimap format:
+
+    -default_docker_parameters="read-only=true,tmpfs=/tmp,tmpfs=/run"
+
+### Common Options
+
+The following Aurora options work for both containerizers.
+
+A scheduler flag, `-global_container_mounts` allows mounting paths from the 
host (i.e the agent machine)
+into all containers on that host. The format is a comma separated list of 
host_path:container_path[:mode]
+tuples. For example 
`-global_container_mounts=/opt/secret_keys_dir:/mnt/secret_keys_dir:ro` mounts
+`/opt/secret_keys_dir` from the agents into all launched containers. Valid 
modes are `ro` and `rw`.
+
+
+## Thermos Process Logs
+
+### Log destination
+By default, Thermos will write process stdout/stderr to log files in the 
sandbox. Process object
+configuration allows specifying alternate log file destinations like streamed 
stdout/stderr or
+suppression of all log output. Default behavior can be configured for the 
entire cluster with the
+following flag (through the `-thermos_executor_flags` argument to the Aurora 
scheduler):
+
+    --runner-logger-destination=both
+
+`both` configuration will send logs to files and stream to parent 
stdout/stderr outputs.
+
+See [Configuration Reference](../../reference/configuration/#logger) for all 
destination options.
+
+### Log rotation
+By default, Thermos will not rotate the stdout/stderr logs from child 
processes and they will grow
+without bound. An individual user may change this behavior via configuration 
on the Process object,
+but it may also be desirable to change the default configuration for the 
entire cluster.
+In order to enable rotation by default, the following flags can be applied to 
Thermos (through the
+`-thermos_executor_flags` argument to the Aurora scheduler):
+
+    --runner-logger-mode=rotate
+    --runner-rotate-log-size-mb=100
+    --runner-rotate-log-backups=10
+
+In the above example, each instance of the Thermos runner will rotate 
stderr/stdout logs once they
+reach 100 MiB in size and keep a maximum of 10 backups. If a user has provided 
a custom setting for
+their process, it will override these default settings.
+
+
+## Thermos Executor Wrapper
+
+If you need to do computation before starting the Thermos executor (for 
example, setting a different
+`--announcer-hostname` parameter for every executor), then the Thermos 
executor should be invoked
+inside a wrapper script. In such a case, the aurora scheduler should be 
started with
+`-thermos_executor_path` pointing to the wrapper script and 
`-thermos_executor_resources` set to a
+comma separated string of all the resources that should be copied into the 
sandbox (including the
+original Thermos executor). Ensure the wrapper script does not access 
resources outside of the
+sandbox, as when the script is run from within a Docker container those 
resources may not exist.
+
+For example, to wrap the executor inside a simple wrapper, the scheduler will 
be started like this
+`-thermos_executor_path=/path/to/wrapper.sh 
-thermos_executor_resources=/usr/share/aurora/bin/thermos_executor.pex`
+
+## Custom Executors
+
+The scheduler can be configured to utilize a custom executor by specifying the 
`-custom_executor_config` flag.
+The flag must be set to the path of a valid executor configuration file.
+
+For more information on this feature please see the custom executors 
[documentation](../../features/custom-executors/).
+
+## A note on increasing executor overhead
+
+Increasing executor overhead on an existing cluster, whether it be for custom 
executors or for Thermos,
+will result in degraded preemption performance until all task which began life 
with the previous
+executor configuration with less overhead are preempted/restarted.
+
+## Controlling MTTA via Update Affinity
+
+When there is high resource contention in your cluster you may experience 
noticably elevated job update
+times, as well as high task churn across the cluster. This is due to Aurora's 
first-fit scheduling
+algorithm. To alleviate this, you can enable update affinity where the 
Scheduler will make a best-effort
+attempt to reuse the same agent for the updated task (so long as the resources 
for the job are not being
+increased).
+
+To enable this in the Scheduler, you can set the following options:
+
+    -enable_update_affinity=true
+    -update_affinity_reservation_hold_time=3mins
+
+You will need to tune the hold time to match the behavior you see in your 
cluster. If you have extremely
+high update throughput, you might have to extend it as processing updates 
could easily add significant
+delays between scheduling attempts. You may also have to tune scheduling 
parameters to achieve the
+throughput you need in your cluster. Some relevant settings (with defaults) 
are:
+
+    -max_schedule_attempts_per_sec=40
+    -initial_schedule_penalty=1secs
+    -max_schedule_penalty=1mins
+    -scheduling_max_batch_size=3
+    -max_tasks_per_schedule_attempt=5
+
+There are metrics exposed by the Scheduler which can provide guidance on where 
the bottleneck is.
+Example metrics to look at:
+
+    - schedule_attempts_blocks (if this number is greater than 0, then task 
throughput is hitting
+                                limits controlled by 
--max_scheduler_attempts_per_sec)
+    - scheduled_task_penalty_* (metrics around scheduling penalties for tasks, 
if the numbers here are high
+                                then you could have high contention for 
resources)
+
+Most likely you'll run into limits with the number of update instances that 
can be processed per minute
+before you run into any other limits. So if your total work done per minute 
starts to exceed 2k instances,
+you may need to extend the update_affinity_reservation_hold_time.
+
+## Cluster Maintenance
+
+Aurora performs maintenance related task drains. One of the scheduler options 
that can control
+how often the scheduler polls for maintenance work can be controlled via,
+
+    -host_maintenance_polling_interval=1min
+
+## Enforcing SLA limitations
+
+Since tasks can specify their own `SLAPolicy`, the cluster needs to limit 
these SLA requirements.
+Too aggressive a requirement can permanently block any type of maintenance work
+(ex: OS/Kernel/Security upgrades) on a host and hold it hostage.
+
+An operator can control the limits for SLA requirements via these scheduler 
configuration options:
+
+    -max_sla_duration_secs=2hrs
+    -min_required_instances_for_sla_check=20
+
+_Note: These limits only apply for `CountSlaPolicy` and `PercentageSlaPolicy`._
+
+### Limiting Coordinator SLA
+
+With `CoordinatorSlaPolicy` the SLA calculation is off-loaded to an external 
HTTP service. Some
+relevant scheduler configuration options are,
+
+    -sla_coordinator_timeout=1min
+    -max_parallel_coordinated_maintenance=10
+
+Since handing off the SLA calculation to an external service can potentially 
block maintenance
+on hosts for an indefinite amount of time (either due to a mis-configured 
coordinator or due to
+a valid degraded service). In those situations the following metrics will be 
helpful to identify the
+offending tasks.
+
+    sla_coordinator_user_errors_*     (counter tracking number of times the 
coordinator for the task
+                                       returned a bad response.)
+    sla_coordinator_errors_*          (counter tracking number of times the 
scheduler was not able
+                                       to communicate with the coordinator of 
the task.)
+    sla_coordinator_lock_starvation_* (counter tracking number of times the 
scheduler was not able to
+                                       get the lock for the coordinator of the 
task.)
+

Added: aurora/site/source/documentation/0.22.0/operations/installation.md
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/operations/installation.md?rev=1871319&view=auto
==============================================================================
--- aurora/site/source/documentation/0.22.0/operations/installation.md (added)
+++ aurora/site/source/documentation/0.22.0/operations/installation.md Fri Dec 
13 05:37:33 2019
@@ -0,0 +1,256 @@
+# Installing Aurora
+
+Source and binary distributions can be found on our
+[downloads](https://aurora.apache.org/downloads/) page.  Installing from 
binary packages is
+recommended for most.
+
+- [Installing the scheduler](#installing-the-scheduler)
+- [Installing worker components](#installing-worker-components)
+- [Installing the client](#installing-the-client)
+- [Installing Mesos](#installing-mesos)
+- [Troubleshooting](#troubleshooting)
+
+If our binay packages don't suite you, our package build toolchain makes it 
easy to build your
+own packages. See the 
[instructions](https://github.com/apache/aurora-packaging) to learn how.
+
+
+## Machine profiles
+
+Given that many of these components communicate over the network, there are 
numerous ways you could
+assemble them to create an Aurora cluster.  The simplest way is to think in 
terms of three machine
+profiles:
+
+### Coordinator
+**Components**: ZooKeeper, Aurora scheduler, Mesos master
+
+A small number of machines (typically 3 or 5) responsible for cluster 
orchestration.  In most cases
+it is fine to co-locate these components in anything but very large clusters 
(> 1000 machines).
+Beyond that point, operators will likely want to manage these services on 
separate machines.
+In particular, you will want to use separate ZooKeeper ensembles for leader 
election and
+service discovery. Otherwise a service discovery error or outage can take down 
the entire cluster.
+
+In practice, 5 coordinators have been shown to reliably manage clusters with 
tens of thousands of
+machines.
+
+### Worker
+**Components**: Aurora executor, Aurora observer, Mesos agent
+
+The bulk of the cluster, where services will actually run.
+
+### Client
+**Components**: Aurora client, Aurora admin client
+
+Any machines that users submit jobs from.
+
+
+## Installing the scheduler
+### Ubuntu Trusty
+
+1. Install Mesos
+   Skip down to [install mesos](#mesos-on-ubuntu-trusty), then run:
+
+        sudo start mesos-master
+
+2. Install ZooKeeper
+
+        sudo apt-get install -y zookeeperd
+
+3. Install the Aurora scheduler
+
+        sudo add-apt-repository -y ppa:openjdk-r/ppa
+        sudo apt-get update
+        sudo apt-get install -y openjdk-8-jre-headless wget
+
+        sudo update-alternatives --set java 
/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java
+
+        wget -c 
https://apache.bintray.com/aurora/ubuntu-trusty/aurora-scheduler_0.17.0_amd64.deb
+        sudo dpkg -i aurora-scheduler_0.17.0_amd64.deb
+
+### CentOS 7
+
+1. Install Mesos
+   Skip down to [install mesos](#mesos-on-centos-7), then run:
+
+        sudo systemctl start mesos-master
+
+2. Install ZooKeeper
+
+        sudo rpm -Uvh 
https://archive.cloudera.com/cdh4/one-click-install/redhat/6/x86_64/cloudera-cdh-4-0.x86_64.rpm
+        sudo yum install -y java-1.8.0-openjdk-headless zookeeper-server
+
+        sudo service zookeeper-server init
+        sudo systemctl start zookeeper-server
+
+3. Install the Aurora scheduler
+
+        sudo yum install -y wget
+
+        wget -c 
https://apache.bintray.com/aurora/centos-7/aurora-scheduler-0.17.0-1.el7.centos.aurora.x86_64.rpm
+        sudo yum install -y 
aurora-scheduler-0.17.0-1.el7.centos.aurora.x86_64.rpm
+
+### Finalizing
+By default, the scheduler will start in an uninitialized mode.  This is 
because external
+coordination is necessary to be certain operator error does not result in a 
quorum of schedulers
+starting up and believing their databases are empty when in fact they should 
be re-joining a
+cluster.
+
+Because of this, a fresh install of the scheduler will need intervention to 
start up.  First,
+stop the scheduler service.
+Ubuntu: `sudo stop aurora-scheduler`
+CentOS: `sudo systemctl stop aurora`
+
+Now initialize the database:
+
+    sudo -u aurora mkdir -p /var/lib/aurora/scheduler/db
+    sudo -u aurora mesos-log initialize --path=/var/lib/aurora/scheduler/db
+
+Now you can start the scheduler back up.
+Ubuntu: `sudo start aurora-scheduler`
+CentOS: `sudo systemctl start aurora`
+
+
+## Installing worker components
+### Ubuntu Trusty
+
+1. Install Mesos
+   Skip down to [install mesos](#mesos-on-ubuntu-trusty), then run:
+
+        start mesos-slave
+
+2. Install Aurora executor and observer
+
+        sudo apt-get install -y python2.7 wget
+
+        # NOTE: This appears to be a missing dependency of the mesos deb 
package and is needed
+        # for the python mesos native bindings.
+        sudo apt-get -y install libcurl4-nss-dev
+
+        wget -c 
https://apache.bintray.com/aurora/ubuntu-trusty/aurora-executor_0.17.0_amd64.deb
+        sudo dpkg -i aurora-executor_0.17.0_amd64.deb
+
+### CentOS 7
+
+1. Install Mesos
+   Skip down to [install mesos](#mesos-on-centos-7), then run:
+
+        sudo systemctl start mesos-slave
+
+2. Install Aurora executor and observer
+
+        sudo yum install -y python2 wget
+
+        wget -c 
https://apache.bintray.com/aurora/centos-7/aurora-executor-0.17.0-1.el7.centos.aurora.x86_64.rpm
+        sudo yum install -y 
aurora-executor-0.17.0-1.el7.centos.aurora.x86_64.rpm
+
+### Worker Configuration
+The executor typically does not require configuration.  Command line arguments 
can
+be passed to the executor using a command line argument on the scheduler.
+
+The observer needs to be configured to look at the correct mesos directory in 
order to find task
+sandboxes. You should 1st find the Mesos working directory by looking for the 
Mesos agent
+`--work_dir` flag. You should see something like:
+
+        ps -eocmd | grep "mesos-slave" | grep -v grep | tr ' ' '\n' | grep 
"\--work_dir"
+        --work_dir=/var/lib/mesos
+
+If the flag is not set, you can view the default value like so:
+
+        mesos-slave --help
+        Usage: mesos-slave [options]
+
+          ...
+          --work_dir=VALUE      Directory path to place framework work 
directories
+                                (default: /tmp/mesos)
+          ...
+
+The value you find for `--work_dir`, `/var/lib/mesos` in this example, should 
match the Aurora
+observer value for `--mesos-root`.  You can look for that setting in a similar 
way on a worker
+node by grepping for `thermos_observer` and `--mesos-root`.  If the flag is 
not set, you can view
+the default value like so:
+
+        thermos_observer -h
+        Options:
+          ...
+          --mesos-root=MESOS_ROOT
+                                The mesos root directory to search for Thermos
+                                executor sandboxes [default: /var/lib/mesos]
+          ...
+
+In this case the default is `/var/lib/mesos` and we have a match. If there is 
no match, you can
+either adjust the mesos-master start script(s) and restart the master(s) or 
else adjust the
+Aurora observer start scripts and restart the observers.  To adjust the Aurora 
observer:
+
+#### Ubuntu Trusty
+
+    sudo sh -c 'echo "MESOS_ROOT=/tmp/mesos" >> /etc/default/thermos'
+
+#### CentOS 7
+
+Make an edit to add the `--mesos-root` flag resulting in something like:
+
+    grep -A5 OBSERVER_ARGS /etc/sysconfig/thermos
+    OBSERVER_ARGS=(
+      --port=1338
+      --mesos-root=/tmp/mesos
+      --log_to_disk=NONE
+      --log_to_stderr=google:INFO
+    )
+
+
+## Installing the client
+### Ubuntu Trusty
+
+    sudo apt-get install -y python2.7 wget
+
+    wget -c 
https://apache.bintray.com/aurora/ubuntu-trusty/aurora-tools_0.17.0_amd64.deb
+    sudo dpkg -i aurora-tools_0.17.0_amd64.deb
+
+### CentOS 7
+
+    sudo yum install -y python2 wget
+
+    wget -c 
https://apache.bintray.com/aurora/centos-7/aurora-tools-0.17.0-1.el7.centos.aurora.x86_64.rpm
+    sudo yum install -y aurora-tools-0.17.0-1.el7.centos.aurora.x86_64.rpm
+
+### Mac OS X
+
+    brew upgrade
+    brew install aurora-cli
+
+### Client Configuration
+Client configuration lives in a json file that describes the clusters 
available and how to reach
+them.  By default this file is at `/etc/aurora/clusters.json`.
+
+Jobs may be submitted to the scheduler using the client, and are described with
+[job configurations](../../reference/configuration/) expressed in `.aurora` 
files.  Typically you will
+maintain a single job configuration file to describe one or more deployment 
environments (e.g.
+dev, test, prod) for a production job.
+
+
+## Installing Mesos
+Mesos uses a single package for the Mesos master and agent.  As a result, the 
package dependencies
+are identical for both.
+
+### Mesos on Ubuntu Trusty
+
+    sudo apt-key adv --keyserver keyserver.ubuntu.com --recv E56151BF
+    DISTRO=$(lsb_release -is | tr '[:upper:]' '[:lower:]')
+    CODENAME=$(lsb_release -cs)
+
+    echo "deb http://repos.mesosphere.io/${DISTRO} ${CODENAME} main" | \
+      sudo tee /etc/apt/sources.list.d/mesosphere.list
+    sudo apt-get -y update
+
+    # Use `apt-cache showpkg mesos | grep [version]` to find the exact version.
+    sudo apt-get -y install mesos=1.1.0-2.0.107.ubuntu1404_amd64.deb
+
+### Mesos on CentOS 7
+
+    sudo rpm -Uvh 
https://repos.mesosphere.io/el/7/noarch/RPMS/mesosphere-el-repo-7-1.noarch.rpm
+    sudo yum -y install mesos-1.1.0
+
+
+## Troubleshooting
+
+So you've started your first cluster and are running into some issues? We've 
collected some common
+stumbling blocks and solutions in our [Troubleshooting 
guide](../troubleshooting/) to help get you moving.

Added: aurora/site/source/documentation/0.22.0/operations/monitoring.md
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/operations/monitoring.md?rev=1871319&view=auto
==============================================================================
--- aurora/site/source/documentation/0.22.0/operations/monitoring.md (added)
+++ aurora/site/source/documentation/0.22.0/operations/monitoring.md Fri Dec 13 
05:37:33 2019
@@ -0,0 +1,181 @@
+# Monitoring your Aurora cluster
+
+Before you start running important services in your Aurora cluster, it's 
important to set up
+monitoring and alerting of Aurora itself.  Most of your monitoring can be 
against the scheduler,
+since it will give you a global view of what's going on.
+
+## Reading stats
+The scheduler exposes a *lot* of instrumentation data via its HTTP interface. 
You can get a quick
+peek at the first few of these in our vagrant image:
+
+    $ vagrant ssh -c 'curl -s localhost:8081/vars | head'
+    async_tasks_completed 1004
+    attribute_store_fetch_all_events 15
+    attribute_store_fetch_all_events_per_sec 0.0
+    attribute_store_fetch_all_nanos_per_event 0.0
+    attribute_store_fetch_all_nanos_total 3048285
+    attribute_store_fetch_all_nanos_total_per_sec 0.0
+    attribute_store_fetch_one_events 3391
+    attribute_store_fetch_one_events_per_sec 0.0
+    attribute_store_fetch_one_nanos_per_event 0.0
+    attribute_store_fetch_one_nanos_total 454690753
+
+These values are served as `Content-Type: text/plain`, with each line 
containing a space-separated metric
+name and value. Values may be integers, doubles, or strings (note: strings are 
static, others
+may be dynamic).
+
+If your monitoring infrastructure prefers JSON, the scheduler exports that as 
well:
+
+    $ vagrant ssh -c 'curl -s localhost:8081/vars.json | python -mjson.tool | 
head'
+    {
+        "async_tasks_completed": 1009,
+        "attribute_store_fetch_all_events": 15,
+        "attribute_store_fetch_all_events_per_sec": 0.0,
+        "attribute_store_fetch_all_nanos_per_event": 0.0,
+        "attribute_store_fetch_all_nanos_total": 3048285,
+        "attribute_store_fetch_all_nanos_total_per_sec": 0.0,
+        "attribute_store_fetch_one_events": 3409,
+        "attribute_store_fetch_one_events_per_sec": 0.0,
+        "attribute_store_fetch_one_nanos_per_event": 0.0,
+
+This will be the same data as above, served with `Content-Type: 
application/json`.
+
+## Viewing live stat samples on the scheduler
+The scheduler uses the Twitter commons stats library, which keeps an internal 
time-series database
+of exported variables - nearly everything in `/vars` is available for instant 
graphing.  This is
+useful for debugging, but is not a replacement for an external monitoring 
system.
+
+You can view these graphs on a scheduler at `/graphview`.  It supports some 
composition and
+aggregation of values, which can be invaluable when triaging a problem.  For 
example, if you have
+the scheduler running in vagrant, check out these links:
+[simple graph](http://192.168.33.7:8081/graphview?query=jvm_uptime_secs)
+[complex 
composition](http://192.168.33.7:8081/graphview?query=rate\(scheduler_log_native_append_nanos_total\)%2Frate\(scheduler_log_native_append_events\)%2F1e6)
+
+### Counters and gauges
+Among numeric stats, there are two fundamental types of stats exported: 
_counters_ and _gauges_.
+Counters are guaranteed to be monotonically-increasing for the lifetime of a 
process, while gauges
+may decrease in value.  Aurora uses counters to represent things like the 
number of times an event
+has occurred, and gauges to capture things like the current length of a queue. 
 Counters are a
+natural fit for accurate composition into [rate 
ratios](http://en.wikipedia.org/wiki/Rate_ratio)
+(useful for sample-resistant latency calculation), while gauges are not.
+
+# Alerting
+
+## Quickstart
+If you are looking for just bare-minimum alerting to get something in place 
quickly, set up alerting
+on `framework_registered` and `task_store_LOST`. These will give you a decent 
picture of overall
+health.
+
+## A note on thresholds
+One of the most difficult things in monitoring is choosing alert thresholds. 
With many of these
+stats, there is no value we can offer as a threshold that will be guaranteed 
to work for you. It
+will depend on the size of your cluster, number of jobs, churn of tasks in the 
cluster, etc. We
+recommend you start with a strict value after viewing a small amount of 
collected data, and then
+adjust thresholds as you see fit. Feel free to ask us if you would like to 
validate that your alerts
+and thresholds make sense.
+
+## Important stats
+
+### `jvm_uptime_secs`
+Type: integer counter
+
+The number of seconds the JVM process has been running. Comes from
+[RuntimeMXBean#getUptime()](http://docs.oracle.com/javase/7/docs/api/java/lang/management/RuntimeMXBean.html#getUptime\(\))
+
+Detecting resets (decreasing values) on this stat will tell you that the 
scheduler is failing to
+stay alive.
+
+Look at the scheduler logs to identify the reason the scheduler is exiting.
+
+### `system_load_avg`
+Type: double gauge
+
+The current load average of the system for the last minute. Comes from
+[OperatingSystemMXBean#getSystemLoadAverage()](http://docs.oracle.com/javase/7/docs/api/java/lang/management/OperatingSystemMXBean.html?is-external=true#getSystemLoadAverage\(\)).
+
+A high sustained value suggests that the scheduler machine may be 
over-utilized.
+
+Use standard unix tools like `top` and `ps` to track down the offending 
process(es).
+
+### `process_cpu_cores_utilized`
+Type: double gauge
+
+The current number of CPU cores in use by the JVM process. This should not 
exceed the number of
+logical CPU cores on the machine. Derived from
+[OperatingSystemMXBean#getProcessCpuTime()](http://docs.oracle.com/javase/7/docs/jre/api/management/extension/com/sun/management/OperatingSystemMXBean.html)
+
+A high sustained value indicates that the scheduler is overworked. Due to 
current internal design
+limitations, if this value is sustained at `1`, there is a good chance the 
scheduler is under water.
+
+There are two main inputs that tend to drive this figure: task scheduling 
attempts and status
+updates from Mesos.  You may see activity in the scheduler logs to give an 
indication of where
+time is being spent.  Beyond that, it really takes good familiarity with the 
code to effectively
+triage this.  We suggest engaging with an Aurora developer.
+
+### `task_store_LOST`
+Type: integer gauge
+
+The number of tasks stored in the scheduler that are in the `LOST` state, and 
have been rescheduled.
+
+If this value is increasing at a high rate, it is a sign of trouble.
+
+There are many sources of `LOST` tasks in Mesos: the scheduler, master, agent, 
and executor can all
+trigger this.  The first step is to look in the scheduler logs for `LOST` to 
identify where the
+state changes are originating.
+
+### `scheduler_resource_offers`
+Type: integer counter
+
+The number of resource offers that the scheduler has received.
+
+For a healthy scheduler, this value must be increasing over time.
+
+Assuming the scheduler is up and otherwise healthy, you will want to check if 
the master thinks it
+is sending offers. You should also look at the master's web interface to see 
if it has a large
+number of outstanding offers that it is waiting to be returned.
+
+### `framework_registered`
+Type: binary integer counter
+
+Will be `1` for the leading scheduler that is registered with the Mesos 
master, `0` for passive
+schedulers,
+
+A sustained period without a `1` (or where `sum() != 1`) warrants 
investigation.
+
+If there is no leading scheduler, look in the scheduler and master logs for 
why.  If there are
+multiple schedulers claiming leadership, this suggests a split brain and 
warrants filing a critical
+bug.
+
+### 
`rate(scheduler_log_native_append_nanos_total)/rate(scheduler_log_native_append_events)`
+Type: rate ratio of integer counters
+
+This composes two counters to compute a windowed figure for the latency of 
replicated log writes.
+
+A hike in this value suggests disk bandwidth contention.
+
+Look in scheduler logs for any reported oddness with saving to the replicated 
log. Also use
+standard tools like `vmstat` and `iotop` to identify whether the disk has 
become slow or
+over-utilized. We suggest using a dedicated disk for the replicated log to 
mitigate this.
+
+### `timed_out_tasks`
+Type: integer counter
+
+Tracks the number of times the scheduler has given up while waiting
+(for `-transient_task_state_timeout`) to hear back about a task that is in a 
transient state
+(e.g. `ASSIGNED`, `KILLING`), and has moved to `LOST` before rescheduling.
+
+This value is currently known to increase occasionally when the scheduler 
fails over
+([AURORA-740](https://issues.apache.org/jira/browse/AURORA-740)). However, any 
large spike in this
+value warrants investigation.
+
+The scheduler will log when it times out a task. You should trace the task ID 
of the timed out
+task into the master, agent, and/or executors to determine where the message 
was dropped.
+
+### `http_500_responses_events`
+Type: integer counter
+
+The total number of HTTP 500 status responses sent by the scheduler. Includes 
API and asset serving.
+
+An increase warrants investigation.
+
+Look in scheduler logs to identify why the scheduler returned a 500, there 
should be a stack trace.

Added: aurora/site/source/documentation/0.22.0/operations/security.md
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/operations/security.md?rev=1871319&view=auto
==============================================================================
--- aurora/site/source/documentation/0.22.0/operations/security.md (added)
+++ aurora/site/source/documentation/0.22.0/operations/security.md Fri Dec 13 
05:37:33 2019
@@ -0,0 +1,362 @@
+Securing your Aurora Cluster
+============================
+
+Aurora integrates with [Apache Shiro](http://shiro.apache.org/) to provide 
security
+controls for its API. In addition to providing some useful features out of the 
box, Shiro
+also allows Aurora cluster administrators to adapt the security system to 
their organization’s
+existing infrastructure. The announcer in the Aurora thermos executor also 
supports security
+controls for talking to ZooKeeper.
+
+
+- [Enabling Security](#enabling-security)
+- [Authentication](#authentication)
+       - [HTTP Basic Authentication](#http-basic-authentication)
+               - [Server Configuration](#server-configuration)
+               - [Client Configuration](#client-configuration)
+       - [HTTP SPNEGO Authentication 
(Kerberos)](#http-spnego-authentication-kerberos)
+               - [Server Configuration](#server-configuration-1)
+               - [Client Configuration](#client-configuration-1)
+- [Authorization](#authorization)
+       - [Using an INI file to define security 
controls](#using-an-ini-file-to-define-security-controls)
+               - [Caveats](#caveats)
+- [Implementing a Custom Realm](#implementing-a-custom-realm)
+       - [Packaging a realm module](#packaging-a-realm-module)
+- [Announcer Authentication](#announcer-authentication)
+    - [ZooKeeper authentication 
configuration](#zookeeper-authentication-configuration)
+    - [Executor settings](#executor-settings)
+- [Scheduler HTTPS](#scheduler-https)
+- [Known Issues](#known-issues)
+
+# Enabling Security
+
+There are two major components of security:
+[authentication and 
authorization](http://en.wikipedia.org/wiki/Authentication#Authorization).  A
+cluster administrator may choose the approach used for each, and may also 
implement custom
+mechanisms for either.  Later sections describe the options available. To 
enable authentication
+ for the announcer, see [Announcer Authentication](#announcer-authentication)
+
+
+# Authentication
+
+The scheduler must be configured with instructions for how to process 
authentication
+credentials at a minimum.  There are currently two built-in authentication 
schemes -
+[HTTP Basic 
Authentication](http://en.wikipedia.org/wiki/Basic_access_authentication), and
+[SPNEGO](http://en.wikipedia.org/wiki/SPNEGO) (Kerberos).
+
+## HTTP Basic Authentication
+
+Basic Authentication is a very quick way to add *some* security.  It is 
supported
+by all major browsers and HTTP client libraries with minimal work.  However,
+before relying on Basic Authentication you should be aware of the [security
+considerations](http://tools.ietf.org/html/rfc2617#section-4).
+
+### Server Configuration
+
+At a minimum you need to set 4 command-line flags on the scheduler:
+
+```
+-http_authentication_mechanism=BASIC
+-shiro_realm_modules=INI_AUTHNZ
+-shiro_ini_path=path/to/security.ini
+```
+
+And create a security.ini file like so:
+
+```
+[users]
+sally = apple, admin
+
+[roles]
+admin = *
+```
+
+The details of the security.ini file are explained below. Note that this file 
contains plaintext,
+unhashed passwords.
+
+### Client Configuration
+
+To configure the client for HTTP Basic authentication, add an entry to 
~/.netrc with your credentials
+
+```
+% cat ~/.netrc
+# ...
+
+machine aurora.example.com
+login sally
+password apple
+
+# ...
+```
+
+No changes are required to `clusters.json`.
+
+## HTTP SPNEGO Authentication (Kerberos)
+
+### Server Configuration
+At a minimum you need to set 6 command-line flags on the scheduler:
+
+```
+-http_authentication_mechanism=NEGOTIATE
+-shiro_realm_modules=KERBEROS5_AUTHN,INI_AUTHNZ
+-kerberos_server_principal=HTTP/[email protected]
+-kerberos_server_keytab=path/to/aurora.example.com.keytab
+-shiro_ini_path=path/to/security.ini
+```
+
+And create a security.ini file like so:
+
+```
+% cat path/to/security.ini
+[users]
+sally = _, admin
+
+[roles]
+admin = *
+```
+
+What's going on here? First, Aurora must be configured to request Kerberos 
credentials when presented with an
+unauthenticated request. This is achieved by setting
+
+```
+-http_authentication_mechanism=NEGOTIATE
+```
+
+Next, a Realm module must be configured to **authenticate** the current 
request using the Kerberos
+credentials that were requested. Aurora ships with a realm module that can do 
this
+
+```
+-shiro_realm_modules=KERBEROS5_AUTHN[,...]
+```
+
+The Kerberos5Realm requires a keytab file and a server principal name. The 
principal name will usually
+be in the form `HTTP/[email protected]`.
+
+```
+-kerberos_server_principal=HTTP/[email protected]
+-kerberos_server_keytab=path/to/aurora.example.com.keytab
+```
+
+The Kerberos5 realm module is authentication-only. For scheduler security to 
work you must also
+enable a realm module that provides an Authorizer implementation. For example, 
to do this using the
+IniShiroRealmModule:
+
+```
+-shiro_realm_modules=KERBEROS5_AUTHN,INI_AUTHNZ
+```
+
+You can then configure authorization using a security.ini file as described 
below
+(the password field is ignored). You must configure the realm module with the 
path to this file:
+
+```
+-shiro_ini_path=path/to/security.ini
+```
+
+### Client Configuration
+To use Kerberos on the client-side you must build Kerberos-enabled client 
binaries. Do this with
+
+```
+./pants binary src/main/python/apache/aurora/kerberos:kaurora
+./pants binary src/main/python/apache/aurora/kerberos:kaurora_admin
+```
+
+You must also configure each cluster where you've enabled Kerberos on the 
scheduler
+to use Kerberos authentication. Do this by setting `auth_mechanism` to 
`KERBEROS`
+in `clusters.json`.
+
+```
+% cat ~/.aurora/clusters.json
+{
+    "devcluser": {
+        "auth_mechanism": "KERBEROS",
+        ...
+    },
+    ...
+}
+```
+
+# Authorization
+Given a means to authenticate the entity a client claims they are, we need to 
define what privileges they have.
+
+## Using an INI file to define security controls
+
+The simplest security configuration for Aurora is an INI file on the 
scheduler.  For small
+clusters, or clusters where the users and access controls change relatively 
infrequently, this is
+likely the preferred approach.  However you may want to avoid this approach if 
access permissions
+are rapidly changing, or if your access control information already exists in 
another system.
+
+You can enable INI-based configuration with following scheduler command line 
arguments:
+
+```
+-http_authentication_mechanism=BASIC
+-shiro_ini_path=path/to/security.ini
+```
+
+*note* As the argument name reveals, this is using Shiro’s
+[IniRealm](http://shiro.apache.org/configuration.html#Configuration-INIConfiguration)
 behind
+the scenes.
+
+The INI file will contain two sections - users and roles.  Here’s an example 
for what might
+be in security.ini:
+
+```
+[users]
+sally = apple, admin
+jim = 123456, accounting
+becky = letmein, webapp
+larry = 654321,accounting
+steve = password
+
+[roles]
+admin = *
+accounting = thrift.AuroraAdmin:setQuota
+webapp = thrift.AuroraSchedulerManager:*:webapp
+```
+
+The users section defines user user credentials and the role(s) they are 
members of.  These lines
+are of the format `<user> = <password>[, <role>...]`.  As you probably 
noticed, the passwords are
+in plaintext and as a result read access to this file should be restricted.
+
+In this configuration, each user has different privileges for actions in the 
cluster because
+of the roles they are a part of:
+
+* admin is granted all privileges
+* accounting may adjust the amount of resource quota for any role
+* webapp represents a collection of jobs that represents a service, and its 
members may create and modify any jobs owned by it
+
+### Caveats
+You might find documentation on the Internet suggesting there are additional 
sections in `shiro.ini`,
+like `[main]` and `[urls]`. These are not supported by Aurora as it uses a 
different mechanism to configure
+those parts of Shiro. Think of Aurora's `security.ini` as a subset with only 
`[users]` and `[roles]` sections.
+
+## Implementing Delegated Authorization
+
+It is possible to leverage Shiro's `runAs` feature by implementing a custom 
Servlet Filter that provides
+the capability and passing it's fully qualified class name to the command line 
argument
+`-shiro_after_auth_filter`. The filter is registered in the same filter chain 
as the Shiro auth filters
+and is placed after the Shiro auth filters in the filter chain. This ensures 
that the Filter is invoked
+after the Shiro filters have had a chance to authenticate the request.
+
+# Implementing a Custom Realm
+
+Since Aurora’s security is backed by [Apache 
Shiro](https://shiro.apache.org), you can implement a
+custom [Realm](http://shiro.apache.org/realm.html) to define 
organization-specific security behavior.
+
+In addition to using Shiro's standard APIs to implement a Realm you can link 
against Aurora to
+access the type-safe Permissions Aurora uses. See the Javadoc for 
`org.apache.aurora.scheduler.spi`
+for more information.
+
+## Packaging a realm module
+Package your custom Realm(s) with a Guice module that exposes a `Set<Realm>` 
multibinding.
+
+```java
+package com.example;
+
+import com.google.inject.AbstractModule;
+import com.google.inject.multibindings.Multibinder;
+import org.apache.shiro.realm.Realm;
+
+public class MyRealmModule extends AbstractModule {
+  @Override
+  public void configure() {
+    Realm myRealm = new MyRealm();
+
+    Multibinder.newSetBinder(binder(), 
Realm.class).addBinding().toInstance(myRealm);
+  }
+
+  static class MyRealm implements Realm {
+    // Realm implementation.
+  }
+}
+```
+
+To use your module in the scheduler, include it as a realm module based on its 
fully-qualified
+class name:
+
+```
+-shiro_realm_modules=KERBEROS5_AUTHN,INI_AUTHNZ,com.example.MyRealmModule
+```
+
+
+# Announcer Authentication
+The Thermos executor can be configured to authenticate with ZooKeeper and 
include
+an 
[ACL](https://zookeeper.apache.org/doc/current/zookeeperProgrammers.html#sc_ZooKeeperAccessControl)
+on the nodes it creates, which will specify
+the privileges of clients to perform different actions on these nodes.  This
+feature is enabled by specifying an ACL configuration file to the executor 
with the
+`--announcer-zookeeper-auth-config` command line argument.
+
+When this feature is _not_ enabled, nodes created by the executor will have 
'world/all' permission
+(`ZOO_OPEN_ACL_UNSAFE`).  In most production environments, operators should 
specify an ACL and
+limit access.
+
+## ZooKeeper Authentication Configuration
+The configuration file must be formatted as JSON with the following schema:
+
+```json
+{
+  "auth": [
+    {
+      "scheme": "<scheme>",
+      "credential": "<plain_credential>"
+    }
+  ],
+  "acl": [
+    {
+      "scheme": "<scheme>",
+      "credential": "<plain_credential>",
+      "permissions": {
+        "read": <bool>,
+        "write": <bool>,
+        "create": <bool>,
+        "delete": <bool>,
+        "admin": <bool>
+      }
+    }
+  ]
+}
+```
+
+The `scheme`
+defines the encoding of the credential field.  Note that these fields are 
passed directly to
+ZooKeeper (except in the case of _digest_ scheme, where the executor will hash 
and encode
+the credential appropriately before passing it to ZooKeeper). In addition to 
`acl`, a list of
+authentication credentials must be provided in `auth` to use for the 
connection.
+
+All properties of the `permissions` object will default to False if not 
provided.
+
+## Executor settings
+To enable the executor to authenticate against ZK, 
`--announcer-zookeeper-auth-config` should be
+set to the configuration file.
+
+
+# Scheduler HTTPS
+
+The Aurora scheduler does not provide native HTTPS support 
([AURORA-343](https://issues.apache.org/jira/browse/AURORA-343)).
+It is therefore recommended to deploy it behind an HTTPS capable reverse proxy 
such as nginx or Apache2.
+
+A simple setup is to launch both the reverse proxy and the Aurora scheduler on 
the same port, but
+bind the reverse proxy to the public IP of the host and the scheduler to 
localhost:
+
+    -ip=127.0.0.1
+    -http_port=8081
+
+If your clients connect to the scheduler via 
[`proxy_url`](../../reference/scheduler-configuration/),
+you can update it to `https`. If you use the ZooKeeper based discovery 
instead, the scheduler
+needs to be launched via
+
+    -serverset_endpoint_name=https
+
+in order to announce its HTTPS support within ZooKeeper.
+
+
+# Known Issues
+
+While the APIs and SPIs we ship with are stable as of 0.8.0, we are aware of 
several incremental
+improvements. Please follow, vote, or send patches.
+
+Relevant tickets:
+* [AURORA-1248](https://issues.apache.org/jira/browse/AURORA-1248): Client 
retries 4xx errors
+* [AURORA-1279](https://issues.apache.org/jira/browse/AURORA-1279): Remove 
kerberos-specific build targets
+* [AURORA-1293](https://issues.apache.org/jira/browse/AURORA-1291): Consider 
defining a JSON format in place of INI
+* [AURORA-1179](https://issues.apache.org/jira/browse/AURORA-1179): Supported 
hashed passwords in security.ini
+* [AURORA-1295](https://issues.apache.org/jira/browse/AURORA-1295): Support 
security for the ReadOnlyScheduler service

Added: aurora/site/source/documentation/0.22.0/operations/storage.md
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.22.0/operations/storage.md?rev=1871319&view=auto
==============================================================================
--- aurora/site/source/documentation/0.22.0/operations/storage.md (added)
+++ aurora/site/source/documentation/0.22.0/operations/storage.md Fri Dec 13 
05:37:33 2019
@@ -0,0 +1,96 @@
+# Aurora Scheduler Storage
+
+- [Overview](#overview)
+- [Storage Semantics](#storage-semantics)
+  - [Reads, writes, modifications](#reads-writes-modifications)
+    - [Read lifecycle](#read-lifecycle)
+    - [Write lifecycle](#write-lifecycle)
+  - [Atomicity, consistency and 
isolation](#atomicity-consistency-and-isolation)
+  - [Population on restart](#population-on-restart)
+
+
+## Overview
+
+Aurora scheduler maintains data that need to be persisted to survive failovers 
and restarts.
+For example:
+
+* Task configurations and scheduled task instances
+* Job update configurations and update progress
+* Production resource quotas
+* Mesos resource offer host attributes
+
+Aurora solves its persistence needs by leveraging the
+[Mesos implementation of a Paxos replicated 
log](http://mesos.apache.org/documentation/latest/replicated-log-internals/)
+[[1]](https://ramcloud.stanford.edu/~ongaro/userstudy/paxos.pdf)
+[[2]](http://en.wikipedia.org/wiki/State_machine_replication) with a key-value
+[LevelDB](https://github.com/google/leveldb) storage as persistence media.
+
+Conceptually, it can be represented by the following major components:
+
+* Volatile storage: in-memory cache of all available data. Implemented via 
in-memory
+[H2 Database](http://www.h2database.com/html/main.html) and accessed via
+[MyBatis](http://mybatis.github.io/mybatis-3/).
+* Log manager: interface between Aurora storage and Mesos replicated log. The 
default schema format
+is [thrift](https://github.com/apache/thrift). Data is stored in serialized 
binary form.
+* Snapshot manager: all data is periodically persisted in Mesos replicated log 
in a single snapshot.
+This helps establishing periodic recovery checkpoints and speeds up volatile 
storage recovery on
+restart.
+* Backup manager: as a precaution, snapshots are periodically written out into 
backup files.
+This solves a [disaster recovery problem](../backup-restore/)
+in case of a complete loss or corruption of Mesos log files.
+
+![Storage hierarchy](../images/storage_hierarchy.png)
+
+
+## Storage Semantics
+
+Implementation details of the Aurora storage system. Understanding those can 
sometimes be useful
+when investigating performance issues.
+
+### Reads, writes, modifications
+
+All services in Aurora access data via a set of predefined store interfaces 
(aka stores) logically
+grouped by the type of data they serve. Every interface defines a specific set 
of operations allowed
+on the data thus abstracting out the storage access and the actual persistence 
implementation. The
+latter is especially important in view of a general immutability of persisted 
data. With the Mesos
+replicated log as the underlying persistence solution, data can be read and 
written easily but not
+modified. All modifications are simulated by saving new versions of modified 
objects. This feature
+and general performance considerations justify the existence of the volatile 
in-memory store.
+
+#### Read lifecycle
+
+There are two types of reads available in Aurora: consistent and 
weakly-consistent. The difference
+is explained [below](#atomicity-consistency-and-isolation).
+
+All reads are served from the volatile storage making reads generally cheap 
storage operations
+from the performance standpoint. The majority of the volatile stores are 
represented by the
+in-memory H2 database. This allows for rich schema definitions, queries and 
relationships that
+key-value storage is unable to match.
+
+#### Write lifecycle
+
+Writes are more involved operations since in addition to updating the volatile 
store data has to be
+appended to the replicated log. Data is not available for reads until fully 
ack-ed by both
+replicated log and volatile storage.
+
+### Atomicity, consistency and isolation
+
+Aurora uses [write-ahead 
logging](http://en.wikipedia.org/wiki/Write-ahead_logging) to ensure
+consistency between replicated and volatile storage. In Aurora, data is first 
written into the
+replicated log and only then updated in the volatile store.
+
+Aurora storage uses read-write locks to serialize data mutations and provide 
consistent view of the
+available data. The available `Storage` interface exposes 3 major types of 
operations:
+* `consistentRead` - access is locked using reader's lock and provides 
consistent view on read
+* `weaklyConsistentRead` - access is lock-less. Delivers best contention 
performance but may result
+in stale reads
+* `write` - access is fully serialized by using writer's lock. Operation 
success requires both
+volatile and replicated writes to succeed.
+
+The consistency of the volatile store is enforced via H2 transactional 
isolation.
+
+### Population on restart
+
+Any time a scheduler restarts, it restores its volatile state from the most 
recent position recorded
+in the replicated log by restoring the snapshot and replaying individual log 
entries on top to fully
+recover the state up to the last write.


Reply via email to