documentation...

renan Mon, 10 Sep 2018 22:29:05 -0700

Added: aurora/site/source/documentation/0.21.0/development/committers-guide.md
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.21.0/development/committers-guide.md?rev=1840515&view=auto
==============================================================================
--- aurora/site/source/documentation/0.21.0/development/committers-guide.md 
(added)
+++ aurora/site/source/documentation/0.21.0/development/committers-guide.md Tue 
Sep 11 05:28:10 2018
@@ -0,0 +1,105 @@
+Committer's Guide
+=================
+
+Information for official Apache Aurora committers.
+
+Setting up your email account
+-----------------------------
+Once your Apache ID has been set up you can configure your account and add ssh 
keys and setup an
+email forwarding address at
+
+    http://id.apache.org
+
+Additional instructions for setting up your new committer email can be found at
+
+    http://www.apache.org/dev/user-email.html
+
+The recommended setup is to configure all services (mailing lists, JIRA, 
ReviewBoard) to send
+emails to your @apache.org email address.
+
+
+Creating a gpg key for releases
+-------------------------------
+In order to create a release candidate you will need a gpg key published to an 
external key server
+and that key will need to be added to our KEYS file as well.
+
+1. Create a key:
+
+               gpg --gen-key
+
+2. Add your gpg key to the Apache Aurora KEYS file:
+
+               git clone https://gitbox.apache.org/repos/asf/aurora
+               (gpg --list-sigs <KEY ID> && gpg --armor --export <KEY ID>) >> 
KEYS
+               git add KEYS && git commit -m "Adding gpg key for <APACHE ID>"
+               ./rbt post -o -g
+
+3. Publish the key to an external key server:
+
+               gpg --keyserver pgp.mit.edu --send-keys <KEY ID>
+
+4. Update the changes to the KEYS file to the Apache Aurora svn dist locations 
listed below:
+
+               https://dist.apache.org/repos/dist/dev/aurora/KEYS
+               https://dist.apache.org/repos/dist/release/aurora/KEYS
+
+5. Add your key to git config for use with the release scripts:
+
+               git config --global user.signingkey <KEY ID>
+
+
+Creating a release
+------------------
+The following will guide you through the steps to create a release candidate, 
vote, and finally an
+official Apache Aurora release. Before starting your gpg key should be in the 
KEYS file and you
+must have access to commit to the dist.a.o repositories.
+
+1. Ensure that all issues resolved for this release candidate are tagged with 
the correct Fix
+Version in JIRA, the changelog script will use this to generate the CHANGELOG 
in step #2.
+To assign the fix version:
+
+    * Look up the [previous release 
date](https://issues.apache.org/jira/browse/aurora/?selectedTab=com.atlassian.jira.jira-projects-plugin:versions-panel).
+    * Query all issues resolved after that release date: `project = AURORA AND 
status in (resolved, Closed) and fixVersion is empty and resolutiondate >= 
"YYYY/MM/DD"`
+    * In the upper right corner of the query result, select Tools > Bulk Edit.
+    * Select all issues > edit issue > set 'Change Fix Version/s' to the 
release version.
+    * Make sure to uncheck 'Send mail for this update' at the bottom.
+
+2. Prepare RELEASE-NOTES.md for the release. This just boils down to removing 
the "(Not yet
+released)" suffix from the impending release.
+
+2. Create a release candidate. This will automatically update the CHANGELOG 
and commit it, create a
+branch and update the current version within the trunk. To create a minor 
version update and publish
+it run
+
+               ./build-support/release/release-candidate -l m -p
+
+3. Update, if necessary, the draft email created from the `release-candidate` 
script in step #2 and
+send the [VOTE] email to the dev@ mailing list. You can verify the release 
signature and checksums
+by running
+
+               ./build-support/release/verify-release-candidate
+
+4. Wait for the vote to complete. If the vote fails close the vote by replying 
to the initial [VOTE]
+email sent in step #3 by editing the subject to [RESULT][VOTE] ... and noting 
the failure reason
+(example [here](http://markmail.org/message/d4d6xtvj7vgwi76f)). You'll also 
need to manually revert
+the commits generated by the release candidate script that incremented the 
snapshot version and
+updated the changelog. Once that is done, now address any issues and go back 
to step #1 and run
+again, this time you will use the -r flag to increment the release candidate 
version. This will
+automatically clean up the release candidate rc0 branch and source 
distribution.
+
+               ./build-support/release/release-candidate -l m -r 1 -p
+
+5. Once the vote has successfully passed create the release
+
+**IMPORTANT: make sure to use the correct release at this final step (e.g.: 
`-r 1` if rc1 candidate
+has been voted for). Once the release tag is pushed it will be very hard to 
undo due to remote
+git pre-receive hook explicitly forbidding release tag manipulations.**
+
+               ./build-support/release/release
+
+6. Update the draft email created fom the `release` script in step #5 to 
include the Apache ID's for
+all binding votes and send the [RESULT][VOTE] email to the dev@ mailing list.
+
+7. Update the [Aurora Website](http://aurora.apache.org/) by following the
+[instructions](https://svn.apache.org/repos/asf/aurora/site/README.md) on the 
ASF Aurora SVN repo.
+Remember to add a blog post under source/blog and regenerate the site before 
committing.


Added: aurora/site/source/documentation/0.21.0/development/db-migration.md
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.21.0/development/db-migration.md?rev=1840515&view=auto
==============================================================================
--- aurora/site/source/documentation/0.21.0/development/db-migration.md (added)
+++ aurora/site/source/documentation/0.21.0/development/db-migration.md Tue Sep 
11 05:28:10 2018
@@ -0,0 +1,34 @@
+DB Migrations
+=============
+
+Changes to the DB schema should be made in the form of migrations. This 
ensures that all changes
+are applied correctly after a DB dump from a previous version is restored.
+
+DB migrations are managed through a system built on top of
+[MyBatis Migrations](http://www.mybatis.org/migrations/). The migrations are 
run automatically when
+a snapshot is restored, no manual interaction is required by cluster operators.
+
+Upgrades
+--------
+When adding or altering tables or changing data, in addition to making to 
change in
+[schema.sql](../../src/main/resources/org/apache/aurora/scheduler/storage/db/schema.sql),
 a new
+migration class should be created under the 
org.apache.aurora.scheduler.storage.db.migration
+package. The class should implement the 
[MigrationScript](https://github.com/mybatis/migrations/blob/master/src/main/java/org/apache/ibatis/migration/MigrationScript.java)
+interface (see 
[V001_TestMigration](https://github.com/apache/aurora/blob/rel/0.21.0/src/test/java/org/apache/aurora/scheduler/storage/db/testmigration/V001_TestMigration.java)
+as an example). The upgrade and downgrade scripts are defined in this class. 
When restoring a
+snapshot the list of migrations on the classpath is compared to the list of 
applied changes in the
+DB. Any changes that have not yet been applied are executed and their 
downgrade script is stored
+alongside the changelog entry in the database to faciliate downgrades in the 
event of a rollback.
+
+Downgrades
+----------
+If, while running migrations, a rollback is detected, i.e. a change exists in 
the DB changelog that
+does not exist on the classpath, the downgrade script associated with each 
affected change is
+applied.
+
+Baselines
+---------
+After enough time has passed (at least 1 official release), it should be safe 
to baseline migrations
+if desired. This can be accomplished by ensuring the changes from migrations 
have been applied to
+[schema.sql](../../src/main/resources/org/apache/aurora/scheduler/storage/db/schema.sql)
 and then
+removing the corresponding migration classes and adding a migration to remove 
the changelog entries.
\ No newline at end of file

Added: aurora/site/source/documentation/0.21.0/development/design-documents.md
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.21.0/development/design-documents.md?rev=1840515&view=auto
==============================================================================
--- aurora/site/source/documentation/0.21.0/development/design-documents.md 
(added)
+++ aurora/site/source/documentation/0.21.0/development/design-documents.md Tue 
Sep 11 05:28:10 2018
@@ -0,0 +1,23 @@
+Design Documents
+================
+
+Since its inception as an Apache project, larger feature additions to the
+Aurora code base are discussed in form of design documents. Design documents
+are living documents until a consensus has been reached to implement a feature
+in the proposed form.
+
+Current and past documents:
+
+* [Command Hooks for the Aurora Client](../design/command-hooks/)
+* [Dynamic 
Reservations](https://docs.google.com/document/d/19gV8Po6DIHO14tOC7Qouk8RnboY8UCfRTninwn_5-7c/edit)
+* [GPU Resources in 
Aurora](https://docs.google.com/document/d/1J9SIswRMpVKQpnlvJAMAJtKfPP7ZARFknuyXl-2aZ-M/edit)
+* [Health Checks for 
Updates](https://docs.google.com/document/d/1KOO0LC046k75TqQqJ4c0FQcVGbxvrn71E10wAjMorVY/edit)
+* [JobUpdateDiff thrift 
API](https://docs.google.com/document/d/1Fc_YhhV7fc4D9Xv6gJzpfooxbK4YWZcvzw6Bd3qVTL8/edit)
+* [REST API 
RFC](https://docs.google.com/document/d/11_lAsYIRlD5ETRzF2eSd3oa8LXAHYFD8rSetspYXaf4/edit)
+* [Revocable Mesos offers in 
Aurora](https://docs.google.com/document/d/1r1WCHgmPJp5wbrqSZLsgtxPNj3sULfHrSFmxp2GyPTo/edit)
+* [Supporting the Mesos Universal 
Containerizer](https://docs.google.com/document/d/111T09NBF2zjjl7HE95xglsDpRdKoZqhCRM5hHmOfTLA/edit?usp=sharing)
+* [Tier Management In Apache 
Aurora](https://docs.google.com/document/d/1erszT-HsWf1zCIfhbqHlsotHxWUvDyI2xUwNQQQxLgs/edit?usp=sharing)
+* [Ubiquitous 
Jobs](https://docs.google.com/document/d/12hr6GnUZU3mc7xsWRzMi3nQILGB-3vyUxvbG-6YmvdE/edit)
+* [Pluggable 
Scheduling](https://docs.google.com/document/d/1fVHLt9AF-YbOCVCDMQmi5DATVusn-tqY8DldKbjVEm0/edit)
+
+Design documents can be found in the Aurora issue tracker via the query 
[`project = AURORA AND text ~ "docs.google.com" ORDER BY 
created`](https://issues.apache.org/jira/browse/AURORA-1528?jql=project%20%3D%20AURORA%20AND%20text%20~%20%22docs.google.com%22%20ORDER%20BY%20created).

Added: 
aurora/site/source/documentation/0.21.0/development/design/command-hooks.md
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.21.0/development/design/command-hooks.md?rev=1840515&view=auto
==============================================================================
--- aurora/site/source/documentation/0.21.0/development/design/command-hooks.md 
(added)
+++ aurora/site/source/documentation/0.21.0/development/design/command-hooks.md 
Tue Sep 11 05:28:10 2018
@@ -0,0 +1,102 @@
+# Command Hooks for the Aurora Client
+
+## Introduction/Motivation
+
+We've got hooks in the client that surround API calls. These are
+pretty awkward, because they don't correlate with user actions. For
+example, suppose we wanted a policy that said users weren't allowed to
+kill all instances of a production job at once.
+
+Right now, all that we could hook would be the "killJob" api call. But
+kill (at least in newer versions of the client) normally runs in
+batches. If a user called killall, what we would see on the API level
+is a series of "killJob" calls, each of which specified a batch of
+instances. We woudn't be able to distinguish between really killing
+all instances of a job (which is forbidden under this policy), and
+carefully killing in batches (which is permitted.) In each case, the
+hook would just see a series of API calls, and couldn't find out what
+the actual command being executed was!
+
+For most policy enforcement, what we really want to be able to do is
+look at and vet the commands that a user is performing, not the API
+calls that the client uses to implement those commands.
+
+So I propose that we add a new kind of hooks, which surround noun/verb
+commands. A hook will register itself to handle a collection of (noun,
+verb) pairs. Whenever any of those noun/verb commands are invoked, the
+hooks methods will be called around the execution of the verb. A
+pre-hook will have the ability to reject a command, preventing the
+verb from being executed.
+
+## Registering Hooks
+
+These hooks will be registered via configuration plugins. A configuration 
plugin
+can register hooks using an API. Hooks registered this way are, effectively,
+hardwired into the client executable.
+
+The order of execution of hooks is unspecified: they may be called in
+any order. There is no way to guarantee that one hook will execute
+before some other hook.
+
+
+### Global Hooks
+
+Commands registered by the python call are called _global_ hooks,
+because they will run for all configurations, whether or not they
+specify any hooks in the configuration file.
+
+In the implementation, hooks are registered in the module
+`apache.aurora.client.cli.command_hooks`, using the class
+`GlobalCommandHookRegistry`. A global hook can be registered by calling
+`GlobalCommandHookRegistry.register_command_hook` in a configuration plugin.
+
+### The API
+
+    class CommandHook(object)
+      @property
+      def name(self):
+        """Returns a name for the hook."
+
+      def get_nouns(self):
+        """Return the nouns that have verbs that should invoke this hook."""
+
+      def get_verbs(self, noun):
+        """Return the verbs for a particular noun that should invoke his 
hook."""
+
+      @abstractmethod
+      def pre_command(self, noun, verb, context, commandline):
+        """Execute a hook before invoking a verb.
+        * noun: the noun being invoked.
+        * verb: the verb being invoked.
+        * context: the context object that will be used to invoke the verb.
+          The options object will be initialized before calling the hook
+        * commandline: the original argv collection used to invoke the client.
+        Returns: True if the command should be allowed to proceed; False if 
the command
+        should be rejected.
+        """
+
+      def post_command(self, noun, verb, context, commandline, result):
+        """Execute a hook after invoking a verb.
+        * noun: the noun being invoked.
+        * verb: the verb being invoked.
+        * context: the context object that will be used to invoke the verb.
+          The options object will be initialized before calling the hook
+        * commandline: the original argv collection used to invoke the client.
+        * result: the result code returned by the verb.
+        Returns: nothing
+        """
+
+    class GlobalCommandHookRegistry(object):
+      @classmethod
+      def register_command_hook(self, hook):
+        pass
+
+### Skipping Hooks
+
+To skip a hook, a user uses a command-line option, `--skip-hooks`. The option 
can either
+specify specific hooks to skip, or "all":
+
+* `aurora --skip-hooks=all job create east/bozo/devel/myjob` will create a job
+  without running any hooks.
+* `aurora --skip-hooks=test,iq create east/bozo/devel/myjob` will create a job,
+  and will skip only the hooks named "test" and "iq".

Added: aurora/site/source/documentation/0.21.0/development/scheduler.md
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.21.0/development/scheduler.md?rev=1840515&view=auto
==============================================================================
--- aurora/site/source/documentation/0.21.0/development/scheduler.md (added)
+++ aurora/site/source/documentation/0.21.0/development/scheduler.md Tue Sep 11 
05:28:10 2018
@@ -0,0 +1,118 @@
+Developing the Aurora Scheduler
+===============================
+
+The Aurora scheduler is written in Java code and built with 
[Gradle](http://gradle.org).
+
+
+Prerequisite
+============
+
+When using Apache Aurora checked out from the source repository or the binary
+distribution, the Gradle wrapper and JavaScript dependencies are provided.
+However, you need to manually install them when using the source release
+downloads:
+
+1. Install Gradle following the instructions on the [Gradle web 
site](http://gradle.org)
+2. From the root directory of the Apache Aurora project generate the Gradle
+wrapper by running:
+
+    gradle wrapper
+
+
+Getting Started
+===============
+
+You will need Java 8 installed and on your `PATH` or unzipped somewhere with 
`JAVA_HOME` set. Then
+
+    ./gradlew tasks
+
+will bootstrap the build system and show available tasks. This can take a 
while the first time you
+run it but subsequent runs will be much faster due to cached artifacts.
+
+Running the Tests
+-----------------
+Aurora has a comprehensive unit test suite. To run the tests use
+
+    ./gradlew build
+
+Gradle will only re-run tests when dependencies of them have changed. To force 
a re-run of all
+tests use
+
+    ./gradlew clean build
+
+Running the build with code quality checks
+------------------------------------------
+To speed up development iteration, the plain gradle commands will not run 
static analysis tools.
+However, you should run these before posting a review diff, and **always** run 
this before pushing a
+commit to origin/master.
+
+    ./gradlew build -Pq
+
+Running integration tests
+-------------------------
+To run the same tests that are run in the Apache Aurora continuous integration
+environment:
+
+    ./build-support/jenkins/build.sh
+
+In addition, there is an end-to-end test that runs a suite of aurora commands
+using a virtual cluster:
+
+    ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh
+
+Creating a bundle for deployment
+--------------------------------
+Gradle can create a zip file containing Aurora, all of its dependencies, and a 
launch script with
+
+    ./gradlew distZip
+
+or a tar file containing the same files with
+
+    ./gradlew distTar
+
+The output file will be written to `dist/distributions/aurora-scheduler.zip` or
+`dist/distributions/aurora-scheduler.tar`.
+
+
+
+Developing Aurora Java code
+===========================
+
+Setting up an IDE
+-----------------
+Gradle can generate project files for your IDE. To generate an IntelliJ IDEA 
project run
+
+    ./gradlew idea
+
+and import the generated `aurora.ipr` file.
+
+Adding or Upgrading a Dependency
+--------------------------------
+New dependencies can be added from Maven central by adding a `compile` 
dependency to `build.gradle`.
+For example, to add a dependency on `com.example`'s `example-lib` 1.0 add this 
block:
+
+    compile 'com.example:example-lib:1.0'
+
+NOTE: Anyone thinking about adding a new dependency should first familiarize 
themselves with the
+Apache Foundation's third-party licensing
+[policy](http://www.apache.org/legal/resolved.html#category-x).
+
+
+
+Developing the Aurora Build System
+==================================
+
+Bootstrapping Gradle
+--------------------
+The following files were autogenerated by `gradle wrapper` using gradle's
+[Wrapper](http://www.gradle.org/docs/current/dsl/org.gradle.api.tasks.wrapper.Wrapper.html)
 plugin and
+should not be modified directly:
+
+    ./gradlew
+    ./gradlew.bat
+    ./gradle/wrapper/gradle-wrapper.jar
+    ./gradle/wrapper/gradle-wrapper.properties
+
+To upgrade Gradle unpack the new version somewhere, run `/path/to/new/gradle 
wrapper` in the
+repository root and commit the changed files.
+

Added: aurora/site/source/documentation/0.21.0/development/thermos.md
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.21.0/development/thermos.md?rev=1840515&view=auto
==============================================================================
--- aurora/site/source/documentation/0.21.0/development/thermos.md (added)
+++ aurora/site/source/documentation/0.21.0/development/thermos.md Tue Sep 11 
05:28:10 2018
@@ -0,0 +1,126 @@
+The Python components of Aurora are built using 
[Pants](https://pantsbuild.github.io).
+
+
+Python Build Conventions
+========================
+The Python code is laid out according to the following conventions:
+
+1. 1 `BUILD` per 3rd level directory. For a list of current top-level packages 
run:
+
+        % find src/main/python -maxdepth 3 -mindepth 3 -type d |\
+        while read dname; do echo $dname |\
+            sed 's@src/main/python/\(.*\)/\(.*\)/\(.*\).*@\1.\2.\3@'; done
+
+2.  Each `BUILD` file exports 1
+    
[`python_library`](https://pantsbuild.github.io/build_dictionary.html#bdict_python_library)
+    that provides a
+    [`setup_py`](https://pantsbuild.github.io/build_dictionary.html#setup_py)
+    containing each
+    
[`python_binary`](https://pantsbuild.github.io/build_dictionary.html#python_binary)
+    in the `BUILD` file, named the same as the directory it's in so that it 
can be referenced
+    without a ':' character. The `sources` field in the `python_library` will 
almost always be
+    `rglobs('*.py')`.
+
+3.  Other BUILD files may only depend on this single public `python_library`
+    target. Any other target is considered a private implementation detail and
+    should be prefixed with an `_`.
+
+4.  `python_binary` targets are always named the same as the exported console 
script.
+
+5.  `python_binary` targets must have identical `dependencies` to the 
`python_library` exported
+    by the package and must use `entry_point`.
+
+    The means a PEX file generated by pants will contain exactly the same 
files that will be
+    available on the `PYTHONPATH` in the case of `pip install` of the 
corresponding library
+    target. This will help our migration off of Pants in the future.
+
+Annotated example - apache.thermos.runner
+-----------------------------------------
+
+    % find src/main/python/apache/thermos/runner
+    src/main/python/apache/thermos/runner
+    src/main/python/apache/thermos/runner/__init__.py
+    src/main/python/apache/thermos/runner/thermos_runner.py
+    src/main/python/apache/thermos/runner/BUILD
+    % cat src/main/python/apache/thermos/runner/BUILD
+    # License boilerplate omitted
+    import os
+
+
+    # Private target so that a setup_py can exist without a circular 
dependency. Only targets within
+    # this file should depend on this.
+    python_library(
+      name = '_runner',
+      # The target covers every python file under this directory and 
subdirectories.
+      sources = rglobs('*.py'),
+      dependencies = [
+        '3rdparty/python:twitter.common.app',
+        '3rdparty/python:twitter.common.log',
+        # Source dependencies are always referenced without a ':'.
+        'src/main/python/apache/thermos/common',
+        'src/main/python/apache/thermos/config',
+        'src/main/python/apache/thermos/core',
+      ],
+    )
+
+    # Binary target for thermos_runner.pex. Nothing should depend on this - 
it's only used as an
+    # argument to ./pants binary.
+    python_binary(
+      name = 'thermos_runner',
+      # Use entry_point, not source so the files used here are the same ones 
tests see.
+      entry_point = 'apache.thermos.bin.thermos_runner',
+      dependencies = [
+        # Notice that we depend only on the single private target from this 
BUILD file here.
+        ':_runner',
+      ],
+    )
+
+    # The public library that everyone importing the runner symbols uses.
+    # The test targets and any other dependent source code should depend on 
this.
+    python_library(
+      name = 'runner',
+      dependencies = [
+        # Again, notice that we depend only on the single private target from 
this BUILD file here.
+        ':_runner',
+      ],
+      # We always provide a setup_py. This will cause any dependee libraries 
to automatically
+      # reference this library in their requirements.txt rather than copy the 
source files into their
+      # sdist.
+      provides = setup_py(
+        # Conventionally named and versioned.
+        name = 'apache.thermos.runner',
+        version = open(os.path.join(get_buildroot(), 
'.auroraversion')).read().strip().upper(),
+      ).with_binaries({
+        # Every binary in this file should also be repeated here.
+        # Always use the dict-form of .with_binaries so that commands with 
dashes in their names are
+        # supported.
+        # The console script name is always the same as the PEX with .pex 
stripped.
+        'thermos_runner': ':thermos_runner',
+      }),
+    )
+
+
+
+Thermos Test resources
+======================
+
+The Aurora source repository and distributions contain several
+[binary files](../../src/test/resources/org/apache/thermos/root/checkpoints) to
+qualify the backwards-compatibility of thermos with checkpoint data. Since
+thermos persists state to disk, to be read by the thermos observer), it is 
important that we have
+tests that prevent regressions affecting the ability to parse 
previously-written data.
+
+The files included represent persisted checkpoints that exercise different
+features of thermos. The existing files should not be modified unless
+we are accepting backwards incompatibility, such as with a major release.
+
+It is not practical to write source code to generate these files on the fly,
+as source would be vulnerable to drift (e.g. due to refactoring) in ways
+that would undermine the goal of ensuring backwards compatibility.
+
+The most common reason to add a new checkpoint file would be to provide
+coverage for new thermos features that alter the data format. This is
+accomplished by writing and running a
+[job configuration](../../reference/configuration/) that exercises the 
feature, and
+copying the checkpoint file from the sandbox directory, by default this is
+`/var/run/thermos/checkpoints/<aurora task id>`.

Added: aurora/site/source/documentation/0.21.0/development/thrift.md
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.21.0/development/thrift.md?rev=1840515&view=auto
==============================================================================
--- aurora/site/source/documentation/0.21.0/development/thrift.md (added)
+++ aurora/site/source/documentation/0.21.0/development/thrift.md Tue Sep 11 
05:28:10 2018
@@ -0,0 +1,54 @@
+Thrift
+======
+
+Aurora uses [Apache Thrift](https://thrift.apache.org/) for representing 
structured data in
+client/server RPC protocol as well as for internal data storage. While Thrift 
is capable of
+correctly handling additions and renames of the existing members, field 
removals must be done
+carefully to ensure backwards compatibility and provide predictable 
deprecation cycle. This
+document describes general guidelines for making Thrift schema changes to the 
existing fields in
+[api.thrift](https://github.com/apache/aurora/blob/rel/0.21.0/api/src/main/thrift/org/apache/aurora/gen/api.thrift).
+
+It is highly recommended to go through the
+[Thrift: The Missing 
Guide](http://diwakergupta.github.io/thrift-missing-guide/) first to refresh on
+basic Thrift schema concepts.
+
+Checklist
+---------
+Every existing Thrift schema modification is unique in its requirements and 
must be analyzed
+carefully to identify its scope and expected consequences. The following 
checklist may help in that
+analysis:
+* Is this a new field/struct? If yes, go ahead
+* Is this a pure field/struct rename without any type/structure change? If 
yes, go ahead and rename
+* Anything else, read further to make sure your change is properly planned
+
+Deprecation cycle
+-----------------
+Any time a breaking change (e.g.: field replacement or removal) is required, 
the following cycle
+must be followed:
+
+### vCurrent
+Change is applied in a way that does not break scheduler/client with this 
version to
+communicate with scheduler/client from vCurrent-1.
+* Do not remove or rename the old field
+* Add a new field as an eventual replacement of the old one and implement a 
dual read/write
+anywhere the old field is used. If a thrift struct is mapped in the DB store 
make sure both columns
+are marked as `NOT NULL`
+* Check 
[storage.thrift](https://github.com/apache/aurora/blob/rel/0.21.0/api/src/main/thrift/org/apache/aurora/gen/storage.thrift)
 to see if
+the affected struct is stored in Aurora scheduler storage. If so, it's almost 
certainly also
+necessary to perform a [DB migration](../db-migration/).
+* Add a deprecation jira ticket into the vCurrent+1 release candidate
+* Add a TODO for the deprecated field mentioning the jira ticket
+
+### vCurrent+1
+Finalize the change by removing the deprecated fields from the Thrift schema.
+* Drop any dual read/write routines added in the previous version
+* Remove thrift backfilling in scheduler
+* Remove the deprecated Thrift field
+
+Testing
+-------
+It's always advisable to test your changes in the local vagrant environment to 
build more
+confidence that you change is backwards compatible. It's easy to simulate 
different
+client/scheduler versions by playing with `aurorabuild` command. See [this 
document](../../getting-started/vagrant/)
+for more.
+

Added: aurora/site/source/documentation/0.21.0/development/ui.md
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.21.0/development/ui.md?rev=1840515&view=auto
==============================================================================
--- aurora/site/source/documentation/0.21.0/development/ui.md (added)
+++ aurora/site/source/documentation/0.21.0/development/ui.md Tue Sep 11 
05:28:10 2018
@@ -0,0 +1,46 @@
+Developing the Aurora Scheduler UI
+==================================
+
+Installing bower (optional)
+----------------------------
+Third party JS libraries used in Aurora (located at 
3rdparty/javascript/bower_components) are
+managed by bower, a JS dependency manager. Bower is only required if you plan 
to add, remove or
+update JS libraries. Bower can be installed using the following command:
+
+    npm install -g bower
+
+Bower depends on node.js and npm. The easiest way to install node on a mac is 
via brew:
+
+    brew install node
+
+For more node.js installation options refer to 
https://github.com/joyent/node/wiki/Installation.
+
+More info on installing and using bower can be found at: http://bower.io/. 
Once installed, you can
+use the following commands to view and modify the bower repo at
+3rdparty/javascript/bower_components
+
+    bower list
+    bower install <library name>
+    bower remove <library name>
+    bower update <library name>
+    bower help
+
+
+Faster Iteration in Vagrant
+---------------------------
+The scheduler serves UI assets from the classpath. For production deployments 
this means the assets
+are served from within a jar. However, for faster development iteration, the 
vagrant image is
+configured to add the `scheduler` subtree of `/vagrant/dist/resources/main` to 
the head of
+`CLASSPATH`. This path is configured as a shared filesystem to the path on the 
host system where
+your Aurora repository lives. This means that any updates under 
`dist/resources/main/scheduler` in
+your checkout will be reflected immediately in the UI served from within the 
vagrant image.
+
+The one caveat to this is that this path is under `dist` not `src`. This is 
because the assets must
+be processed by gradle before they can be served. So, unfortunately, you 
cannot just save your local
+changes and see them reflected in the UI, you must first run `./gradlew 
processResources`. This is
+less than ideal, but better than having to restart the scheduler after every 
change. Additionally,
+gradle makes this process somewhat easier with the use of the `--continuous` 
flag. If you run:
+`./gradlew processResources --continuous` gradle will monitor the filesystem 
for changes and run the
+task automatically as necessary. This doesn't quite provide hot-reload 
capabilities, but it does
+allow for <5s from save to changes being visibile in the UI with no further 
action required on the
+part of the developer.

Added: aurora/site/source/documentation/0.21.0/features/constraints.md
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.21.0/features/constraints.md?rev=1840515&view=auto
==============================================================================
--- aurora/site/source/documentation/0.21.0/features/constraints.md (added)
+++ aurora/site/source/documentation/0.21.0/features/constraints.md Tue Sep 11 
05:28:10 2018
@@ -0,0 +1,126 @@
+Scheduling Constraints
+======================
+
+By default, Aurora will pick any random agent with sufficient resources
+in order to schedule a task. This scheduling choice can be further
+restricted with the help of constraints.
+
+
+Mesos Attributes
+----------------
+
+Data centers are often organized with hierarchical failure domains.  Common 
failure domains
+include hosts, racks, rows, and PDUs.  If you have this information available, 
it is wise to tag
+the Mesos agent with them as
+[attributes](https://mesos.apache.org/documentation/attributes-resources/).
+
+The Mesos agent `--attributes` command line argument can be used to mark 
agents with
+static key/value pairs, so called attributes (not to be confused with 
`--resources`, which are
+dynamic and accounted).
+
+For example, consider the host `cluster1-aaa-03-sr2` and its following 
attributes (given in
+key:value format): `host:cluster1-aaa-03-sr2` and `rack:aaa`.
+
+Aurora makes these attributes available for matching with scheduling 
constraints.
+
+
+Limit Constraints
+-----------------
+
+Limit constraints allow to control machine diversity using constraints. The 
below
+constraint ensures that no more than two instances of your job may run on a 
single host.
+Think of this as a "group by" limit.
+
+    Service(
+      name = 'webservice',
+      role = 'www-data',
+      constraints = {
+        'host': 'limit:2',
+      }
+      ...
+    )
+
+
+Likewise, you can use constraints to control rack diversity, e.g. at
+most one task per rack:
+
+    constraints = {
+      'rack': 'limit:1',
+    }
+
+Use these constraints sparingly as they can dramatically reduce Tasks' 
schedulability.
+Further details are available in the reference documentation on
+[Scheduling 
Constraints](../../reference/configuration/#specifying-scheduling-constraints).
+
+
+
+Value Constraints
+-----------------
+
+Value constraints can be used to express that a certain attribute with a 
certain value
+should be present on a Mesos agent. For example, the following job would only 
be
+scheduled on nodes that claim to have an `SSD` as their disk.
+
+    Service(
+      name = 'webservice',
+      role = 'www-data',
+      constraints = {
+        'disk': 'SSD',
+      }
+      ...
+    )
+
+
+Further details are available in the reference documentation on
+[Scheduling 
Constraints](../../reference/configuration/#specifying-scheduling-constraints).
+
+
+Running stateful services
+-------------------------
+
+Aurora is best suited to run stateless applications, but it also accommodates 
for stateful services
+like databases, or services that otherwise need to always run on the same 
machines.
+
+### Dedicated attribute
+
+Most of the Mesos attributes arbitrary and available for custom use.  There is 
one exception,
+though: the `dedicated` attribute.  Aurora treats this specially, and only 
allows matching jobs to
+run on these machines, and will only schedule matching jobs on these machines.
+
+
+#### Syntax
+The dedicated attribute has semantic meaning. The format is `$role(/.*)?`. 
When a job is created,
+the scheduler requires that the `$role` component matches the `role` field in 
the job
+configuration, and will reject the job creation otherwise.  The remainder of 
the attribute is
+free-form. We've developed the idiom of formatting this attribute as 
`$role/$job`, but do not
+enforce this. For example: a job `devcluster/www-data/prod/hello` with a 
dedicated constraint set as
+`www-data/web.multi` will have its tasks scheduled only on Mesos agents 
configured with:
+`--attributes=dedicated:www-data/web.multi`.
+
+A wildcard (`*`) may be used for the role portion of the dedicated attribute, 
which will allow any
+owner to elect for a job to run on the host(s). For example: tasks from both
+`devcluster/www-data/prod/hello` and `devcluster/vagrant/test/hello` with a 
dedicated constraint
+formatted as `*/web.multi` will be scheduled only on Mesos agents configured 
with
+`--attributes=dedicated:*/web.multi`. This may be useful when assembling a 
virtual cluster of
+machines sharing the same set of traits or requirements.
+
+##### Example
+Consider the following agent command line:
+
+    mesos-slave --attributes="dedicated:db_team/redis" ...
+
+And this job configuration:
+
+    Service(
+      name = 'redis',
+      role = 'db_team',
+      constraints = {
+        'dedicated': 'db_team/redis'
+      }
+      ...
+    )
+
+The job configuration is indicating that it should only be scheduled on agents 
with the attribute
+`dedicated:db_team/redis`.  Additionally, Aurora will prevent any tasks that 
do _not_ have that
+constraint from running on those agents.
+

Added: aurora/site/source/documentation/0.21.0/features/containers.md
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.21.0/features/containers.md?rev=1840515&view=auto
==============================================================================
--- aurora/site/source/documentation/0.21.0/features/containers.md (added)
+++ aurora/site/source/documentation/0.21.0/features/containers.md Tue Sep 11 
05:28:10 2018
@@ -0,0 +1,130 @@
+Containers
+==========
+
+Aurora supports several containerizers, notably the Mesos containerizer and 
the Docker
+containerizer. The Mesos containerizer uses native OS features directly to 
provide isolation between
+containers, while the Docker containerizer delegates container management to 
the Docker engine.
+
+The support for launching container images via both containerizers has to be
+[enabled by a cluster operator](../../operations/configuration/#containers).
+
+Mesos Containerizer
+-------------------
+
+The Mesos containerizer is the native Mesos containerizer solution. It allows 
tasks to be
+run with an array of [pluggable isolators](../resource-isolation/) and can 
launch tasks using
+[Docker](https://github.com/docker/docker/blob/master/image/spec/v1.md) images,
+[AppC](https://github.com/appc/spec/blob/master/SPEC.md) images, or directly 
on the agent host
+filesystem.
+
+The following example (available in our [Vagrant 
environment](../../getting-started/vagrant/))
+launches a hello world example within a `debian/jessie` Docker image:
+
+    $ cat /vagrant/examples/jobs/hello_docker_image.aurora
+    hello_loop = Process(
+      name = 'hello',
+      cmdline = """
+        while true; do
+          echo hello world
+          sleep 10
+        done
+      """)
+
+    task = Task(
+      processes = [hello_loop],
+      resources = Resources(cpu=1, ram=1*MB, disk=8*MB)
+    )
+
+    jobs = [
+      Service(
+        cluster = 'devcluster',
+        environment = 'devel',
+        role = 'www-data',
+        name = 'hello_docker_image',
+        task = task,
+        container = Mesos(image=DockerImage(name='debian', tag='jessie'))
+      )
+    ]
+
+Docker and Appc images are designated using an appropriate `image` property of 
the `Mesos`
+configuration object. If either `container` or `image` is left unspecified, 
the host filesystem
+will be used. Further details of how to specify images can be found in the
+[Reference Documentation](../../reference/configuration/#mesos-object).
+
+By default, Aurora launches processes as the Linux user named like the used 
role (e.g. `www-data`
+in the example above). This user has to exist on the host filesystem. If it 
does not exist within
+the container image, it will be created automatically. Otherwise, this user 
and its primary group
+has to exist in the image with matching uid/gid.
+
+For more information on the Mesos containerizer filesystem, namespace, and 
isolator features, visit
+[Mesos 
Containerizer](http://mesos.apache.org/documentation/latest/mesos-containerizer/)
 and
+[Mesos Container 
Images](http://mesos.apache.org/documentation/latest/container-image/).
+
+
+Docker Containerizer
+--------------------
+
+The Docker containerizer launches container images using the Docker engine. It 
may often provide
+more advanced features than the native Mesos containerizer, but has to be 
installed separately to
+Mesos on each agent host.
+
+Starting with the 0.17.0 release, `image` can be specified with a 
`{{docker.image[name][tag]}}` binder so that
+the tag can be resolved to a concrete image digest. This ensures that the job 
always uses the same image
+across restarts, even if the version identified by the tag has been updated, 
guaranteeing that only job
+updates can mutate configuration.
+
+Example (available in the [Vagrant 
environment](../../getting-started/vagrant/)):
+
+    $ cat /vagrant/examples/jobs/hello_docker_engine.aurora
+    hello_loop = Process(
+      name = 'hello',
+      cmdline = """
+        while true; do
+          echo hello world
+          sleep 10
+        done
+      """)
+
+    task = Task(
+      processes = [hello_loop],
+      resources = Resources(cpu=1, ram=1*MB, disk=8*MB)
+    )
+
+    jobs = [
+      Service(
+        cluster = 'devcluster',
+        environment = 'devel',
+        role = 'www-data',
+        name = 'hello_docker',
+        task = task,
+        container = Docker(image = 'python:2.7')
+      ), Service(
+        cluster = 'devcluster',
+        environment = 'devel',
+        role = 'www-data',
+        name = 'hello_docker_engine_binding',
+        task = task,
+        container = Docker(image = '{{docker.image[library/python][2.7]}}')
+      )
+    ]
+
+Note, this feature requires a v2 Docker registry. If using a private Docker 
registry its url
+must be specified in the `clusters.json` configuration file under the key 
`docker_registry`.
+If not specified `docker_registry` defaults to `https://registry-1.docker.io` 
(Docker Hub).
+
+Example:
+    # clusters.json
+    [{
+      "name": "devcluster",
+      ...
+      "docker_registry": "https://registry.example.com";
+    }]
+
+Details of how to use Docker via the Docker engine can be found in the
+[Reference Documentation](../../reference/configuration/#docker-object). 
Please note that in order to
+correctly execute processes inside a job, the Docker container must have 
Python 2.7 and potentitally
+further Mesos dependencies installed. This limitation does not hold for Docker 
containers used via
+the Mesos containerizer.
+
+For more information on launching Docker containers through the Docker 
containerizer, visit
+[Docker 
Containerizer](http://mesos.apache.org/documentation/latest/docker-containerizer/)

Added: aurora/site/source/documentation/0.21.0/features/cron-jobs.md
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.21.0/features/cron-jobs.md?rev=1840515&view=auto
==============================================================================
--- aurora/site/source/documentation/0.21.0/features/cron-jobs.md (added)
+++ aurora/site/source/documentation/0.21.0/features/cron-jobs.md Tue Sep 11 
05:28:10 2018
@@ -0,0 +1,124 @@
+# Cron Jobs
+
+Aurora supports execution of scheduled jobs on a Mesos cluster using 
cron-style syntax.
+
+- [Overview](#overview)
+- [Collision Policies](#collision-policies)
+- [Failure recovery](#failure-recovery)
+- [Interacting with cron jobs via the Aurora 
CLI](#interacting-with-cron-jobs-via-the-aurora-cli)
+       - [cron schedule](#cron-schedule)
+       - [cron deschedule](#cron-deschedule)
+       - [cron start](#cron-start)
+       - [job killall, job restart, job 
kill](#job-killall-job-restart-job-kill)
+- [Technical Note About Syntax](#technical-note-about-syntax)
+- [Caveats](#caveats)
+       - [Failovers](#failovers)
+       - [Collision policy is best-effort](#collision-policy-is-best-effort)
+       - [Timezone Configuration](#timezone-configuration)
+
+## Overview
+
+A job is identified as a cron job by the presence of a
+`cron_schedule` attribute containing a cron-style schedule in the
+[`Job`](../../reference/configuration/#job-objects) object. Examples of cron 
schedules
+include "every 5 minutes" (`*/5 * * * *`), "Fridays at 17:00" (`* 17 * * 
FRI`), and
+"the 1st and 15th day of the month at 03:00" (`0 3 1,15 *`).
+
+Example (available in the [Vagrant 
environment](../../getting-started/vagrant/)):
+
+    $ cat /vagrant/examples/jobs/cron_hello_world.aurora
+    # A cron job that runs every 5 minutes.
+    jobs = [
+      Job(
+        cluster = 'devcluster',
+        role = 'www-data',
+        environment = 'test',
+        name = 'cron_hello_world',
+        cron_schedule = '*/5 * * * *',
+        task = SimpleTask(
+          'cron_hello_world',
+          'echo "Hello world from cron, the time is now $(date --rfc-822)"'),
+      ),
+    ]
+
+## Collision Policies
+
+The `cron_collision_policy` field specifies the scheduler's behavior when a 
new cron job is
+triggered while an older run hasn't finished. The scheduler has two policies 
available:
+
+* `KILL_EXISTING`: The default policy - on a collision the old instances are 
killed and a instances with the current
+configuration are started.
+* `CANCEL_NEW`: On a collision the new run is cancelled.
+
+Note that the use of `CANCEL_NEW` is likely a code smell - interrupted cron 
jobs should be able
+to recover their progress on a subsequent invocation, otherwise they risk 
having their work queue
+grow faster than they can process it.
+
+## Failure recovery
+
+Unlike with services, which aurora will always re-execute regardless of exit 
status, instances of
+cron jobs retry according to the `max_task_failures` attribute of the
+[Task](../../reference/configuration/#task-object) object. To get 
"run-until-success" semantics,
+set `max_task_failures` to `-1`.
+
+## Interacting with cron jobs via the Aurora CLI
+
+Most interaction with cron jobs takes place using the `cron` subcommand. See 
`aurora cron -h`
+for up-to-date usage instructions.
+
+### cron schedule
+Schedules a new cron job on the Aurora cluster for later runs or replaces the 
existing cron template
+with a new one. Only future runs will be affected, any existing active tasks 
are left intact.
+
+    $ aurora cron schedule devcluster/www-data/test/cron_hello_world 
/vagrant/examples/jobs/cron_hello_world.aurora
+
+### cron deschedule
+Deschedules a cron job, preventing future runs but allowing current runs to 
complete.
+
+    $ aurora cron deschedule devcluster/www-data/test/cron_hello_world
+
+### cron start
+Start a cron job immediately, outside of its normal cron schedule.
+
+    $ aurora cron start devcluster/www-data/test/cron_hello_world
+
+### job killall, job restart, job kill
+Cron jobs create instances running on the cluster that you can interact with 
like normal Aurora
+tasks with `job kill` and `job restart`.
+
+
+## Technical Note About Syntax
+
+`cron_schedule` uses a restricted subset of BSD crontab syntax. While the
+execution engine currently uses Quartz, the schedule parsing is custom, a 
subset of FreeBSD
+[crontab(5)](http://www.freebsd.org/cgi/man.cgi?crontab(5)) syntax. See
+[the 
source](https://github.com/apache/aurora/blob/master/src/main/java/org/apache/aurora/scheduler/cron/CrontabEntry.java#L106-L124)
+for details.
+
+
+## Caveats
+
+### Failovers
+No failover recovery. Aurora does not record the latest minute it fired
+triggers for across failovers. Therefore it's possible to miss triggers
+on failover. Note that this behavior may change in the future.
+
+It's necessary to sync time between schedulers with something like `ntpd`.
+Clock skew could cause double or missed triggers in the case of a failover.
+
+### Collision policy is best-effort
+Aurora aims to always have *at least one copy* of a given instance running at 
a time - it's
+an AP system, meaning it chooses Availability and Partition Tolerance at the 
expense of
+Consistency.
+
+If your collision policy was `CANCEL_NEW` and a task has terminated but
+Aurora has not noticed this Aurora will go ahead and create your new
+task.
+
+If your collision policy was `KILL_EXISTING` and a task was marked `LOST`
+but not yet GCed Aurora will go ahead and create your new task without
+attempting to kill the old one (outside the GC interval).
+
+### Timezone Configuration
+Cron timezone is configured indepdendently of JVM timezone with the 
`-cron_timezone` flag and
+defaults to UTC.

Added: aurora/site/source/documentation/0.21.0/features/custom-executors.md
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.21.0/features/custom-executors.md?rev=1840515&view=auto
==============================================================================
--- aurora/site/source/documentation/0.21.0/features/custom-executors.md (added)
+++ aurora/site/source/documentation/0.21.0/features/custom-executors.md Tue 
Sep 11 05:28:10 2018
@@ -0,0 +1,166 @@
+Custom Executors
+================
+
+If the need arises to use a Mesos executor other than the Thermos executor, 
the scheduler can be
+configured to utilize a custom executor by specifying the 
`-custom_executor_config` flag.
+The flag must be set to the path of a valid executor configuration file.
+
+The configuration file must be a valid **JSON array** and contain, at minimum,
+one executor configuration including the name, command and resources fields and
+must be pointed to by the `-custom_executor_config` flag when the scheduler is
+started.
+
+### Array Entry
+
+Property                 | Description
+-----------------------  | ---------------------------------
+executor (required)      | Description of executor.
+task_prefix (required) ) | Prefix given to tasks launched with this executor's 
configuration.
+volume_mounts (optional) | Volumes to be mounted in container running executor.
+
+#### executor
+
+Property                 | Description
+-----------------------  | ---------------------------------
+name (required)          | Name of the executor.
+command (required)       | How to run the executor.
+resources (required)     | Overhead to use for each executor instance.
+
+#### command
+
+Property                 | Description
+-----------------------  | ---------------------------------
+value (required)         | The command to execute.
+arguments (optional)     | A list of arguments to pass to the command.
+uris (optional)          | List of resources to download into the task sandbox.
+shell (optional)         | Run executor via shell.
+
+A note on the command property (from 
[mesos.proto](https://github.com/apache/mesos/blob/master/include/mesos/mesos.proto)):
+
+```
+1) If 'shell == true', the command will be launched via shell
+   (i.e., /bin/sh -c 'value'). The 'value' specified will be
+   treated as the shell command. The 'arguments' will be ignored.
+2) If 'shell == false', the command will be launched by passing
+   arguments to an executable. The 'value' specified will be
+   treated as the filename of the executable. The 'arguments'
+   will be treated as the arguments to the executable. This is
+   similar to how POSIX exec families launch processes (i.e.,
+   execlp(value, arguments(0), arguments(1), ...)).
+```
+
+##### uris (list)
+* Follows the [Mesos Fetcher 
schema](http://mesos.apache.org/documentation/latest/fetcher/)
+
+Property                 | Description
+-----------------------  | ---------------------------------
+value (required)         | Path to the resource needed in the sandbox.
+executable (optional)    | Change resource to be executable via chmod.
+extract (optional)       | Extract files from packed or compressed archives 
into the sandbox.
+cache (optional)         | Use caching mechanism provided by Mesos for 
resources.
+
+#### resources (list)
+
+Property             | Description
+-------------------  | ---------------------------------
+name (required)      | Name of the resource: cpus or mem.
+type (required)      | Type of resource. Should always be SCALAR.
+scalar (required)    | Value in float for cpus or int for mem (in MBs)
+
+### volume_mounts (list)
+
+Property                     | Description
+---------------------------  | ---------------------------------
+host_path (required)         | Host path to mount inside the container.
+container_path (required)    | Path inside the container where `host_path` 
will be mounted.
+mode (required)              | Mode in which to mount the volume, Read-Write 
(RW) or Read-Only (RO).
+
+A sample configuration is as follows:
+
+```json
+[
+    {
+      "executor": {
+        "name": "myExecutor",
+        "command": {
+          "value": "myExecutor.a",
+          "shell": "false",
+          "arguments": [
+            "localhost:2181",
+            "-verbose",
+            "-config myConfiguration.config"
+          ],
+          "uris": [
+            {
+              "value": "/dist/myExecutor.a",
+              "executable": true,
+              "extract": false,
+              "cache": true
+            },
+            {
+              "value": "/home/user/myConfiguration.config",
+              "executable": false,
+              "extract": false,
+              "cache": false
+            }
+          ]
+        },
+        "resources": [
+          {
+            "name": "cpus",
+            "type": "SCALAR",
+            "scalar": {
+              "value": 1.00
+            }
+          },
+          {
+            "name": "mem",
+            "type": "SCALAR",
+            "scalar": {
+              "value": 512
+            }
+          }
+        ]
+      },
+      "volume_mounts": [
+        {
+          "mode": "RO",
+          "container_path": "/path/on/container",
+          "host_path": "/path/to/host/directory"
+        },
+        {
+          "mode": "RW",
+          "container_path": "/container",
+          "host_path": "/host"
+        }
+      ],
+      "task_prefix": "my-executor-"
+    }
+]
+```
+
+It should be noted that if you do not use Thermos or a Thermos based executor, 
links in the scheduler's
+Web UI for tasks will not work (at least for the time being).
+Some information about launched tasks can still be accessed via the Mesos Web 
UI or via the Aurora Client.
+
+### Using a custom executor
+
+To launch tasks using a custom executor,
+an [ExecutorConfig](../../reference/configuration/#executorconfig-objects) 
object must be added to
+the Job or Service object. The `name` parameter of ExecutorConfig must match 
the name of an executor
+defined in the JSON object provided to the scheduler at startup time.
+
+For example, if we desire to launch tasks using `myExecutor` (defined above), 
we may do so in
+the following manner:
+
+```
+jobs = [Service(
+  task = task,
+  cluster = 'devcluster',
+  role = 'www-data',
+  environment = 'prod',
+  name = 'hello',
+  executor_config = ExecutorConfig(name='myExecutor'))]
+```
+
+This will create a Service Job which will launch tasks using myExecutor 
instead of Thermos.

Added: aurora/site/source/documentation/0.21.0/features/job-updates.md
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.21.0/features/job-updates.md?rev=1840515&view=auto
==============================================================================
--- aurora/site/source/documentation/0.21.0/features/job-updates.md (added)
+++ aurora/site/source/documentation/0.21.0/features/job-updates.md Tue Sep 11 
05:28:10 2018
@@ -0,0 +1,123 @@
+Aurora Job Updates
+==================
+
+`Job` configurations can be updated at any point in their lifecycle.
+Usually updates are done incrementally using a process called a *rolling
+upgrade*, in which Tasks are upgraded in small groups, one group at a
+time.  Updates are done using various Aurora Client commands.
+
+
+Rolling Job Updates
+-------------------
+
+There are several sub-commands to manage job updates:
+
+    aurora update start <job key> <configuration file>
+    aurora update info <job key>
+    aurora update pause <job key>
+    aurora update resume <job key>
+    aurora update abort <job key>
+    aurora update list <cluster>
+
+When you `start` a job update, the command will return once it has sent the
+instructions to the scheduler.  At that point, you may view detailed
+progress for the update with the `info` subcommand, in addition to viewing
+graphical progress in the web browser.  You may also get a full listing of
+in-progress updates in a cluster with `list`.
+
+Once an update has been started, you can `pause` to keep the update but halt
+progress.  This can be useful for doing things like debug a  partially-updated
+job to determine whether you would like to proceed.  You can `resume` to
+proceed.
+
+You may `abort` a job update regardless of the state it is in. This will
+instruct the scheduler to completely abandon the job update and leave the job
+in the current (possibly partially-updated) state.
+
+For a configuration update, the Aurora Scheduler calculates required changes
+by examining the current job config state and the new desired job config.
+It then starts a *rolling batched update process* by going through every batch
+and performing these operations, in order:
+
+- If an instance is not present in the scheduler but is present in
+  the new config, then the instance is created.
+- If an instance is present in both the scheduler and the new config, then
+  the scheduler diffs both task configs. If it detects any changes, it
+  performs an instance update by killing the old config instance and adds
+  the new config instance.
+- If an instance is present in the scheduler but isn't in the new config,
+  then that instance is killed.
+
+The Aurora Scheduler continues through the instance list until all tasks are
+updated and in `RUNNING`. If the scheduler determines the update is not going
+well (based on the criteria specified in the UpdateConfig), it cancels the 
update.
+
+Update cancellation runs a procedure similar to the described above
+update sequence, but in reverse order. New instance configs are swapped
+with old instance configs and batch updates proceed backwards
+from the point where the update failed. E.g.; (0,1,2) (3,4,5) (6,7,
+8-FAIL) results in a rollback in order (8,7,6) (5,4,3) (2,1,0).
+
+For details on how to control a job update, please see the
+[UpdateConfig](../../reference/configuration/#updateconfig-objects) 
configuration object.
+
+
+Coordinated Job Updates
+------------------------
+
+Some Aurora services may benefit from having more control over updates by 
explicitly
+acknowledging ("heartbeating") job update progress. This may be helpful for 
mission-critical
+service updates where explicit job health monitoring is vital during the 
entire job update
+lifecycle. Such job updates would rely on an external service (or a custom 
client) periodically
+pulsing an active coordinated job update via a
+[pulseJobUpdate 
RPC](https://github.com/apache/aurora/blob/rel/0.21.0/api/src/main/thrift/org/apache/aurora/gen/api.thrift).
+
+A coordinated update is defined by setting a positive
+[pulse_interval_secs](../../reference/configuration/#updateconfig-objects) 
value in job configuration
+file. If no pulses are received within specified interval the update will be 
blocked. A blocked
+update is unable to continue rolling forward (or rolling back) but retains its 
active status.
+It may only be unblocked by a fresh `pulseJobUpdate` call.
+
+NOTE: A coordinated update starts in `ROLL_FORWARD_AWAITING_PULSE` state and 
will not make any
+progress until the first pulse arrives. However, a paused update 
(`ROLL_FORWARD_PAUSED` or
+`ROLL_BACK_PAUSED`) is still considered active and upon resuming will 
immediately make progress
+provided the pulse interval has not expired.
+
+
+SLA-Aware Updates
+-----------------
+
+Updates can take advantage of [Custom SLA 
Requirements](../../features/sla-requirements/) and
+specify the `sla_aware=True` option within
+[UpdateConfig](../../reference/configuration/#updateconfig-objects) to only 
update instances if
+the action will maintain the task's SLA requirements. This feature allows 
updates to avoid killing
+too many instances in the face of unexpected failures outside of the update 
range.
+
+See the [Using the `sla_aware` 
option](../../reference/configuration/#using-the-sla-aware-option)
+for more information on how to use this feature.
+
+
+Canary Deployments
+------------------
+
+Canary deployments are a pattern for rolling out updates to a subset of job 
instances,
+in order to test different code versions alongside the actual production job.
+It is a risk-mitigation strategy for job owners and commonly used in a form 
where
+job instance 0 runs with a different configuration than the instances 1-N.
+
+For example, consider a job with 4 instances that each
+request 1 core of cpu, 1 GB of RAM, and 1 GB of disk space as specified
+in the configuration file `hello_world.aurora`. If you want to
+update it so it requests 2 GB of RAM instead of 1. You can create a new
+configuration file to do that called `new_hello_world.aurora` and
+issue
+
+    aurora update start <job_key_value>/0-1 new_hello_world.aurora
+
+This results in instances 0 and 1 having 1 cpu, 2 GB of RAM, and 1 GB of disk 
space,
+while instances 2 and 3 have 1 cpu, 1 GB of RAM, and 1 GB of disk space. If 
instance 3
+dies and restarts, it restarts with 1 cpu, 1 GB RAM, and 1 GB disk space.
+
+So that means there are two simultaneous task configurations for the same job
+at the same time, just valid for different ranges of instances. While this 
isn't a recommended
+pattern, it is valid and supported by the Aurora scheduler.

Added: aurora/site/source/documentation/0.21.0/features/mesos-fetcher.md
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.21.0/features/mesos-fetcher.md?rev=1840515&view=auto
==============================================================================
--- aurora/site/source/documentation/0.21.0/features/mesos-fetcher.md (added)
+++ aurora/site/source/documentation/0.21.0/features/mesos-fetcher.md Tue Sep 
11 05:28:10 2018
@@ -0,0 +1,46 @@
+Mesos Fetcher
+=============
+
+Mesos has support for downloading resources into the sandbox through the
+use of the [Mesos 
Fetcher](http://mesos.apache.org/documentation/latest/fetcher/)
+
+Aurora supports passing URIs to the Mesos Fetcher dynamically by including
+a list of URIs in job submissions.
+
+How to use
+----------
+The scheduler flag `-enable_mesos_fetcher` must be set to true.
+
+Currently only the scheduler side of this feature has been implemented
+so a modification to the existing client, or a custom Thrift client are 
required
+to make use of this feature.
+
+If using a custom Thrift client, the list of URIs must be included in 
TaskConfig
+as the `mesosFetcherUris` field.
+
+Each Mesos Fetcher URI has the following data members:
+
+|Property | Description|
+|---------|------|
+|value (required)  |Path to the resource needed in the sandbox.|
+|extract (optional)|Extract files from packed or compressed archives into the 
sandbox.|
+|cache (optional) | Use caching mechanism provided by Mesos for resources.|
+
+Note that this structure is very similar to the one provided for downloading
+resources needed for a [custom executor](../../operations/configuration/).
+
+This is because both features use the Mesos fetcher to retrieve resources into
+the sandbox. However, one, the custom executor feature, has a static set of 
URIs
+set in the server side, and the other, the Mesos Fetcher feature, is a dynamic 
set
+of URIs set at the time of job submission.
+
+Security Implications
+---------------------
+There are security implications that must be taken into account when enabling 
this feature.
+**Enabling this feature may potentially enable any job submitting user to 
perform a privilege escalation.**
+
+Until a more through solution is created, one step that has been taken to 
mitigate this issue
+is to statically mark every user submitted URI as non-executable. This is in 
contrast to the set of URIs
+set in the custom executor feature which may mark any URI as executable.
+
+If the need arises to mark a downloaded URI as executable, please consider 
using the custom executor feature.
\ No newline at end of file

Added: aurora/site/source/documentation/0.21.0/features/multitenancy.md
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.21.0/features/multitenancy.md?rev=1840515&view=auto
==============================================================================
--- aurora/site/source/documentation/0.21.0/features/multitenancy.md (added)
+++ aurora/site/source/documentation/0.21.0/features/multitenancy.md Tue Sep 11 
05:28:10 2018
@@ -0,0 +1,82 @@
+Multitenancy
+============
+
+Aurora is a multi-tenant system that can run jobs of multiple clients/tenants.
+Going beyond the [resource isolation on an individual 
host](../resource-isolation/), it is
+crucial to prevent those jobs from stepping on each others toes.
+
+
+Job Namespaces
+--------------
+
+The namespace for jobs in Aurora follows a hierarchical structure. This is 
meant to make it easier
+to differentiate between different jobs. A job key consists of four parts. The 
four parts are
+`<cluster>/<role>/<environment>/<jobname>` in that order:
+
+* Cluster refers to the name of a particular Aurora installation.
+* Role names are user accounts.
+* Environment names are namespaces.
+* Jobname is the custom name of your job.
+
+Role names correspond to user accounts. They are used for
+[authentication](../../operations/security/), as the linux user used to run 
jobs, and for the
+assignment of [quota](#preemption). If you don't know what accounts are 
available, contact your
+sysadmin.
+
+The environment component in the job key, serves as a namespace. The values for
+environment are validated in the scheduler. By default allowing any of 
`devel`, `test`,
+`production`, and any value matching the regular expression `staging[0-9]*`. 
This validation can be
+changed to allow any arbitrary regular expression by setting the scheduler 
option `allowed_job_environments`.
+
+None of the values imply any difference in the scheduling behavior. 
Conventionally, the
+"environment" is set so as to indicate a certain level of stability in the 
behavior of the job
+by ensuring that an appropriate level of testing has been performed on the 
application code. e.g.
+in the case of a typical Job, releases may progress through the following 
phases in order of
+increasing level of stability: `devel`, `test`, `staging`, `production`.
+
+
+Configuration Tiers
+-------------------
+
+Tier is a predefined bundle of task configuration options. Aurora schedules 
tasks and assigns them
+resources based on their tier assignment. The default scheduler tier 
configuration allows for
+3 tiers:
+
+ - `revocable`: The `revocable` tier requires the task to run with 
[revocable](../resource-isolation/#oversubscription)
+ resources.
+ - `preemptible`: Setting the taskâs tier to `preemptible` allows for the 
possibility of that task
+ being [preempted](#preemption) by other tasks when cluster is running low on 
resources.
+ - `preferred`: The `preferred` tier prevents the task from using 
[revocable](../resource-isolation/#oversubscription)
+ resources and from being [preempted](#preemption).
+
+Since it is possible that a cluster is configured with a custom tier 
configuration, users should
+consult their cluster administrator to be informed of the tiers supported by 
the cluster. Attempts
+to schedule jobs with an unsupported tier will be rejected by the scheduler.
+
+
+Preemption
+----------
+
+In order to guarantee that important production jobs are always running, 
Aurora supports
+preemption.
+
+Let's consider we have a pending job that is candidate for scheduling but 
resource shortage pressure
+prevents this. Active tasks can become the victim of preemption, if:
+
+ - both candidate and victim are owned by the same role and the
+   [priority](../../reference/configuration/#job-objects) of a victim is lower 
than the
+   [priority](../../reference/configuration/#job-objects) of the candidate.
+ - OR a victim is a `preemptible` or `revocable` [tier](#configuration-tiers) 
task and the candidate
+   is a `preferred` [tier](#configuration-tiers) task.
+
+In other words, tasks from `preferred` 
[tier](../../reference/configuration/#job-objects) jobs may
+preempt tasks from any `preemptible` or `revocable` job. However, a 
`preferred` task may only be
+preempted by tasks from `preferred` jobs in the same role with higher 
[priority](../../reference/configuration/#job-objects).
+
+Aurora requires resource quotas for [production non-dedicated 
jobs](../../reference/configuration/#job-objects).
+Quota is enforced at the job role level and when set, defines a 
non-preemptible pool of compute resources within
+that role. All job types (service, adhoc or cron) require role resource quota 
unless a job has
+[dedicated constraint set](../constraints/#dedicated-attribute).
+
+To grant quota to a particular role in production, an operator can use the 
command
+`aurora_admin set_quota`.

Added: aurora/site/source/documentation/0.21.0/features/resource-isolation.md
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.21.0/features/resource-isolation.md?rev=1840515&view=auto
==============================================================================
--- aurora/site/source/documentation/0.21.0/features/resource-isolation.md 
(added)
+++ aurora/site/source/documentation/0.21.0/features/resource-isolation.md Tue 
Sep 11 05:28:10 2018
@@ -0,0 +1,181 @@
+Resources Isolation and Sizing
+==============================
+
+This document assumes Aurora and Mesos have been configured
+using our [recommended resource isolation 
settings](../../operations/configuration/#resource-isolation).
+
+- [Isolation](#isolation)
+- [Sizing](#sizing)
+- [Oversubscription](#oversubscription)
+
+
+Isolation
+---------
+
+Aurora is a multi-tenant system; a single software instance runs on a
+server, serving multiple clients/tenants. To share resources among
+tenants, it leverages Mesos for isolation of:
+
+* CPU
+* GPU
+* memory
+* disk space
+* ports
+
+CPU is a soft limit, and handled differently from memory and disk space.
+Too low a CPU value results in throttling your application and
+slowing it down. Memory and disk space are both hard limits; when your
+application goes over these values, it's killed.
+
+### CPU Isolation
+
+Mesos can be configured to use a quota based CPU scheduler (the *Completely*
+*Fair Scheduler*) to provide consistent and predictable performance.
+This is effectively a guarantee of resources -- you receive at least what
+you requested, but also no more than you've requested.
+
+The scheduler gives applications a CPU quota for every 100 ms interval.
+When an application uses its quota for an interval, it is throttled for
+the rest of the 100 ms. Usage resets for each interval and unused
+quota does not carry over.
+
+For example, an application specifying 4.0 CPU has access to 400 ms of
+CPU time every 100 ms. This CPU quota can be used in different ways,
+depending on the application and available resources. Consider the
+scenarios shown in this diagram.
+
+![CPU Availability](../images/CPUavailability.png)
+
+* *Scenario A*: the application can use up to 4 cores continuously for
+every 100 ms interval. It is never throttled and starts processing
+new requests immediately.
+
+* *Scenario B* : the application uses up to 8 cores (depending on
+availability) but is throttled after 50 ms. The CPU quota resets at the
+start of each new 100 ms interval.
+
+* *Scenario C* : is like Scenario A, but there is a garbage collection
+event in the second interval that consumes all CPU quota. The
+application throttles for the remaining 75 ms of that interval and
+cannot service requests until the next interval. In this example, the
+garbage collection finished in one interval but, depending on how much
+garbage needs collecting, it may take more than one interval and further
+delay service of requests.
+
+*Technical Note*: Mesos considers logical cores, also known as
+hyperthreading or SMT cores, as the unit of CPU.
+
+### Memory Isolation
+
+Mesos uses dedicated memory allocation. Your application always has
+access to the amount of memory specified in your configuration. The
+application's memory use is defined as the sum of the resident set size
+(RSS) of all processes in a shard. Each shard is considered
+independently.
+
+In other words, say you specified a memory size of 10GB. Each shard
+would receive 10GB of memory. If an individual shard's memory demands
+exceed 10GB, that shard is killed, but the other shards continue
+working.
+
+*Technical note*: Total memory size is not enforced at allocation time,
+so your application can request more than its allocation without getting
+an ENOMEM. However, it will be killed shortly after.
+
+### Disk Space
+
+Disk space used by your application is defined as the sum of the files'
+disk space in your application's directory, including the `stdout` and
+`stderr` logged from your application. Each shard is considered
+independently. You should use off-node storage for your application's
+data whenever possible.
+
+In other words, say you specified disk space size of 100MB. Each shard
+would receive 100MB of disk space. If an individual shard's disk space
+demands exceed 100MB, that shard is killed, but the other shards
+continue working.
+
+After your application finishes running, its allocated disk space is
+reclaimed. Thus, your job's final action should move any disk content
+that you want to keep, such as logs, to your home file system or other
+less transitory storage. Disk reclamation takes place an undefined
+period after the application finish time; until then, the disk contents
+are still available but you shouldn't count on them being so.
+
+*Technical note* : Disk space is not enforced at write so your
+application can write above its quota without getting an ENOSPC, but it
+will be killed shortly after. This is subject to change.
+
+### GPU Isolation
+
+GPU isolation will be supported for Nvidia devices starting from Mesos 1.0.
+Access to the allocated units will be exclusive with no sharing between tasks
+allowed (e.g. no fractional GPU allocation). For more details, see the
+[Mesos design 
document](https://docs.google.com/document/d/10GJ1A80x4nIEo8kfdeo9B11PIbS1xJrrB4Z373Ifkpo/edit#heading=h.w84lz7p4eexl)
+and the [Mesos agent 
configuration](http://mesos.apache.org/documentation/latest/configuration/).
+
+### Other Resources
+
+Other resources, such as network bandwidth, do not have any performance
+guarantees. For some resources, such as memory bandwidth, there are no
+practical sharing methods so some application combinations collocated on
+the same host may cause contention.
+
+
+Sizing
+-------
+
+### CPU Sizing
+
+To correctly size Aurora-run Mesos tasks, specify a per-shard CPU value
+that lets the task run at its desired performance when at peak load
+distributed across all shards. Include reserve capacity of at least 50%,
+possibly more, depending on how critical your service is (or how
+confident you are about your original estimate : -)), ideally by
+increasing the number of shards to also improve resiliency. When running
+your application, observe its CPU stats over time. If consistently at or
+near your quota during peak load, you should consider increasing either
+per-shard CPU or the number of shards.
+
+## Memory Sizing
+
+Size for your application's peak requirement. Observe the per-instance
+memory statistics over time, as memory requirements can vary over
+different periods. Remember that if your application exceeds its memory
+value, it will be killed, so you should also add a safety margin of
+around 10-20%. If you have the ability to do so, you may also want to
+put alerts on the per-instance memory.
+
+## Disk Space Sizing
+
+Size for your application's peak requirement. Rotate and discard log
+files as needed to stay within your quota. When running a Java process,
+add the maximum size of the Java heap to your disk space requirement, in
+order to account for an out of memory error dumping the heap
+into the application's sandbox space.
+
+## GPU Sizing
+
+GPU is highly dependent on your application requirements and is only limited
+by the number of physical GPU units available on a target box.
+
+
+Oversubscription
+----------------
+
+Mesos supports [oversubscription of machine 
resources](http://mesos.apache.org/documentation/latest/oversubscription/)
+via the concept of revocable tasks. In contrast to non-revocable tasks, 
revocable tasks are best-effort.
+Mesos reserves the right to throttle or even kill them if they might affect 
existing high-priority
+user-facing services.
+
+As of today, the only revocable resource supported by Aurora are CPU and RAM 
resources. A job can
+opt-in to use those by specifying the `revocable` [Configuration 
Tier](../../features/multitenancy/#configuration-tiers).
+A revocable job will only be scheduled using revocable resources, even if 
there are plenty of
+non-revocable resources available.
+
+The Aurora scheduler must be [configured to receive revocable 
offers](../../operations/configuration/#resource-isolation)
+from Mesos and accept revocable jobs. If not configured properly revocable 
tasks will never get
+assigned to hosts and will stay in `PENDING`.
+
+For details on how to mark a job as being revocable, see the
+[Configuration Reference](../../reference/configuration/).

Added: aurora/site/source/documentation/0.21.0/features/service-discovery.md
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.21.0/features/service-discovery.md?rev=1840515&view=auto
==============================================================================
--- aurora/site/source/documentation/0.21.0/features/service-discovery.md 
(added)
+++ aurora/site/source/documentation/0.21.0/features/service-discovery.md Tue 
Sep 11 05:28:10 2018
@@ -0,0 +1,43 @@
+Service Discovery
+=================
+
+It is possible for the Aurora executor to announce tasks into ServerSets for
+the purpose of service discovery.  ServerSets use the Zookeeper [group 
membership 
pattern](http://zookeeper.apache.org/doc/trunk/recipes.html#sc_outOfTheBox)
+of which there are several reference implementations:
+
+  - [C++](https://github.com/apache/mesos/blob/master/src/zookeeper/group.cpp)
+  - 
[Java](https://github.com/twitter/commons/blob/master/src/java/com/twitter/common/zookeeper/ServerSetImpl.java#L221)
+  - 
[Python](https://github.com/twitter/commons/blob/master/src/python/twitter/common/zookeeper/serverset/serverset.py#L51)
+
+These can also be used natively in Finagle using the 
[ZookeeperServerSetCluster](https://github.com/twitter/finagle/blob/master/finagle-serversets/src/main/scala/com/twitter/finagle/zookeeper/ZookeeperServerSetCluster.scala).
+
+For more information about how to configure announcing, see the [Configuration 
Reference](../../reference/configuration/).
+
+Using Mesos DiscoveryInfo
+-------------------------
+Experimental support for populating DiscoveryInfo in Mesos is introduced in 
Aurora. This can be used to build
+custom service discovery system not using zookeeper. Please see `Service 
Discovery` section in
+[Mesos Framework Development 
guide](http://mesos.apache.org/documentation/latest/app-framework-development-guide/)
 for
+explanation of the protobuf message in Mesos.
+
+To use this feature, please enable `--populate_discovery_info` flag on 
scheduler. All jobs started by scheduler
+afterwards will have their portmap populated to Mesos and discoverable in 
`/state` endpoint in Mesos master and agent.
+
+### Using Mesos DNS
+An example is using [Mesos-DNS](https://github.com/mesosphere/mesos-dns), 
which is able to generate multiple DNS
+records. With current implementation, the example job with key 
`devcluster/vagrant/test/http-example` generates at
+least the following:
+
+1. An A record for `http_example.test.vagrant.aurora.mesos` (which only 
includes IP address);
+2. A [SRV record](https://en.wikipedia.org/wiki/SRV_record) for
+ `_http_example.test.vagrant._tcp.aurora.mesos`, which includes IP address and 
every port. This should only
+  be used if the service has one port.
+3. A SRV record `_{port-name}._http_example.test.vagrant._tcp.aurora.mesos` 
for each port name
+  defined. This should be used when the service has multiple ports. To have 
this working properly it's needed to
+  add `-populate_discovery_info` to scheduler's configuration.
+
+Things to note:
+
+1. The domain part (".mesos" in above example) can be configured in [Mesos 
DNS](http://mesosphere.github.io/mesos-dns/docs/configuration-parameters.html);
+2. Right now, portmap and port aliases in announcer object are not reflected 
in DiscoveryInfo, therefore not visible in
+   Mesos DNS records either. This is because they are only resolved in thermos 
executors.

Added: aurora/site/source/documentation/0.21.0/features/services.md
URL: 
http://svn.apache.org/viewvc/aurora/site/source/documentation/0.21.0/features/services.md?rev=1840515&view=auto
==============================================================================
--- aurora/site/source/documentation/0.21.0/features/services.md (added)
+++ aurora/site/source/documentation/0.21.0/features/services.md Tue Sep 11 
05:28:10 2018
@@ -0,0 +1,116 @@
+Long-running Services
+=====================
+
+Jobs that are always restart on completion, whether successful or unsuccessful,
+are called services. This is useful for long-running processes
+such as webservices that should always be running, unless stopped explicitly.
+
+
+Service Specification
+---------------------
+
+A job is identified as a service by the presence of the flag
+``service=True` in the [`Job`](../../reference/configuration/#job-objects) 
object.
+The `Service` alias can be used as shorthand for `Job` with `service=True`.
+
+Example (available in the [Vagrant 
environment](../../getting-started/vagrant/)):
+
+    $ cat /vagrant/examples/jobs/hello_world.aurora
+    hello = Process(
+      name = 'hello',
+      cmdline = """
+        while true; do
+          echo hello world
+          sleep 10
+        done
+      """)
+
+    task = SequentialTask(
+      processes = [hello],
+      resources = Resources(cpu = 1.0, ram = 128*MB, disk = 128*MB)
+    )
+
+    jobs = [
+      Service(
+        task = task,
+        cluster = 'devcluster',
+        role = 'www-data',
+        environment = 'prod',
+        name = 'hello'
+      )
+    ]
+
+
+Jobs without the service bit set only restart up to `max_task_failures` times 
and only if they
+terminated unsuccessfully either due to human error or machine failure (see the
+[`Job`](../../reference/configuration/#job-objects) object for details).
+
+
+Ports
+-----
+
+In order to be useful, most services have to bind to one or more ports. Aurora 
enables this
+usecase via the [`thermos.ports` 
namespace](../../reference/configuration/#thermos-namespace) that
+allows to request arbitrarily named ports:
+
+
+    nginx = Process(
+      name = 'nginx',
+      cmdline = './run_nginx.sh -port {{thermos.ports[http]}}'
+    )
+
+
+When this process is included in a job, the job will be allocated a port, and 
the command line
+will be replaced with something like:
+
+    ./run_nginx.sh -port 42816
+
+Where 42816 happens to be the allocated port.
+
+For details on how to enable clients to discover this dynamically assigned 
port, see our
+[Service Discovery](../service-discovery/) documentation.
+
+
+Health Checking
+---------------
+
+Typically, the Thermos executor monitors processes within a task only by 
liveness of the forked
+process. In addition to that, Aurora has support for rudimentary health 
checking: Either via HTTP
+via custom shell scripts.
+
+For example, simply by requesting a `health` port, a process can request to be 
health checked
+via repeated calls to the `/health` endpoint:
+
+    nginx = Process(
+      name = 'nginx',
+      cmdline = './run_nginx.sh -port {{thermos.ports[health]}}'
+    )
+
+Please see the
+[configuration 
reference](../../reference/configuration/#healthcheckconfig-objects)
+for configuration options for this feature.
+
+Starting with the 0.17.0 release, job updates rely only on task health-checks 
by introducing
+a `min_consecutive_successes` parameter on the HealthCheckConfig object. This 
parameter represents
+the number of successful health checks needed before a task is moved into the 
`RUNNING` state. Tasks
+that do not have enough successful health checks within the first `n` 
attempts, are moved to the
+`FAILED` state, where `n = ceil(initial_interval_secs/interval_secs) + 
max_consecutive_failures +
+min_consecutive_successes`. In order to accommodate variability during task 
warm up, `initial_interval_secs`
+will act as a grace period. Any health-check failures during the first `m` 
attempts are ignored and
+do not count towards `max_consecutive_failures`, where `m = 
ceil(initial_interval_secs/interval_secs)`.
+
+As [job updates](../job-updates/) are based only on health-checks, it is not 
necessary to set
+`watch_secs` to the worst-case update time, it can instead be set to 0. The 
scheduler considers a
+task that is in the `RUNNING` to be healthy and proceeds to updating the next 
batch of instances.
+For details on how to control health checks, please see the
+[HealthCheckConfig](../../reference/configuration/#healthcheckconfig-objects) 
configuration object.
+Existing jobs that do not configure a health-check can fall-back to using 
`watch_secs` to
+monitor a task before considering it healthy.
+
+You can pause health checking by touching a file inside of your sandbox, named 
`.healthchecksnooze`.
+As long as that file is present, health checks will be disabled, enabling 
users to gather core
+dumps or other performance measurements without worrying about Aurora's health 
check killing
+their process.
+
+WARNING: Remember to remove this when you are done, otherwise your instance 
will have permanently
+disabled health checks.

svn commit: r1840515 [11/15] - in /aurora/site: publish/blog/aurora-0-21-0-released/ publish/documentation/0.21.0/ publish/documentation/0.21.0/additional-resources/ publish/documentation/0.21.0/additional-resources/presentations/ publish/documentation...

Reply via email to