Repository: aurora Updated Branches: refs/heads/master 095009596 -> f28f41a70
http://git-wip-us.apache.org/repos/asf/aurora/blob/f28f41a7/docs/security.md ---------------------------------------------------------------------- diff --git a/docs/security.md b/docs/security.md deleted file mode 100644 index 32bea42..0000000 --- a/docs/security.md +++ /dev/null @@ -1,279 +0,0 @@ -Aurora integrates with [Apache Shiro](http://shiro.apache.org/) to provide security -controls for its API. In addition to providing some useful features out of the box, Shiro -also allows Aurora cluster administrators to adapt the security system to their organizationâs -existing infrastructure. - -- [Enabling Security](#enabling-security) -- [Authentication](#authentication) - - [HTTP Basic Authentication](#http-basic-authentication) - - [Server Configuration](#server-configuration) - - [Client Configuration](#client-configuration) - - [HTTP SPNEGO Authentication (Kerberos)](#http-spnego-authentication-kerberos) - - [Server Configuration](#server-configuration-1) - - [Client Configuration](#client-configuration-1) -- [Authorization](#authorization) - - [Using an INI file to define security controls](#using-an-ini-file-to-define-security-controls) - - [Caveats](#caveats) -- [Implementing a Custom Realm](#implementing-a-custom-realm) - - [Packaging a realm module](#packaging-a-realm-module) -- [Known Issues](#known-issues) - -# Enabling Security - -There are two major components of security: -[authentication and authorization](http://en.wikipedia.org/wiki/Authentication#Authorization). A -cluster administrator may choose the approach used for each, and may also implement custom -mechanisms for either. Later sections describe the options available. - -# Authentication - -The scheduler must be configured with instructions for how to process authentication -credentials at a minimum. There are currently two built-in authentication schemes - -[HTTP Basic Authentication](http://en.wikipedia.org/wiki/Basic_access_authentication), and -[SPNEGO](http://en.wikipedia.org/wiki/SPNEGO) (Kerberos). - -## HTTP Basic Authentication - -Basic Authentication is a very quick way to add *some* security. It is supported -by all major browsers and HTTP client libraries with minimal work. However, -before relying on Basic Authentication you should be aware of the [security -considerations](http://tools.ietf.org/html/rfc2617#section-4). - -### Server Configuration - -At a minimum you need to set 4 command-line flags on the scheduler: - -``` --http_authentication_mechanism=BASIC --shiro_realm_modules=INI_AUTHNZ --shiro_ini_path=path/to/security.ini -``` - -And create a security.ini file like so: - -``` -[users] -sally = apple, admin - -[roles] -admin = * -``` - -The details of the security.ini file are explained below. Note that this file contains plaintext, -unhashed passwords. - -### Client Configuration - -To configure the client for HTTP Basic authentication, add an entry to ~/.netrc with your credentials - -``` -% cat ~/.netrc -# ... - -machine aurora.example.com -login sally -password apple - -# ... -``` - -No changes are required to `clusters.json`. - -## HTTP SPNEGO Authentication (Kerberos) - -### Server Configuration -At a minimum you need to set 6 command-line flags on the scheduler: - -``` --http_authentication_mechanism=NEGOTIATE --shiro_realm_modules=KERBEROS5_AUTHN,INI_AUTHNZ --kerberos_server_principal=HTTP/[email protected] --kerberos_server_keytab=path/to/aurora.example.com.keytab --shiro_ini_path=path/to/security.ini -``` - -And create a security.ini file like so: - -``` -% cat path/to/security.ini -[users] -sally = _, admin - -[roles] -admin = * -``` - -What's going on here? First, Aurora must be configured to request Kerberos credentials when presented with an -unauthenticated request. This is achieved by setting - -``` --http_authentication_mechanism=NEGOTIATE -``` - -Next, a Realm module must be configured to **authenticate** the current request using the Kerberos -credentials that were requested. Aurora ships with a realm module that can do this - -``` --shiro_realm_modules=KERBEROS5_AUTHN[,...] -``` - -The Kerberos5Realm requires a keytab file and a server principal name. The principal name will usually -be in the form `HTTP/[email protected]`. - -``` --kerberos_server_principal=HTTP/[email protected] --kerberos_server_keytab=path/to/aurora.example.com.keytab -``` - -The Kerberos5 realm module is authentication-only. For scheduler security to work you must also -enable a realm module that provides an Authorizer implementation. For example, to do this using the -IniShiroRealmModule: - -``` --shiro_realm_modules=KERBEROS5_AUTHN,INI_AUTHNZ -``` - -You can then configure authorization using a security.ini file as described below -(the password field is ignored). You must configure the realm module with the path to this file: - -``` --shiro_ini_path=path/to/security.ini -``` - -### Client Configuration -To use Kerberos on the client-side you must build Kerberos-enabled client binaries. Do this with - -``` -./pants binary src/main/python/apache/aurora/kerberos:kaurora -./pants binary src/main/python/apache/aurora/kerberos:kaurora_admin -``` - -You must also configure each cluster where you've enabled Kerberos on the scheduler -to use Kerberos authentication. Do this by setting `auth_mechanism` to `KERBEROS` -in `clusters.json`. - -``` -% cat ~/.aurora/clusters.json -{ - "devcluser": { - "auth_mechanism": "KERBEROS", - ... - }, - ... -} -``` - -# Authorization -Given a means to authenticate the entity a client claims they are, we need to define what privileges they have. - -## Using an INI file to define security controls - -The simplest security configuration for Aurora is an INI file on the scheduler. For small -clusters, or clusters where the users and access controls change relatively infrequently, this is -likely the preferred approach. However you may want to avoid this approach if access permissions -are rapidly changing, or if your access control information already exists in another system. - -You can enable INI-based configuration with following scheduler command line arguments: - -``` --http_authentication_mechanism=BASIC --shiro_ini_path=path/to/security.ini -``` - -*note* As the argument name reveals, this is using Shiroâs -[IniRealm](http://shiro.apache.org/configuration.html#Configuration-INIConfiguration) behind -the scenes. - -The INI file will contain two sections - users and roles. Hereâs an example for what might -be in security.ini: - -``` -[users] -sally = apple, admin -jim = 123456, accounting -becky = letmein, webapp -larry = 654321,accounting -steve = password - -[roles] -admin = * -accounting = thrift.AuroraAdmin:setQuota -webapp = thrift.AuroraSchedulerManager:*:webapp -``` - -The users section defines user user credentials and the role(s) they are members of. These lines -are of the format `<user> = <password>[, <role>...]`. As you probably noticed, the passwords are -in plaintext and as a result read access to this file should be restricted. - -In this configuration, each user has different privileges for actions in the cluster because -of the roles they are a part of: - -* admin is granted all privileges -* accounting may adjust the amount of resource quota for any role -* webapp represents a collection of jobs that represents a service, and its members may create and modify any jobs owned by it - -### Caveats -You might find documentation on the Internet suggesting there are additional sections in `shiro.ini`, -like `[main]` and `[urls]`. These are not supported by Aurora as it uses a different mechanism to configure -those parts of Shiro. Think of Aurora's `security.ini` as a subset with only `[users]` and `[roles]` sections. - -## Implementing Delegated Authorization - -It is possible to leverage Shiro's `runAs` feature by implementing a custom Servlet Filter that provides -the capability and passing it's fully qualified class name to the command line argument -`-shiro_after_auth_filter`. The filter is registered in the same filter chain as the Shiro auth filters -and is placed after the Shiro auth filters in the filter chain. This ensures that the Filter is invoked -after the Shiro filters have had a chance to authenticate the request. - -# Implementing a Custom Realm - -Since Auroraâs security is backed by [Apache Shiro](https://shiro.apache.org), you can implement a -custom [Realm](http://shiro.apache.org/realm.html) to define organization-specific security behavior. - -In addition to using Shiro's standard APIs to implement a Realm you can link against Aurora to -access the type-safe Permissions Aurora uses. See the Javadoc for `org.apache.aurora.scheduler.spi` -for more information. - -## Packaging a realm module -Package your custom Realm(s) with a Guice module that exposes a `Set<Realm>` multibinding. - -```java -package com.example; - -import com.google.inject.AbstractModule; -import com.google.inject.multibindings.Multibinder; -import org.apache.shiro.realm.Realm; - -public class MyRealmModule extends AbstractModule { - @Override - public void configure() { - Realm myRealm = new MyRealm(); - - Multibinder.newSetBinder(binder(), Realm.class).addBinding().toInstance(myRealm); - } - - static class MyRealm implements Realm { - // Realm implementation. - } -} -``` - -To use your module in the scheduler, include it as a realm module based on its fully-qualified -class name: - -``` --shiro_realm_modules=KERBEROS5_AUTHN,INI_AUTHNZ,com.example.MyRealmModule -``` - -# Known Issues - -While the APIs and SPIs we ship with are stable as of 0.8.0, we are aware of several incremental -improvements. Please follow, vote, or send patches. - -Relevant tickets: -* [AURORA-343](https://issues.apache.org/jira/browse/AURORA-343): HTTPS support -* [AURORA-1248](https://issues.apache.org/jira/browse/AURORA-1248): Client retries 4xx errors -* [AURORA-1279](https://issues.apache.org/jira/browse/AURORA-1279): Remove kerberos-specific build targets -* [AURORA-1293](https://issues.apache.org/jira/browse/AURORA-1291): Consider defining a JSON format in place of INI -* [AURORA-1179](https://issues.apache.org/jira/browse/AURORA-1179): Supported hashed passwords in security.ini -* [AURORA-1295](https://issues.apache.org/jira/browse/AURORA-1295): Support security for the ReadOnlyScheduler service http://git-wip-us.apache.org/repos/asf/aurora/blob/f28f41a7/docs/sla.md ---------------------------------------------------------------------- diff --git a/docs/sla.md b/docs/sla.md deleted file mode 100644 index a558e00..0000000 --- a/docs/sla.md +++ /dev/null @@ -1,177 +0,0 @@ -Aurora SLA Measurement --------------- - -- [Overview](#overview) -- [Metric Details](#metric-details) - - [Platform Uptime](#platform-uptime) - - [Job Uptime](#job-uptime) - - [Median Time To Assigned (MTTA)](#median-time-to-assigned-\(mtta\)) - - [Median Time To Running (MTTR)](#median-time-to-running-\(mttr\)) -- [Limitations](#limitations) - -## Overview - -The primary goal of the feature is collection and monitoring of Aurora job SLA (Service Level -Agreements) metrics that defining a contractual relationship between the Aurora/Mesos platform -and hosted services. - -The Aurora SLA feature is by default only enabled for service (non-cron) -production jobs (`"production = True"` in your `.aurora` config). It can be enabled for -non-production services via the scheduler command line flag `-sla_non_prod_metrics`. - -Counters that track SLA measurements are computed periodically within the scheduler. -The individual instance metrics are refreshed every minute (configurable via -`sla_stat_refresh_interval`). The instance counters are subsequently aggregated by -relevant grouping types before exporting to scheduler `/vars` endpoint (when using `vagrant` -that would be `http://192.168.33.7:8081/vars`) - -## Metric Details - -### Platform Uptime - -*Aggregate amount of time a job spends in a non-runnable state due to platform unavailability -or scheduling delays. This metric tracks Aurora/Mesos uptime performance and reflects on any -system-caused downtime events (tasks LOST or DRAINED). Any user-initiated task kills/restarts -will not degrade this metric.* - -**Collection scope:** - -* Per job - `sla_<job_key>_platform_uptime_percent` -* Per cluster - `sla_cluster_platform_uptime_percent` - -**Units:** percent - -A fault in the task environment may cause the Aurora/Mesos to have different views on the task state -or lose track of the task existence. In such cases, the service task is marked as LOST and -rescheduled by Aurora. For example, this may happen when the task stays in ASSIGNED or STARTING -for too long or the Mesos slave becomes unhealthy (or disappears completely). The time between -task entering LOST and its replacement reaching RUNNING state is counted towards platform downtime. - -Another example of a platform downtime event is the administrator-requested task rescheduling. This -happens during planned Mesos slave maintenance when all slave tasks are marked as DRAINED and -rescheduled elsewhere. - -To accurately calculate Platform Uptime, we must separate platform incurred downtime from user -actions that put a service instance in a non-operational state. It is simpler to isolate -user-incurred downtime and treat all other downtime as platform incurred. - -Currently, a user can cause a healthy service (task) downtime in only two ways: via `killTasks` -or `restartShards` RPCs. For both, their affected tasks leave an audit state transition trail -relevant to uptime calculations. By applying a special "SLA meaning" to exposed task state -transition records, we can build a deterministic downtime trace for every given service instance. - -A task going through a state transition carries one of three possible SLA meanings -(see [SlaAlgorithm.java](../src/main/java/org/apache/aurora/scheduler/sla/SlaAlgorithm.java) for -sla-to-task-state mapping): - -* Task is UP: starts a period where the task is considered to be up and running from the Aurora - platform standpoint. - -* Task is DOWN: starts a period where the task cannot reach the UP state for some - non-user-related reason. Counts towards instance downtime. - -* Task is REMOVED from SLA: starts a period where the task is not expected to be UP due to - user initiated action or failure. We ignore this period for the uptime calculation purposes. - -This metric is recalculated over the last sampling period (last minute) to account for -any UP/DOWN/REMOVED events. It ignores any UP/DOWN events not immediately adjacent to the -sampling interval as well as adjacent REMOVED events. - -### Job Uptime - -*Percentage of the job instances considered to be in RUNNING state for the specified duration -relative to request time. This is a purely application side metric that is considering aggregate -uptime of all RUNNING instances. Any user- or platform initiated restarts directly affect -this metric.* - -**Collection scope:** We currently expose job uptime values at 5 pre-defined -percentiles (50th,75th,90th,95th and 99th): - -* `sla_<job_key>_job_uptime_50_00_sec` -* `sla_<job_key>_job_uptime_75_00_sec` -* `sla_<job_key>_job_uptime_90_00_sec` -* `sla_<job_key>_job_uptime_95_00_sec` -* `sla_<job_key>_job_uptime_99_00_sec` - -**Units:** seconds -You can also get customized real-time stats from aurora client. See `aurora sla -h` for -more details. - -### Median Time To Assigned (MTTA) - -*Median time a job spends waiting for its tasks to be assigned to a host. This is a combined -metric that helps track the dependency of scheduling performance on the requested resources -(user scope) as well as the internal scheduler bin-packing algorithm efficiency (platform scope).* - -**Collection scope:** - -* Per job - `sla_<job_key>_mtta_ms` -* Per cluster - `sla_cluster_mtta_ms` -* Per instance size (small, medium, large, x-large, xx-large). Size are defined in: -[ResourceAggregates.java](../src/main/java/org/apache/aurora/scheduler/base/ResourceAggregates.java) - * By CPU: - * `sla_cpu_small_mtta_ms` - * `sla_cpu_medium_mtta_ms` - * `sla_cpu_large_mtta_ms` - * `sla_cpu_xlarge_mtta_ms` - * `sla_cpu_xxlarge_mtta_ms` - * By RAM: - * `sla_ram_small_mtta_ms` - * `sla_ram_medium_mtta_ms` - * `sla_ram_large_mtta_ms` - * `sla_ram_xlarge_mtta_ms` - * `sla_ram_xxlarge_mtta_ms` - * By DISK: - * `sla_disk_small_mtta_ms` - * `sla_disk_medium_mtta_ms` - * `sla_disk_large_mtta_ms` - * `sla_disk_xlarge_mtta_ms` - * `sla_disk_xxlarge_mtta_ms` - -**Units:** milliseconds - -MTTA only considers instances that have already reached ASSIGNED state and ignores those -that are still PENDING. This ensures straggler instances (e.g. with unreasonable resource -constraints) do not affect metric curves. - -### Median Time To Running (MTTR) - -*Median time a job waits for its tasks to reach RUNNING state. This is a comprehensive metric -reflecting on the overall time it takes for the Aurora/Mesos to start executing user content.* - -**Collection scope:** - -* Per job - `sla_<job_key>_mttr_ms` -* Per cluster - `sla_cluster_mttr_ms` -* Per instance size (small, medium, large, x-large, xx-large). Size are defined in: -[ResourceAggregates.java](../src/main/java/org/apache/aurora/scheduler/base/ResourceAggregates.java) - * By CPU: - * `sla_cpu_small_mttr_ms` - * `sla_cpu_medium_mttr_ms` - * `sla_cpu_large_mttr_ms` - * `sla_cpu_xlarge_mttr_ms` - * `sla_cpu_xxlarge_mttr_ms` - * By RAM: - * `sla_ram_small_mttr_ms` - * `sla_ram_medium_mttr_ms` - * `sla_ram_large_mttr_ms` - * `sla_ram_xlarge_mttr_ms` - * `sla_ram_xxlarge_mttr_ms` - * By DISK: - * `sla_disk_small_mttr_ms` - * `sla_disk_medium_mttr_ms` - * `sla_disk_large_mttr_ms` - * `sla_disk_xlarge_mttr_ms` - * `sla_disk_xxlarge_mttr_ms` - -**Units:** milliseconds - -MTTR only considers instances in RUNNING state. This ensures straggler instances (e.g. with -unreasonable resource constraints) do not affect metric curves. - -## Limitations - -* The availability of Aurora SLA metrics is bound by the scheduler availability. - -* All metrics are calculated at a pre-defined interval (currently set at 1 minute). - Scheduler restarts may result in missed collections. http://git-wip-us.apache.org/repos/asf/aurora/blob/f28f41a7/docs/storage-config.md ---------------------------------------------------------------------- diff --git a/docs/storage-config.md b/docs/storage-config.md deleted file mode 100644 index 7c64841..0000000 --- a/docs/storage-config.md +++ /dev/null @@ -1,153 +0,0 @@ -# Storage Configuration And Maintenance - -- [Overview](#overview) -- [Scheduler storage configuration flags](#scheduler-storage-configuration-flags) - - [Mesos replicated log configuration flags](#mesos-replicated-log-configuration-flags) - - [-native_log_quorum_size](#-native_log_quorum_size) - - [-native_log_file_path](#-native_log_file_path) - - [-native_log_zk_group_path](#-native_log_zk_group_path) - - [Backup configuration flags](#backup-configuration-flags) - - [-backup_interval](#-backup_interval) - - [-backup_dir](#-backup_dir) - - [-max_saved_backups](#-max_saved_backups) -- [Recovering from a scheduler backup](#recovering-from-a-scheduler-backup) - - [Summary](#summary) - - [Preparation](#preparation) - - [Cleanup and re-initialize Mesos replicated log](#cleanup-and-re-initialize-mesos-replicated-log) - - [Restore from backup](#restore-from-backup) - - [Cleanup](#cleanup) - -## Overview - -This document summarizes Aurora storage configuration and maintenance details and is -intended for use by anyone deploying and/or maintaining Aurora. - -For a high level overview of the Aurora storage architecture refer to [this document](storage.md). - -## Scheduler storage configuration flags - -Below is a summary of scheduler storage configuration flags that either don't have default values -or require attention before deploying in a production environment. - -### Mesos replicated log configuration flags - -#### -native_log_quorum_size -Defines the Mesos replicated log quorum size. See -[the replicated log configuration document](deploying-aurora-scheduler.md#replicated-log-configuration) -on how to choose the right value. - -#### -native_log_file_path -Location of the Mesos replicated log files. Consider allocating a dedicated disk (preferably SSD) -for Mesos replicated log files to ensure optimal storage performance. - -#### -native_log_zk_group_path -ZooKeeper path used for Mesos replicated log quorum discovery. - -See [code](../src/main/java/org/apache/aurora/scheduler/log/mesos/MesosLogStreamModule.java) for -other available Mesos replicated log configuration options and default values. - -### Backup configuration flags - -Configuration options for the Aurora scheduler backup manager. - -#### -backup_interval -The interval on which the scheduler writes local storage backups. The default is every hour. - -#### -backup_dir -Directory to write backups to. - -#### -max_saved_backups -Maximum number of backups to retain before deleting the oldest backup(s). - -## Recovering from a scheduler backup - -**Be sure to read the entire page before attempting to restore from a backup, as it may have -unintended consequences.** - -### Summary - -The restoration procedure replaces the existing (possibly corrupted) Mesos replicated log with an -earlier, backed up, version and requires all schedulers to be taken down temporarily while -restoring. Once completed, the scheduler state resets to what it was when the backup was created. -This means any jobs/tasks created or updated after the backup are unknown to the scheduler and will -be killed shortly after the cluster restarts. All other tasks continue operating as normal. - -Usually, it is a bad idea to restore a backup that is not extremely recent (i.e. older than a few -hours). This is because the scheduler will expect the cluster to look exactly as the backup does, -so any tasks that have been rescheduled since the backup was taken will be killed. - -Instructions below have been verified in [Vagrant environment](vagrant.md) and with minor -syntax/path changes should be applicable to any Aurora cluster. - -### Preparation - -Follow these steps to prepare the cluster for restoring from a backup: - -* Stop all scheduler instances - -* Consider blocking external traffic on a port defined in `-http_port` for all schedulers to -prevent users from interacting with the scheduler during the restoration process. This will help -troubleshooting by reducing the scheduler log noise and prevent users from making changes that will -be erased after the backup snapshot is restored. - -* Configure `aurora_admin` access to run all commands listed in - [Restore from backup](#restore-from-backup) section locally on the leading scheduler: - * Make sure the [clusters.json](client-commands.md#cluster-configuration) file configured to - access scheduler directly. Set `scheduler_uri` setting and remove `zk`. Since leader can get - re-elected during the restore steps, consider doing it on all scheduler replicas. - * Depending on your particular security approach you will need to either turn off scheduler - authorization by removing scheduler `-http_authentication_mechanism` flag or make sure the - direct scheduler access is properly authorized. E.g.: in case of Kerberos you will need to make - a `/etc/hosts` file change to match your local IP to the scheduler URL configured in keytabs: - - <local_ip> <scheduler_domain_in_keytabs> - -* Next steps are required to put scheduler into a partially disabled state where it would still be -able to accept storage recovery requests but unable to schedule or change task states. This may be -accomplished by updating the following scheduler configuration options: - * Set `-mesos_master_address` to a non-existent zk address. This will prevent scheduler from - registering with Mesos. E.g.: `-mesos_master_address=zk://localhost:1111/mesos/master` - * `-max_registration_delay` - set to sufficiently long interval to prevent registration timeout - and as a result scheduler suicide. E.g: `-max_registration_delay=360mins` - * Make sure `-reconciliation_initial_delay` option is set high enough (e.g.: `365days`) to - prevent accidental task GC. This is important as scheduler will attempt to reconcile the cluster - state and will kill all tasks when restarted with an empty Mesos replicated log. - -* Restart all schedulers - -### Cleanup and re-initialize Mesos replicated log - -Get rid of the corrupted files and re-initialize Mesos replicated log: - -* Stop schedulers -* Delete all files under `-native_log_file_path` on all schedulers -* Initialize Mesos replica's log file: `sudo mesos-log initialize --path=<-native_log_file_path>` -* Start schedulers - -### Restore from backup - -At this point the scheduler is ready to rehydrate from the backup: - -* Identify the leading scheduler by: - * examining the `scheduler_lifecycle_LEADER_AWAITING_REGISTRATION` metric at the scheduler - `/vars` endpoint. Leader will have 1. All other replicas - 0. - * examining scheduler logs - * or examining Zookeeper registration under the path defined by `-zk_endpoints` - and `-serverset_path` - -* Locate the desired backup file, copy it to the leading scheduler's `-backup_dir` folder and stage -recovery by running the following command on a leader -`aurora_admin scheduler_stage_recovery --bypass-leader-redirect <cluster> scheduler-backup-<yyyy-MM-dd-HH-mm>` - -* At this point, the recovery snapshot is staged and available for manual verification/modification -via `aurora_admin scheduler_print_recovery_tasks --bypass-leader-redirect` and -`scheduler_delete_recovery_tasks --bypass-leader-redirect` commands. -See `aurora_admin help <command>` for usage details. - -* Commit recovery. This instructs the scheduler to overwrite the existing Mesos replicated log with -the provided backup snapshot and initiate a mandatory failover -`aurora_admin scheduler_commit_recovery --bypass-leader-redirect <cluster>` - -### Cleanup -Undo any modification done during [Preparation](#preparation) sequence. - http://git-wip-us.apache.org/repos/asf/aurora/blob/f28f41a7/docs/storage.md ---------------------------------------------------------------------- diff --git a/docs/storage.md b/docs/storage.md deleted file mode 100644 index 6ffed54..0000000 --- a/docs/storage.md +++ /dev/null @@ -1,88 +0,0 @@ -#Aurora Scheduler Storage - -- [Overview](#overview) -- [Reads, writes, modifications](#reads-writes-modifications) - - [Read lifecycle](#read-lifecycle) - - [Write lifecycle](#write-lifecycle) -- [Atomicity, consistency and isolation](#atomicity-consistency-and-isolation) -- [Population on restart](#population-on-restart) - -## Overview - -Aurora scheduler maintains data that need to be persisted to survive failovers and restarts. -For example: - -* Task configurations and scheduled task instances -* Job update configurations and update progress -* Production resource quotas -* Mesos resource offer host attributes - -Aurora solves its persistence needs by leveraging the Mesos implementation of a Paxos replicated -log [[1]](https://ramcloud.stanford.edu/~ongaro/userstudy/paxos.pdf) -[[2]](http://en.wikipedia.org/wiki/State_machine_replication) with a key-value -[LevelDB](https://github.com/google/leveldb) storage as persistence media. - -Conceptually, it can be represented by the following major components: - -* Volatile storage: in-memory cache of all available data. Implemented via in-memory -[H2 Database](http://www.h2database.com/html/main.html) and accessed via -[MyBatis](http://mybatis.github.io/mybatis-3/). -* Log manager: interface between Aurora storage and Mesos replicated log. The default schema format -is [thrift](https://github.com/apache/thrift). Data is stored in serialized binary form. -* Snapshot manager: all data is periodically persisted in Mesos replicated log in a single snapshot. -This helps establishing periodic recovery checkpoints and speeds up volatile storage recovery on -restart. -* Backup manager: as a precaution, snapshots are periodically written out into backup files. -This solves a [disaster recovery problem](storage-config.md#recovering-from-a-scheduler-backup) -in case of a complete loss or corruption of Mesos log files. - - - -## Reads, writes, modifications - -All services in Aurora access data via a set of predefined store interfaces (aka stores) logically -grouped by the type of data they serve. Every interface defines a specific set of operations allowed -on the data thus abstracting out the storage access and the actual persistence implementation. The -latter is especially important in view of a general immutability of persisted data. With the Mesos -replicated log as the underlying persistence solution, data can be read and written easily but not -modified. All modifications are simulated by saving new versions of modified objects. This feature -and general performance considerations justify the existence of the volatile in-memory store. - -### Read lifecycle - -There are two types of reads available in Aurora: consistent and weakly-consistent. The difference -is explained [below](#atomicity-and-isolation). - -All reads are served from the volatile storage making reads generally cheap storage operations -from the performance standpoint. The majority of the volatile stores are represented by the -in-memory H2 database. This allows for rich schema definitions, queries and relationships that -key-value storage is unable to match. - -### Write lifecycle - -Writes are more involved operations since in addition to updating the volatile store data has to be -appended to the replicated log. Data is not available for reads until fully ack-ed by both -replicated log and volatile storage. - -## Atomicity, consistency and isolation - -Aurora uses [write-ahead logging](http://en.wikipedia.org/wiki/Write-ahead_logging) to ensure -consistency between replicated and volatile storage. In Aurora, data is first written into the -replicated log and only then updated in the volatile store. - -Aurora storage uses read-write locks to serialize data mutations and provide consistent view of the -available data. The available `Storage` interface exposes 3 major types of operations: -* `consistentRead` - access is locked using reader's lock and provides consistent view on read -* `weaklyConsistentRead` - access is lock-less. Delivers best contention performance but may result -in stale reads -* `write` - access is fully serialized by using writer's lock. Operation success requires both -volatile and replicated writes to succeed. - -The consistency of the volatile store is enforced via H2 transactional isolation. - -## Population on restart - -Any time a scheduler restarts, it restores its volatile state from the most recent position recorded -in the replicated log by restoring the snapshot and replaying individual log entries on top to fully -recover the state up to the last write. - http://git-wip-us.apache.org/repos/asf/aurora/blob/f28f41a7/docs/task-lifecycle.md ---------------------------------------------------------------------- diff --git a/docs/task-lifecycle.md b/docs/task-lifecycle.md deleted file mode 100644 index 5d6456c..0000000 --- a/docs/task-lifecycle.md +++ /dev/null @@ -1,146 +0,0 @@ -# Task Lifecycle - -When Aurora reads a configuration file and finds a `Job` definition, it: - -1. Evaluates the `Job` definition. -2. Splits the `Job` into its constituent `Task`s. -3. Sends those `Task`s to the scheduler. -4. The scheduler puts the `Task`s into `PENDING` state, starting each - `Task`'s life cycle. - - - - -Please note, a couple of task states described below are missing from -this state diagram. - - -## PENDING to RUNNING states - -When a `Task` is in the `PENDING` state, the scheduler constantly -searches for machines satisfying that `Task`'s resource request -requirements (RAM, disk space, CPU time) while maintaining configuration -constraints such as "a `Task` must run on machines dedicated to a -particular role" or attribute limit constraints such as "at most 2 -`Task`s from the same `Job` may run on each rack". When the scheduler -finds a suitable match, it assigns the `Task` to a machine and puts the -`Task` into the `ASSIGNED` state. - -From the `ASSIGNED` state, the scheduler sends an RPC to the slave -machine containing `Task` configuration, which the slave uses to spawn -an executor responsible for the `Task`'s lifecycle. When the scheduler -receives an acknowledgment that the machine has accepted the `Task`, -the `Task` goes into `STARTING` state. - -`STARTING` state initializes a `Task` sandbox. When the sandbox is fully -initialized, Thermos begins to invoke `Process`es. Also, the slave -machine sends an update to the scheduler that the `Task` is -in `RUNNING` state. - - - -## RUNNING to terminal states - -There are various ways that an active `Task` can transition into a terminal -state. By definition, it can never leave this state. However, depending on -nature of the termination and the originating `Job` definition -(e.g. `service`, `max_task_failures`), a replacement `Task` might be -scheduled. - -### Natural Termination: FINISHED, FAILED - -A `RUNNING` `Task` can terminate without direct user interaction. For -example, it may be a finite computation that finishes, even something as -simple as `echo hello world.`, or it could be an exceptional condition in -a long-lived service. If the `Task` is successful (its underlying -processes have succeeded with exit status `0` or finished without -reaching failure limits) it moves into `FINISHED` state. If it finished -after reaching a set of failure limits, it goes into `FAILED` state. - -A terminated `TASK` which is subject to rescheduling will be temporarily -`THROTTLED`, if it is considered to be flapping. A task is flapping, if its -previous invocation was terminated after less than 5 minutes (scheduler -default). The time penalty a task has to remain in the `THROTTLED` state, -before it is eligible for rescheduling, increases with each consecutive -failure. - -### Forceful Termination: KILLING, RESTARTING - -You can terminate a `Task` by issuing an `aurora job kill` command, which -moves it into `KILLING` state. The scheduler then sends the slave a -request to terminate the `Task`. If the scheduler receives a successful -response, it moves the Task into `KILLED` state and never restarts it. - -If a `Task` is forced into the `RESTARTING` state via the `aurora job restart` -command, the scheduler kills the underlying task but in parallel schedules -an identical replacement for it. - -In any case, the responsible executor on the slave follows an escalation -sequence when killing a running task: - - 1. If a `HttpLifecycleConfig` is not present, skip to (4). - 2. Send a POST to the `graceful_shutdown_endpoint` and wait 5 seconds. - 3. Send a POST to the `shutdown_endpoint` and wait 5 seconds. - 4. Send SIGTERM (`kill`) and wait at most `finalization_wait` seconds. - 5. Send SIGKILL (`kill -9`). - -If the executor notices that all `Process`es in a `Task` have aborted -during this sequence, it will not proceed with subsequent steps. -Note that graceful shutdown is best-effort, and due to the many -inevitable realities of distributed systems, it may not be performed. - -### Unexpected Termination: LOST - -If a `Task` stays in a transient task state for too long (such as `ASSIGNED` -or `STARTING`), the scheduler forces it into `LOST` state, creating a new -`Task` in its place that's sent into `PENDING` state. - -In addition, if the Mesos core tells the scheduler that a slave has -become unhealthy (or outright disappeared), the `Task`s assigned to that -slave go into `LOST` state and new `Task`s are created in their place. -From `PENDING` state, there is no guarantee a `Task` will be reassigned -to the same machine unless job constraints explicitly force it there. - -### Giving Priority to Production Tasks: PREEMPTING - -Sometimes a Task needs to be interrupted, such as when a non-production -Task's resources are needed by a higher priority production Task. This -type of interruption is called a *pre-emption*. When this happens in -Aurora, the non-production Task is killed and moved into -the `PREEMPTING` state when both the following are true: - -- The task being killed is a non-production task. -- The other task is a `PENDING` production task that hasn't been - scheduled due to a lack of resources. - -The scheduler UI shows the non-production task was preempted in favor of -the production task. At some point, tasks in `PREEMPTING` move to `KILLED`. - -Note that non-production tasks consuming many resources are likely to be -preempted in favor of production tasks. - -### Making Room for Maintenance: DRAINING - -Cluster operators can set slave into maintenance mode. This will transition -all `Task` running on this slave into `DRAINING` and eventually to `KILLED`. -Drained `Task`s will be restarted on other slaves for which no maintenance -has been announced yet. - - - -## State Reconciliation - -Due to the many inevitable realities of distributed systems, there might -be a mismatch of perceived and actual cluster state (e.g. a machine returns -from a `netsplit` but the scheduler has already marked all its `Task`s as -`LOST` and rescheduled them). - -Aurora regularly runs a state reconciliation process in order to detect -and correct such issues (e.g. by killing the errant `RUNNING` tasks). -By default, the proper detection of all failure scenarios and inconsistencies -may take up to an hour. - -To emphasize this point: there is no uniqueness guarantee for a single -instance of a job in the presence of network partitions. If the `Task` -requires that, it should be baked in at the application level using a -distributed coordination service such as Zookeeper. http://git-wip-us.apache.org/repos/asf/aurora/blob/f28f41a7/docs/test-resource-generation.md ---------------------------------------------------------------------- diff --git a/docs/test-resource-generation.md b/docs/test-resource-generation.md deleted file mode 100644 index e78e742..0000000 --- a/docs/test-resource-generation.md +++ /dev/null @@ -1,24 +0,0 @@ -# Generating test resources - -## Background -The Aurora source repository and distributions contain several -[binary files](../src/test/resources/org/apache/thermos/root/checkpoints) to -qualify the backwards-compatibility of thermos with checkpoint data. Since -thermos persists state to disk, to be read by the thermos observer), it is important that we have -tests that prevent regressions affecting the ability to parse previously-written data. - -## Generating test files -The files included represent persisted checkpoints that exercise different -features of thermos. The existing files should not be modified unless -we are accepting backwards incompatibility, such as with a major release. - -It is not practical to write source code to generate these files on the fly, -as source would be vulnerable to drift (e.g. due to refactoring) in ways -that would undermine the goal of ensuring backwards compatibility. - -The most common reason to add a new checkpoint file would be to provide -coverage for new thermos features that alter the data format. This is -accomplished by writing and running a -[job configuration](configuration-reference.md) that exercises the feature, and -copying the checkpoint file from the sandbox directory, by default this is -`/var/run/thermos/checkpoints/<aurora task id>`. http://git-wip-us.apache.org/repos/asf/aurora/blob/f28f41a7/docs/thrift-deprecation.md ---------------------------------------------------------------------- diff --git a/docs/thrift-deprecation.md b/docs/thrift-deprecation.md deleted file mode 100644 index 62a71bc..0000000 --- a/docs/thrift-deprecation.md +++ /dev/null @@ -1,54 +0,0 @@ -# Thrift API Changes - -## Overview -Aurora uses [Apache Thrift](https://thrift.apache.org/) for representing structured data in -client/server RPC protocol as well as for internal data storage. While Thrift is capable of -correctly handling additions and renames of the existing members, field removals must be done -carefully to ensure backwards compatibility and provide predictable deprecation cycle. This -document describes general guidelines for making Thrift schema changes to the existing fields in -[api.thrift](../api/src/main/thrift/org/apache/aurora/gen/api.thrift). - -It is highly recommended to go through the -[Thrift: The Missing Guide](http://diwakergupta.github.io/thrift-missing-guide/) first to refresh on -basic Thrift schema concepts. - -## Checklist -Every existing Thrift schema modification is unique in its requirements and must be analyzed -carefully to identify its scope and expected consequences. The following checklist may help in that -analysis: -* Is this a new field/struct? If yes, go ahead -* Is this a pure field/struct rename without any type/structure change? If yes, go ahead and rename -* Anything else, read further to make sure your change is properly planned - -## Deprecation cycle -Any time a breaking change (e.g.: field replacement or removal) is required, the following cycle -must be followed: - -### vCurrent -Change is applied in a way that does not break scheduler/client with this version to -communicate with scheduler/client from vCurrent-1. -* Do not remove or rename the old field -* Add a new field as an eventual replacement of the old one and implement a dual read/write -anywhere the old field is used. If a thrift struct is mapped in the DB store make sure both columns -are marked as `NOT NULL` -* Check [storage.thrift](../api/src/main/thrift/org/apache/aurora/gen/storage.thrift) to see if the -affected struct is stored in Aurora scheduler storage. If so, you most likely need to backfill -existing data to ensure both fields are populated eagerly on startup. See -[this patch](https://reviews.apache.org/r/43172) as a real-life example of thrift-struct -backfilling. IMPORTANT: backfilling implementation needs to ensure both fields are populated. This -is critical to enable graceful scheduler upgrade as well as rollback to the old version if needed. -* Add a deprecation jira ticket into the vCurrent+1 release candidate -* Add a TODO for the deprecated field mentioning the jira ticket - -### vCurrent+1 -Finalize the change by removing the deprecated fields from the Thrift schema. -* Drop any dual read/write routines added in the previous version -* Remove thrift backfilling in scheduler -* Remove the deprecated Thrift field - -## Testing -It's always advisable to test your changes in the local vagrant environment to build more -confidence that you change is backwards compatible. It's easy to simulate different -client/scheduler versions by playing with `aurorabuild` command. See [this document](vagrant.md) -for more. - http://git-wip-us.apache.org/repos/asf/aurora/blob/f28f41a7/docs/tools.md ---------------------------------------------------------------------- diff --git a/docs/tools.md b/docs/tools.md deleted file mode 100644 index 2ae550d..0000000 --- a/docs/tools.md +++ /dev/null @@ -1,16 +0,0 @@ -# Tools - -Various tools integrate with Aurora. Is there a tool missing? Let us know, or submit a patch to add it! - -* Load-balacing technology used to direct traffic to services running on Aurora - - [synapse](https://github.com/airbnb/synapse) based on HAProxy - - [aurproxy](https://github.com/tellapart/aurproxy) based on nginx - - [jobhopper](https://github.com/benley/aurora-jobhopper) performing HTTP redirects for easy developers and administor access - -* Monitoring - - [collectd-aurora](https://github.com/zircote/collectd-aurora) for cluster monitoring using collectd - - [Prometheus Aurora exporter](https://github.com/tommyulfsparre/aurora_exporter) for cluster monitoring using Prometheus - - [Prometheus service discovery integration](http://prometheus.io/docs/operating/configuration/#zookeeper-serverset-sd-configurations-serverset_sd_config) for discovering and monitoring services running on Aurora - -* Packaging and deployment - - [aurora-packaging](https://github.com/apache/aurora-packaging), the source of the official Aurora packaes http://git-wip-us.apache.org/repos/asf/aurora/blob/f28f41a7/docs/tutorial.md ---------------------------------------------------------------------- diff --git a/docs/tutorial.md b/docs/tutorial.md deleted file mode 100644 index 95539ef..0000000 --- a/docs/tutorial.md +++ /dev/null @@ -1,260 +0,0 @@ -# Aurora Tutorial - -This tutorial shows how to use the Aurora scheduler to run (and "`printf-debug`") -a hello world program on Mesos. This is the recommended document for new Aurora users -to start getting up to speed on the system. - -- [Prerequisite](#setup-install-aurora) -- [The Script](#the-script) -- [Aurora Configuration](#aurora-configuration) -- [Creating the Job](#creating-the-job) -- [Watching the Job Run](#watching-the-job-run) -- [Cleanup](#cleanup) -- [Next Steps](#next-steps) - - -## Prerequisite - -This tutorial assumes you are running [Aurora locally using Vagrant](vagrant.md). -However, in general the instructions are also applicable to any other -[Aurora installation](installing.md). - -Unless otherwise stated, all commands are to be run from the root of the aurora -repository clone. - - -## The Script - -Our "hello world" application is a simple Python script that loops -forever, displaying the time every few seconds. Copy the code below and -put it in a file named `hello_world.py` in the root of your Aurora repository clone -(Note: this directory is the same as `/vagrant` inside the Vagrant VMs). - -The script has an intentional bug, which we will explain later on. - -<!-- NOTE: If you are changing this file, be sure to also update examples/vagrant/test_tutorial.sh. ---> -```python -import time - -def main(): - SLEEP_DELAY = 10 - # Python ninjas - ignore this blatant bug. - for i in xrang(100): - print("Hello world! The time is now: %s. Sleeping for %d secs" % ( - time.asctime(), SLEEP_DELAY)) - time.sleep(SLEEP_DELAY) - -if __name__ == "__main__": - main() -``` - -## Aurora Configuration - -Once we have our script/program, we need to create a *configuration -file* that tells Aurora how to manage and launch our Job. Save the below -code in the file `hello_world.aurora`. - -<!-- NOTE: If you are changing this file, be sure to also update examples/vagrant/test_tutorial.sh. ---> -```python -pkg_path = '/vagrant/hello_world.py' - -# we use a trick here to make the configuration change with -# the contents of the file, for simplicity. in a normal setting, packages would be -# versioned, and the version number would be changed in the configuration. -import hashlib -with open(pkg_path, 'rb') as f: - pkg_checksum = hashlib.md5(f.read()).hexdigest() - -# copy hello_world.py into the local sandbox -install = Process( - name = 'fetch_package', - cmdline = 'cp %s . && echo %s && chmod +x hello_world.py' % (pkg_path, pkg_checksum)) - -# run the script -hello_world = Process( - name = 'hello_world', - cmdline = 'python -u hello_world.py') - -# describe the task -hello_world_task = SequentialTask( - processes = [install, hello_world], - resources = Resources(cpu = 1, ram = 1*MB, disk=8*MB)) - -jobs = [ - Service(cluster = 'devcluster', - environment = 'devel', - role = 'www-data', - name = 'hello_world', - task = hello_world_task) -] -``` - -There is a lot going on in that configuration file: - -1. From a "big picture" viewpoint, it first defines two -Processes. Then it defines a Task that runs the two Processes in the -order specified in the Task definition, as well as specifying what -computational and memory resources are available for them. Finally, -it defines a Job that will schedule the Task on available and suitable -machines. This Job is the sole member of a list of Jobs; you can -specify more than one Job in a config file. - -2. At the Process level, it specifies how to get your code into the -local sandbox in which it will run. It then specifies how the code is -actually run once the second Process starts. - -For more about Aurora configuration files, see the [Configuration -Tutorial](configuration-tutorial.md) and the [Aurora + Thermos -Reference](configuration-reference.md) (preferably after finishing this -tutorial). - - -## Creating the Job - -We're ready to launch our job! To do so, we use the Aurora Client to -issue a Job creation request to the Aurora scheduler. - -Many Aurora Client commands take a *job key* argument, which uniquely -identifies a Job. A job key consists of four parts, each separated by a -"/". The four parts are `<cluster>/<role>/<environment>/<jobname>` -in that order: - -* Cluster refers to the name of a particular Aurora installation. -* Role names are user accounts existing on the slave machines. If you -don't know what accounts are available, contact your sysadmin. -* Environment names are namespaces; you can count on `test`, `devel`, -`staging` and `prod` existing. -* Jobname is the custom name of your job. - -When comparing two job keys, if any of the four parts is different from -its counterpart in the other key, then the two job keys identify two separate -jobs. If all four values are identical, the job keys identify the same job. - -The `clusters.json` [client configuration](client-cluster-configuration.md) -for the Aurora scheduler defines the available cluster names. -For Vagrant, from the top-level of your Aurora repository clone, do: - - $ vagrant ssh - -Followed by: - - vagrant@aurora:~$ cat /etc/aurora/clusters.json - -You'll see something like the following. The `name` value shown here, corresponds to a job key's cluster value. - -```javascript -[{ - "name": "devcluster", - "zk": "192.168.33.7", - "scheduler_zk_path": "/aurora/scheduler", - "auth_mechanism": "UNAUTHENTICATED", - "slave_run_directory": "latest", - "slave_root": "/var/lib/mesos" -}] -``` - -The Aurora Client command that actually runs our Job is `aurora job create`. It creates a Job as -specified by its job key and configuration file arguments and runs it. - - aurora job create <cluster>/<role>/<environment>/<jobname> <config_file> - -Or for our example: - - aurora job create devcluster/www-data/devel/hello_world /vagrant/hello_world.aurora - -After entering our virtual machine using `vagrant ssh`, this returns: - - vagrant@aurora:~$ aurora job create devcluster/www-data/devel/hello_world /vagrant/hello_world.aurora - INFO] Creating job hello_world - INFO] Checking status of devcluster/www-data/devel/hello_world - Job create succeeded: job url=http://aurora.local:8081/scheduler/www-data/devel/hello_world - - -## Watching the Job Run - -Now that our job is running, let's see what it's doing. Access the -scheduler web interface at `http://$scheduler_hostname:$scheduler_port/scheduler` -Or when using `vagrant`, `http://192.168.33.7:8081/scheduler` -First we see what Jobs are scheduled: - - - -Click on your user name, which in this case was `www-data`, and we see the Jobs associated -with that role: - - - -If you click on your `hello_world` Job, you'll see: - - - -Oops, looks like our first job didn't quite work! The task is temporarily throttled for -having failed on every attempt of the Aurora scheduler to run it. We have to figure out -what is going wrong. - -On the Completed tasks tab, we see all past attempts of the Aurora scheduler to run our job. - - - -We can navigate to the Task page of a failed run by clicking on the host link. - - - -Once there, we see that the `hello_world` process failed. The Task page -captures the standard error and standard output streams and makes them available. -Clicking through to `stderr` on the failed `hello_world` process, we see what happened. - - - -It looks like we made a typo in our Python script. We wanted `xrange`, -not `xrang`. Edit the `hello_world.py` script to use the correct function -and save it as `hello_world_v2.py`. Then update the `hello_world.aurora` -configuration to the newest version. - -In order to try again, we can now instruct the scheduler to update our job: - - vagrant@aurora:~$ aurora update start devcluster/www-data/devel/hello_world /vagrant/hello_world.aurora - INFO] Starting update for: hello_world - Job update has started. View your update progress at http://aurora.local:8081/scheduler/www-data/devel/hello_world/update/8ef38017-e60f-400d-a2f2-b5a8b724e95b - -This time, the task comes up. - - - -By again clicking on the host, we inspect the Task page, and see that the -`hello_world` process is running. - - - -We then inspect the output by clicking on `stdout` and see our process' -output: - - - -## Cleanup - -Now that we're done, we kill the job using the Aurora client: - - vagrant@aurora:~$ aurora job killall devcluster/www-data/devel/hello_world - INFO] Killing tasks for job: devcluster/www-data/devel/hello_world - INFO] Instances to be killed: [0] - Successfully killed instances [0] - Job killall succeeded - -The job page now shows the `hello_world` tasks as completed. - - - -## Next Steps - -Now that you've finished this Tutorial, you should read or do the following: - -- [The Aurora Configuration Tutorial](configuration-tutorial.md), which provides more examples - and best practices for writing Aurora configurations. You should also look at - the [Aurora + Thermos Configuration Reference](configuration-reference.md). -- The [Aurora User Guide](user-guide.md) provides an overview of how Aurora, Mesos, and - Thermos work "under the hood". -- Explore the Aurora Client - use `aurora -h`, and read the - [Aurora Client Commands](client-commands.md) document. http://git-wip-us.apache.org/repos/asf/aurora/blob/f28f41a7/docs/user-guide.md ---------------------------------------------------------------------- diff --git a/docs/user-guide.md b/docs/user-guide.md deleted file mode 100644 index 656296c..0000000 --- a/docs/user-guide.md +++ /dev/null @@ -1,244 +0,0 @@ -Aurora User Guide ------------------ - -- [Overview](#user-content-overview) -- [Job Lifecycle](#user-content-job-lifecycle) - - [Task Updates](#user-content-task-updates) - - [HTTP Health Checking](#user-content-http-health-checking) -- [Service Discovery](#user-content-service-discovery) -- [Configuration](#user-content-configuration) -- [Creating Jobs](#user-content-creating-jobs) -- [Interacting With Jobs](#user-content-interacting-with-jobs) - -Overview --------- - -This document gives an overview of how Aurora works under the hood. -It assumes you've already worked through the "hello world" example -job in the [Aurora Tutorial](tutorial.md). Specifics of how to use Aurora are **not** - given here, but pointers to documentation about how to use Aurora are -provided. - -Aurora is a Mesos framework used to schedule *jobs* onto Mesos. Mesos -cares about individual *tasks*, but typical jobs consist of dozens or -hundreds of task replicas. Aurora provides a layer on top of Mesos with -its `Job` abstraction. An Aurora `Job` consists of a task template and -instructions for creating near-identical replicas of that task (modulo -things like "instance id" or specific port numbers which may differ from -machine to machine). - -How many tasks make up a Job is complicated. On a basic level, a Job consists of -one task template and instructions for creating near-idential replicas of that task -(otherwise referred to as "instances" or "shards"). - -However, since Jobs can be updated on the fly, a single Job identifier or *job key* -can have multiple job configurations associated with it. - -For example, consider when I have a Job with 4 instances that each -request 1 core of cpu, 1 GB of RAM, and 1 GB of disk space as specified -in the configuration file `hello_world.aurora`. I want to -update it so it requests 2 GB of RAM instead of 1. I create a new -configuration file to do that called `new_hello_world.aurora` and -issue a `aurora update start <job_key_value>/0-1 new_hello_world.aurora` -command. - -This results in instances 0 and 1 having 1 cpu, 2 GB of RAM, and 1 GB of disk space, -while instances 2 and 3 have 1 cpu, 1 GB of RAM, and 1 GB of disk space. If instance 3 -dies and restarts, it restarts with 1 cpu, 1 GB RAM, and 1 GB disk space. - -So that means there are two simultaneous task configurations for the same Job -at the same time, just valid for different ranges of instances. - -This isn't a recommended pattern, but it is valid and supported by the -Aurora scheduler. This most often manifests in the "canary pattern" where -instance 0 runs with a different configuration than instances 1-N to test -different code versions alongside the actual production job. - -A task can merely be a single *process* corresponding to a single -command line, such as `python2.6 my_script.py`. However, a task can also -consist of many separate processes, which all run within a single -sandbox. For example, running multiple cooperating agents together, -such as `logrotate`, `installer`, master, or slave processes. This is -where Thermos comes in. While Aurora provides a `Job` abstraction on -top of Mesos `Tasks`, Thermos provides a `Process` abstraction -underneath Mesos `Task`s and serves as part of the Aurora framework's -executor. - -You define `Job`s,` Task`s, and `Process`es in a configuration file. -Configuration files are written in Python, and make use of the Pystachio -templating language. They end in a `.aurora` extension. - -Pystachio is a type-checked dictionary templating library. - -> TL;DR -> -> - Aurora manages jobs made of tasks. -> - Mesos manages tasks made of processes. -> - Thermos manages processes. -> - All defined in `.aurora` configuration file. - - - -Each `Task` has a *sandbox* created when the `Task` starts and garbage -collected when it finishes. All of a `Task'`s processes run in its -sandbox, so processes can share state by using a shared current working -directory. - -The sandbox garbage collection policy considers many factors, most -importantly age and size. It makes a best-effort attempt to keep -sandboxes around as long as possible post-task in order for service -owners to inspect data and logs, should the `Task` have completed -abnormally. But you can't design your applications assuming sandboxes -will be around forever, e.g. by building log saving or other -checkpointing mechanisms directly into your application or into your -`Job` description. - - -Job Lifecycle -------------- - -`Job`s and their `Task`s have various states that are described in the [Task Lifecycle](task-lifecycle.md). -However, in day to day use, you'll be primarily concerned with launching new jobs and updating existing ones. - - -### Task Updates - -`Job` configurations can be updated at any point in their lifecycle. -Usually updates are done incrementally using a process called a *rolling -upgrade*, in which Tasks are upgraded in small groups, one group at a -time. Updates are done using various Aurora Client commands. - -For a configuration update, the Aurora Client calculates required changes -by examining the current job config state and the new desired job config. -It then starts a rolling batched update process by going through every batch -and performing these operations: - -- If an instance is present in the scheduler but isn't in the new config, - then that instance is killed. -- If an instance is not present in the scheduler but is present in - the new config, then the instance is created. -- If an instance is present in both the scheduler and the new config, then - the client diffs both task configs. If it detects any changes, it - performs an instance update by killing the old config instance and adds - the new config instance. - -The Aurora client continues through the instance list until all tasks are -updated, in `RUNNING,` and healthy for a configurable amount of time. -If the client determines the update is not going well (a percentage of health -checks have failed), it cancels the update. - -Update cancellation runs a procedure similar to the described above -update sequence, but in reverse order. New instance configs are swapped -with old instance configs and batch updates proceed backwards -from the point where the update failed. E.g.; (0,1,2) (3,4,5) (6,7, -8-FAIL) results in a rollback in order (8,7,6) (5,4,3) (2,1,0). - -### HTTP Health Checking - -The Executor implements a protocol for rudimentary control of a task via HTTP. Tasks subscribe for -this protocol by declaring a port named `health`. Take for example this configuration snippet: - - nginx = Process( - name = 'nginx', - cmdline = './run_nginx.sh -port {{thermos.ports[health]}}') - -When this Process is included in a job, the job will be allocated a port, and the command line -will be replaced with something like: - - ./run_nginx.sh -port 42816 - -Where 42816 happens to be the allocated. port. Typically, the Executor monitors Processes within -a task only by liveness of the forked process. However, when a `health` port was allocated, it will -also send periodic HTTP health checks. A task requesting a `health` port must handle the following -requests: - -| HTTP request | Description | -| ------------ | ----------- | -| `GET /health` | Inquires whether the task is healthy. | - -Please see the -[configuration reference](configuration-reference.md#user-content-healthcheckconfig-objects) for -configuration options for this feature. - -#### Snoozing Health Checks - -If you need to pause your health check, you can do so by touching a file inside of your sandbox, -named `.healthchecksnooze` - -As long as that file is present, health checks will be disabled, enabling users to gather core dumps -or other performance measurements without worrying about Aurora's health check killing their -process. - -WARNING: Remember to remove this when you are done, otherwise your instance will have permanently -disabled health checks. - - -Configuration -------------- - -You define and configure your Jobs (and their Tasks and Processes) in -Aurora configuration files. Their filenames end with the `.aurora` -suffix, and you write them in Python making use of the Pystachio -templating language, along -with specific Aurora, Mesos, and Thermos commands and methods. See the -[Configuration Guide and Reference](configuration-reference.md) and -[Configuration Tutorial](configuration-tutorial.md). - -Service Discovery ------------------ - -It is possible for the Aurora executor to announce tasks into ServerSets for -the purpose of service discovery. ServerSets use the Zookeeper [group membership pattern](http://zookeeper.apache.org/doc/trunk/recipes.html#sc_outOfTheBox) -of which there are several reference implementations: - - - [C++](https://github.com/apache/mesos/blob/master/src/zookeeper/group.cpp) - - [Java](https://github.com/twitter/commons/blob/master/src/java/com/twitter/common/zookeeper/ServerSetImpl.java#L221) - - [Python](https://github.com/twitter/commons/blob/master/src/python/twitter/common/zookeeper/serverset/serverset.py#L51) - -These can also be used natively in Finagle using the [ZookeeperServerSetCluster](https://github.com/twitter/finagle/blob/master/finagle-serversets/src/main/scala/com/twitter/finagle/zookeeper/ZookeeperServerSetCluster.scala). - -For more information about how to configure announcing, see the [Configuration Reference](configuration-reference.md). - -Creating Jobs -------------- - -You create and manipulate Aurora Jobs with the Aurora client, which starts all its -command line commands with -`aurora`. See [Aurora Client Commands](client-commands.md) for details -about the Aurora Client. - -Interacting With Jobs ---------------------- - -You interact with Aurora jobs either via: - -- Read-only Web UIs - - Part of the output from creating a new Job is a URL for the Job's scheduler UI page. - - For example: - - vagrant@precise64:~$ aurora job create devcluster/www-data/prod/hello \ - /vagrant/examples/jobs/hello_world.aurora - INFO] Creating job hello - INFO] Response from scheduler: OK (message: 1 new tasks pending for job www-data/prod/hello) - INFO] Job url: http://precise64:8081/scheduler/www-data/prod/hello - - The "Job url" goes to the Job's scheduler UI page. To go to the overall scheduler UI page, - stop at the "scheduler" part of the URL, in this case, `http://precise64:8081/scheduler` - - You can also reach the scheduler UI page via the Client command `aurora job open`: - - aurora job open [<cluster>[/<role>[/<env>/<job_name>]]] - - If only the cluster is specified, it goes directly to that cluster's scheduler main page. - If the role is specified, it goes to the top-level role page. If the full job key is specified, - it goes directly to the job page where you can inspect individual tasks. - - Once you click through to a role page, you see Jobs arranged separately by pending jobs, active - jobs, and finished jobs. Jobs are arranged by role, typically a service account for production - jobs and user accounts for test or development jobs. - -- The Aurora client - - See [client commands](client-commands.md). http://git-wip-us.apache.org/repos/asf/aurora/blob/f28f41a7/docs/vagrant.md ---------------------------------------------------------------------- diff --git a/docs/vagrant.md b/docs/vagrant.md deleted file mode 100644 index 3bc201f..0000000 --- a/docs/vagrant.md +++ /dev/null @@ -1,137 +0,0 @@ -Getting Started -=============== - -This document shows you how to configure a complete cluster using a virtual machine. This setup -replicates a real cluster in your development machine as closely as possible. After you complete -the steps outlined here, you will be ready to create and run your first Aurora job. - -The following sections describe these steps in detail: - -1. [Overview](#user-content-overview) -1. [Install VirtualBox and Vagrant](#user-content-install-virtualbox-and-vagrant) -1. [Clone the Aurora repository](#user-content-clone-the-aurora-repository) -1. [Start the local cluster](#user-content-start-the-local-cluster) -1. [Log onto the VM](#user-content-log-onto-the-vm) -1. [Run your first job](#user-content-run-your-first-job) -1. [Rebuild components](#user-content-rebuild-components) -1. [Shut down or delete your local cluster](#user-content-shut-down-or-delete-your-local-cluster) -1. [Troubleshooting](#user-content-troubleshooting) - - -Overview --------- - -The Aurora distribution includes a set of scripts that enable you to create a local cluster in -your development machine. These scripts use [Vagrant](https://www.vagrantup.com/) and -[VirtualBox](https://www.virtualbox.org/) to run and configure a virtual machine. Once the -virtual machine is running, the scripts install and initialize Aurora and any required components -to create the local cluster. - - -Install VirtualBox and Vagrant ------------------------------- - -First, download and install [VirtualBox](https://www.virtualbox.org/) on your development machine. - -Then download and install [Vagrant](https://www.vagrantup.com/). To verify that the installation -was successful, open a terminal window and type the `vagrant` command. You should see a list of -common commands for this tool. - - -Clone the Aurora repository ---------------------------- - -To obtain the Aurora source distribution, clone its Git repository using the following command: - - git clone git://git.apache.org/aurora.git - - -Start the local cluster ------------------------ - -Now change into the `aurora/` directory, which contains the Aurora source code and -other scripts and tools: - - cd aurora/ - -To start the local cluster, type the following command: - - vagrant up - -This command uses the configuration scripts in the Aurora distribution to: - -* Download a Linux system image. -* Start a virtual machine (VM) and configure it. -* Install the required build tools on the VM. -* Install Aurora's requirements (like [Mesos](http://mesos.apache.org/) and -[Zookeeper](http://zookeeper.apache.org/)) on the VM. -* Build and install Aurora from source on the VM. -* Start Aurora's services on the VM. - -This process takes several minutes to complete. - -To verify that Aurora is running on the cluster, visit the following URLs: - -* Scheduler - http://192.168.33.7:8081 -* Observer - http://192.168.33.7:1338 -* Mesos Master - http://192.168.33.7:5050 -* Mesos Slave - http://192.168.33.7:5051 - - -Log onto the VM ---------------- - -To SSH into the VM, run the following command in your development machine: - - vagrant ssh - -To verify that Aurora is installed in the VM, type the `aurora` command. You should see a list -of arguments and possible commands. - -The `/vagrant` directory on the VM is mapped to the `aurora/` local directory -from which you started the cluster. You can edit files inside this directory in your development -machine and access them from the VM under `/vagrant`. - -A pre-installed `clusters.json` file refers to your local cluster as `devcluster`, which you -will use in client commands. - - -Run your first job ------------------- - -Now that your cluster is up and running, you are ready to define and run your first job in Aurora. -For more information, see the [Aurora Tutorial](tutorial.md). - - -Rebuild components ------------------- - -If you are changing Aurora code and would like to rebuild a component, you can use the `aurorabuild` -command on the VM to build and restart a component. This is considerably faster than destroying -and rebuilding your VM. - -`aurorabuild` accepts a list of components to build and update. To get a list of supported -components, invoke the `aurorabuild` command with no arguments: - - vagrant ssh -c 'aurorabuild client' - - -Shut down or delete your local cluster --------------------------------------- - -To shut down your local cluster, run the `vagrant halt` command in your development machine. To -start it again, run the `vagrant up` command. - -Once you are finished with your local cluster, or if you would otherwise like to start from scratch, -you can use the command `vagrant destroy` to turn off and delete the virtual file system. - - -Troubleshooting ---------------- - -Most of the vagrant related problems can be fixed by the following steps: - -* Destroying the vagrant environment with `vagrant destroy` -* Killing any orphaned VMs (see AURORA-499) with `virtualbox` UI or `VBoxManage` command line tool -* Cleaning the repository of build artifacts and other intermediate output with `git clean -fdx` -* Bringing up the vagrant environment with `vagrant up`
