Repository: aurora Updated Branches: refs/heads/master e32f4fbd1 -> 18533141c
Extend the resource isolation and oversubscription documentation I had to answer a couple of questions regarding these over the recent weeks and thought it might make sense to update the docs accordingly. Reviewed at https://reviews.apache.org/r/51602/ Project: http://git-wip-us.apache.org/repos/asf/aurora/repo Commit: http://git-wip-us.apache.org/repos/asf/aurora/commit/18533141 Tree: http://git-wip-us.apache.org/repos/asf/aurora/tree/18533141 Diff: http://git-wip-us.apache.org/repos/asf/aurora/diff/18533141 Branch: refs/heads/master Commit: 18533141cdf7d4a1e0ce016073274f169548f354 Parents: e32f4fb Author: Stephan Erb <[email protected]> Authored: Sun Sep 4 00:02:44 2016 +0200 Committer: Stephan Erb <[email protected]> Committed: Sun Sep 4 00:02:44 2016 +0200 ---------------------------------------------------------------------- docs/features/multitenancy.md | 1 + docs/features/resource-isolation.md | 54 +++++++++++++++++--------------- docs/operations/configuration.md | 45 +++++++++++++++++++++----- 3 files changed, 67 insertions(+), 33 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/aurora/blob/18533141/docs/features/multitenancy.md ---------------------------------------------------------------------- diff --git a/docs/features/multitenancy.md b/docs/features/multitenancy.md index cb45beb..301170d 100644 --- a/docs/features/multitenancy.md +++ b/docs/features/multitenancy.md @@ -40,6 +40,7 @@ Configuration Tiers Tier is a predefined bundle of task configuration options. Aurora schedules tasks and assigns them resources based on their tier assignment. The default scheduler tier configuration allows for 3 tiers: + - `revocable`: The `revocable` tier requires the task to run with [revocable](resource-isolation.md#oversubscription) resources. - `preemptible`: Setting the taskâs tier to `preemptible` allows for the possibility of that task http://git-wip-us.apache.org/repos/asf/aurora/blob/18533141/docs/features/resource-isolation.md ---------------------------------------------------------------------- diff --git a/docs/features/resource-isolation.md b/docs/features/resource-isolation.md index 59da823..01c5b40 100644 --- a/docs/features/resource-isolation.md +++ b/docs/features/resource-isolation.md @@ -1,6 +1,9 @@ Resources Isolation and Sizing ============================== +This document assumes Aurora and Mesos have been configured +using our [recommended resource isolation settings](../operations/configuration.md#resource-isolation). + - [Isolation](#isolation) - [Sizing](#sizing) - [Oversubscription](#oversubscription) @@ -11,11 +14,13 @@ Isolation Aurora is a multi-tenant system; a single software instance runs on a server, serving multiple clients/tenants. To share resources among -tenants, it implements isolation of: +tenants, it leverages Mesos for isolation of: * CPU +* GPU * memory * disk space +* ports CPU is a soft limit, and handled differently from memory and disk space. Too low a CPU value results in throttling your application and @@ -24,10 +29,10 @@ application goes over these values, it's killed. ### CPU Isolation -Mesos uses a quota based CPU scheduler (the *Completely Fair Scheduler*) -to provide consistent and predictable performance. This is effectively -a guarantee of resources -- you receive at least what you requested, but -also no more than you've requested. +Mesos can be configured to use a quota based CPU scheduler (the *Completely* +*Fair Scheduler*) to provide consistent and predictable performance. +This is effectively a guarantee of resources -- you receive at least what +you requested, but also no more than you've requested. The scheduler gives applications a CPU quota for every 100 ms interval. When an application uses its quota for an interval, it is throttled for @@ -103,11 +108,11 @@ will be killed shortly after. This is subject to change. ### GPU Isolation -GPU isolation will be supported for Nvidia devices starting from Mesos 0.29.0. +GPU isolation will be supported for Nvidia devices starting from Mesos 1.0. Access to the allocated units will be exclusive with no sharing between tasks -allowed (e.g. no fractional GPU allocation). Until official documentation is released, -see [Mesos design document](https://docs.google.com/document/d/10GJ1A80x4nIEo8kfdeo9B11PIbS1xJrrB4Z373Ifkpo/edit#heading=h.w84lz7p4eexl) -for more details. +allowed (e.g. no fractional GPU allocation). For more details, see the +[Mesos design document](https://docs.google.com/document/d/10GJ1A80x4nIEo8kfdeo9B11PIbS1xJrrB4Z373Ifkpo/edit#heading=h.w84lz7p4eexl) +and the [Mesos agent configuration](http://mesos.apache.org/documentation/latest/configuration/). ### Other Resources @@ -154,26 +159,23 @@ into the application's sandbox space. GPU is highly dependent on your application requirements and is only limited by the number of physical GPU units available on a target box. + Oversubscription ---------------- -**WARNING**: This feature is currently in alpha status. Do not use it in production clusters! - -Mesos [supports a concept of revocable tasks](http://mesos.apache.org/documentation/latest/oversubscription/) -by oversubscribing machine resources by the amount deemed safe to not affect the existing -non-revocable tasks. Aurora now supports revocable jobs via a `tier` setting set to `revocable` -value. - -The Aurora scheduler must be configured to receive revocable offers from Mesos and accept revocable -jobs. If not configured properly revocable tasks will never get assigned to hosts and will stay in -`PENDING`. Set these scheduler flag to allow receiving revocable Mesos offers: - - -receive_revocable_resources=true - -Specify a tier configuration file path (unless you want to use the [default](../../src/main/resources/org/apache/aurora/scheduler/tiers.json)): +Mesos supports [oversubscription of machine resources](http://mesos.apache.org/documentation/latest/oversubscription/) +via the concept of revocable tasks. In contrast to non-revocable tasks, revocable tasks are best-effort. +Mesos reserves the right to throttle or even kill them if they might affect existing high-priority +user-facing services. - -tier_config=path/to/tiers/config.json +As of today, the only revocable resource supported by Aurora are CPU resources. A job can opt-in to +use those by specifying the `revocable` [Configuration Tier](../features/multitenancy.md#configuration-tiers). +A revocable job will only be scheduled using revocable CPU resources, even if there are plenty of +non-revocable resources available. +The Aurora scheduler must be [configured to receive revocable offers](../operations/configuration.md#resource-isolation) +from Mesos and accept revocable jobs. If not configured properly revocable tasks will never get +assigned to hosts and will stay in `PENDING`. -See the [Configuration Reference](../reference/configuration.md) for details on how to mark a job -as being revocable. +For details on how to mark a job as being revocable, see the +[Configuration Reference](../reference/configuration.md). http://git-wip-us.apache.org/repos/asf/aurora/blob/18533141/docs/operations/configuration.md ---------------------------------------------------------------------- diff --git a/docs/operations/configuration.md b/docs/operations/configuration.md index 350ea77..85787b0 100644 --- a/docs/operations/configuration.md +++ b/docs/operations/configuration.md @@ -90,17 +90,48 @@ or truncating of the replicated log used by Aurora. In that case, see the docume Configuration options for the Aurora scheduler backup manager. -### `-backup_interval` -The interval on which the scheduler writes local storage backups. The default is every hour. +* `-backup_interval`: The interval on which the scheduler writes local storage backups. The default is every hour. +* `-backup_dir`: Directory to write backups to. +* `-max_saved_backups`: Maximum number of backups to retain before deleting the oldest backup(s). -### `-backup_dir` -Directory to write backups to. -### `-max_saved_backups` -Maximum number of backups to retain before deleting the oldest backup(s). +## Resource Isolation +For proper CPU, memory, and disk isolation as mentioned in our [enduser documentation](../features/resource-isolation.md), +we recommend to add the following isolators to the `--isolation` flag of the Mesos agent: -## Process Logs +* `cgroups/cpu` +* `cgroups/mem` +* `disk/du` + +In addition, we recommend to set the following [agent flags](http://mesos.apache.org/documentation/latest/configuration/): + +* `--cgroups_limit_swap` to enable memory limits on both memory and swap instead of just memory. + Alternatively, you could disable swap on your agent hosts. +* `--cgroups_enable_cfs` to enable hard limits on CPU resources via the CFS bandwidth limiting + feature. +* `--enforce_container_disk_quota` to enable disk quota enforcement for containers. + +To enable the optional GPU support in Mesos, please see the GPU related flags in the +[Mesos configuration](http://mesos.apache.org/documentation/latest/configuration/). +To enable the corresponding feature in Aurora, you have to start the scheduler with the +flag + + -allow_gpu_resource=true + +If you want to use revocable resources, first follow the +[Mesos oversubscription documentation](http://mesos.apache.org/documentation/latest/oversubscription/) +and then set set this Aurora scheduler flag to allow receiving revocable Mesos offers: + + -receive_revocable_resources=true + +Unless you want to use the [default](../../src/main/resources/org/apache/aurora/scheduler/tiers.json) +tier configuration, you will also have to specify a file path: + + -tier_config=path/to/tiers/config.json + + +## Thermos Process Logs ### Log destination By default, Thermos will write process stdout/stderr to log files in the sandbox. Process object
