Repository: aurora Updated Branches: refs/heads/master f25c9c1de -> 9f6a6606c
Use 'Mesos agent' instead of 'Mesos slave' in docs There are a few referencess left: those that refer to the `mesos-slave` command line and those that document `clusters.json` attributes that use the term `slave`. I am picking up this work from kts as the whole renaming has gained momentum in Mesos https://github.com/apache/mesos/commit/24e1e098035ce918e0b73c9b3c751418d5c06064. Bugs closed: AURORA-1451 Reviewed at https://reviews.apache.org/r/47495/ Project: http://git-wip-us.apache.org/repos/asf/aurora/repo Commit: http://git-wip-us.apache.org/repos/asf/aurora/commit/9f6a6606 Tree: http://git-wip-us.apache.org/repos/asf/aurora/tree/9f6a6606 Diff: http://git-wip-us.apache.org/repos/asf/aurora/diff/9f6a6606 Branch: refs/heads/master Commit: 9f6a6606cbe49450050b47f7a225322656b770b2 Parents: f25c9c1 Author: Stephan Erb <[email protected]> Authored: Sun May 22 12:09:14 2016 +0200 Committer: Stephan Erb <[email protected]> Committed: Sun May 22 12:09:54 2016 +0200 ---------------------------------------------------------------------- docs/development/client.md | 4 ++-- docs/features/constraints.md | 18 +++++++++--------- docs/features/sla-metrics.md | 4 ++-- docs/getting-started/overview.md | 4 +++- docs/getting-started/tutorial.md | 2 +- docs/getting-started/vagrant.md | 2 +- docs/operations/configuration.md | 8 ++++---- docs/operations/installation.md | 4 ++-- docs/operations/monitoring.md | 4 ++-- docs/reference/client-cluster-configuration.md | 6 +++--- docs/reference/client-commands.md | 2 +- docs/reference/configuration-tutorial.md | 2 +- docs/reference/configuration.md | 8 ++++---- docs/reference/task-lifecycle.md | 20 ++++++++++---------- 14 files changed, 45 insertions(+), 43 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/aurora/blob/9f6a6606/docs/development/client.md ---------------------------------------------------------------------- diff --git a/docs/development/client.md b/docs/development/client.md index 41d44be..079c471 100644 --- a/docs/development/client.md +++ b/docs/development/client.md @@ -22,8 +22,8 @@ Running/Debugging For manually testing client changes against a cluster, we use [Vagrant](https://www.vagrantup.com/). To start a virtual cluster, you need to install Vagrant, and then run `vagrant up` for the root of -the aurora workspace. This will create a vagrant host named "devcluster", with a mesos master, a set -of mesos agents, and an aurora scheduler. +the aurora workspace. This will create a vagrant host named "devcluster", with a Mesos master, a set +of Mesos agents, and an Aurora scheduler. If you have a change you would like to test in your local cluster, you'll rebuild the client: http://git-wip-us.apache.org/repos/asf/aurora/blob/9f6a6606/docs/features/constraints.md ---------------------------------------------------------------------- diff --git a/docs/features/constraints.md b/docs/features/constraints.md index 2a6a15e..866e4b9 100644 --- a/docs/features/constraints.md +++ b/docs/features/constraints.md @@ -1,7 +1,7 @@ Scheduling Constraints ====================== -By default, Aurora will pick any random slave with sufficient resources +By default, Aurora will pick any random agent with sufficient resources in order to schedule a task. This scheduling choice can be further restricted with the help of constraints. @@ -11,10 +11,10 @@ Mesos Attributes Data centers are often organized with hierarchical failure domains. Common failure domains include hosts, racks, rows, and PDUs. If you have this information available, it is wise to tag -the Mesos slave with them as +the Mesos agent with them as [attributes](https://mesos.apache.org/documentation/attributes-resources/). -The Mesos slave `--attributes` command line argument can be used to mark slaves with +The Mesos agent `--attributes` command line argument can be used to mark agents with static key/value pairs, so called attributes (not to be confused with `--resources`, which are dynamic and accounted). @@ -58,7 +58,7 @@ Value Constraints ----------------- Value constraints can be used to express that a certain attribute with a certain value -should be present on a Mesos slave. For example, the following job would only be +should be present on a Mesos agent. For example, the following job would only be scheduled on nodes that claim to have an `SSD` as their disk. Service( @@ -94,18 +94,18 @@ the scheduler requires that the `$role` component matches the `role` field in th configuration, and will reject the job creation otherwise. The remainder of the attribute is free-form. We've developed the idiom of formatting this attribute as `$role/$job`, but do not enforce this. For example: a job `devcluster/www-data/prod/hello` with a dedicated constraint set as -`www-data/web.multi` will have its tasks scheduled only on Mesos slaves configured with: +`www-data/web.multi` will have its tasks scheduled only on Mesos agents configured with: `--attributes=dedicated:www-data/web.multi`. A wildcard (`*`) may be used for the role portion of the dedicated attribute, which will allow any owner to elect for a job to run on the host(s). For example: tasks from both `devcluster/www-data/prod/hello` and `devcluster/vagrant/test/hello` with a dedicated constraint -formatted as `*/web.multi` will be scheduled only on Mesos slaves configured with +formatted as `*/web.multi` will be scheduled only on Mesos agents configured with `--attributes=dedicated:*/web.multi`. This may be useful when assembling a virtual cluster of machines sharing the same set of traits or requirements. ##### Example -Consider the following slave command line: +Consider the following agent command line: mesos-slave --attributes="dedicated:db_team/redis" ... @@ -120,7 +120,7 @@ And this job configuration: ... ) -The job configuration is indicating that it should only be scheduled on slaves with the attribute +The job configuration is indicating that it should only be scheduled on agents with the attribute `dedicated:db_team/redis`. Additionally, Aurora will prevent any tasks that do _not_ have that -constraint from running on those slaves. +constraint from running on those agents. http://git-wip-us.apache.org/repos/asf/aurora/blob/9f6a6606/docs/features/sla-metrics.md ---------------------------------------------------------------------- diff --git a/docs/features/sla-metrics.md b/docs/features/sla-metrics.md index 11a1ced..932b5dc 100644 --- a/docs/features/sla-metrics.md +++ b/docs/features/sla-metrics.md @@ -45,11 +45,11 @@ will not degrade this metric.* A fault in the task environment may cause the Aurora/Mesos to have different views on the task state or lose track of the task existence. In such cases, the service task is marked as LOST and rescheduled by Aurora. For example, this may happen when the task stays in ASSIGNED or STARTING -for too long or the Mesos slave becomes unhealthy (or disappears completely). The time between +for too long or the Mesos agent becomes unhealthy (or disappears completely). The time between task entering LOST and its replacement reaching RUNNING state is counted towards platform downtime. Another example of a platform downtime event is the administrator-requested task rescheduling. This -happens during planned Mesos slave maintenance when all slave tasks are marked as DRAINED and +happens during planned Mesos agent maintenance when all agent tasks are marked as DRAINED and rescheduled elsewhere. To accurately calculate Platform Uptime, we must separate platform incurred downtime from user http://git-wip-us.apache.org/repos/asf/aurora/blob/9f6a6606/docs/getting-started/overview.md ---------------------------------------------------------------------- diff --git a/docs/getting-started/overview.md b/docs/getting-started/overview.md index f49c911..492c92a 100644 --- a/docs/getting-started/overview.md +++ b/docs/getting-started/overview.md @@ -53,6 +53,8 @@ a functioning Aurora cluster. When a user task is launched, the agent will launch the executor (in the context of a Linux cgroup or Docker container depending upon the environment), which will in turn fork user processes. + In earlier versions of Mesos and Aurora, the Mesos agent was known as the Mesos slave. + Jobs, Tasks and Processes -------------------------- @@ -73,7 +75,7 @@ A task can merely be a single *process* corresponding to a single command line, such as `python2.7 my_script.py`. However, a task can also consist of many separate processes, which all run within a single sandbox. For example, running multiple cooperating agents together, -such as `logrotate`, `installer`, master, or slave processes. This is +such as `logrotate`, `installer`, master, or agent processes. This is where Thermos comes in. While Aurora provides a `Job` abstraction on top of Mesos `Tasks`, Thermos provides a `Process` abstraction underneath Mesos `Task`s and serves as part of the Aurora framework's http://git-wip-us.apache.org/repos/asf/aurora/blob/9f6a6606/docs/getting-started/tutorial.md ---------------------------------------------------------------------- diff --git a/docs/getting-started/tutorial.md b/docs/getting-started/tutorial.md index 94fdc38..1037ad7 100644 --- a/docs/getting-started/tutorial.md +++ b/docs/getting-started/tutorial.md @@ -122,7 +122,7 @@ identifies a Job. A job key consists of four parts, each separated by a in that order: * Cluster refers to the name of a particular Aurora installation. -* Role names are user accounts existing on the slave machines. If you +* Role names are user accounts existing on the agent machines. If you don't know what accounts are available, contact your sysadmin. * Environment names are namespaces; you can count on `test`, `devel`, `staging` and `prod` existing. http://git-wip-us.apache.org/repos/asf/aurora/blob/9f6a6606/docs/getting-started/vagrant.md ---------------------------------------------------------------------- diff --git a/docs/getting-started/vagrant.md b/docs/getting-started/vagrant.md index 5c3ba05..4460600 100644 --- a/docs/getting-started/vagrant.md +++ b/docs/getting-started/vagrant.md @@ -85,7 +85,7 @@ To verify that Aurora is running on the cluster, visit the following URLs: * Scheduler - http://192.168.33.7:8081 * Observer - http://192.168.33.7:1338 * Mesos Master - http://192.168.33.7:5050 -* Mesos Slave - http://192.168.33.7:5051 +* Mesos Agent - http://192.168.33.7:5051 Log onto the VM http://git-wip-us.apache.org/repos/asf/aurora/blob/9f6a6606/docs/operations/configuration.md ---------------------------------------------------------------------- diff --git a/docs/operations/configuration.md b/docs/operations/configuration.md index 8f7f92a..65cf64a 100644 --- a/docs/operations/configuration.md +++ b/docs/operations/configuration.md @@ -75,7 +75,7 @@ other available Mesos replicated log configuration options and default values. ### Changing the Quorum Size Special care needs to be taken when changing the size of the Aurora scheduler quorum. Since Aurora uses a Mesos replicated log, similar steps need to be followed as when -[changing the mesos quorum size](http://mesos.apache.org/documentation/latest/operational-guide). +[changing the Mesos quorum size](http://mesos.apache.org/documentation/latest/operational-guide). As a preparation, increase `-native_log_quorum_size` on each existing scheduler and restart them. When updating from 3 to 5 schedulers, the quorum size would grow from 2 to 3. @@ -148,7 +148,7 @@ For example, to wrap the executor inside a simple wrapper, the scheduler will be ### Docker containers In order for Aurora to launch jobs using docker containers, a few extra configuration options must be set. The [docker containerizer](http://mesos.apache.org/documentation/latest/docker-containerizer/) -must be enabled on the mesos slaves by launching them with the `--containerizers=docker,mesos` option. +must be enabled on the Mesos agents by launching them with the `--containerizers=docker,mesos` option. By default, Aurora will configure Mesos to copy the file specified in `-thermos_executor_path` into the container's sandbox. If using a wrapper script to launch the thermos executor, @@ -158,10 +158,10 @@ wrapper script and executor are correctly copied into the sandbox. Finally, ensu script does not access resources outside of the sandbox, as when the script is run from within a docker container those resources will not exist. -A scheduler flag, `-global_container_mounts` allows mounting paths from the host (i.e., the slave) +A scheduler flag, `-global_container_mounts` allows mounting paths from the host (i.e the agent machine) into all containers on that host. The format is a comma separated list of host_path:container_path[:mode] tuples. For example `-global_container_mounts=/opt/secret_keys_dir:/mnt/secret_keys_dir:ro` mounts -`/opt/secret_keys_dir` from the slaves into all launched containers. Valid modes are `ro` and `rw`. +`/opt/secret_keys_dir` from the agents into all launched containers. Valid modes are `ro` and `rw`. If you would like to run a container with a read-only filesystem, it may also be necessary to pass to use the scheduler flag `-thermos_home_in_sandbox` in order to set HOME to the sandbox http://git-wip-us.apache.org/repos/asf/aurora/blob/9f6a6606/docs/operations/installation.md ---------------------------------------------------------------------- diff --git a/docs/operations/installation.md b/docs/operations/installation.md index f026c01..d3fb529 100644 --- a/docs/operations/installation.md +++ b/docs/operations/installation.md @@ -145,7 +145,7 @@ The executor typically does not require configuration. Command line arguments c be passed to the executor using a command line argument on the scheduler. The observer needs to be configured to look at the correct mesos directory in order to find task -sandboxes. You should 1st find the Mesos working directory by looking for the Mesos slave +sandboxes. You should 1st find the Mesos working directory by looking for the Mesos agent `--work_dir` flag. You should see something like: ps -eocmd | grep "mesos-slave" | grep -v grep | tr ' ' '\n' | grep "\--work_dir" @@ -237,7 +237,7 @@ dev, test, prod) for a production job. ## Installing Mesos -Mesos uses a single package for the Mesos master and slave. As a result, the package dependencies +Mesos uses a single package for the Mesos master and agent. As a result, the package dependencies are identical for both. ### Mesos on Ubuntu Trusty http://git-wip-us.apache.org/repos/asf/aurora/blob/9f6a6606/docs/operations/monitoring.md ---------------------------------------------------------------------- diff --git a/docs/operations/monitoring.md b/docs/operations/monitoring.md index 3cb2a79..2ceda62 100644 --- a/docs/operations/monitoring.md +++ b/docs/operations/monitoring.md @@ -119,7 +119,7 @@ The number of tasks stored in the scheduler that are in the `LOST` state, and ha If this value is increasing at a high rate, it is a sign of trouble. -There are many sources of `LOST` tasks in Mesos: the scheduler, master, slave, and executor can all +There are many sources of `LOST` tasks in Mesos: the scheduler, master, agent, and executor can all trigger this. The first step is to look in the scheduler logs for `LOST` to identify where the state changes are originating. @@ -169,7 +169,7 @@ This value is currently known to increase occasionally when the scheduler fails value warrants investigation. The scheduler will log when it times out a task. You should trace the task ID of the timed out -task into the master, slave, and/or executors to determine where the message was dropped. +task into the master, agent, and/or executors to determine where the message was dropped. ### `http_500_responses_events` Type: integer counter http://git-wip-us.apache.org/repos/asf/aurora/blob/9f6a6606/docs/reference/client-cluster-configuration.md ---------------------------------------------------------------------- diff --git a/docs/reference/client-cluster-configuration.md b/docs/reference/client-cluster-configuration.md index ee02ca1..5a86cda 100644 --- a/docs/reference/client-cluster-configuration.md +++ b/docs/reference/client-cluster-configuration.md @@ -27,8 +27,8 @@ The following properties may be set: **Property** | **Type** | **Description** :------------------------| :------- | :-------------- **name** | String | Cluster name (Required) - **slave_root** | String | Path to mesos slave work dir (Required) - **slave_run_directory** | String | Name of mesos slave run dir (Required) + **slave_root** | String | Path to Mesos agent work dir (Required) + **slave_run_directory** | String | Name of Mesos agent run dir (Required) **zk** | String | Hostname of ZooKeeper instance used to resolve Aurora schedulers. **zk_port** | Integer | Port of ZooKeeper instance used to locate Aurora schedulers (Default: 2181) **scheduler_zk_path** | String | ZooKeeper path under which scheduler instances are registered. @@ -46,7 +46,7 @@ any job keys identifying jobs running within the cluster. ### `slave_root` -The path on the mesos slaves where executing tasks can be found. It is used in combination with the +The path on the Mesos agents where executing tasks can be found. It is used in combination with the `slave_run_directory` property by `aurora task run` and `aurora task ssh` to change into the sandbox directory after connecting to the host. This value should match the value passed to `mesos-slave` as `-work_dir`. http://git-wip-us.apache.org/repos/asf/aurora/blob/9f6a6606/docs/reference/client-commands.md ---------------------------------------------------------------------- diff --git a/docs/reference/client-commands.md b/docs/reference/client-commands.md index 84a8bd4..582c96a 100644 --- a/docs/reference/client-commands.md +++ b/docs/reference/client-commands.md @@ -86,7 +86,7 @@ refer to different Jobs. For example, job key `cluster2/foo/prod/workhorse` is different from `cluster1/tyg/test/workhorse.` -Role names are user accounts existing on the slave machines. If you don't know what accounts +Role names are user accounts existing on the agent machines. If you don't know what accounts are available, contact your sysadmin. Environment names are namespaces; you can count on `prod`, `devel` and `test` existing. http://git-wip-us.apache.org/repos/asf/aurora/blob/9f6a6606/docs/reference/configuration-tutorial.md ---------------------------------------------------------------------- diff --git a/docs/reference/configuration-tutorial.md b/docs/reference/configuration-tutorial.md index c40022b..c0e573b 100644 --- a/docs/reference/configuration-tutorial.md +++ b/docs/reference/configuration-tutorial.md @@ -230,7 +230,7 @@ working directory. Typically, you save this code somewhere. You then need to define a Process in your `.aurora` configuration file that fetches the code from that somewhere -to where the slave can see it. For a public cloud, that can be anywhere public on +to where the agent can see it. For a public cloud, that can be anywhere public on the Internet, such as S3. For a private cloud internal storage, you need to put in on an accessible HDFS cluster or similar storage. http://git-wip-us.apache.org/repos/asf/aurora/blob/9f6a6606/docs/reference/configuration.md ---------------------------------------------------------------------- diff --git a/docs/reference/configuration.md b/docs/reference/configuration.md index eb0af3e..e77ee60 100644 --- a/docs/reference/configuration.md +++ b/docs/reference/configuration.md @@ -410,12 +410,12 @@ no announcement will take place. For more information about ServerSets, see the documentation. By default, the hostname in the registered endpoints will be the `--hostname` parameter -that is passed to the mesos slave. To override the hostname value, the executor can be started +that is passed to the mesos agent. To override the hostname value, the executor can be started with `--announcer-hostname=<overriden_value>`. If you decide to use `--announcer-hostname` and if the overriden value needs to change for every executor, then the executor has to be started inside a wrapper, see [Executor Wrapper](../operations/configuration.md#thermos-executor-wrapper). For example, if you want the hostname in the endpoint to be an IP address instead of the hostname, -the `--hostname` parameter to the mesos slave can be set to the machine IP or the executor can +the `--hostname` parameter to the mesos agent can be set to the machine IP or the executor can be started with `--announcer-hostname=<host_ip>` while wrapping the executor inside a script. | object | type | description @@ -445,7 +445,7 @@ find a static port 80. No port would be requested of or allocated by Mesos. Static ports should be used cautiously as Aurora does nothing to prevent two tasks with the same static port allocations from being co-scheduled. -External constraints such as slave attributes should be used to enforce such +External constraints such as agent attributes should be used to enforce such guarantees should they be needed. ### Container Objects @@ -570,7 +570,7 @@ Aurora client or Aurora-provided services. ### mesos Namespace -The `mesos` namespace contains variables which relate to the `mesos` slave +The `mesos` namespace contains variables which relate to the `mesos` agent which launched the task. The `instance` variable can be used to distinguish between Task replicas. http://git-wip-us.apache.org/repos/asf/aurora/blob/9f6a6606/docs/reference/task-lifecycle.md ---------------------------------------------------------------------- diff --git a/docs/reference/task-lifecycle.md b/docs/reference/task-lifecycle.md index 1477364..4dcb481 100644 --- a/docs/reference/task-lifecycle.md +++ b/docs/reference/task-lifecycle.md @@ -26,14 +26,14 @@ particular role" or attribute limit constraints such as "at most 2 finds a suitable match, it assigns the `Task` to a machine and puts the `Task` into the `ASSIGNED` state. -From the `ASSIGNED` state, the scheduler sends an RPC to the slave -machine containing `Task` configuration, which the slave uses to spawn +From the `ASSIGNED` state, the scheduler sends an RPC to the agent +machine containing `Task` configuration, which the agent uses to spawn an executor responsible for the `Task`'s lifecycle. When the scheduler receives an acknowledgment that the machine has accepted the `Task`, the `Task` goes into `STARTING` state. `STARTING` state initializes a `Task` sandbox. When the sandbox is fully -initialized, Thermos begins to invoke `Process`es. Also, the slave +initialized, Thermos begins to invoke `Process`es. Also, the agent machine sends an update to the scheduler that the `Task` is in `RUNNING` state. @@ -67,7 +67,7 @@ failure. ### Forceful Termination: KILLING, RESTARTING You can terminate a `Task` by issuing an `aurora job kill` command, which -moves it into `KILLING` state. The scheduler then sends the slave a +moves it into `KILLING` state. The scheduler then sends the agent a request to terminate the `Task`. If the scheduler receives a successful response, it moves the Task into `KILLED` state and never restarts it. @@ -75,7 +75,7 @@ If a `Task` is forced into the `RESTARTING` state via the `aurora job restart` command, the scheduler kills the underlying task but in parallel schedules an identical replacement for it. -In any case, the responsible executor on the slave follows an escalation +In any case, the responsible executor on the agent follows an escalation sequence when killing a running task: 1. If a `HttpLifecycleConfig` is not present, skip to (4). @@ -95,9 +95,9 @@ If a `Task` stays in a transient task state for too long (such as `ASSIGNED` or `STARTING`), the scheduler forces it into `LOST` state, creating a new `Task` in its place that's sent into `PENDING` state. -In addition, if the Mesos core tells the scheduler that a slave has +In addition, if the Mesos core tells the scheduler that a agent has become unhealthy (or outright disappeared), the `Task`s assigned to that -slave go into `LOST` state and new `Task`s are created in their place. +agent go into `LOST` state and new `Task`s are created in their place. From `PENDING` state, there is no guarantee a `Task` will be reassigned to the same machine unless job constraints explicitly force it there. @@ -121,9 +121,9 @@ preempted in favor of production tasks. ### Making Room for Maintenance: DRAINING -Cluster operators can set slave into maintenance mode. This will transition -all `Task` running on this slave into `DRAINING` and eventually to `KILLED`. -Drained `Task`s will be restarted on other slaves for which no maintenance +Cluster operators can set agent into maintenance mode. This will transition +all `Task` running on this agent into `DRAINING` and eventually to `KILLED`. +Drained `Task`s will be restarted on other agents for which no maintenance has been announced yet.
