Repository: aurora Updated Branches: refs/heads/master 041b286f1 -> 2535c467d
Update and slightly extend the beginner tutorial Reviewed at https://reviews.apache.org/r/41844/ Project: http://git-wip-us.apache.org/repos/asf/aurora/repo Commit: http://git-wip-us.apache.org/repos/asf/aurora/commit/2535c467 Tree: http://git-wip-us.apache.org/repos/asf/aurora/tree/2535c467 Diff: http://git-wip-us.apache.org/repos/asf/aurora/diff/2535c467 Branch: refs/heads/master Commit: 2535c467d8898036a2043f9c42e4485f3bee8b5e Parents: 041b286 Author: Stephan Erb <[email protected]> Authored: Sun Jan 3 21:37:28 2016 +0100 Committer: Bill Farner <[email protected]> Committed: Wed Jan 6 21:43:42 2016 -0800 ---------------------------------------------------------------------- README.md | 8 +- docs/README.md | 5 +- docs/developing-aurora-scheduler.md | 2 +- docs/images/CompletedTasks.png | Bin 0 -> 45851 bytes docs/images/HelloWorldJob.png | Bin 44392 -> 46966 bytes docs/images/RoleJobs.png | Bin 59734 -> 57755 bytes docs/images/RunningJob.png | Bin 0 -> 63635 bytes docs/images/ScheduledJobs.png | Bin 40758 -> 31732 bytes docs/images/TaskBreakdown.png | Bin 89794 -> 63530 bytes docs/images/killedtask.png | Bin 73312 -> 60667 bytes docs/images/runningtask.png | Bin 58821 -> 48214 bytes docs/images/stderr.png | Bin 16176 -> 9684 bytes docs/images/stdout.png | Bin 32941 -> 24263 bytes docs/tools.md | 2 +- docs/tutorial.md | 159 +++++++++++++++---------------- examples/vagrant/test_tutorial.sh | 6 +- 16 files changed, 86 insertions(+), 96 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/aurora/blob/2535c467/README.md ---------------------------------------------------------------------- diff --git a/README.md b/README.md index 52d850c..fc8642f 100644 --- a/README.md +++ b/README.md @@ -1,7 +1,7 @@  -Apache Aurora lets you use an [Apache Mesos](http://mesos.apache.org) cluster as a private cloud. -It supports running long-running services, cron jobs, and ad-hoc jobs. +[Apache Aurora](https://aurora.apache.org/) lets you use an [Apache Mesos](http://mesos.apache.org) +cluster as a private cloud. It supports running long-running services, cron jobs, and ad-hoc jobs. Aurora aims to make it extremely quick and easy to take a built application and run it on machines in a cluster, with an emphasis on reliability. It provides basic operations to manage services running in a cluster, such as rolling upgrades. @@ -9,12 +9,10 @@ running in a cluster, such as rolling upgrades. To very concisely describe Aurora, it is like a distributed monit or distributed supervisord that you can instruct to do things like _run 100 of these, somewhere, forever_. -https://aurora.apache.org/ - ## Features -Aurora is build for users _and_ operators. +Aurora is built for users _and_ operators. * User-facing Features: - Management of long-running services http://git-wip-us.apache.org/repos/asf/aurora/blob/2535c467/docs/README.md ---------------------------------------------------------------------- diff --git a/docs/README.md b/docs/README.md index 55dc4db..8ebc061 100644 --- a/docs/README.md +++ b/docs/README.md @@ -5,7 +5,7 @@ Apache Aurora is a service scheduler that runs on top of Apache Mesos, enabling * Operators: For those that wish to manage and fine-tune an Aurora cluster. * Developers: All the information you need to start modifying Aurora and contributing back to the project. -We encourage you to ask questions on the [Aurora developer list](http://aurora.apache.org/community/) or the `#aurora` IRC channel on `irc.freenode.net`. +We encourage you to ask questions on the [Aurora user list](http://aurora.apache.org/community/) or the `#aurora` IRC channel on `irc.freenode.net`. ## Users * [Install Aurora on virtual machines on your private machine](vagrant.md) @@ -19,13 +19,12 @@ We encourage you to ask questions on the [Aurora developer list](http://aurora.a ## Operators * [Installation](installing.md) - * [Deployment and cluster configuraiton](deploying-aurora-scheduler.md) + * [Deployment and cluster configuration](deploying-aurora-scheduler.md) * [Security](security.md) * [Monitoring](monitoring.md) * [Hooks for Aurora Client API](hooks.md) * [Scheduler Storage](storage.md) * [Scheduler Storage and Maintenance](storage-config.md) - * [Scheduler Storage Performance Tuning](scheduler-storage.md) * [SLA Measurement](sla.md) * [Resource Isolation and Sizing](resources.md) http://git-wip-us.apache.org/repos/asf/aurora/blob/2535c467/docs/developing-aurora-scheduler.md ---------------------------------------------------------------------- diff --git a/docs/developing-aurora-scheduler.md b/docs/developing-aurora-scheduler.md index 40e123d..a7a929e 100644 --- a/docs/developing-aurora-scheduler.md +++ b/docs/developing-aurora-scheduler.md @@ -56,7 +56,7 @@ environment: In addition, there is an end-to-end test that runs a suite of aurora commands using a virtual cluster: - bash src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh + ./src/test/sh/org/apache/aurora/e2e/test_end_to_end.sh http://git-wip-us.apache.org/repos/asf/aurora/blob/2535c467/docs/images/CompletedTasks.png ---------------------------------------------------------------------- diff --git a/docs/images/CompletedTasks.png b/docs/images/CompletedTasks.png new file mode 100644 index 0000000..343e111 Binary files /dev/null and b/docs/images/CompletedTasks.png differ http://git-wip-us.apache.org/repos/asf/aurora/blob/2535c467/docs/images/HelloWorldJob.png ---------------------------------------------------------------------- diff --git a/docs/images/HelloWorldJob.png b/docs/images/HelloWorldJob.png index 7a89575..4977227 100644 Binary files a/docs/images/HelloWorldJob.png and b/docs/images/HelloWorldJob.png differ http://git-wip-us.apache.org/repos/asf/aurora/blob/2535c467/docs/images/RoleJobs.png ---------------------------------------------------------------------- diff --git a/docs/images/RoleJobs.png b/docs/images/RoleJobs.png index d41ee0b..95ffa49 100644 Binary files a/docs/images/RoleJobs.png and b/docs/images/RoleJobs.png differ http://git-wip-us.apache.org/repos/asf/aurora/blob/2535c467/docs/images/RunningJob.png ---------------------------------------------------------------------- diff --git a/docs/images/RunningJob.png b/docs/images/RunningJob.png new file mode 100644 index 0000000..4e7a045 Binary files /dev/null and b/docs/images/RunningJob.png differ http://git-wip-us.apache.org/repos/asf/aurora/blob/2535c467/docs/images/ScheduledJobs.png ---------------------------------------------------------------------- diff --git a/docs/images/ScheduledJobs.png b/docs/images/ScheduledJobs.png index 21bfcae..0be4731 100644 Binary files a/docs/images/ScheduledJobs.png and b/docs/images/ScheduledJobs.png differ http://git-wip-us.apache.org/repos/asf/aurora/blob/2535c467/docs/images/TaskBreakdown.png ---------------------------------------------------------------------- diff --git a/docs/images/TaskBreakdown.png b/docs/images/TaskBreakdown.png index 125a4e9..fab702c 100644 Binary files a/docs/images/TaskBreakdown.png and b/docs/images/TaskBreakdown.png differ http://git-wip-us.apache.org/repos/asf/aurora/blob/2535c467/docs/images/killedtask.png ---------------------------------------------------------------------- diff --git a/docs/images/killedtask.png b/docs/images/killedtask.png index b173698..bb1bc92 100644 Binary files a/docs/images/killedtask.png and b/docs/images/killedtask.png differ http://git-wip-us.apache.org/repos/asf/aurora/blob/2535c467/docs/images/runningtask.png ---------------------------------------------------------------------- diff --git a/docs/images/runningtask.png b/docs/images/runningtask.png index 7f7553c..9e7e31a 100644 Binary files a/docs/images/runningtask.png and b/docs/images/runningtask.png differ http://git-wip-us.apache.org/repos/asf/aurora/blob/2535c467/docs/images/stderr.png ---------------------------------------------------------------------- diff --git a/docs/images/stderr.png b/docs/images/stderr.png index b83ccfa..6076a84 100644 Binary files a/docs/images/stderr.png and b/docs/images/stderr.png differ http://git-wip-us.apache.org/repos/asf/aurora/blob/2535c467/docs/images/stdout.png ---------------------------------------------------------------------- diff --git a/docs/images/stdout.png b/docs/images/stdout.png index fb4e0b7..d614097 100644 Binary files a/docs/images/stdout.png and b/docs/images/stdout.png differ http://git-wip-us.apache.org/repos/asf/aurora/blob/2535c467/docs/tools.md ---------------------------------------------------------------------- diff --git a/docs/tools.md b/docs/tools.md index c14aa2e..2ae550d 100644 --- a/docs/tools.md +++ b/docs/tools.md @@ -1,6 +1,6 @@ # Tools -Various tools integrate with Aurora. There is a tool missing? Let us know, or submit a patch to add it! +Various tools integrate with Aurora. Is there a tool missing? Let us know, or submit a patch to add it! * Load-balacing technology used to direct traffic to services running on Aurora - [synapse](https://github.com/airbnb/synapse) based on HAProxy http://git-wip-us.apache.org/repos/asf/aurora/blob/2535c467/docs/tutorial.md ---------------------------------------------------------------------- diff --git a/docs/tutorial.md b/docs/tutorial.md index 1bdc1ca..95539ef 100644 --- a/docs/tutorial.md +++ b/docs/tutorial.md @@ -1,65 +1,52 @@ -Aurora Tutorial ---------------- +# Aurora Tutorial -- [Introduction](#introduction) -- [Setup: Install Aurora](#setup-install-aurora) +This tutorial shows how to use the Aurora scheduler to run (and "`printf-debug`") +a hello world program on Mesos. This is the recommended document for new Aurora users +to start getting up to speed on the system. + +- [Prerequisite](#setup-install-aurora) - [The Script](#the-script) - [Aurora Configuration](#aurora-configuration) -- [What's Going On In That Configuration File?](#whats-going-on-in-that-configuration-file) - [Creating the Job](#creating-the-job) - [Watching the Job Run](#watching-the-job-run) - [Cleanup](#cleanup) - [Next Steps](#next-steps) -## Introduction - -This tutorial shows how to use the Aurora scheduler to run (and -"`printf-debug`") a hello world program on Mesos. The operational -hierarchy is: -- Aurora manages and schedules jobs for Mesos to run. -- Mesos manages the individual tasks that make up a job. -- Thermos manages the individual processes that make up a task. +## Prerequisite -This is the recommended first Aurora users document to read to start -getting up to speed on the system. +This tutorial assumes you are running [Aurora locally using Vagrant](vagrant.md). +However, in general the instructions are also applicable to any other +[Aurora installation](installing.md). -To get help, email questions to the Aurora Developer List, -[[email protected]](mailto:[email protected]) +Unless otherwise stated, all commands are to be run from the root of the aurora +repository clone. -## Setup: Install Aurora - -You use the Aurora client and web UI to interact with Aurora jobs. To -install it locally, see [vagrant.md](vagrant.md). The remainder of this -Tutorial assumes you are running Aurora using Vagrant. Unless otherwise stated, -all commands are to be run from the root of the aurora repository clone. ## The Script Our "hello world" application is a simple Python script that loops forever, displaying the time every few seconds. Copy the code below and -put it in a file named `hello_world.py` in the root of your Aurora repository clone (Note: -this directory is the same as `/vagrant` inside the Vagrant VMs). +put it in a file named `hello_world.py` in the root of your Aurora repository clone +(Note: this directory is the same as `/vagrant` inside the Vagrant VMs). The script has an intentional bug, which we will explain later on. <!-- NOTE: If you are changing this file, be sure to also update examples/vagrant/test_tutorial.sh. --> ```python -import sys import time -def main(argv): +def main(): SLEEP_DELAY = 10 # Python ninjas - ignore this blatant bug. for i in xrang(100): print("Hello world! The time is now: %s. Sleeping for %d secs" % ( time.asctime(), SLEEP_DELAY)) - sys.stdout.flush() time.sleep(SLEEP_DELAY) if __name__ == "__main__": - main(sys.argv) + main() ``` ## Aurora Configuration @@ -88,7 +75,7 @@ install = Process( # run the script hello_world = Process( name = 'hello_world', - cmdline = 'python hello_world.py') + cmdline = 'python -u hello_world.py') # describe the task hello_world_task = SequentialTask( @@ -104,14 +91,7 @@ jobs = [ ] ``` -For more about Aurora configuration files, see the [Configuration -Tutorial](configuration-tutorial.md) and the [Aurora + Thermos -Reference](configuration-reference.md) (preferably after finishing this -tutorial). - -## What's Going On In That Configuration File? - -More than you might think. +There is a lot going on in that configuration file: 1. From a "big picture" viewpoint, it first defines two Processes. Then it defines a Task that runs the two Processes in the @@ -125,6 +105,12 @@ specify more than one Job in a config file. local sandbox in which it will run. It then specifies how the code is actually run once the second Process starts. +For more about Aurora configuration files, see the [Configuration +Tutorial](configuration-tutorial.md) and the [Aurora + Thermos +Reference](configuration-reference.md) (preferably after finishing this +tutorial). + + ## Creating the Job We're ready to launch our job! To do so, we use the Aurora Client to @@ -133,39 +119,42 @@ issue a Job creation request to the Aurora scheduler. Many Aurora Client commands take a *job key* argument, which uniquely identifies a Job. A job key consists of four parts, each separated by a "/". The four parts are `<cluster>/<role>/<environment>/<jobname>` -in that order. When comparing two job keys, if any of the -four parts is different from its counterpart in the other key, then the -two job keys identify two separate jobs. If all four values are -identical, the job keys identify the same job. +in that order: + +* Cluster refers to the name of a particular Aurora installation. +* Role names are user accounts existing on the slave machines. If you +don't know what accounts are available, contact your sysadmin. +* Environment names are namespaces; you can count on `test`, `devel`, +`staging` and `prod` existing. +* Jobname is the custom name of your job. -`/etc/aurora/clusters.json` within the Aurora scheduler has the available -cluster names. For Vagrant, from the top-level of your Aurora repository clone, -do: +When comparing two job keys, if any of the four parts is different from +its counterpart in the other key, then the two job keys identify two separate +jobs. If all four values are identical, the job keys identify the same job. + +The `clusters.json` [client configuration](client-cluster-configuration.md) +for the Aurora scheduler defines the available cluster names. +For Vagrant, from the top-level of your Aurora repository clone, do: $ vagrant ssh Followed by: - vagrant@precise64:~$ cat /etc/aurora/clusters.json + vagrant@aurora:~$ cat /etc/aurora/clusters.json -You'll see something like: +You'll see something like the following. The `name` value shown here, corresponds to a job key's cluster value. ```javascript [{ "name": "devcluster", "zk": "192.168.33.7", "scheduler_zk_path": "/aurora/scheduler", - "auth_mechanism": "UNAUTHENTICATED" + "auth_mechanism": "UNAUTHENTICATED", + "slave_run_directory": "latest", + "slave_root": "/var/lib/mesos" }] ``` -Use a `name` value for your job key's cluster value. - -Role names are user accounts existing on the slave machines. If you don't know what accounts -are available, contact your sysadmin. - -Environment names are namespaces; you can count on `prod`, `devel` and `test` existing. - The Aurora Client command that actually runs our Job is `aurora job create`. It creates a Job as specified by its job key and configuration file arguments and runs it. @@ -175,20 +164,13 @@ Or for our example: aurora job create devcluster/www-data/devel/hello_world /vagrant/hello_world.aurora -This returns: - - $ vagrant ssh - Welcome to Ubuntu 12.04 LTS (GNU/Linux 3.2.0-23-generic x86_64) +After entering our virtual machine using `vagrant ssh`, this returns: - * Documentation: https://help.ubuntu.com/ - Welcome to your Vagrant-built virtual machine. - Last login: Fri Jan 3 02:18:55 2014 from 10.0.2.2 - vagrant@precise64:~$ aurora job create devcluster/www-data/devel/hello_world \ - /vagrant/hello_world.aurora + vagrant@aurora:~$ aurora job create devcluster/www-data/devel/hello_world /vagrant/hello_world.aurora INFO] Creating job hello_world - INFO] Response from scheduler: OK (message: 1 new tasks pending for job - www-data/devel/hello_world) - INFO] Job url: http://precise64:8081/scheduler/www-data/devel/hello_world + INFO] Checking status of devcluster/www-data/devel/hello_world + Job create succeeded: job url=http://aurora.local:8081/scheduler/www-data/devel/hello_world + ## Watching the Job Run @@ -208,27 +190,40 @@ If you click on your `hello_world` Job, you'll see:  -Oops, looks like our first job didn't quite work! The task failed, so we have -to figure out what went wrong. +Oops, looks like our first job didn't quite work! The task is temporarily throttled for +having failed on every attempt of the Aurora scheduler to run it. We have to figure out +what is going wrong. + +On the Completed tasks tab, we see all past attempts of the Aurora scheduler to run our job. + + -Access the page for our Task by clicking on its host. +We can navigate to the Task page of a failed run by clicking on the host link.  -Once there, we see that the -`hello_world` process failed. The Task page captures the standard error and -standard output streams and makes them available. Clicking through -to `stderr` on the failed `hello_world` process, we see what happened. +Once there, we see that the `hello_world` process failed. The Task page +captures the standard error and standard output streams and makes them available. +Clicking through to `stderr` on the failed `hello_world` process, we see what happened.  It looks like we made a typo in our Python script. We wanted `xrange`, -not `xrang`. Edit the `hello_world.py` script to use the correct function and -we will try again. +not `xrang`. Edit the `hello_world.py` script to use the correct function +and save it as `hello_world_v2.py`. Then update the `hello_world.aurora` +configuration to the newest version. + +In order to try again, we can now instruct the scheduler to update our job: + + vagrant@aurora:~$ aurora update start devcluster/www-data/devel/hello_world /vagrant/hello_world.aurora + INFO] Starting update for: hello_world + Job update has started. View your update progress at http://aurora.local:8081/scheduler/www-data/devel/hello_world/update/8ef38017-e60f-400d-a2f2-b5a8b724e95b + +This time, the task comes up. - aurora update start devcluster/www-data/devel/hello_world /vagrant/hello_world.aurora + -This time, the task comes up, we inspect the page, and see that the +By again clicking on the host, we inspect the Task page, and see that the `hello_world` process is running.  @@ -242,11 +237,11 @@ output: Now that we're done, we kill the job using the Aurora client: - vagrant@precise64:~$ aurora job killall devcluster/www-data/devel/hello_world + vagrant@aurora:~$ aurora job killall devcluster/www-data/devel/hello_world INFO] Killing tasks for job: devcluster/www-data/devel/hello_world - INFO] Response from scheduler: OK (message: Tasks killed.) - INFO] Job url: http://precise64:8081/scheduler/www-data/devel/hello_world - vagrant@precise64:~$ + INFO] Instances to be killed: [0] + Successfully killed instances [0] + Job killall succeeded The job page now shows the `hello_world` tasks as completed. http://git-wip-us.apache.org/repos/asf/aurora/blob/2535c467/examples/vagrant/test_tutorial.sh ---------------------------------------------------------------------- diff --git a/examples/vagrant/test_tutorial.sh b/examples/vagrant/test_tutorial.sh index 4628241..ae06fd2 100755 --- a/examples/vagrant/test_tutorial.sh +++ b/examples/vagrant/test_tutorial.sh @@ -34,7 +34,6 @@ function require_healthy { function write_test_files { cat > hello_world.py <<EOF -import sys import time def main(argv): @@ -43,11 +42,10 @@ def main(argv): for i in xrang(100): print("Hello world! The time is now: %s. Sleeping for %d secs" % ( time.asctime(), SLEEP_DELAY)) - sys.stdout.flush() time.sleep(SLEEP_DELAY) if __name__ == "__main__": - main(sys.argv) + main() EOF cat > hello_world.aurora <<EOF @@ -68,7 +66,7 @@ install = Process( # run the script hello_world = Process( name = 'hello_world', - cmdline = 'python hello_world.py') + cmdline = 'python -u hello_world.py') # describe the task hello_world_task = SequentialTask(
