Added: aurora/site/publish/documentation/0.16.0/operations/monitoring/index.html URL: http://svn.apache.org/viewvc/aurora/site/publish/documentation/0.16.0/operations/monitoring/index.html?rev=1762695&view=auto ============================================================================== --- aurora/site/publish/documentation/0.16.0/operations/monitoring/index.html (added) +++ aurora/site/publish/documentation/0.16.0/operations/monitoring/index.html Wed Sep 28 18:23:53 2016 @@ -0,0 +1,321 @@ +<!DOCTYPE html> +<html lang="en"> + <head> + <meta charset="utf-8"> + <meta name="viewport" content="width=device-width, initial-scale=1"> + <title>Apache Aurora</title> + <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.1/css/bootstrap.min.css"> + <link href="/assets/css/main.css" rel="stylesheet"> + <!-- Analytics --> + <script type="text/javascript"> + var _gaq = _gaq || []; + _gaq.push(['_setAccount', 'UA-45879646-1']); + _gaq.push(['_setDomainName', 'apache.org']); + _gaq.push(['_trackPageview']); + + (function() { + var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; + ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; + var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); + })(); + </script> + </head> + <body> + <div class="container-fluid section-header"> + <div class="container"> + <div class="nav nav-bar"> + <a href="/"><img src="/assets/img/aurora_logo_dkbkg.svg" width="300" alt="Transparent Apache Aurora logo with dark background"/></a> + <ul class="nav navbar-nav navbar-right"> + <li><a href="/documentation/latest/">Documentation</a></li> + <li><a href="/community/">Community</a></li> + <li><a href="/downloads/">Downloads</a></li> + <li><a href="/blog/">Blog</a></li> + </ul> + </div> + </div> +</div> + + <div class="container-fluid"> + <div class="container content"> + <div class="col-md-12 documentation"> +<h5 class="page-header text-uppercase">Documentation +<select onChange="window.location.href='/documentation/' + this.value + '/operations/monitoring/'" + value="0.16.0"> + <option value="0.16.0" + selected="selected"> + 0.16.0 + (latest) + </option> + <option value="0.15.0" + > + 0.15.0 + </option> + <option value="0.14.0" + > + 0.14.0 + </option> + <option value="0.13.0" + > + 0.13.0 + </option> + <option value="0.12.0" + > + 0.12.0 + </option> + <option value="0.11.0" + > + 0.11.0 + </option> + <option value="0.10.0" + > + 0.10.0 + </option> + <option value="0.9.0" + > + 0.9.0 + </option> + <option value="0.8.0" + > + 0.8.0 + </option> + <option value="0.7.0-incubating" + > + 0.7.0-incubating + </option> + <option value="0.6.0-incubating" + > + 0.6.0-incubating + </option> + <option value="0.5.0-incubating" + > + 0.5.0-incubating + </option> +</select> +</h5> +<h1 id="monitoring-your-aurora-cluster">Monitoring your Aurora cluster</h1> + +<p>Before you start running important services in your Aurora cluster, it’s important to set up +monitoring and alerting of Aurora itself. Most of your monitoring can be against the scheduler, +since it will give you a global view of what’s going on.</p> + +<h2 id="reading-stats">Reading stats</h2> + +<p>The scheduler exposes a <em>lot</em> of instrumentation data via its HTTP interface. You can get a quick +peek at the first few of these in our vagrant image:</p> +<pre class="highlight plaintext"><code>$ vagrant ssh -c 'curl -s localhost:8081/vars | head' +async_tasks_completed 1004 +attribute_store_fetch_all_events 15 +attribute_store_fetch_all_events_per_sec 0.0 +attribute_store_fetch_all_nanos_per_event 0.0 +attribute_store_fetch_all_nanos_total 3048285 +attribute_store_fetch_all_nanos_total_per_sec 0.0 +attribute_store_fetch_one_events 3391 +attribute_store_fetch_one_events_per_sec 0.0 +attribute_store_fetch_one_nanos_per_event 0.0 +attribute_store_fetch_one_nanos_total 454690753 +</code></pre> + +<p>These values are served as <code>Content-Type: text/plain</code>, with each line containing a space-separated metric +name and value. Values may be integers, doubles, or strings (note: strings are static, others +may be dynamic).</p> + +<p>If your monitoring infrastructure prefers JSON, the scheduler exports that as well:</p> +<pre class="highlight plaintext"><code>$ vagrant ssh -c 'curl -s localhost:8081/vars.json | python -mjson.tool | head' +{ + "async_tasks_completed": 1009, + "attribute_store_fetch_all_events": 15, + "attribute_store_fetch_all_events_per_sec": 0.0, + "attribute_store_fetch_all_nanos_per_event": 0.0, + "attribute_store_fetch_all_nanos_total": 3048285, + "attribute_store_fetch_all_nanos_total_per_sec": 0.0, + "attribute_store_fetch_one_events": 3409, + "attribute_store_fetch_one_events_per_sec": 0.0, + "attribute_store_fetch_one_nanos_per_event": 0.0, +</code></pre> + +<p>This will be the same data as above, served with <code>Content-Type: application/json</code>.</p> + +<h2 id="viewing-live-stat-samples-on-the-scheduler">Viewing live stat samples on the scheduler</h2> + +<p>The scheduler uses the Twitter commons stats library, which keeps an internal time-series database +of exported variables - nearly everything in <code>/vars</code> is available for instant graphing. This is +useful for debugging, but is not a replacement for an external monitoring system.</p> + +<p>You can view these graphs on a scheduler at <code>/graphview</code>. It supports some composition and +aggregation of values, which can be invaluable when triaging a problem. For example, if you have +the scheduler running in vagrant, check out these links: +<a href="http://192.168.33.7:8081/graphview?query=jvm_uptime_secs">simple graph</a> +<a href="http://192.168.33.7:8081/graphview?query=rate(scheduler_log_native_append_nanos_total)%2Frate(scheduler_log_native_append_events)%2F1e6">complex composition</a></p> + +<h3 id="counters-and-gauges">Counters and gauges</h3> + +<p>Among numeric stats, there are two fundamental types of stats exported: <em>counters</em> and <em>gauges</em>. +Counters are guaranteed to be monotonically-increasing for the lifetime of a process, while gauges +may decrease in value. Aurora uses counters to represent things like the number of times an event +has occurred, and gauges to capture things like the current length of a queue. Counters are a +natural fit for accurate composition into <a href="http://en.wikipedia.org/wiki/Rate_ratio">rate ratios</a> +(useful for sample-resistant latency calculation), while gauges are not.</p> + +<h1 id="alerting">Alerting</h1> + +<h2 id="quickstart">Quickstart</h2> + +<p>If you are looking for just bare-minimum alerting to get something in place quickly, set up alerting +on <code>framework_registered</code> and <code>task_store_LOST</code>. These will give you a decent picture of overall +health.</p> + +<h2 id="a-note-on-thresholds">A note on thresholds</h2> + +<p>One of the most difficult things in monitoring is choosing alert thresholds. With many of these +stats, there is no value we can offer as a threshold that will be guaranteed to work for you. It +will depend on the size of your cluster, number of jobs, churn of tasks in the cluster, etc. We +recommend you start with a strict value after viewing a small amount of collected data, and then +adjust thresholds as you see fit. Feel free to ask us if you would like to validate that your alerts +and thresholds make sense.</p> + +<h2 id="important-stats">Important stats</h2> + +<h3 id="jvm_uptime_secs"><code>jvm_uptime_secs</code></h3> + +<p>Type: integer counter</p> + +<p>The number of seconds the JVM process has been running. Comes from +<a href="http://docs.oracle.com/javase/7/docs/api/java/lang/management/RuntimeMXBean.html#getUptime()">RuntimeMXBean#getUptime()</a></p> + +<p>Detecting resets (decreasing values) on this stat will tell you that the scheduler is failing to +stay alive.</p> + +<p>Look at the scheduler logs to identify the reason the scheduler is exiting.</p> + +<h3 id="system_load_avg"><code>system_load_avg</code></h3> + +<p>Type: double gauge</p> + +<p>The current load average of the system for the last minute. Comes from +<a href="http://docs.oracle.com/javase/7/docs/api/java/lang/management/OperatingSystemMXBean.html?is-external=true#getSystemLoadAverage()">OperatingSystemMXBean#getSystemLoadAverage()</a>.</p> + +<p>A high sustained value suggests that the scheduler machine may be over-utilized.</p> + +<p>Use standard unix tools like <code>top</code> and <code>ps</code> to track down the offending process(es).</p> + +<h3 id="process_cpu_cores_utilized"><code>process_cpu_cores_utilized</code></h3> + +<p>Type: double gauge</p> + +<p>The current number of CPU cores in use by the JVM process. This should not exceed the number of +logical CPU cores on the machine. Derived from +<a href="http://docs.oracle.com/javase/7/docs/jre/api/management/extension/com/sun/management/OperatingSystemMXBean.html">OperatingSystemMXBean#getProcessCpuTime()</a></p> + +<p>A high sustained value indicates that the scheduler is overworked. Due to current internal design +limitations, if this value is sustained at <code>1</code>, there is a good chance the scheduler is under water.</p> + +<p>There are two main inputs that tend to drive this figure: task scheduling attempts and status +updates from Mesos. You may see activity in the scheduler logs to give an indication of where +time is being spent. Beyond that, it really takes good familiarity with the code to effectively +triage this. We suggest engaging with an Aurora developer.</p> + +<h3 id="task_store_lost"><code>task_store_LOST</code></h3> + +<p>Type: integer gauge</p> + +<p>The number of tasks stored in the scheduler that are in the <code>LOST</code> state, and have been rescheduled.</p> + +<p>If this value is increasing at a high rate, it is a sign of trouble.</p> + +<p>There are many sources of <code>LOST</code> tasks in Mesos: the scheduler, master, agent, and executor can all +trigger this. The first step is to look in the scheduler logs for <code>LOST</code> to identify where the +state changes are originating.</p> + +<h3 id="scheduler_resource_offers"><code>scheduler_resource_offers</code></h3> + +<p>Type: integer counter</p> + +<p>The number of resource offers that the scheduler has received.</p> + +<p>For a healthy scheduler, this value must be increasing over time.</p> + +<p>Assuming the scheduler is up and otherwise healthy, you will want to check if the master thinks it +is sending offers. You should also look at the master’s web interface to see if it has a large +number of outstanding offers that it is waiting to be returned.</p> + +<h3 id="framework_registered"><code>framework_registered</code></h3> + +<p>Type: binary integer counter</p> + +<p>Will be <code>1</code> for the leading scheduler that is registered with the Mesos master, <code>0</code> for passive +schedulers,</p> + +<p>A sustained period without a <code>1</code> (or where <code>sum() != 1</code>) warrants investigation.</p> + +<p>If there is no leading scheduler, look in the scheduler and master logs for why. If there are +multiple schedulers claiming leadership, this suggests a split brain and warrants filing a critical +bug.</p> + +<h3 id="rate-scheduler_log_native_append_nanos_total-rate-scheduler_log_native_append_events"><code>rate(scheduler_log_native_append_nanos_total)/rate(scheduler_log_native_append_events)</code></h3> + +<p>Type: rate ratio of integer counters</p> + +<p>This composes two counters to compute a windowed figure for the latency of replicated log writes.</p> + +<p>A hike in this value suggests disk bandwidth contention.</p> + +<p>Look in scheduler logs for any reported oddness with saving to the replicated log. Also use +standard tools like <code>vmstat</code> and <code>iotop</code> to identify whether the disk has become slow or +over-utilized. We suggest using a dedicated disk for the replicated log to mitigate this.</p> + +<h3 id="timed_out_tasks"><code>timed_out_tasks</code></h3> + +<p>Type: integer counter</p> + +<p>Tracks the number of times the scheduler has given up while waiting +(for <code>-transient_task_state_timeout</code>) to hear back about a task that is in a transient state +(e.g. <code>ASSIGNED</code>, <code>KILLING</code>), and has moved to <code>LOST</code> before rescheduling.</p> + +<p>This value is currently known to increase occasionally when the scheduler fails over +(<a href="https://issues.apache.org/jira/browse/AURORA-740">AURORA-740</a>). However, any large spike in this +value warrants investigation.</p> + +<p>The scheduler will log when it times out a task. You should trace the task ID of the timed out +task into the master, agent, and/or executors to determine where the message was dropped.</p> + +<h3 id="http_500_responses_events"><code>http_500_responses_events</code></h3> + +<p>Type: integer counter</p> + +<p>The total number of HTTP 500 status responses sent by the scheduler. Includes API and asset serving.</p> + +<p>An increase warrants investigation.</p> + +<p>Look in scheduler logs to identify why the scheduler returned a 500, there should be a stack trace.</p> + +</div> + + </div> + </div> + <div class="container-fluid section-footer buffer"> + <div class="container"> + <div class="row"> + <div class="col-md-2 col-md-offset-1"><h3>Quick Links</h3> + <ul> + <li><a href="/downloads/">Downloads</a></li> + <li><a href="/community/">Mailing Lists</a></li> + <li><a href="http://issues.apache.org/jira/browse/AURORA">Issue Tracking</a></li> + <li><a href="/documentation/latest/contributing/">How To Contribute</a></li> + </ul> + </div> + <div class="col-md-2"><h3>The ASF</h3> + <ul> + <li><a href="http://www.apache.org/licenses/">License</a></li> + <li><a href="http://www.apache.org/foundation/sponsorship.html">Sponsorship</a></li> + <li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li> + <li><a href="http://www.apache.org/security/">Security</a></li> + </ul> + </div> + <div class="col-md-6"> + <p class="disclaimer">© 2014-2016 <a href="http://www.apache.org/">Apache Software Foundation</a>. Licensed under the <a href="http://www.apache.org/licenses/">Apache License v2.0</a>. The <a href="https://www.flickr.com/photos/trondk/12706051375/">Aurora Borealis IX photo</a> displayed on the homepage is available under a <a href="https://creativecommons.org/licenses/by-nc-nd/2.0/">Creative Commons BY-NC-ND 2.0 license</a>. Apache, Apache Aurora, and the Apache feather logo are trademarks of The Apache Software Foundation.</p> + </div> + </div> + </div> + + </body> +</html>
Added: aurora/site/publish/documentation/0.16.0/operations/security/index.html URL: http://svn.apache.org/viewvc/aurora/site/publish/documentation/0.16.0/operations/security/index.html?rev=1762695&view=auto ============================================================================== --- aurora/site/publish/documentation/0.16.0/operations/security/index.html (added) +++ aurora/site/publish/documentation/0.16.0/operations/security/index.html Wed Sep 28 18:23:53 2016 @@ -0,0 +1,462 @@ +<!DOCTYPE html> +<html lang="en"> + <head> + <meta charset="utf-8"> + <meta name="viewport" content="width=device-width, initial-scale=1"> + <title>Apache Aurora</title> + <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.1/css/bootstrap.min.css"> + <link href="/assets/css/main.css" rel="stylesheet"> + <!-- Analytics --> + <script type="text/javascript"> + var _gaq = _gaq || []; + _gaq.push(['_setAccount', 'UA-45879646-1']); + _gaq.push(['_setDomainName', 'apache.org']); + _gaq.push(['_trackPageview']); + + (function() { + var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; + ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; + var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); + })(); + </script> + </head> + <body> + <div class="container-fluid section-header"> + <div class="container"> + <div class="nav nav-bar"> + <a href="/"><img src="/assets/img/aurora_logo_dkbkg.svg" width="300" alt="Transparent Apache Aurora logo with dark background"/></a> + <ul class="nav navbar-nav navbar-right"> + <li><a href="/documentation/latest/">Documentation</a></li> + <li><a href="/community/">Community</a></li> + <li><a href="/downloads/">Downloads</a></li> + <li><a href="/blog/">Blog</a></li> + </ul> + </div> + </div> +</div> + + <div class="container-fluid"> + <div class="container content"> + <div class="col-md-12 documentation"> +<h5 class="page-header text-uppercase">Documentation +<select onChange="window.location.href='/documentation/' + this.value + '/operations/security/'" + value="0.16.0"> + <option value="0.16.0" + selected="selected"> + 0.16.0 + (latest) + </option> + <option value="0.15.0" + > + 0.15.0 + </option> + <option value="0.14.0" + > + 0.14.0 + </option> + <option value="0.13.0" + > + 0.13.0 + </option> + <option value="0.12.0" + > + 0.12.0 + </option> + <option value="0.11.0" + > + 0.11.0 + </option> + <option value="0.10.0" + > + 0.10.0 + </option> + <option value="0.9.0" + > + 0.9.0 + </option> + <option value="0.8.0" + > + 0.8.0 + </option> + <option value="0.7.0-incubating" + > + 0.7.0-incubating + </option> + <option value="0.6.0-incubating" + > + 0.6.0-incubating + </option> + <option value="0.5.0-incubating" + > + 0.5.0-incubating + </option> +</select> +</h5> +<h1 id="securing-your-aurora-cluster">Securing your Aurora Cluster</h1> + +<p>Aurora integrates with <a href="http://shiro.apache.org/">Apache Shiro</a> to provide security +controls for its API. In addition to providing some useful features out of the box, Shiro +also allows Aurora cluster administrators to adapt the security system to their organizationâs +existing infrastructure. The announcer in the Aurora thermos executor also supports security +controls for talking to ZooKeeper.</p> + +<ul> +<li><a href="#enabling-security">Enabling Security</a></li> +<li><a href="#authentication">Authentication</a> + +<ul> +<li><a href="#http-basic-authentication">HTTP Basic Authentication</a> + +<ul> +<li><a href="#server-configuration">Server Configuration</a></li> +<li><a href="#client-configuration">Client Configuration</a></li> +</ul></li> +<li><a href="#http-spnego-authentication-kerberos">HTTP SPNEGO Authentication (Kerberos)</a> + +<ul> +<li><a href="#server-configuration-1">Server Configuration</a></li> +<li><a href="#client-configuration-1">Client Configuration</a></li> +</ul></li> +</ul></li> +<li><a href="#authorization">Authorization</a> + +<ul> +<li><a href="#using-an-ini-file-to-define-security-controls">Using an INI file to define security controls</a> + +<ul> +<li><a href="#caveats">Caveats</a></li> +</ul></li> +</ul></li> +<li><a href="#implementing-a-custom-realm">Implementing a Custom Realm</a> + +<ul> +<li><a href="#packaging-a-realm-module">Packaging a realm module</a></li> +</ul></li> +<li><a href="#known-issues">Known Issues</a></li> +<li><a href="#announcer-authentication">Announcer Authentication</a> + +<ul> +<li><a href="#zookeeper-authentication-configuration">ZooKeeper authentication configuration</a></li> +<li><a href="#executor-settings">Executor settings</a></li> +</ul></li> +</ul> + +<h1 id="enabling-security">Enabling Security</h1> + +<p>There are two major components of security: +<a href="http://en.wikipedia.org/wiki/Authentication#Authorization">authentication and authorization</a>. A +cluster administrator may choose the approach used for each, and may also implement custom +mechanisms for either. Later sections describe the options available. To enable authentication + for the announcer, see <a href="#announcer-authentication">Announcer Authentication</a></p> + +<h1 id="authentication">Authentication</h1> + +<p>The scheduler must be configured with instructions for how to process authentication +credentials at a minimum. There are currently two built-in authentication schemes - +<a href="http://en.wikipedia.org/wiki/Basic_access_authentication">HTTP Basic Authentication</a>, and +<a href="http://en.wikipedia.org/wiki/SPNEGO">SPNEGO</a> (Kerberos).</p> + +<h2 id="http-basic-authentication">HTTP Basic Authentication</h2> + +<p>Basic Authentication is a very quick way to add <em>some</em> security. It is supported +by all major browsers and HTTP client libraries with minimal work. However, +before relying on Basic Authentication you should be aware of the <a href="http://tools.ietf.org/html/rfc2617#section-4">security +considerations</a>.</p> + +<h3 id="server-configuration">Server Configuration</h3> + +<p>At a minimum you need to set 4 command-line flags on the scheduler:</p> +<pre class="highlight plaintext"><code>-http_authentication_mechanism=BASIC +-shiro_realm_modules=INI_AUTHNZ +-shiro_ini_path=path/to/security.ini +</code></pre> + +<p>And create a security.ini file like so:</p> +<pre class="highlight plaintext"><code>[users] +sally = apple, admin + +[roles] +admin = * +</code></pre> + +<p>The details of the security.ini file are explained below. Note that this file contains plaintext, +unhashed passwords.</p> + +<h3 id="client-configuration">Client Configuration</h3> + +<p>To configure the client for HTTP Basic authentication, add an entry to ~/.netrc with your credentials</p> +<pre class="highlight plaintext"><code>% cat ~/.netrc +# ... + +machine aurora.example.com +login sally +password apple + +# ... +</code></pre> + +<p>No changes are required to <code>clusters.json</code>.</p> + +<h2 id="http-spnego-authentication-kerberos">HTTP SPNEGO Authentication (Kerberos)</h2> + +<h3 id="server-configuration">Server Configuration</h3> + +<p>At a minimum you need to set 6 command-line flags on the scheduler:</p> +<pre class="highlight plaintext"><code>-http_authentication_mechanism=NEGOTIATE +-shiro_realm_modules=KERBEROS5_AUTHN,INI_AUTHNZ +-kerberos_server_principal=HTTP/aurora.example....@example.com +-kerberos_server_keytab=path/to/aurora.example.com.keytab +-shiro_ini_path=path/to/security.ini +</code></pre> + +<p>And create a security.ini file like so:</p> +<pre class="highlight plaintext"><code>% cat path/to/security.ini +[users] +sally = _, admin + +[roles] +admin = * +</code></pre> + +<p>What’s going on here? First, Aurora must be configured to request Kerberos credentials when presented with an +unauthenticated request. This is achieved by setting</p> +<pre class="highlight plaintext"><code>-http_authentication_mechanism=NEGOTIATE +</code></pre> + +<p>Next, a Realm module must be configured to <strong>authenticate</strong> the current request using the Kerberos +credentials that were requested. Aurora ships with a realm module that can do this</p> +<pre class="highlight plaintext"><code>-shiro_realm_modules=KERBEROS5_AUTHN[,...] +</code></pre> + +<p>The Kerberos5Realm requires a keytab file and a server principal name. The principal name will usually +be in the form <code>HTTP/aurora.example....@example.com</code>.</p> +<pre class="highlight plaintext"><code>-kerberos_server_principal=HTTP/aurora.example....@example.com +-kerberos_server_keytab=path/to/aurora.example.com.keytab +</code></pre> + +<p>The Kerberos5 realm module is authentication-only. For scheduler security to work you must also +enable a realm module that provides an Authorizer implementation. For example, to do this using the +IniShiroRealmModule:</p> +<pre class="highlight plaintext"><code>-shiro_realm_modules=KERBEROS5_AUTHN,INI_AUTHNZ +</code></pre> + +<p>You can then configure authorization using a security.ini file as described below +(the password field is ignored). You must configure the realm module with the path to this file:</p> +<pre class="highlight plaintext"><code>-shiro_ini_path=path/to/security.ini +</code></pre> + +<h3 id="client-configuration">Client Configuration</h3> + +<p>To use Kerberos on the client-side you must build Kerberos-enabled client binaries. Do this with</p> +<pre class="highlight plaintext"><code>./pants binary src/main/python/apache/aurora/kerberos:kaurora +./pants binary src/main/python/apache/aurora/kerberos:kaurora_admin +</code></pre> + +<p>You must also configure each cluster where you’ve enabled Kerberos on the scheduler +to use Kerberos authentication. Do this by setting <code>auth_mechanism</code> to <code>KERBEROS</code> +in <code>clusters.json</code>.</p> +<pre class="highlight plaintext"><code>% cat ~/.aurora/clusters.json +{ + "devcluser": { + "auth_mechanism": "KERBEROS", + ... + }, + ... +} +</code></pre> + +<h1 id="authorization">Authorization</h1> + +<p>Given a means to authenticate the entity a client claims they are, we need to define what privileges they have.</p> + +<h2 id="using-an-ini-file-to-define-security-controls">Using an INI file to define security controls</h2> + +<p>The simplest security configuration for Aurora is an INI file on the scheduler. For small +clusters, or clusters where the users and access controls change relatively infrequently, this is +likely the preferred approach. However you may want to avoid this approach if access permissions +are rapidly changing, or if your access control information already exists in another system.</p> + +<p>You can enable INI-based configuration with following scheduler command line arguments:</p> +<pre class="highlight plaintext"><code>-http_authentication_mechanism=BASIC +-shiro_ini_path=path/to/security.ini +</code></pre> + +<p><em>note</em> As the argument name reveals, this is using Shiroâs +<a href="http://shiro.apache.org/configuration.html#Configuration-INIConfiguration">IniRealm</a> behind +the scenes.</p> + +<p>The INI file will contain two sections - users and roles. Hereâs an example for what might +be in security.ini:</p> +<pre class="highlight plaintext"><code>[users] +sally = apple, admin +jim = 123456, accounting +becky = letmein, webapp +larry = 654321,accounting +steve = password + +[roles] +admin = * +accounting = thrift.AuroraAdmin:setQuota +webapp = thrift.AuroraSchedulerManager:*:webapp +</code></pre> + +<p>The users section defines user user credentials and the role(s) they are members of. These lines +are of the format <code><user> = <password>[, <role>...]</code>. As you probably noticed, the passwords are +in plaintext and as a result read access to this file should be restricted.</p> + +<p>In this configuration, each user has different privileges for actions in the cluster because +of the roles they are a part of:</p> + +<ul> +<li>admin is granted all privileges</li> +<li>accounting may adjust the amount of resource quota for any role</li> +<li>webapp represents a collection of jobs that represents a service, and its members may create and modify any jobs owned by it</li> +</ul> + +<h3 id="caveats">Caveats</h3> + +<p>You might find documentation on the Internet suggesting there are additional sections in <code>shiro.ini</code>, +like <code>[main]</code> and <code>[urls]</code>. These are not supported by Aurora as it uses a different mechanism to configure +those parts of Shiro. Think of Aurora’s <code>security.ini</code> as a subset with only <code>[users]</code> and <code>[roles]</code> sections.</p> + +<h2 id="implementing-delegated-authorization">Implementing Delegated Authorization</h2> + +<p>It is possible to leverage Shiro’s <code>runAs</code> feature by implementing a custom Servlet Filter that provides +the capability and passing it’s fully qualified class name to the command line argument +<code>-shiro_after_auth_filter</code>. The filter is registered in the same filter chain as the Shiro auth filters +and is placed after the Shiro auth filters in the filter chain. This ensures that the Filter is invoked +after the Shiro filters have had a chance to authenticate the request.</p> + +<h1 id="implementing-a-custom-realm">Implementing a Custom Realm</h1> + +<p>Since Auroraâs security is backed by <a href="https://shiro.apache.org">Apache Shiro</a>, you can implement a +custom <a href="http://shiro.apache.org/realm.html">Realm</a> to define organization-specific security behavior.</p> + +<p>In addition to using Shiro’s standard APIs to implement a Realm you can link against Aurora to +access the type-safe Permissions Aurora uses. See the Javadoc for <code>org.apache.aurora.scheduler.spi</code> +for more information.</p> + +<h2 id="packaging-a-realm-module">Packaging a realm module</h2> + +<p>Package your custom Realm(s) with a Guice module that exposes a <code>Set<Realm></code> multibinding.</p> +<pre class="highlight java"><code><span style="color: #000000;font-weight: bold">package</span> <span style="background-color: #f8f8f8">com</span><span style="color: #000000;font-weight: bold">.</span><span style="color: #008080">example</span><span style="color: #000000;font-weight: bold">;</span> + +<span style="color: #000000;font-weight: bold">import</span> <span style="color: #555555">com.google.inject.AbstractModule</span><span style="color: #000000;font-weight: bold">;</span> +<span style="color: #000000;font-weight: bold">import</span> <span style="color: #555555">com.google.inject.multibindings.Multibinder</span><span style="color: #000000;font-weight: bold">;</span> +<span style="color: #000000;font-weight: bold">import</span> <span style="color: #555555">org.apache.shiro.realm.Realm</span><span style="color: #000000;font-weight: bold">;</span> + +<span style="color: #000000;font-weight: bold">public</span> <span style="color: #000000;font-weight: bold">class</span> <span style="color: #445588;font-weight: bold">MyRealmModule</span> <span style="color: #000000;font-weight: bold">extends</span> <span style="background-color: #f8f8f8">AbstractModule</span> <span style="color: #000000;font-weight: bold">{</span> + <span style="color: #3c5d5d;font-weight: bold">@Override</span> + <span style="color: #000000;font-weight: bold">public</span> <span style="color: #445588;font-weight: bold">void</span> <span style="background-color: #f8f8f8">configure</span><span style="color: #000000;font-weight: bold">()</span> <span style="color: #000000;font-weight: bold">{</span> + <span style="background-color: #f8f8f8">Realm</span> <span style="background-color: #f8f8f8">myRealm</span> <span style="color: #000000;font-weight: bold">=</span> <span style="color: #000000;font-weight: bold">new</span> <span style="background-color: #f8f8f8">MyRealm</span><span style="color: #000000;font-weight: bold">();</span> + + <span style="background-color: #f8f8f8">Multibinder</span><span style="color: #000000;font-weight: bold">.</span><span style="color: #008080">newSetBinder</span><span style="color: #000000;font-weight: bold">(</span><span style="background-color: #f8f8f8">binder</span><span style="color: #000000;font-weight: bold">(),</span> <span style="background-color: #f8f8f8">Realm</span><span style="color: #000000;font-weight: bold">.</span><span style="color: #008080">class</span><span style="color: #000000;font-weight: bold">).</span><span style="color: #008080">addBinding</span><span style="color: #000000;font-weight: bold">().</span><span style="color: #008080">toInstance</span><span style="color: #000000;font-weight: bold">(</span><span style="background-color: #f8f8f8">myRealm</span><span style="color: #000000;font-weight: bold">);</span> + <span style="color: #000000;font-weight: bold">}</span> + + <span style="color: #000000;font-weight: bold">static</span> <span style="color: #000000;font-weight: bold">class</span> <span style="color: #445588;font-weight: bold">MyRealm</span> <span style="color: #000000;font-weight: bold">implements</span> <span style="background-color: #f8f8f8">Realm</span> <span style="color: #000000;font-weight: bold">{</span> + <span style="color: #999988;font-style: italic">// Realm implementation.</span> + <span style="color: #000000;font-weight: bold">}</span> +<span style="color: #000000;font-weight: bold">}</span> +</code></pre> + +<p>To use your module in the scheduler, include it as a realm module based on its fully-qualified +class name:</p> +<pre class="highlight plaintext"><code>-shiro_realm_modules=KERBEROS5_AUTHN,INI_AUTHNZ,com.example.MyRealmModule +</code></pre> + +<h1 id="known-issues">Known Issues</h1> + +<p>While the APIs and SPIs we ship with are stable as of 0.8.0, we are aware of several incremental +improvements. Please follow, vote, or send patches.</p> + +<p>Relevant tickets: +* <a href="https://issues.apache.org/jira/browse/AURORA-343">AURORA-343</a>: HTTPS support +* <a href="https://issues.apache.org/jira/browse/AURORA-1248">AURORA-1248</a>: Client retries 4xx errors +* <a href="https://issues.apache.org/jira/browse/AURORA-1279">AURORA-1279</a>: Remove kerberos-specific build targets +* <a href="https://issues.apache.org/jira/browse/AURORA-1291">AURORA-1293</a>: Consider defining a JSON format in place of INI +* <a href="https://issues.apache.org/jira/browse/AURORA-1179">AURORA-1179</a>: Supported hashed passwords in security.ini +* <a href="https://issues.apache.org/jira/browse/AURORA-1295">AURORA-1295</a>: Support security for the ReadOnlyScheduler service</p> + +<h1 id="announcer-authentication">Announcer Authentication</h1> + +<p>The Thermos executor can be configured to authenticate with ZooKeeper and include +an <a href="https://zookeeper.apache.org/doc/current/zookeeperProgrammers.html#sc_ZooKeeperAccessControl">ACL</a> +on the nodes it creates, which will specify +the privileges of clients to perform different actions on these nodes. This +feature is enabled by specifying an ACL configuration file to the executor with the +<code>--announcer-zookeeper-auth-config</code> command line argument.</p> + +<p>When this feature is <em>not</em> enabled, nodes created by the executor will have ‘world/all’ permission +(<code>ZOO_OPEN_ACL_UNSAFE</code>). In most production environments, operators should specify an ACL and +limit access.</p> + +<h2 id="zookeeper-authentication-configuration">ZooKeeper Authentication Configuration</h2> + +<p>The configuration file must be formatted as JSON with the following schema:</p> +<pre class="highlight json"><code><span style="background-color: #f8f8f8">{</span><span style="color: #bbbbbb"> + </span><span style="color: #000080">"auth"</span><span style="background-color: #f8f8f8">:</span><span style="color: #bbbbbb"> </span><span style="background-color: #f8f8f8">[</span><span style="color: #bbbbbb"> + </span><span style="background-color: #f8f8f8">{</span><span style="color: #bbbbbb"> + </span><span style="color: #000080">"scheme"</span><span style="background-color: #f8f8f8">:</span><span style="color: #bbbbbb"> </span><span style="color: #d14">"<scheme>"</span><span style="background-color: #f8f8f8">,</span><span style="color: #bbbbbb"> + </span><span style="color: #000080">"credential"</span><span style="background-color: #f8f8f8">:</span><span style="color: #bbbbbb"> </span><span style="color: #d14">"<plain_credential>"</span><span style="color: #bbbbbb"> + </span><span style="background-color: #f8f8f8">}</span><span style="color: #bbbbbb"> + </span><span style="background-color: #f8f8f8">],</span><span style="color: #bbbbbb"> + </span><span style="color: #000080">"acl"</span><span style="background-color: #f8f8f8">:</span><span style="color: #bbbbbb"> </span><span style="background-color: #f8f8f8">[</span><span style="color: #bbbbbb"> + </span><span style="background-color: #f8f8f8">{</span><span style="color: #bbbbbb"> + </span><span style="color: #000080">"scheme"</span><span style="background-color: #f8f8f8">:</span><span style="color: #bbbbbb"> </span><span style="color: #d14">"<scheme>"</span><span style="background-color: #f8f8f8">,</span><span style="color: #bbbbbb"> + </span><span style="color: #000080">"credential"</span><span style="background-color: #f8f8f8">:</span><span style="color: #bbbbbb"> </span><span style="color: #d14">"<plain_credential>"</span><span style="background-color: #f8f8f8">,</span><span style="color: #bbbbbb"> + </span><span style="color: #000080">"permissions"</span><span style="background-color: #f8f8f8">:</span><span style="color: #bbbbbb"> </span><span style="background-color: #f8f8f8">{</span><span style="color: #bbbbbb"> + </span><span style="color: #000080">"read"</span><span style="background-color: #f8f8f8">:</span><span style="color: #bbbbbb"> </span><span style="color: #a61717;background-color: #e3d2d2"><bool></span><span style="background-color: #f8f8f8">,</span><span style="color: #bbbbbb"> + </span><span style="color: #000080">"write"</span><span style="background-color: #f8f8f8">:</span><span style="color: #bbbbbb"> </span><span style="color: #a61717;background-color: #e3d2d2"><bool></span><span style="background-color: #f8f8f8">,</span><span style="color: #bbbbbb"> + </span><span style="color: #000080">"create"</span><span style="background-color: #f8f8f8">:</span><span style="color: #bbbbbb"> </span><span style="color: #a61717;background-color: #e3d2d2"><bool></span><span style="background-color: #f8f8f8">,</span><span style="color: #bbbbbb"> + </span><span style="color: #000080">"delete"</span><span style="background-color: #f8f8f8">:</span><span style="color: #bbbbbb"> </span><span style="color: #a61717;background-color: #e3d2d2"><bool></span><span style="background-color: #f8f8f8">,</span><span style="color: #bbbbbb"> + </span><span style="color: #000080">"admin"</span><span style="background-color: #f8f8f8">:</span><span style="color: #bbbbbb"> </span><span style="color: #a61717;background-color: #e3d2d2"><bool></span><span style="color: #bbbbbb"> + </span><span style="background-color: #f8f8f8">}</span><span style="color: #bbbbbb"> + </span><span style="background-color: #f8f8f8">}</span><span style="color: #bbbbbb"> + </span><span style="background-color: #f8f8f8">]</span><span style="color: #bbbbbb"> +</span><span style="background-color: #f8f8f8">}</span><span style="color: #bbbbbb"> +</span></code></pre> + +<p>The <code>scheme</code> +defines the encoding of the credential field. Note that these fields are passed directly to +ZooKeeper (except in the case of <em>digest</em> scheme, where the executor will hash and encode +the credential appropriately before passing it to ZooKeeper). In addition to <code>acl</code>, a list of +authentication credentials must be provided in <code>auth</code> to use for the connection.</p> + +<p>All properties of the <code>permissions</code> object will default to False if not provided.</p> + +<h2 id="executor-settings">Executor settings</h2> + +<p>To enable the executor to authenticate against ZK, <code>--announcer-zookeeper-auth-config</code> should be +set to the configuration file.</p> + +</div> + + </div> + </div> + <div class="container-fluid section-footer buffer"> + <div class="container"> + <div class="row"> + <div class="col-md-2 col-md-offset-1"><h3>Quick Links</h3> + <ul> + <li><a href="/downloads/">Downloads</a></li> + <li><a href="/community/">Mailing Lists</a></li> + <li><a href="http://issues.apache.org/jira/browse/AURORA">Issue Tracking</a></li> + <li><a href="/documentation/latest/contributing/">How To Contribute</a></li> + </ul> + </div> + <div class="col-md-2"><h3>The ASF</h3> + <ul> + <li><a href="http://www.apache.org/licenses/">License</a></li> + <li><a href="http://www.apache.org/foundation/sponsorship.html">Sponsorship</a></li> + <li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li> + <li><a href="http://www.apache.org/security/">Security</a></li> + </ul> + </div> + <div class="col-md-6"> + <p class="disclaimer">© 2014-2016 <a href="http://www.apache.org/">Apache Software Foundation</a>. Licensed under the <a href="http://www.apache.org/licenses/">Apache License v2.0</a>. The <a href="https://www.flickr.com/photos/trondk/12706051375/">Aurora Borealis IX photo</a> displayed on the homepage is available under a <a href="https://creativecommons.org/licenses/by-nc-nd/2.0/">Creative Commons BY-NC-ND 2.0 license</a>. Apache, Apache Aurora, and the Apache feather logo are trademarks of The Apache Software Foundation.</p> + </div> + </div> + </div> + + </body> +</html> Added: aurora/site/publish/documentation/0.16.0/operations/storage/index.html URL: http://svn.apache.org/viewvc/aurora/site/publish/documentation/0.16.0/operations/storage/index.html?rev=1762695&view=auto ============================================================================== --- aurora/site/publish/documentation/0.16.0/operations/storage/index.html (added) +++ aurora/site/publish/documentation/0.16.0/operations/storage/index.html Wed Sep 28 18:23:53 2016 @@ -0,0 +1,230 @@ +<!DOCTYPE html> +<html lang="en"> + <head> + <meta charset="utf-8"> + <meta name="viewport" content="width=device-width, initial-scale=1"> + <title>Apache Aurora</title> + <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.1/css/bootstrap.min.css"> + <link href="/assets/css/main.css" rel="stylesheet"> + <!-- Analytics --> + <script type="text/javascript"> + var _gaq = _gaq || []; + _gaq.push(['_setAccount', 'UA-45879646-1']); + _gaq.push(['_setDomainName', 'apache.org']); + _gaq.push(['_trackPageview']); + + (function() { + var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; + ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; + var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); + })(); + </script> + </head> + <body> + <div class="container-fluid section-header"> + <div class="container"> + <div class="nav nav-bar"> + <a href="/"><img src="/assets/img/aurora_logo_dkbkg.svg" width="300" alt="Transparent Apache Aurora logo with dark background"/></a> + <ul class="nav navbar-nav navbar-right"> + <li><a href="/documentation/latest/">Documentation</a></li> + <li><a href="/community/">Community</a></li> + <li><a href="/downloads/">Downloads</a></li> + <li><a href="/blog/">Blog</a></li> + </ul> + </div> + </div> +</div> + + <div class="container-fluid"> + <div class="container content"> + <div class="col-md-12 documentation"> +<h5 class="page-header text-uppercase">Documentation +<select onChange="window.location.href='/documentation/' + this.value + '/operations/storage/'" + value="0.16.0"> + <option value="0.16.0" + selected="selected"> + 0.16.0 + (latest) + </option> + <option value="0.15.0" + > + 0.15.0 + </option> + <option value="0.14.0" + > + 0.14.0 + </option> + <option value="0.13.0" + > + 0.13.0 + </option> + <option value="0.12.0" + > + 0.12.0 + </option> + <option value="0.11.0" + > + 0.11.0 + </option> + <option value="0.10.0" + > + 0.10.0 + </option> + <option value="0.9.0" + > + 0.9.0 + </option> + <option value="0.8.0" + > + 0.8.0 + </option> + <option value="0.7.0-incubating" + > + 0.7.0-incubating + </option> + <option value="0.6.0-incubating" + > + 0.6.0-incubating + </option> + <option value="0.5.0-incubating" + > + 0.5.0-incubating + </option> +</select> +</h5> +<h1 id="aurora-scheduler-storage">Aurora Scheduler Storage</h1> + +<ul> +<li><a href="#overview">Overview</a></li> +<li><a href="#replicated-log-configuration">Replicated Log Configuration</a></li> +<li><a href="#replicated-log-configuration">Backup Configuration</a></li> +<li><a href="#storage-semantics">Storage Semantics</a> + +<ul> +<li><a href="#reads-writes-modifications">Reads, writes, modifications</a></li> +<li><a href="#read-lifecycle">Read lifecycle</a></li> +<li><a href="#write-lifecycle">Write lifecycle</a></li> +<li><a href="#atomicity-consistency-and-isolation">Atomicity, consistency and isolation</a></li> +<li><a href="#population-on-restart">Population on restart</a></li> +</ul></li> +</ul> + +<h2 id="overview">Overview</h2> + +<p>Aurora scheduler maintains data that need to be persisted to survive failovers and restarts. +For example:</p> + +<ul> +<li>Task configurations and scheduled task instances</li> +<li>Job update configurations and update progress</li> +<li>Production resource quotas</li> +<li>Mesos resource offer host attributes</li> +</ul> + +<p>Aurora solves its persistence needs by leveraging the Mesos implementation of a Paxos replicated +log <a href="https://ramcloud.stanford.edu/~ongaro/userstudy/paxos.pdf">[1]</a> +<a href="http://en.wikipedia.org/wiki/State_machine_replication">[2]</a> with a key-value +<a href="https://github.com/google/leveldb">LevelDB</a> storage as persistence media.</p> + +<p>Conceptually, it can be represented by the following major components:</p> + +<ul> +<li>Volatile storage: in-memory cache of all available data. Implemented via in-memory +<a href="http://www.h2database.com/html/main.html">H2 Database</a> and accessed via +<a href="http://mybatis.github.io/mybatis-3/">MyBatis</a>.</li> +<li>Log manager: interface between Aurora storage and Mesos replicated log. The default schema format +is <a href="https://github.com/apache/thrift">thrift</a>. Data is stored in serialized binary form.</li> +<li>Snapshot manager: all data is periodically persisted in Mesos replicated log in a single snapshot. +This helps establishing periodic recovery checkpoints and speeds up volatile storage recovery on +restart.</li> +<li>Backup manager: as a precaution, snapshots are periodically written out into backup files. +This solves a <a href="../backup-restore/">disaster recovery problem</a> +in case of a complete loss or corruption of Mesos log files.</li> +</ul> + +<p><img alt="Storage hierarchy" src="../../images/storage_hierarchy.png" /></p> + +<h2 id="storage-semantics">Storage Semantics</h2> + +<p>Implementation details of the Aurora storage system. Understanding those can sometimes be useful +when investigating performance issues.</p> + +<h3 id="reads-writes-modifications">Reads, writes, modifications</h3> + +<p>All services in Aurora access data via a set of predefined store interfaces (aka stores) logically +grouped by the type of data they serve. Every interface defines a specific set of operations allowed +on the data thus abstracting out the storage access and the actual persistence implementation. The +latter is especially important in view of a general immutability of persisted data. With the Mesos +replicated log as the underlying persistence solution, data can be read and written easily but not +modified. All modifications are simulated by saving new versions of modified objects. This feature +and general performance considerations justify the existence of the volatile in-memory store.</p> + +<h4 id="read-lifecycle">Read lifecycle</h4> + +<p>There are two types of reads available in Aurora: consistent and weakly-consistent. The difference +is explained <a href="#atomicity-consistency-and-isolation">below</a>.</p> + +<p>All reads are served from the volatile storage making reads generally cheap storage operations +from the performance standpoint. The majority of the volatile stores are represented by the +in-memory H2 database. This allows for rich schema definitions, queries and relationships that +key-value storage is unable to match.</p> + +<h4 id="write-lifecycle">Write lifecycle</h4> + +<p>Writes are more involved operations since in addition to updating the volatile store data has to be +appended to the replicated log. Data is not available for reads until fully ack-ed by both +replicated log and volatile storage.</p> + +<h3 id="atomicity-consistency-and-isolation">Atomicity, consistency and isolation</h3> + +<p>Aurora uses <a href="http://en.wikipedia.org/wiki/Write-ahead_logging">write-ahead logging</a> to ensure +consistency between replicated and volatile storage. In Aurora, data is first written into the +replicated log and only then updated in the volatile store.</p> + +<p>Aurora storage uses read-write locks to serialize data mutations and provide consistent view of the +available data. The available <code>Storage</code> interface exposes 3 major types of operations: +* <code>consistentRead</code> - access is locked using reader’s lock and provides consistent view on read +* <code>weaklyConsistentRead</code> - access is lock-less. Delivers best contention performance but may result +in stale reads +* <code>write</code> - access is fully serialized by using writer’s lock. Operation success requires both +volatile and replicated writes to succeed.</p> + +<p>The consistency of the volatile store is enforced via H2 transactional isolation.</p> + +<h3 id="population-on-restart">Population on restart</h3> + +<p>Any time a scheduler restarts, it restores its volatile state from the most recent position recorded +in the replicated log by restoring the snapshot and replaying individual log entries on top to fully +recover the state up to the last write.</p> + +</div> + + </div> + </div> + <div class="container-fluid section-footer buffer"> + <div class="container"> + <div class="row"> + <div class="col-md-2 col-md-offset-1"><h3>Quick Links</h3> + <ul> + <li><a href="/downloads/">Downloads</a></li> + <li><a href="/community/">Mailing Lists</a></li> + <li><a href="http://issues.apache.org/jira/browse/AURORA">Issue Tracking</a></li> + <li><a href="/documentation/latest/contributing/">How To Contribute</a></li> + </ul> + </div> + <div class="col-md-2"><h3>The ASF</h3> + <ul> + <li><a href="http://www.apache.org/licenses/">License</a></li> + <li><a href="http://www.apache.org/foundation/sponsorship.html">Sponsorship</a></li> + <li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li> + <li><a href="http://www.apache.org/security/">Security</a></li> + </ul> + </div> + <div class="col-md-6"> + <p class="disclaimer">© 2014-2016 <a href="http://www.apache.org/">Apache Software Foundation</a>. Licensed under the <a href="http://www.apache.org/licenses/">Apache License v2.0</a>. The <a href="https://www.flickr.com/photos/trondk/12706051375/">Aurora Borealis IX photo</a> displayed on the homepage is available under a <a href="https://creativecommons.org/licenses/by-nc-nd/2.0/">Creative Commons BY-NC-ND 2.0 license</a>. Apache, Apache Aurora, and the Apache feather logo are trademarks of The Apache Software Foundation.</p> + </div> + </div> + </div> + + </body> +</html> Added: aurora/site/publish/documentation/0.16.0/reference/client-cluster-configuration/index.html URL: http://svn.apache.org/viewvc/aurora/site/publish/documentation/0.16.0/reference/client-cluster-configuration/index.html?rev=1762695&view=auto ============================================================================== --- aurora/site/publish/documentation/0.16.0/reference/client-cluster-configuration/index.html (added) +++ aurora/site/publish/documentation/0.16.0/reference/client-cluster-configuration/index.html Wed Sep 28 18:23:53 2016 @@ -0,0 +1,258 @@ +<!DOCTYPE html> +<html lang="en"> + <head> + <meta charset="utf-8"> + <meta name="viewport" content="width=device-width, initial-scale=1"> + <title>Apache Aurora</title> + <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.1/css/bootstrap.min.css"> + <link href="/assets/css/main.css" rel="stylesheet"> + <!-- Analytics --> + <script type="text/javascript"> + var _gaq = _gaq || []; + _gaq.push(['_setAccount', 'UA-45879646-1']); + _gaq.push(['_setDomainName', 'apache.org']); + _gaq.push(['_trackPageview']); + + (function() { + var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; + ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; + var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); + })(); + </script> + </head> + <body> + <div class="container-fluid section-header"> + <div class="container"> + <div class="nav nav-bar"> + <a href="/"><img src="/assets/img/aurora_logo_dkbkg.svg" width="300" alt="Transparent Apache Aurora logo with dark background"/></a> + <ul class="nav navbar-nav navbar-right"> + <li><a href="/documentation/latest/">Documentation</a></li> + <li><a href="/community/">Community</a></li> + <li><a href="/downloads/">Downloads</a></li> + <li><a href="/blog/">Blog</a></li> + </ul> + </div> + </div> +</div> + + <div class="container-fluid"> + <div class="container content"> + <div class="col-md-12 documentation"> +<h5 class="page-header text-uppercase">Documentation +<select onChange="window.location.href='/documentation/' + this.value + '/reference/client-cluster-configuration/'" + value="0.16.0"> + <option value="0.16.0" + selected="selected"> + 0.16.0 + (latest) + </option> + <option value="0.15.0" + > + 0.15.0 + </option> + <option value="0.14.0" + > + 0.14.0 + </option> + <option value="0.13.0" + > + 0.13.0 + </option> + <option value="0.12.0" + > + 0.12.0 + </option> + <option value="0.11.0" + > + 0.11.0 + </option> + <option value="0.10.0" + > + 0.10.0 + </option> + <option value="0.9.0" + > + 0.9.0 + </option> + <option value="0.8.0" + > + 0.8.0 + </option> + <option value="0.7.0-incubating" + > + 0.7.0-incubating + </option> + <option value="0.6.0-incubating" + > + 0.6.0-incubating + </option> + <option value="0.5.0-incubating" + > + 0.5.0-incubating + </option> +</select> +</h5> +<h1 id="client-cluster-configuration">Client Cluster Configuration</h1> + +<p>A cluster configuration file is used by the Aurora client to describe the Aurora clusters with +which it can communicate. Ultimately this allows client users to reference clusters with short names +like us-east and eu.</p> + +<p>A cluster configuration is formatted as JSON. The simplest cluster configuration is one that +communicates with a single (non-leader-elected) scheduler. For example:</p> +<pre class="highlight plaintext"><code>[{ + "name": "example", + "scheduler_uri": "http://localhost:55555", +}] +</code></pre> + +<p>A configuration for a leader-elected scheduler would contain something like:</p> +<pre class="highlight plaintext"><code>[{ + "name": "example", + "zk": "192.168.33.7", + "scheduler_zk_path": "/aurora/scheduler" +}] +</code></pre> + +<p>The following properties may be set:</p> + +<table><thead> +<tr> +<th style="text-align: left"><strong>Property</strong></th> +<th style="text-align: left"><strong>Type</strong></th> +<th style="text-align: left"><strong>Description</strong></th> +</tr> +</thead><tbody> +<tr> +<td style="text-align: left"><strong>name</strong></td> +<td style="text-align: left">String</td> +<td style="text-align: left">Cluster name (Required)</td> +</tr> +<tr> +<td style="text-align: left"><strong>slave_root</strong></td> +<td style="text-align: left">String</td> +<td style="text-align: left">Path to Mesos agent work dir (Required)</td> +</tr> +<tr> +<td style="text-align: left"><strong>slave<em>run</em>directory</strong></td> +<td style="text-align: left">String</td> +<td style="text-align: left">Name of Mesos agent run dir (Required)</td> +</tr> +<tr> +<td style="text-align: left"><strong>zk</strong></td> +<td style="text-align: left">String</td> +<td style="text-align: left">Hostname of ZooKeeper instance used to resolve Aurora schedulers.</td> +</tr> +<tr> +<td style="text-align: left"><strong>zk_port</strong></td> +<td style="text-align: left">Integer</td> +<td style="text-align: left">Port of ZooKeeper instance used to locate Aurora schedulers (Default: 2181)</td> +</tr> +<tr> +<td style="text-align: left"><strong>scheduler<em>zk</em>path</strong></td> +<td style="text-align: left">String</td> +<td style="text-align: left">ZooKeeper path under which scheduler instances are registered.</td> +</tr> +<tr> +<td style="text-align: left"><strong>scheduler_uri</strong></td> +<td style="text-align: left">String</td> +<td style="text-align: left">URI of Aurora scheduler instance.</td> +</tr> +<tr> +<td style="text-align: left"><strong>proxy_url</strong></td> +<td style="text-align: left">String</td> +<td style="text-align: left">Used by the client to format URLs for display.</td> +</tr> +<tr> +<td style="text-align: left"><strong>auth_mechanism</strong></td> +<td style="text-align: left">String</td> +<td style="text-align: left">The authentication mechanism to use when communicating with the scheduler. (Default: UNAUTHENTICATED)</td> +</tr> +</tbody></table> + +<h2 id="details">Details</h2> + +<h3 id="name"><code>name</code></h3> + +<p>The name of the Aurora cluster represented by this entry. This name will be the <code>cluster</code> portion of +any job keys identifying jobs running within the cluster.</p> + +<h3 id="slave_root"><code>slave_root</code></h3> + +<p>The path on the Mesos agents where executing tasks can be found. It is used in combination with the +<code>slave_run_directory</code> property by <code>aurora task run</code> and <code>aurora task ssh</code> to change into the sandbox +directory after connecting to the host. This value should match the value passed to <code>mesos-slave</code> +as <code>-work_dir</code>.</p> + +<h3 id="slave_run_directory"><code>slave_run_directory</code></h3> + +<p>The name of the directory where the task run can be found. This is used in combination with the +<code>slave_root</code> property by <code>aurora task run</code> and <code>aurora task ssh</code> to change into the sandbox +directory after connecting to the host. This should almost always be set to <code>latest</code>.</p> + +<h3 id="zk"><code>zk</code></h3> + +<p>The hostname of the ZooKeeper instance used to resolve the Aurora scheduler. Aurora uses ZooKeeper +to elect a leader. The client will connect to this ZooKeeper instance to determine the current +leader. This host should match the host passed to the scheduler as <code>-zk_endpoints</code>.</p> + +<h3 id="zk_port"><code>zk_port</code></h3> + +<p>The port on which the ZooKeeper instance is running. If not set this will default to the standard +ZooKeeper port of 2181. This port should match the port in the host passed to the scheduler as +<code>-zk_endpoints</code>.</p> + +<h3 id="scheduler_zk_path"><code>scheduler_zk_path</code></h3> + +<p>The path on the ZooKeeper instance under which the Aurora serverset is registered. This value should +match the value passed to the scheduler as <code>-serverset_path</code>.</p> + +<h3 id="scheduler_uri"><code>scheduler_uri</code></h3> + +<p>The URI of the scheduler. This would be used in place of the ZooKeeper related configuration above +in circumstances where direct communication with a single scheduler is needed (e.g. testing +environments). It is strongly advised to <strong>never</strong> use this property for production deploys.</p> + +<h3 id="proxy_url"><code>proxy_url</code></h3> + +<p>Instead of using the hostname of the leading scheduler as the base url, if <code>proxy_url</code> is set, its +value will be used instead. In that scenario the value for <code>proxy_url</code> would be, for example, the +URL of your VIP in a loadbalancer or a roundrobin DNS name.</p> + +<h3 id="auth_mechanism"><code>auth_mechanism</code></h3> + +<p>The identifier of an authentication mechanism that the client should use when communicating with the +scheduler. Support for values other than <code>UNAUTHENTICATED</code> requires a matching scheduler-side +<a href="../../operations/security/">security configuration</a>.</p> + +</div> + + </div> + </div> + <div class="container-fluid section-footer buffer"> + <div class="container"> + <div class="row"> + <div class="col-md-2 col-md-offset-1"><h3>Quick Links</h3> + <ul> + <li><a href="/downloads/">Downloads</a></li> + <li><a href="/community/">Mailing Lists</a></li> + <li><a href="http://issues.apache.org/jira/browse/AURORA">Issue Tracking</a></li> + <li><a href="/documentation/latest/contributing/">How To Contribute</a></li> + </ul> + </div> + <div class="col-md-2"><h3>The ASF</h3> + <ul> + <li><a href="http://www.apache.org/licenses/">License</a></li> + <li><a href="http://www.apache.org/foundation/sponsorship.html">Sponsorship</a></li> + <li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li> + <li><a href="http://www.apache.org/security/">Security</a></li> + </ul> + </div> + <div class="col-md-6"> + <p class="disclaimer">© 2014-2016 <a href="http://www.apache.org/">Apache Software Foundation</a>. Licensed under the <a href="http://www.apache.org/licenses/">Apache License v2.0</a>. The <a href="https://www.flickr.com/photos/trondk/12706051375/">Aurora Borealis IX photo</a> displayed on the homepage is available under a <a href="https://creativecommons.org/licenses/by-nc-nd/2.0/">Creative Commons BY-NC-ND 2.0 license</a>. Apache, Apache Aurora, and the Apache feather logo are trademarks of The Apache Software Foundation.</p> + </div> + </div> + </div> + + </body> +</html> Added: aurora/site/publish/documentation/0.16.0/reference/client-commands/index.html URL: http://svn.apache.org/viewvc/aurora/site/publish/documentation/0.16.0/reference/client-commands/index.html?rev=1762695&view=auto ============================================================================== --- aurora/site/publish/documentation/0.16.0/reference/client-commands/index.html (added) +++ aurora/site/publish/documentation/0.16.0/reference/client-commands/index.html Wed Sep 28 18:23:53 2016 @@ -0,0 +1,456 @@ +<!DOCTYPE html> +<html lang="en"> + <head> + <meta charset="utf-8"> + <meta name="viewport" content="width=device-width, initial-scale=1"> + <title>Apache Aurora</title> + <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/3.3.1/css/bootstrap.min.css"> + <link href="/assets/css/main.css" rel="stylesheet"> + <!-- Analytics --> + <script type="text/javascript"> + var _gaq = _gaq || []; + _gaq.push(['_setAccount', 'UA-45879646-1']); + _gaq.push(['_setDomainName', 'apache.org']); + _gaq.push(['_trackPageview']); + + (function() { + var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; + ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; + var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); + })(); + </script> + </head> + <body> + <div class="container-fluid section-header"> + <div class="container"> + <div class="nav nav-bar"> + <a href="/"><img src="/assets/img/aurora_logo_dkbkg.svg" width="300" alt="Transparent Apache Aurora logo with dark background"/></a> + <ul class="nav navbar-nav navbar-right"> + <li><a href="/documentation/latest/">Documentation</a></li> + <li><a href="/community/">Community</a></li> + <li><a href="/downloads/">Downloads</a></li> + <li><a href="/blog/">Blog</a></li> + </ul> + </div> + </div> +</div> + + <div class="container-fluid"> + <div class="container content"> + <div class="col-md-12 documentation"> +<h5 class="page-header text-uppercase">Documentation +<select onChange="window.location.href='/documentation/' + this.value + '/reference/client-commands/'" + value="0.16.0"> + <option value="0.16.0" + selected="selected"> + 0.16.0 + (latest) + </option> + <option value="0.15.0" + > + 0.15.0 + </option> + <option value="0.14.0" + > + 0.14.0 + </option> + <option value="0.13.0" + > + 0.13.0 + </option> + <option value="0.12.0" + > + 0.12.0 + </option> + <option value="0.11.0" + > + 0.11.0 + </option> + <option value="0.10.0" + > + 0.10.0 + </option> + <option value="0.9.0" + > + 0.9.0 + </option> + <option value="0.8.0" + > + 0.8.0 + </option> + <option value="0.7.0-incubating" + > + 0.7.0-incubating + </option> + <option value="0.6.0-incubating" + > + 0.6.0-incubating + </option> + <option value="0.5.0-incubating" + > + 0.5.0-incubating + </option> +</select> +</h5> +<h1 id="aurora-client-commands">Aurora Client Commands</h1> + +<ul> +<li><a href="#introduction">Introduction</a></li> +<li><a href="#cluster-configuration">Cluster Configuration</a></li> +<li><a href="#job-keys">Job Keys</a></li> +<li><a href="#modifying-aurora-client-commands">Modifying Aurora Client Commands</a></li> +<li><a href="#regular-jobs">Regular Jobs</a> + +<ul> +<li><a href="#creating-and-running-a-job">Creating and Running a Job</a></li> +<li><a href="#running-a-command-on-a-running-job">Running a Command On a Running Job</a></li> +<li><a href="#killing-a-job">Killing a Job</a></li> +<li><a href="#adding-instances">Adding Instances</a></li> +<li><a href="#updating-a-job">Updating a Job</a> + +<ul> +<li><a href="#coordinated-job-updates">Coordinated job updates</a></li> +</ul></li> +<li><a href="#renaming-a-job">Renaming a Job</a></li> +<li><a href="#restarting-jobs">Restarting Jobs</a></li> +</ul></li> +<li><a href="#cron-jobs">Cron Jobs</a></li> +<li><a href="#comparing-jobs">Comparing Jobs</a></li> +<li><a href="#viewingexamining-jobs">Viewing/Examining Jobs</a> + +<ul> +<li><a href="#listing-jobs">Listing Jobs</a></li> +<li><a href="#inspecting-a-job">Inspecting a Job</a></li> +<li><a href="#versions">Versions</a></li> +<li><a href="#checking-your-quota">Checking Your Quota</a></li> +<li><a href="#finding-a-job-on-web-ui">Finding a Job on Web UI</a></li> +<li><a href="#getting-job-status">Getting Job Status</a></li> +<li><a href="#opening-the-web-ui">Opening the Web UI</a></li> +<li><a href="#sshing-to-a-specific-task-machine">SSHing to a Specific Task Machine</a></li> +<li><a href="#templating-command-arguments">Templating Command Arguments</a></li> +</ul></li> +</ul> + +<h2 id="introduction">Introduction</h2> + +<p>Once you have written an <code>.aurora</code> configuration file that describes +your Job and its parameters and functionality, you interact with Aurora +using Aurora Client commands. This document describes all of these commands +and how and when to use them. All Aurora Client commands start with +<code>aurora</code>, followed by the name of the specific command and its +arguments.</p> + +<p><em>Job keys</em> are a very common argument to Aurora commands, as well as the +gateway to useful information about a Job. Before using Aurora, you +should read the next section which describes them in detail. The section +after that briefly describes how you can modify the behavior of certain +Aurora Client commands, linking to a detailed document about how to do +that.</p> + +<p>This is followed by the Regular Jobs section, which describes the basic +Client commands for creating, running, and manipulating Aurora Jobs. +After that are sections on Comparing Jobs and Viewing/Examining Jobs. In +other words, various commands for getting information and metadata about +Aurora Jobs.</p> + +<h2 id="cluster-configuration">Cluster Configuration</h2> + +<p>The client must be able to find a configuration file that specifies available clusters. This file +declares shorthand names for clusters, which are in turn referenced by job configuration files +and client commands.</p> + +<p>The client will load at most two configuration files, making both of their defined clusters +available. The first is intended to be a system-installed cluster, using the path specified in +the environment variable <code>AURORA_CONFIG_ROOT</code>, defaulting to <code>/etc/aurora/clusters.json</code> if the +environment variable is not set. The second is a user-installed file, located at +<code>~/.aurora/clusters.json</code>.</p> + +<p>For more details on cluster configuration see the +<a href="../client-cluster-configuration/">Client Cluster Configuration</a> documentation.</p> + +<h2 id="job-keys">Job Keys</h2> + +<p>A job key is a unique system-wide identifier for an Aurora-managed +Job, for example <code>cluster1/web-team/test/experiment204</code>. It is a 4-tuple +consisting of, in order, <em>cluster</em>, <em>role</em>, <em>environment</em>, and +<em>jobname</em>, separated by /s. Cluster is the name of an Aurora +cluster. Role is the Unix service account under which the Job +runs. Environment is a namespace component like <code>devel</code>, <code>test</code>, +<code>prod</code>, or <code>stagingN.</code> Jobname is the Job’s name.</p> + +<p>The combination of all four values uniquely specifies the Job. If any +one value is different from that of another job key, the two job keys +refer to different Jobs. For example, job key +<code>cluster1/tyg/prod/workhorse</code> is different from +<code>cluster1/tyg/prod/workcamel</code> is different from +<code>cluster2/tyg/prod/workhorse</code> is different from +<code>cluster2/foo/prod/workhorse</code> is different from +<code>cluster1/tyg/test/workhorse.</code></p> + +<p>Role names are user accounts existing on the agent machines. If you don’t know what accounts +are available, contact your sysadmin.</p> + +<p>Environment names are namespaces; you can count on <code>prod</code>, <code>devel</code> and <code>test</code> existing.</p> + +<h2 id="modifying-aurora-client-commands">Modifying Aurora Client Commands</h2> + +<p>For certain Aurora Client commands, you can define hook methods that run +either before or after an action that takes place during the command’s +execution, as well as based on whether the action finished successfully or failed +during execution. Basically, a hook is code that lets you extend the +command’s actions. The hook executes on the client side, specifically on +the machine executing Aurora commands.</p> + +<p>Hooks can be associated with these Aurora Client commands.</p> + +<ul> +<li><code>job create</code></li> +<li><code>job kill</code></li> +<li><code>job restart</code></li> +</ul> + +<p>The process for writing and activating them is complex enough +that we explain it in a devoted document, <a href="../client-hooks/">Hooks for Aurora Client API</a>.</p> + +<h2 id="regular-jobs">Regular Jobs</h2> + +<p>This section covers Aurora commands related to running, killing, +renaming, updating, and restarting a basic Aurora Job.</p> + +<h3 id="creating-and-running-a-job">Creating and Running a Job</h3> +<pre class="highlight plaintext"><code>aurora job create <job key> <configuration file> +</code></pre> + +<p>Creates and then runs a Job with the specified job key based on a <code>.aurora</code> configuration file. +The configuration file may also contain and activate hook definitions.</p> + +<h3 id="running-a-command-on-a-running-job">Running a Command On a Running Job</h3> +<pre class="highlight plaintext"><code>aurora task run CLUSTER/ROLE/ENV/NAME[/INSTANCES] <cmd> +</code></pre> + +<p>Runs a shell command on all machines currently hosting shards of a +single Job.</p> + +<p><code>run</code> supports the same command line wildcards used to populate a Job’s +commands; i.e. anything in the <code>{{mesos.*}}</code> and <code>{{thermos.*}}</code> +namespaces.</p> + +<h3 id="killing-a-job">Killing a Job</h3> +<pre class="highlight plaintext"><code>aurora job killall CLUSTER/ROLE/ENV/NAME +</code></pre> + +<p>Kills all Tasks associated with the specified Job, blocking until all +are terminated. Defaults to killing all instances in the Job.</p> + +<p>The <code><configuration file></code> argument for <code>kill</code> is optional. Use it only +if it contains hook definitions and activations that affect the +kill command.</p> + +<h3 id="adding-instances">Adding Instances</h3> +<pre class="highlight plaintext"><code>aurora job add CLUSTER/ROLE/ENV/NAME/INSTANCE <count> +</code></pre> + +<p>Adds <code><count></code> instances to the existing job. The configuration of the new instances is derived from +an active job instance pointed by the <code>/INSTANCE</code> part of the job specification. This command is +a simpler way to scale out an existing job when an instance with desired task configuration +already exists. Use <code>aurora update start</code> to add instances with a new (updated) configuration.</p> + +<h3 id="updating-a-job">Updating a Job</h3> + +<p>You can manage job updates using the <code>aurora update</code> command. Please see +<a href="../../features/job-updates/">the Job Update documentation</a> for more details.</p> + +<h3 id="renaming-a-job">Renaming a Job</h3> + +<p>Renaming is a tricky operation as downstream clients must be informed of +the new name. A conservative approach +to renaming suitable for production services is:</p> + +<ol> +<li> Modify the Aurora configuration file to change the role, +environment, and/or name as appropriate to the standardized naming +scheme.</li> +<li><p>Check that only these naming components have changed +with <code>aurora diff</code>.</p> +<pre class="highlight plaintext"><code>aurora job diff CLUSTER/ROLE/ENV/NAME <job_configuration> +</code></pre></li> +<li><p>Create the (identical) job at the new key. You may need to request a +temporary quota increase.</p> +<pre class="highlight plaintext"><code>aurora job create CLUSTER/ROLE/ENV/NEW_NAME <job_configuration> +</code></pre></li> +<li><p>Migrate all clients over to the new job key. Update all links and +dashboards. Ensure that both job keys run identical versions of the +code while in this state.</p></li> +<li><p>After verifying that all clients have successfully moved over, kill +the old job.</p> +<pre class="highlight plaintext"><code>aurora job killall CLUSTER/ROLE/ENV/NAME +</code></pre></li> +<li><p>If you received a temporary quota increase, be sure to let the +powers that be know you no longer need the additional capacity.</p></li> +</ol> + +<h3 id="restarting-jobs">Restarting Jobs</h3> + +<p><code>restart</code> restarts all of a job key identified Job’s shards:</p> +<pre class="highlight plaintext"><code>aurora job restart CLUSTER/ROLE/ENV/NAME[/INSTANCES] +</code></pre> + +<p>Restarts are controlled on the client side, so aborting +the <code>job restart</code> command halts the restart operation.</p> + +<p><strong>Note</strong>: <code>job restart</code> only applies its command line arguments and does not +use or is affected by <code>update.config</code>. Restarting +does <strong><em>not</em></strong> involve a configuration change. To update the +configuration, use <code>update.config</code>.</p> + +<p>The <code>--config</code> argument for restart is optional. Use it only +if it contains hook definitions and activations that affect the +<code>job restart</code> command.</p> + +<h2 id="cron-jobs">Cron Jobs</h2> + +<p>You can manage cron jobs using the <code>aurora cron</code> command. Please see +<a href="../../features/cron-jobs/">the Cron Jobs Feature</a> for more details.</p> + +<h2 id="comparing-jobs">Comparing Jobs</h2> +<pre class="highlight plaintext"><code>aurora job diff CLUSTER/ROLE/ENV/NAME <job configuration> +</code></pre> + +<p>Compares a job configuration against a running job. By default the diff +is determined using <code>diff</code>, though you may choose an alternate + diff program by specifying the <code>DIFF_VIEWER</code> environment variable.</p> + +<h2 id="viewing-examining-jobs">Viewing/Examining Jobs</h2> + +<p>Above we discussed creating, killing, and updating Jobs. Here we discuss +how to view and examine Jobs.</p> + +<h3 id="listing-jobs">Listing Jobs</h3> +<pre class="highlight plaintext"><code>aurora config list <job configuration> +</code></pre> + +<p>Lists all Jobs registered with the Aurora scheduler in the named cluster for the named role.</p> + +<h3 id="inspecting-a-job">Inspecting a Job</h3> +<pre class="highlight plaintext"><code>aurora job inspect CLUSTER/ROLE/ENV/NAME <job configuration> +</code></pre> + +<p><code>inspect</code> verifies that its specified job can be parsed from a +configuration file, and displays the parsed configuration.</p> + +<h3 id="checking-your-quota">Checking Your Quota</h3> +<pre class="highlight plaintext"><code>aurora quota get CLUSTER/ROLE +</code></pre> + +<p>Prints the production quota allocated to the role’s value at the given +cluster. Only non-<a href="../../features/constraints/#dedicated-attribute">dedicated</a> +<a href="../configuration/#job-objects">production</a> jobs consume quota.</p> + +<h3 id="finding-a-job-on-web-ui">Finding a Job on Web UI</h3> + +<p>When you create a job, part of the output response contains a URL that goes +to the job’s scheduler UI page. For example:</p> +<pre class="highlight plaintext"><code>vagrant@precise64:~$ aurora job create devcluster/www-data/prod/hello /vagrant/examples/jobs/hello_world.aurora +INFO] Creating job hello +INFO] Response from scheduler: OK (message: 1 new tasks pending for job www-data/prod/hello) +INFO] Job url: http://precise64:8081/scheduler/www-data/prod/hello +</code></pre> + +<p>You can go to the scheduler UI page for this job via <code>http://precise64:8081/scheduler/www-data/prod/hello</code> +You can go to the overall scheduler UI page by going to the part of that URL that ends at <code>scheduler</code>; <code>http://precise64:8081/scheduler</code></p> + +<p>Once you click through to a role page, you see Jobs arranged +separately by pending jobs, active jobs and finished jobs. +Jobs are arranged by role, typically a service account for +production jobs and user accounts for test or development jobs.</p> + +<h3 id="getting-job-status">Getting Job Status</h3> +<pre class="highlight plaintext"><code>aurora job status <job_key> +</code></pre> + +<p>Returns the status of recent tasks associated with the +<code>job_key</code> specified Job in its supplied cluster. Typically this includes +a mix of active tasks (running or assigned) and inactive tasks +(successful, failed, and lost.)</p> + +<h3 id="opening-the-web-ui">Opening the Web UI</h3> + +<p>Use the Job’s web UI scheduler URL or the <code>aurora status</code> command to find out on which +machines individual tasks are scheduled. You can open the web UI via the +<code>open</code> command line command if invoked from your machine:</p> +<pre class="highlight plaintext"><code>aurora job open [<cluster>[/<role>[/<env>/<job_name>]]] +</code></pre> + +<p>If only the cluster is specified, it goes directly to that cluster’s +scheduler main page. If the role is specified, it goes to the top-level +role page. If the full job key is specified, it goes directly to the job +page where you can inspect individual tasks.</p> + +<h3 id="sshing-to-a-specific-task-machine">SSHing to a Specific Task Machine</h3> +<pre class="highlight plaintext"><code>aurora task ssh <job_key> <shard number> +</code></pre> + +<p>You can have the Aurora client ssh directly to the machine that has been +assigned a particular Job/shard number. This may be useful for quickly +diagnosing issues such as performance issues or abnormal behavior on a +particular machine.</p> + +<h3 id="templating-command-arguments">Templating Command Arguments</h3> +<pre class="highlight plaintext"><code>aurora task run [-e] [-t THREADS] <job_key> -- <<command-line>> +</code></pre> + +<p>Given a job specification, run the supplied command on all hosts and +return the output. You may use the standard Mustache templating rules:</p> + +<ul> +<li><code>{{thermos.ports[name]}}</code> substitutes the specific named port of the +task assigned to this machine</li> +<li><code>{{mesos.instance}}</code> substitutes the shard id of the job’s task +assigned to this machine</li> +<li><code>{{thermos.task_id}}</code> substitutes the task id of the job’s task +assigned to this machine</li> +</ul> + +<p>For example, the following type of pattern can be a powerful diagnostic +tool:</p> +<pre class="highlight plaintext"><code>aurora task run -t5 cluster1/tyg/devel/seizure -- \ + 'curl -s -m1 localhost:{{thermos.ports[http]}}/vars | grep uptime' +</code></pre> + +<p>By default, the command runs in the Task’s sandbox. The <code>-e</code> option can +run the command in the executor’s sandbox. This is mostly useful for +Aurora administrators.</p> + +<p>You can parallelize the runs by using the <code>-t</code> option.</p> + +</div> + + </div> + </div> + <div class="container-fluid section-footer buffer"> + <div class="container"> + <div class="row"> + <div class="col-md-2 col-md-offset-1"><h3>Quick Links</h3> + <ul> + <li><a href="/downloads/">Downloads</a></li> + <li><a href="/community/">Mailing Lists</a></li> + <li><a href="http://issues.apache.org/jira/browse/AURORA">Issue Tracking</a></li> + <li><a href="/documentation/latest/contributing/">How To Contribute</a></li> + </ul> + </div> + <div class="col-md-2"><h3>The ASF</h3> + <ul> + <li><a href="http://www.apache.org/licenses/">License</a></li> + <li><a href="http://www.apache.org/foundation/sponsorship.html">Sponsorship</a></li> + <li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li> + <li><a href="http://www.apache.org/security/">Security</a></li> + </ul> + </div> + <div class="col-md-6"> + <p class="disclaimer">© 2014-2016 <a href="http://www.apache.org/">Apache Software Foundation</a>. Licensed under the <a href="http://www.apache.org/licenses/">Apache License v2.0</a>. The <a href="https://www.flickr.com/photos/trondk/12706051375/">Aurora Borealis IX photo</a> displayed on the homepage is available under a <a href="https://creativecommons.org/licenses/by-nc-nd/2.0/">Creative Commons BY-NC-ND 2.0 license</a>. Apache, Apache Aurora, and the Apache feather logo are trademarks of The Apache Software Foundation.</p> + </div> + </div> + </div> + + </body> +</html>