Repository: aurora
Updated Branches:
  refs/heads/master b429612ef -> c4903d873


Introduce a flag to treat RAM as a revocable resources

We plan to open source a very simple Mesos ResourceEstimator and QosController 
that supports RAM and CPU oversubscription (ETA ~2 weeks). We have been using 
it internally with a patched Aurora version where the hardcoded 
`isMesosRevocable` flag of RAM has been set to `true`. This patch makes this 
behaviour configurable.

Reviewed at https://reviews.apache.org/r/51807/


Project: http://git-wip-us.apache.org/repos/asf/aurora/repo
Commit: http://git-wip-us.apache.org/repos/asf/aurora/commit/c4903d87
Tree: http://git-wip-us.apache.org/repos/asf/aurora/tree/c4903d87
Diff: http://git-wip-us.apache.org/repos/asf/aurora/diff/c4903d87

Branch: refs/heads/master
Commit: c4903d873d090549ebdf9a07110851b5aad7d978
Parents: b429612
Author: Stephan Erb <[email protected]>
Authored: Tue Sep 13 00:09:29 2016 +0200
Committer: Stephan Erb <[email protected]>
Committed: Tue Sep 13 00:09:29 2016 +0200

----------------------------------------------------------------------
 RELEASE-NOTES.md                                |  2 ++
 docs/features/resource-isolation.md             |  6 ++--
 docs/operations/configuration.md                |  5 +++
 docs/reference/scheduler-configuration.md       | 24 ++++++++++---
 .../scheduler/resources/ResourceSettings.java   | 37 ++++++++++++++++++++
 .../scheduler/resources/ResourceType.java       |  6 ++--
 6 files changed, 70 insertions(+), 10 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/aurora/blob/c4903d87/RELEASE-NOTES.md
----------------------------------------------------------------------
diff --git a/RELEASE-NOTES.md b/RELEASE-NOTES.md
index bbf7198..4476d52 100644
--- a/RELEASE-NOTES.md
+++ b/RELEASE-NOTES.md
@@ -35,6 +35,8 @@
   schedulers up. A rolling upgrade would result in no leading scheduler for 
the duration of the
   roll which could be confusing to monitor and debug.
 - Add a new MTTS (Median Time To Starting) metric in addition to MTTA and MTTR.
+- In addition to CPU resources, RAM resources can now be treated as revocable 
via the scheduler
+  commandline flag `-enable_revocable_ram`.
 
 ### Deprecations and removals:
 

http://git-wip-us.apache.org/repos/asf/aurora/blob/c4903d87/docs/features/resource-isolation.md
----------------------------------------------------------------------
diff --git a/docs/features/resource-isolation.md 
b/docs/features/resource-isolation.md
index 01c5b40..503f2de 100644
--- a/docs/features/resource-isolation.md
+++ b/docs/features/resource-isolation.md
@@ -168,9 +168,9 @@ via the concept of revocable tasks. In contrast to 
non-revocable tasks, revocabl
 Mesos reserves the right to throttle or even kill them if they might affect 
existing high-priority
 user-facing services.
 
-As of today, the only revocable resource supported by Aurora are CPU 
resources. A job can opt-in to
-use those by specifying the `revocable` [Configuration 
Tier](../features/multitenancy.md#configuration-tiers).
-A revocable job will only be scheduled using revocable CPU resources, even if 
there are plenty of
+As of today, the only revocable resource supported by Aurora are CPU and RAM 
resources. A job can
+opt-in to use those by specifying the `revocable` [Configuration 
Tier](../features/multitenancy.md#configuration-tiers).
+A revocable job will only be scheduled using revocable resources, even if 
there are plenty of
 non-revocable resources available.
 
 The Aurora scheduler must be [configured to receive revocable 
offers](../operations/configuration.md#resource-isolation)

http://git-wip-us.apache.org/repos/asf/aurora/blob/c4903d87/docs/operations/configuration.md
----------------------------------------------------------------------
diff --git a/docs/operations/configuration.md b/docs/operations/configuration.md
index 90dde57..203f3be 100644
--- a/docs/operations/configuration.md
+++ b/docs/operations/configuration.md
@@ -126,6 +126,11 @@ and then set set this Aurora scheduler flag to allow 
receiving revocable Mesos o
 
     -receive_revocable_resources=true
 
+Both CPUs and RAM are supported as revocable resources. The former is enabled 
by the default,
+the latter needs to be enabled via:
+
+    -enable_revocable_ram=true
+
 Unless you want to use the 
[default](../../src/main/resources/org/apache/aurora/scheduler/tiers.json)
 tier configuration, you will also have to specify a file path:
 

http://git-wip-us.apache.org/repos/asf/aurora/blob/c4903d87/docs/reference/scheduler-configuration.md
----------------------------------------------------------------------
diff --git a/docs/reference/scheduler-configuration.md 
b/docs/reference/scheduler-configuration.md
index 87d2cde..31be714 100644
--- a/docs/reference/scheduler-configuration.md
+++ b/docs/reference/scheduler-configuration.md
@@ -22,6 +22,8 @@ Required flags:
        Max number of idle connections to the database via MyBatis
 -framework_authentication_file
        Properties file which contains framework credentials to authenticate 
with Mesosmaster. Must contain the properties 'aurora_authentication_principal' 
and 'aurora_authentication_secret'.
+-ip
+       The ip address to listen. If not set, the scheduler will listen on all 
interfaces.
 -mesos_master_address [not null]
        Address for the mesos master, can be a socket address or zookeeper path.
 -mesos_role
@@ -34,12 +36,16 @@ Required flags:
        Path to the thermos executor entry point.
 -tier_config [file must be readable]
        Configuration file defining supported task tiers, task traits and 
behaviors.
+-webhook_config [file must exist, file must be readable]
+       Path to webhook configuration file.
 -zk_endpoints [must have at least 1 item]
        Endpoint specification for the ZooKeeper servers.
 
 Optional flags:
 -allow_docker_parameters (default false)
        Allow to pass docker container parameters in the job.
+-allow_gpu_resource (default false)
+       Allow jobs to request Mesos GPU resource.
 -allowed_container_types (default [MESOS])
        Container types that are allowed to be used by jobs.
 -async_slot_stat_update_interval (default (1, mins))
@@ -76,10 +82,16 @@ Optional flags:
        List of domains for which CORS support should be enabled.
 -enable_h2_console (default false)
        Enable H2 DB management console.
+-enable_mesos_fetcher (default false)
+       Allow jobs to pass URIs to the Mesos Fetcher. Note that enabling this 
feature could pose a privilege escalation threat.
 -enable_preemptor (default true)
        Enable the preemptor and preemption
+-enable_revocable_cpus (default true)
+       Treat CPUs as a revocable resource.
+-enable_revocable_ram (default false)
+       Treat RAM as a revocable resource.
 -executor_user (default root)
-       User to start the executor. Defaults to "root". Set this to an 
unprivileged user if the mesos master was started with "--no-root_submissions". 
If set to anything other than "root", the executor will ignore the "role" 
setting for jobs since it can't use setuid() anymore. This means that all your 
jobs will run under the specified user and the user has to exist on the mesos 
slaves.
+       User to start the executor. Defaults to "root". Set this to an 
unprivileged user if the mesos master was started with "--no-root_submissions". 
If set to anything other than "root", the executor will ignore the "role" 
setting for jobs since it can't use setuid() anymore. This means that all your 
jobs will run under the specified user and the user has to exist on the Mesos 
agents.
 -first_schedule_delay (default (1, ms))
        Initial amount of time to wait before first attempting to schedule a 
PENDING task.
 -flapping_task_threshold (default (5, mins))
@@ -163,7 +175,7 @@ Optional flags:
 -offer_hold_jitter_window (default (1, mins))
        Maximum amount of random jitter to add to the offer hold time window.
 -offer_reservation_duration (default (3, mins))
-       Time to reserve a slave's offers while trying to satisfy a task 
preempting another.
+       Time to reserve a agent's offers while trying to satisfy a task 
preempting another.
 -populate_discovery_info (default false)
        If true, Aurora populates DiscoveryInfo field of Mesos TaskInfo.
 -preemption_delay (default (3, mins))
@@ -174,6 +186,10 @@ Optional flags:
        Time interval between pending task preemption slot searches.
 -receive_revocable_resources (default false)
        Allows receiving revocable resource offers from Mesos.
+-reconciliation_explicit_batch_interval (default (5, secs))
+       Interval between explicit batch reconciliation requests.
+-reconciliation_explicit_batch_size (default 1000) [must be > 0]
+       Number of tasks in a single batch request sent to Mesos for explicit 
reconciliation.
 -reconciliation_explicit_interval (default (60, mins))
        Interval on which scheduler will ask Mesos for status updates of all 
non-terminal tasks known to scheduler.
 -reconciliation_implicit_interval (default (60, mins))
@@ -186,7 +202,7 @@ Optional flags:
        If false, Docker tasks may run without an executor (EXPERIMENTAL)
 -shiro_ini_path
        Path to shiro.ini for authentication and authorization configuration.
--shiro_realm_modules (default 
[org.apache.aurora.scheduler.app.MoreModules$1@13c9d689])
+-shiro_realm_modules (default 
[org.apache.aurora.scheduler.app.MoreModules$1@158a8276])
        Guice modules for configuring Shiro Realms.
 -sla_non_prod_metrics (default [])
        Metric categories collected for non production tasks.
@@ -218,8 +234,6 @@ Optional flags:
        Whether to use the experimental database-backed task store.
 -viz_job_url_prefix (default )
        URL prefix for job container stats.
--webhook_config [file must be readable]
-    File to configure a HTTP webhook to receive task state change events.
 -zk_chroot_path
        chroot path to use for the ZooKeeper connections
 -zk_digest_credentials

http://git-wip-us.apache.org/repos/asf/aurora/blob/c4903d87/src/main/java/org/apache/aurora/scheduler/resources/ResourceSettings.java
----------------------------------------------------------------------
diff --git 
a/src/main/java/org/apache/aurora/scheduler/resources/ResourceSettings.java 
b/src/main/java/org/apache/aurora/scheduler/resources/ResourceSettings.java
new file mode 100644
index 0000000..c49fd06
--- /dev/null
+++ b/src/main/java/org/apache/aurora/scheduler/resources/ResourceSettings.java
@@ -0,0 +1,37 @@
+/**
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.aurora.scheduler.resources;
+
+import org.apache.aurora.common.args.Arg;
+import org.apache.aurora.common.args.CmdLine;
+
+/**
+ * Control knobs for how Aurora treats different resource types.
+ *
+ * The command line handling seen here is non-standard. Normally we declare 
them in modules
+ * and then inject them via 'settings' classes. Unfortunately, this does not 
work here as we
+ * would need to perform the injection into the ResourceType enum. Enums are 
picky in that regard.
+ */
+final class ResourceSettings {
+
+  @CmdLine(name = "enable_revocable_cpus", help = "Treat CPUs as a revocable 
resource.")
+  static final Arg<Boolean> ENABLE_REVOCABLE_CPUS = Arg.create(true);
+
+  @CmdLine(name = "enable_revocable_ram", help = "Treat RAM as a revocable 
resource.")
+  static final Arg<Boolean> ENABLE_REVOCABLE_RAM = Arg.create(false);
+
+  private ResourceSettings() {
+
+  }
+}

http://git-wip-us.apache.org/repos/asf/aurora/blob/c4903d87/src/main/java/org/apache/aurora/scheduler/resources/ResourceType.java
----------------------------------------------------------------------
diff --git 
a/src/main/java/org/apache/aurora/scheduler/resources/ResourceType.java 
b/src/main/java/org/apache/aurora/scheduler/resources/ResourceType.java
index 4c102a3..e1a5dce 100644
--- a/src/main/java/org/apache/aurora/scheduler/resources/ResourceType.java
+++ b/src/main/java/org/apache/aurora/scheduler/resources/ResourceType.java
@@ -36,6 +36,8 @@ import static 
org.apache.aurora.scheduler.resources.AuroraResourceConverter.STRI
 import static 
org.apache.aurora.scheduler.resources.MesosResourceConverter.RANGES;
 import static 
org.apache.aurora.scheduler.resources.MesosResourceConverter.SCALAR;
 import static org.apache.aurora.scheduler.resources.ResourceMapper.PORT_MAPPER;
+import static 
org.apache.aurora.scheduler.resources.ResourceSettings.ENABLE_REVOCABLE_CPUS;
+import static 
org.apache.aurora.scheduler.resources.ResourceSettings.ENABLE_REVOCABLE_RAM;
 
 /**
  * Describes Mesos resource types and their Aurora traits.
@@ -55,7 +57,7 @@ public enum ResourceType implements TEnum {
       "core(s)",
       16,
       false,
-      true),
+      ENABLE_REVOCABLE_CPUS.get()),
 
   /**
    * RAM resource.
@@ -70,7 +72,7 @@ public enum ResourceType implements TEnum {
       "MB",
       Amount.of(24, GB).as(MB),
       false,
-      false),
+      ENABLE_REVOCABLE_RAM.get()),
 
   /**
    * DISK resource.

Reply via email to