Repository: mesos Updated Branches: refs/heads/master 0e101e266 -> fa976c22a
Added documentation for shared resources. Review: https://reviews.apache.org/r/45967/ Project: http://git-wip-us.apache.org/repos/asf/mesos/repo Commit: http://git-wip-us.apache.org/repos/asf/mesos/commit/fa976c22 Tree: http://git-wip-us.apache.org/repos/asf/mesos/tree/fa976c22 Diff: http://git-wip-us.apache.org/repos/asf/mesos/diff/fa976c22 Branch: refs/heads/master Commit: fa976c22ac66ff5c905157a5a36bda1d21525b32 Parents: 0e101e2 Author: Anindya Sinha <anindya_si...@apple.com> Authored: Thu Oct 13 08:34:21 2016 -0700 Committer: Jiang Yan Xu <xuj...@apple.com> Committed: Thu Oct 13 08:34:21 2016 -0700 ---------------------------------------------------------------------- docs/home.md | 1 + docs/shared-resources.md | 164 ++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 165 insertions(+) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/mesos/blob/fa976c22/docs/home.md ---------------------------------------------------------------------- diff --git a/docs/home.md b/docs/home.md index ad59eb1..1c6b191 100644 --- a/docs/home.md +++ b/docs/home.md @@ -51,6 +51,7 @@ layout: documentation * [Quota](quota.md) for how to configure Mesos to provide guaranteed resource allocations for use by a role. * [Reservation](reservation.md) for how operators and frameworks can reserve resources on individual agents for use by a role. * [Replicated Log](replicated-log-internals.md) for information on the Mesos replicated log. +* [Shared Resources](shared-resources.md) for how to allow tasks to set persistent volumes as shared. ## APIs * [Scheduler HTTP API](scheduler-http-api.md) describes the new HTTP API for communication between schedulers and the Mesos master. http://git-wip-us.apache.org/repos/asf/mesos/blob/fa976c22/docs/shared-resources.md ---------------------------------------------------------------------- diff --git a/docs/shared-resources.md b/docs/shared-resources.md new file mode 100644 index 0000000..29e4338 --- /dev/null +++ b/docs/shared-resources.md @@ -0,0 +1,164 @@ +--- +layout: documentation +--- + +# Shared Persistent Volumes + +## Overview + +Mesos already provides a mechanism to create persistent volumes. The persistent +volumes that are created on a specific agent is offered to the framework(s) +as resources. As a result, an executor/container which needs access to that +volume can use that resource. While the executor/container using the +persistent volume is running, it cannot be offered again to any framework +and hence would not be available to another executor/container. + +Currently, access to regular persistent volumes is exclusive to a single +container/executor at a time. Shared persistent volumes allow multiple +executor/containers access to the same persistent volume simultaneously. +Simulatenous access to a single shared persistent volume is not isolated. + +The epic for this feature is +[MESOS-3421](https://issues.apache.org/jira/browse/MESOS-3421). + +Please refer to the following documents: + +* [Persistent Volumes](persistent-volume.md): Documentation for details + regarding persistent volumes available in Mesos. +* [Reservation](reservation.md): Documentation for details regarding + reservation mechanisms available in Mesos. + +Additional references: + +* Talk at MesosCon Europe 2016 at Amsterdam on August 31, 2016 entitled + ["Practical Persistent Volumes"] + (http://schd.ws/hosted_files/mesosconeu2016/08/MesosConEurope2016PPVv1.0.pdf). + +## Framework opt in for shared resources + +A new `FrameworkInfo::Capability`, viz. `SHARED_RESOURCES` is added for a +framework to indicate that the framework is willing to accept shared +resources. If the framework registers itself with this capability, offers +shall contain the shared persistent volumes. + +## Creation of shared Persistent Volume + +The framework can create a shared persistent volume using the existing +persistent volume workflow. See usage examples for Scheduler API and +Operator HTTP Endpoints in [Persistent Volumes](persistent-volume.md). + +Suppose a framework receives a resource offer with 2048 MB of dynamically +reserved disk. + +``` +{ + "id" : <offer_id>, + "framework_id" : <framework_id>, + "slave_id" : <slave_id>, + "hostname" : <hostname>, + "resources" : [ + { + "name" : "disk", + "type" : "SCALAR", + "scalar" : { "value" : 2048 }, + "role" : <framework_role>, + "reservation" : { + "principal" : <framework_principal> + } + } + ] +} +``` + +When creating a new persistent volume, the framework can marked it as shared +by setting the shared attribute. + +``` +{ + "type" : Offer::Operation::CREATE, + "create": { + "volumes" : [ + { + "name" : "disk", + "type" : "SCALAR", + "scalar" : { "value" : 2048 }, + "role" : <framework_role>, + "reservation" : { + "principal" : <framework_principal> + }, + "disk": { + "persistence": { + "id" : <persistent_volume_id> + }, + "volume" : { + "container_path" : <container_path>, + "mode" : <mode> + } + }, + "shared" : { + } + } + ] + } +} +``` + +If this succeeds, a subsequent resource offer will contain the shared +persistent volume: + +``` +{ + "id" : <offer_id>, + "framework_id" : <framework_id>, + "slave_id" : <slave_id>, + "hostname" : <hostname>, + "resources" : [ + { + "name" : "disk", + "type" : "SCALAR", + "scalar" : { "value" : 2048 }, + "role" : <framework_role>, + "reservation" : { + "principal" : <framework_principal> + }, + "disk": { + "persistence": { + "id" : <persistent_volume_id> + }, + "volume" : { + "container_path" : <container_path>, + "mode" : <mode> + } + }, + "shared": { + } + } + ] +} +``` + +The rest of the basic workflow is identical to the regular persistent +volumes. The same shared persistent volume is offered to frameworks +of the same role in different offer cycles. + +## Unique Shared Persistent Volume Features + +### Launching multiple tasks on the same shared persistent volume + +Since a shared persisent volume is offered to frameworks even when that +volume is being used by an executor/container, a framework can launch +additional tasks using the same shared persistent volume. Moreover, +multiple tasks using the same shared persistent volume can be launched +in a single `ACCEPT` call (and does not necessarily need to spread across +multiple `ACCEPT` calls). + +### Destroying shared persistent volumes + +Since a shared persistent volume is offered to frameworks even if the +volume is currently in use, presence of a shared volume in the offer +does not automatically make that volume eligible to be destroyed. On +receipt of a `DESTROY`, the shared volume is destroyed only if there is +no running or pending task using that shared volume. If the volume is +already assigned to one or more executors/containers, the DESTROY of +the shared volume shall not be successful; and that shared volume will +be offered in the next cycle.