Repository: mesos Updated Branches: refs/heads/master 92fb8e7ee -> 6549194d0
Improved documentation for shared persistent volumes. Review: https://reviews.apache.org/r/52967/ Project: http://git-wip-us.apache.org/repos/asf/mesos/repo Commit: http://git-wip-us.apache.org/repos/asf/mesos/commit/6549194d Tree: http://git-wip-us.apache.org/repos/asf/mesos/tree/6549194d Diff: http://git-wip-us.apache.org/repos/asf/mesos/diff/6549194d Branch: refs/heads/master Commit: 6549194d034932ba9277cfa939b3ca0d752dfd5c Parents: 92fb8e7 Author: Neil Conway <neil.con...@gmail.com> Authored: Tue Oct 18 09:20:25 2016 -0700 Committer: Jiang Yan Xu <xuj...@apple.com> Committed: Tue Oct 18 09:20:25 2016 -0700 ---------------------------------------------------------------------- docs/home.md | 2 +- docs/persistent-volume.md | 11 ++- docs/shared-resources.md | 169 +++++++++++++++++------------------------ 3 files changed, 80 insertions(+), 102 deletions(-) ---------------------------------------------------------------------- http://git-wip-us.apache.org/repos/asf/mesos/blob/6549194d/docs/home.md ---------------------------------------------------------------------- diff --git a/docs/home.md b/docs/home.md index 1c6b191..f47f7f9 100644 --- a/docs/home.md +++ b/docs/home.md @@ -51,7 +51,7 @@ layout: documentation * [Quota](quota.md) for how to configure Mesos to provide guaranteed resource allocations for use by a role. * [Reservation](reservation.md) for how operators and frameworks can reserve resources on individual agents for use by a role. * [Replicated Log](replicated-log-internals.md) for information on the Mesos replicated log. -* [Shared Resources](shared-resources.md) for how to allow tasks to set persistent volumes as shared. +* [Shared Resources](shared-resources.md) for how to share persistent volumes between tasks managed by different executors on the same agent. ## APIs * [Scheduler HTTP API](scheduler-http-api.md) describes the new HTTP API for communication between schedulers and the Mesos master. http://git-wip-us.apache.org/repos/asf/mesos/blob/6549194d/docs/persistent-volume.md ---------------------------------------------------------------------- diff --git a/docs/persistent-volume.md b/docs/persistent-volume.md index 5dfbbf1..410993f 100644 --- a/docs/persistent-volume.md +++ b/docs/persistent-volume.md @@ -32,6 +32,14 @@ regarding reservation mechanisms available in Mesos. Persistent volumes can also be created on isolated and auxiliary disks by reserving [multiple disk resources](multiple-disk.md). +By default, a persistent volume cannot be shared between tasks running +being managed by different executors: that is, once a task is launched +using a persistent volume, that volume will not appear in any resource +offers until the task has finished running. _Shared_ volumes are a type +of persistent volumes that can be accessed by multiple tasks at the same +agent simultaneously; see the documentation on +[shared volumes](shared-resources.md) for more information. + Persistent volumes can be created by __operators__ and authorized __frameworks__. By default, frameworks and operators can create volumes for _any_ role and destroy _any_ persistent volume. [Authorization](authorization.md) @@ -485,4 +493,5 @@ endpoints for creating and destroying volumes. Mesos 0.28 introduced support for master endpoint to include detailed information about persistent volumes and dynamic reservations. Mesos 1.0 changed the semantics of destroying a volume: in previous releases, destroying a volume would remove the Mesos-level metadata, -but would not remove the volume's data from the agent's filesystem. +but would not remove the volume's data from the agent's filesystem. Mesos 1.1 +introduced support for [shared persistent volumes](shared-resources.md). http://git-wip-us.apache.org/repos/asf/mesos/blob/6549194d/docs/shared-resources.md ---------------------------------------------------------------------- diff --git a/docs/shared-resources.md b/docs/shared-resources.md index 29e4338..85cd3cf 100644 --- a/docs/shared-resources.md +++ b/docs/shared-resources.md @@ -6,49 +6,26 @@ layout: documentation ## Overview -Mesos already provides a mechanism to create persistent volumes. The persistent -volumes that are created on a specific agent is offered to the framework(s) -as resources. As a result, an executor/container which needs access to that -volume can use that resource. While the executor/container using the -persistent volume is running, it cannot be offered again to any framework -and hence would not be available to another executor/container. +By default, [persistent volumes](persistent-volume.md) provide +_exclusive_ access: once a task is launched using a persistent volume, +no other tasks can use that volume, and the volume will not appear in +any resource offers until the task that is using it has finished. -Currently, access to regular persistent volumes is exclusive to a single -container/executor at a time. Shared persistent volumes allow multiple -executor/containers access to the same persistent volume simultaneously. -Simulatenous access to a single shared persistent volume is not isolated. +In some cases, it can be useful to share a volume between multiple tasks +running on the same agent. For example, this could be used to +efficiently share a large data set between multiple data analysis tasks. -The epic for this feature is -[MESOS-3421](https://issues.apache.org/jira/browse/MESOS-3421). +## Creating Shared Volumes -Please refer to the following documents: +Shared persistent volumes are created using the same workflow as normal +persistent volumes: by starting with a +[reserved resource](reservation.md) and applying a `CREATE` operation, +either via the framework scheduler API or the +[/create-volumes](endpoints/master/create-volumes.md) HTTP endpoint. To +create a shared volume, set the `shared` field during volume creation. -* [Persistent Volumes](persistent-volume.md): Documentation for details - regarding persistent volumes available in Mesos. -* [Reservation](reservation.md): Documentation for details regarding - reservation mechanisms available in Mesos. - -Additional references: - -* Talk at MesosCon Europe 2016 at Amsterdam on August 31, 2016 entitled - ["Practical Persistent Volumes"] - (http://schd.ws/hosted_files/mesosconeu2016/08/MesosConEurope2016PPVv1.0.pdf). - -## Framework opt in for shared resources - -A new `FrameworkInfo::Capability`, viz. `SHARED_RESOURCES` is added for a -framework to indicate that the framework is willing to accept shared -resources. If the framework registers itself with this capability, offers -shall contain the shared persistent volumes. - -## Creation of shared Persistent Volume - -The framework can create a shared persistent volume using the existing -persistent volume workflow. See usage examples for Scheduler API and -Operator HTTP Endpoints in [Persistent Volumes](persistent-volume.md). - -Suppose a framework receives a resource offer with 2048 MB of dynamically -reserved disk. +For example, suppose a framework receives a resource offer containing +2048MB of dynamically reserved disk: ``` { @@ -70,8 +47,8 @@ reserved disk. } ``` -When creating a new persistent volume, the framework can marked it as shared -by setting the shared attribute. +The framework can create a shared persistent volume using this disk +resource via the following offer operation: ``` { @@ -103,62 +80,54 @@ by setting the shared attribute. } ``` -If this succeeds, a subsequent resource offer will contain the shared -persistent volume: - -``` -{ - "id" : <offer_id>, - "framework_id" : <framework_id>, - "slave_id" : <slave_id>, - "hostname" : <hostname>, - "resources" : [ - { - "name" : "disk", - "type" : "SCALAR", - "scalar" : { "value" : 2048 }, - "role" : <framework_role>, - "reservation" : { - "principal" : <framework_principal> - }, - "disk": { - "persistence": { - "id" : <persistent_volume_id> - }, - "volume" : { - "container_path" : <container_path>, - "mode" : <mode> - } - }, - "shared": { - } - } - ] -} -``` - -The rest of the basic workflow is identical to the regular persistent -volumes. The same shared persistent volume is offered to frameworks -of the same role in different offer cycles. - -## Unique Shared Persistent Volume Features - -### Launching multiple tasks on the same shared persistent volume - -Since a shared persisent volume is offered to frameworks even when that -volume is being used by an executor/container, a framework can launch -additional tasks using the same shared persistent volume. Moreover, -multiple tasks using the same shared persistent volume can be launched -in a single `ACCEPT` call (and does not necessarily need to spread across -multiple `ACCEPT` calls). - -### Destroying shared persistent volumes - -Since a shared persistent volume is offered to frameworks even if the -volume is currently in use, presence of a shared volume in the offer -does not automatically make that volume eligible to be destroyed. On -receipt of a `DESTROY`, the shared volume is destroyed only if there is -no running or pending task using that shared volume. If the volume is -already assigned to one or more executors/containers, the DESTROY of -the shared volume shall not be successful; and that shared volume will -be offered in the next cycle. +Note that the `shared` field has been set (to an empty JSON object), +which indicates that the `CREATE` operation will create a shared volume. + +## Using Shared Volumes + +To be eligible to receive resource offers that contain shared volumes, a +framework must enable the `SHARED_RESOURCES` capability in the +`FrameworkInfo` it provides when it registers with the master. +Frameworks that do _not_ enable this capability will not be offered +shared resources. + +When a framework receives a resource offer, it can determine whether a +volume is shared by checking if the `shared` field has been set. Unlike +normal persistent volumes, a shared volume that is in use by a task will +continue to be offered to the frameworks in the volume's role; this +gives those frameworks the opportunity to launch additional tasks that +can access the volume. A framework can also launch multiple tasks that +access the volume using a single `ACCEPT` call. + +Note that Mesos does not provide any isolation or concurrency control +between the tasks that are sharing a volume. Framework developers should +ensure that tasks that access the same volume do not conflict with one +another. This can be done via careful application-level concurrency +control, or by ensuring that the tasks access the volume in a read-only +manner. Mesos provides support for read-only access to volumes: as +described in the [persistent volume](persistent-volume.md) +documentation, tasks that are launched on a volume can specify a `mode` +of `"RO"` to use the volume in read-only mode. + +### Destroying Shared Volumes + +A persistent volume, whether shared or not, can only be destroyed if no +running or pending tasks have been launched using the volume. For +non-shared volumes, it is usually easy to determine when it is safe to +delete a volume. For shared volumes, the framework(s) that have launched +tasks using the volume typically need to coordinate to ensure (e.g., via +reference counting) that a volume is no longer being used before it is +destroyed. + +### Resource Allocation + +TODO: how do shared volumes influence resource allocation? + +## References + +* [MESOS-3421](https://issues.apache.org/jira/browse/MESOS-3421) + contains additional information about the implementation of this + feature. + +* Talk at MesosCon Europe 2016 on August 31, 2016 entitled + "[Practical Persistent Volumes](http://schd.ws/hosted_files/mesosconeu2016/08/MesosConEurope2016PPVv1.0.pdf)".