I'd suggest adding yourself as a watcher on https://issues.apache.org/jira/browse/MESOS-4424, as a design doc will be circulated shortly.
The current thinking as Vikrama said is to add a generic device-level gpu isolator, with no vendor library dependencies. This would be an initial step, since it means no resource discovery (operators have to specify the --resources) and no resource utilization monitoring. One thought I have is whether we can void the vendor-specific library dependencies by using the binary wrappers (e.g. nvidia-smi). On Tue, Jan 19, 2016 at 10:00 AM, Charles Allen < [email protected]> wrote: > Out of curiosity, is isolation something that will be guaranteed by > "standard" kernel mechanisms (like cgroups) or is it a functionality that > requires using the driver/library directly? > > Internally I have some POCs where a standard set of services runs in a > cgroup always, and another cgroup is setup for mesos tasks. (that was > easier than writing a custom mesos resource methodology). In such a > scenario I'm curious how well mesos and non-mesos tasks on the same system > would interact with the resource isolation. > > On Mon, Jan 18, 2016 at 8:23 AM Vikrama Ditya <[email protected]> wrote: > >> Clarification on isolation. As Ben and we (Nvidia) working to introduce >> GPU as first class resource into Mesos. >> >> >> >> By default there is no isolation. But there will be isolation module for >> Nvidia GPU devices which can be linked at build time and provide isolation >> for GPU tasks among GPU devices. Initially device level isolation will be >> there assuming all tasks using same device libraries (hence no file system >> isolation). Our initial proposal is not exposing details of GPU but >> subsequently more detail of GPU resources like (topology, memory, core, >> bandwidth etc.) will be exposed to do better job scheduling. >> >> >> >> As Ben indicated very soon we will send out design proposal to community >> for comments. >> >> >> >> Regards >> >> -- >> Vikram >> >> >> >> *From:* Benjamin Mahler [mailto:[email protected]] >> *Sent:* Saturday, January 16, 2016 4:31 PM >> *To:* [email protected] >> >> >> *Subject:* Re: Share GPU resources via attributes or as custom resources >> (INTERNAL) >> >> >> >> There is a design proposal coming that will include guidance around using >> GPUs and better GPU support in mesos, so stay tuned. >> >> >> >> Mesos supports adding arbitrary resources, e.g. >> >> >> >> --resources=cpus(*):4;gpus(*):4 >> >> >> >> Mesos will then manage a scalar "gpu" resource with a value of 4. This >> means "gpu" scalars will be offered to the framework and the framework may >> launch tasks / executors that are allocated a "gpu" scalar. Of course, >> you'll need support from Marathon for custom resources when you define your >> job, not sure if that exists currently. >> >> >> >> Now, by default no isolation is going to take place. That may be ok for >> you if you have tight control over the fact that tasks/executors only try >> to consume the number of gpus that have been allocated to them. If not, you >> may run an isolator module for gpus (e.g. using the device whitelist >> controller cgroup). At the current time you would have to write one, as I'm >> not sure whether one has been written / published. >> >> >> >> You'll need to make sure your containers have access to the necessary gpu >> libraries. If you are running without filesystem isolation then tasks can >> just reach out of the sandbox to use the necessary libraries. >> >> >> >> Hope that helps, >> >> Ben >> >> >> >> On Thu, Jan 14, 2016 at 9:02 AM, <[email protected]> wrote: >> >> I have a machine with 4 GPUs and want to use Mesos+Marathon to schedule >> the jobs to be run in the machine. Each job will use maximum 1 GPU and >> sharing 1 GPU between small jobs would be ok. >> I know Mesos does not directly support GPUs, but it seems I might use >> custom resources or attributes to do what I want. But how exactly should >> this be done? >> >> If I use --attributes="hasGpu:true", would a job be sent to the machine >> when another job is already running in the machine (and only using 1 GPU)? >> I would say all jobs requesting a machine with a hasGpu attribute would be >> sent to the machine (as long as it has free CPU and memory resources). >> Then, if a job is sent to the machine when the 4 GPUs are already busy, the >> job will fail to start, right? Could then Marathon be used to re-send the >> job after some time, until it is accepted by the machine? >> >> If I specify --resources="gpu(*):4", it is my understanding that once a >> job is sent to the machine, all 4 GPUs will become busy to the eyes of >> Mesos (even if this is not really true). If that is right, would this >> work-around work: specify 4 different resources: gpu:A, gpu:B, gpu:C and >> gpu:D; and use constraints in Marathon like this "constraints": [["gpu", >> "LIKE", " [A-D]"]]? >> >> Cheers >> >> >> ------------------------------ >> This email message is for the sole use of the intended recipient(s) and >> may contain confidential information. Any unauthorized review, use, >> disclosure or distribution is prohibited. If you are not the intended >> recipient, please contact the sender by reply email and destroy all copies >> of the original message. >> ------------------------------ >> >

