[ 
https://issues.apache.org/jira/browse/MESOS-4424?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benjamin Mahler updated MESOS-4424:
-----------------------------------
    Description: 
Mesos already has generic mechanisms for expressing / isolating resources, and 
we'd like to expose GPUs as resources that can be consumed and isolated. 
However, GPUs present unique challenges:
* Users may rely on vendor-specific libraries to interact with the device (e.g. 
CUDA, HSA, etc), others may rely on portable libraries like OpenCL or OpenGL. 
These libraries need to be available from within the container.
* GPU hardware has many attributes that may impose scheduling constraints (e.g. 
core count, total memory, topology (via PCI-E, NVLINK, etc), driver versions, 
etc).
* Obtaining utilization information requires vendor-specific approaches.
* Isolated sharing of a GPU device requires vendor-specific approaches.

As such, the focus is on supporting a narrow initial use case: homogenous 
device-level GPU support:
* Fractional sharing of GPU devices across containers will not be supported 
initially, unlike CPU cores.
* Heterogeneity will be supported via other means for now (e.g. using agent 
attributes to differentiate hardware profiles, using portable libraries like 
OpenCL, etc).

Working group email list: https://groups.google.com/forum/#!forum/mesos-gpus

  was:
Mesos already has generic mechanisms for expressing / isolating resources, and 
we'd like to expose GPUs as resources that can be consumed and isolated. 
However, GPUs present unique challenges:
* Users may rely on vendor-specific libraries to interact with the device (e.g. 
CUDA, HSA, etc), others may rely on portable libraries like OpenCL or OpenGL. 
These libraries need to be available from within the container.
* GPU hardware has many attributes that may impose scheduling constraints (e.g. 
core count, total memory, topology (via PCI-E, NVLINK, etc), driver versions, 
etc).
* Obtaining utilization information requires vendor-specific approaches.
* Isolated sharing of a GPU device requires vendor-specific approaches.

As such, the focus is on supporting a narrow initial use case: homogenous 
device-level GPU support:
* Fractional sharing of GPU devices across containers will not be supported 
initially, unlike CPU cores.
* Heterogeneity will be supported via other means for now (e.g. using agent 
attributes to differentiate hardware profiles, using portable libraries like 
OpenCL, etc).



> Initial support for GPU resources.
> ----------------------------------
>
>                 Key: MESOS-4424
>                 URL: https://issues.apache.org/jira/browse/MESOS-4424
>             Project: Mesos
>          Issue Type: Epic
>          Components: isolation
>            Reporter: Benjamin Mahler
>
> Mesos already has generic mechanisms for expressing / isolating resources, 
> and we'd like to expose GPUs as resources that can be consumed and isolated. 
> However, GPUs present unique challenges:
> * Users may rely on vendor-specific libraries to interact with the device 
> (e.g. CUDA, HSA, etc), others may rely on portable libraries like OpenCL or 
> OpenGL. These libraries need to be available from within the container.
> * GPU hardware has many attributes that may impose scheduling constraints 
> (e.g. core count, total memory, topology (via PCI-E, NVLINK, etc), driver 
> versions, etc).
> * Obtaining utilization information requires vendor-specific approaches.
> * Isolated sharing of a GPU device requires vendor-specific approaches.
> As such, the focus is on supporting a narrow initial use case: homogenous 
> device-level GPU support:
> * Fractional sharing of GPU devices across containers will not be supported 
> initially, unlike CPU cores.
> * Heterogeneity will be supported via other means for now (e.g. using agent 
> attributes to differentiate hardware profiles, using portable libraries like 
> OpenCL, etc).
> Working group email list: https://groups.google.com/forum/#!forum/mesos-gpus



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to