[
https://issues.apache.org/jira/browse/MESOS-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14299384#comment-14299384
]
Adam B commented on MESOS-2262:
-------------------------------
We've also discussed turning the --resources flag into a more readable JSON
format. See MESOS-507
As for attributes vs. resources: you would use resources if you want these
elements to be consumable by a task, such that if you have 1 GPU resource, only
one task can use it at a time. You would use an attribute if you want the
information about the presence of a GPU to be offered to all frameworks, and
even if one framework launches a task that may (or may not) use the GPU, other
frameworks still see it and can launch tasks on the same node based on the
presence of the GPU.
Isolation is another story, and would suggest that you want to use a resource,
and then provide isolation to restrict that only the task consuming that
resource can access the GPU. From what I've heard, it would be much easier to
provide all-or-nothing isolation/access to the GPU, rather than trying to
isolate individual GPU cores.
> Adding GPGPU resource into Mesos framework, so we can know if any GPGPU
> resource are available for master
> ---------------------------------------------------------------------------------------------------------
>
> Key: MESOS-2262
> URL: https://issues.apache.org/jira/browse/MESOS-2262
> Project: Mesos
> Issue Type: Task
> Components: framework, slave
> Environment: OpenCL support env, such as OS X, Linux, Windows..
> Reporter: chester kuo
> Priority: Minor
>
> Extending Mesos to support Heterogeneous resource such as GPGPU/FPGA..etc as
> computing resources in the data-center, OpenCL will be first target to add
> into Mesos (support by all major GPU vendor) , I will reserve to support
> others such as CUDA in the future.
> In this feature, slave will be supported to do resources discover including
> but not limited to,
> (1) Heterogeneous Computing protocol type : "OpenCL". "CUDA", "HSA"
> (2) Computing global memory (MB)
> (3) Computing run time version , such as "1.2" , "2.0"
> (4) Computing compute unit (double)
> (5) Computing device type : GPGPU, CPU, Accelerator device.
> (6) Computing (number of devices): (double)
> The Heterogeneous resource isolation will be supported in the framework
> instead of in the slave devices side, the major reason here is , the
> ecosystem , such as OpenCL operate on top of private device driver own by
> vendors, only runtime library (OpenCL) is user-space application, so its hard
> for us to do like Linux cgroup to have CPU/memory resource isolation. As a
> result we may use run time library to do device isolation and memory
> allocation.
> (PS, if anyone know how to do it for GPGPU driver, please drop me a note)
> Meanwhile, some run-time library (such as OpenCL) support to run on top of
> CPU, so we need to use isolator API to notify this once it allocated.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)