[ 
https://issues.apache.org/jira/browse/MESOS-3521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14908706#comment-14908706
 ] 

Jonathan Calmels edited comment on MESOS-3521 at 9/25/15 9:39 PM:
------------------------------------------------------------------

Well in conjunction with [https://issues.apache.org/jira/browse/MESOS-3366], it 
would allow agents to describe their hardware in a well known format. This 
format would be standardized, hence schedulers could act upon it to take better 
decisions.

Use case scenario:
The agent provides a hook that uses libhwloc and export CPU information (see 
the example above). A scheduler could look at the resources_info of each offers 
and launch a task on the agent providing the best clock rate.

Following the same logic, we could export GPUs as a custom resource set, and 
ask the scheduler to pick say 2 of them that are pairable for a multi-GPU aware 
application. 



was (Author: exxo):
Well in conjunction with [https://issues.apache.org/jira/browse/MESOS-3366], it 
would allow agents to describe their hardware in a well known format. This 
format would be standardized, hence schedulers could act upon it to take better 
decisions.

Use case scenario:
The agent provides a hook that uses libhwloc and export CPU information (see 
the example above). A scheduler could look at the resources_info of each offers 
and launch a task on the agent providing the best clock rate.



> Support for hardware topology and resources description
> -------------------------------------------------------
>
>                 Key: MESOS-3521
>                 URL: https://issues.apache.org/jira/browse/MESOS-3521
>             Project: Mesos
>          Issue Type: Improvement
>            Reporter: Jonathan Calmels
>
> In heterogeneous clusters, tasks sometimes have strong constraints on the 
> type of hardware they need to execute on. The current solution is to use 
> custom attributes to describe resources on the agents.
> While this solution works, the current attribute format is somehow 
> constraining and requires workarounds on both the agent and the scheduler 
> (e.g. Base64 RFC 6920). In addition, we often need to encode the link between 
> an attribute and a resource which is inherently error prone and 
> implementation defined from one scheduler to another.
> I would like to propose a unified format to expose hardware topology and 
> resources information at the agent level.
> This format would effectively extend the following specification: 
> [http://mesos.apache.org/documentation/attributes-resources/]
> {code:title=specification}
> kv : text ":" ( range | kvSet | scalar | text )
> kvSet : "{" kv ( "," kv )* "}"
> info : ( range | text | "*" ) : kvSet
> infoSet : "{" info ( "," info )* "}"
> resourceName : text | "*"
> resourcesInfoValue : text "(" resourceName ")" ":" infoSet
> resourcesInfo : resourcesInfoValue ( ";" resourcesInfoValue )*
> {code}
> {code:javascript|title=example}
> --resources= gpus:{card0, card1};ports:[0-100];cpus:8
> --resources_info= nvidia(gpus): {             
>                       card0: {
>                               uuid: GPU-34e8d7ba-0e4d-ac00-6852-695d5d404f51,
>                               name: GeForce_GTX_980,
>                               path: /dev/nvidia0,
>                               clocks: {
>                                               graphic: 1392,
>                                               sm: 1392
>                               }
>                       },
>                       card1: {
>                               uuid: GPU-12e457ba-0f4e-bf01-3452-674a5b212c21,
>                               name: GeForce_GTX_970,
>                               path: /dev/nvidia1
>                               clocks: {
>                                               graphic: 1392,
>                                               sm: 1392
>                               }
>                       },
>                       *: {
>                               driver: 352.39
>                       }
>                };
>                services(ports): {
>                       [0-79]: {
>                               type: daemon,
>                               user: root
>                       },
>                       [80-100]: {
>                               type: web_services,
>                               user: www-data
>                       }
>                };
>                procs(cpus): {
>                       *: {
>                               name: Intel_i7_6700K,
>                               frequency: 4,
>                               cache: 8
>                       }
>                }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to