[ https://issues.apache.org/jira/browse/MESOS-3521?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14908706#comment-14908706 ]
Jonathan Calmels edited comment on MESOS-3521 at 9/25/15 9:39 PM: ------------------------------------------------------------------ Well in conjunction with [https://issues.apache.org/jira/browse/MESOS-3366], it would allow agents to describe their hardware in a well known format. This format would be standardized, hence schedulers could act upon it to take better decisions. Use case scenario: The agent provides a hook that uses libhwloc and export CPU information (see the example above). A scheduler could look at the resources_info of each offers and launch a task on the agent providing the best clock rate. Following the same logic, we could export GPUs as a custom resource set, and ask the scheduler to pick say 2 of them that are pairable for a multi-GPU aware application. was (Author: exxo): Well in conjunction with [https://issues.apache.org/jira/browse/MESOS-3366], it would allow agents to describe their hardware in a well known format. This format would be standardized, hence schedulers could act upon it to take better decisions. Use case scenario: The agent provides a hook that uses libhwloc and export CPU information (see the example above). A scheduler could look at the resources_info of each offers and launch a task on the agent providing the best clock rate. > Support for hardware topology and resources description > ------------------------------------------------------- > > Key: MESOS-3521 > URL: https://issues.apache.org/jira/browse/MESOS-3521 > Project: Mesos > Issue Type: Improvement > Reporter: Jonathan Calmels > > In heterogeneous clusters, tasks sometimes have strong constraints on the > type of hardware they need to execute on. The current solution is to use > custom attributes to describe resources on the agents. > While this solution works, the current attribute format is somehow > constraining and requires workarounds on both the agent and the scheduler > (e.g. Base64 RFC 6920). In addition, we often need to encode the link between > an attribute and a resource which is inherently error prone and > implementation defined from one scheduler to another. > I would like to propose a unified format to expose hardware topology and > resources information at the agent level. > This format would effectively extend the following specification: > [http://mesos.apache.org/documentation/attributes-resources/] > {code:title=specification} > kv : text ":" ( range | kvSet | scalar | text ) > kvSet : "{" kv ( "," kv )* "}" > info : ( range | text | "*" ) : kvSet > infoSet : "{" info ( "," info )* "}" > resourceName : text | "*" > resourcesInfoValue : text "(" resourceName ")" ":" infoSet > resourcesInfo : resourcesInfoValue ( ";" resourcesInfoValue )* > {code} > {code:javascript|title=example} > --resources= gpus:{card0, card1};ports:[0-100];cpus:8 > --resources_info= nvidia(gpus): { > card0: { > uuid: GPU-34e8d7ba-0e4d-ac00-6852-695d5d404f51, > name: GeForce_GTX_980, > path: /dev/nvidia0, > clocks: { > graphic: 1392, > sm: 1392 > } > }, > card1: { > uuid: GPU-12e457ba-0f4e-bf01-3452-674a5b212c21, > name: GeForce_GTX_970, > path: /dev/nvidia1 > clocks: { > graphic: 1392, > sm: 1392 > } > }, > *: { > driver: 352.39 > } > }; > services(ports): { > [0-79]: { > type: daemon, > user: root > }, > [80-100]: { > type: web_services, > user: www-data > } > }; > procs(cpus): { > *: { > name: Intel_i7_6700K, > frequency: 4, > cache: 8 > } > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)