[
https://issues.apache.org/jira/browse/MESOS-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chris updated MESOS-5342:
-------------------------
Comment: was deleted
(was: For information about submodular functions (and why it was selected for
this problem), strongly suggest reviewing this youtube video:
https://youtu.be/6ThMzlHdKsI)
> CPU pinning/binding support for CgroupsCpushareIsolatorProcess
> --------------------------------------------------------------
>
> Key: MESOS-5342
> URL: https://issues.apache.org/jira/browse/MESOS-5342
> Project: Mesos
> Issue Type: Improvement
> Components: cgroups, containerization
> Affects Versions: 0.28.1
> Reporter: Chris
>
> The cgroups isolator currently lacks support for binding (also called
> pinning) containers to a set of cores. The GNU/Linux kernel is known to make
> sub-optimal core assignments for processes and threads. Poor assignments
> impact program performance, specifically in terms of cache locality.
> Applications requiring GPU resources can benefit from this feature by getting
> access to cores closest to the GPU hardware, which reduces cpu-gpu copy
> latency.
> Most cluster management systems from the HPC community (SLURM) provide both
> cgroup isolation and cpu binding. This feature would provide similar
> capabilities. The current interest in supporting Intel's Cache Allocation
> Technology, and the advent of Intel's Knights-series processors, will require
> making choices about where container's are going to run on the mesos-agent's
> processor(s) cores - this feature is a step toward developing a robust
> solution.
> The improvement in this JIRA ticket will handle hardware topology detection,
> track container-to-core utilization in a histogram, and use a mathematical
> optimization technique to select cores for container assignment based on
> latency and the container-to-core utilization histogram.
> For GPU tasks, the improvement will prioritize selection of cores based on
> latency between the GPU and cores in an effort to minimize copy latency.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)