haosdent huang commented on MESOS-5342:

Hi, [~klueska][~ct.clmsn] I read the file briefly before and going to read it 
again tmr. The CgroupsCpushareIsolatorProcess have changed to CpuSubsytem. And 
some Huawei guys are adding NUMA/cpuset support to Mesos recently, they 
implementation consider cpuset like network ports which more simpler than the 
proposal in 
 . I would try to add some comments tomorrow and see if we could merge both 
Huawei guys and [~ct.clmsn] works in the proposal.

> CPU pinning/binding support for CgroupsCpushareIsolatorProcess
> --------------------------------------------------------------
>                 Key: MESOS-5342
>                 URL: https://issues.apache.org/jira/browse/MESOS-5342
>             Project: Mesos
>          Issue Type: Improvement
>          Components: cgroups, containerization
>    Affects Versions: 0.28.1
>            Reporter: Chris
>              Labels: cgroups, cpu, cpu-usage, gpu, isolation, isolator, 
> mentor, perfomance
> The cgroups isolator currently lacks support for binding (also called 
> pinning) containers to a set of cores. The GNU/Linux kernel is known to make 
> sub-optimal core assignments for processes and threads. Poor assignments 
> impact program performance, specifically in terms of cache locality. 
> Applications requiring GPU resources can benefit from this feature by getting 
> access to cores closest to the GPU hardware, which reduces cpu-gpu copy 
> latency.
> Most cluster management systems from the HPC community (SLURM) provide both 
> cgroup isolation and cpu binding. This feature would provide similar 
> capabilities. The current interest in supporting Intel's Cache Allocation 
> Technology, and the advent of Intel's Knights-series processors, will require 
> making choices about where container's are going to run on the mesos-agent's 
> processor(s) cores - this feature is a step toward developing a robust 
> solution.
> The improvement in this JIRA ticket will handle hardware topology detection, 
> track container-to-core utilization in a histogram, and use a mathematical 
> optimization technique to select cores for container assignment based on 
> latency and the container-to-core utilization histogram.
> For GPU tasks, the improvement will prioritize selection of cores based on 
> latency between the GPU and cores in an effort to minimize copy latency.

This message was sent by Atlassian JIRA

Reply via email to