[jira] [Commented] (MESOS-6035) Add non-recursive version of cgroups::get

2017-01-17 Thread haosdent huang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-6035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15825685#comment-15825685
 ] 

haosdent huang commented on MESOS-6035:
---

hi, [~vinodkone] yan reverted patches above because it failed the nested 
container test cases. We have future discussions about how to refactor the 
cgroups test part. Because this issue is not resolved after the patch revert, I 
reopen it before.

> Add non-recursive version of cgroups::get
> -
>
> Key: MESOS-6035
> URL: https://issues.apache.org/jira/browse/MESOS-6035
> Project: Mesos
>  Issue Type: Improvement
>  Components: cgroups
>Reporter: haosdent
>Assignee: haosdent
>Priority: Minor
> Fix For: 1.2.0
>
>
> In some cases, we only need to get the top level cgroups instead of to get 
> all cgroups recursively. Add a non-recursive version could help to avoid 
> unnecessary paths.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (MESOS-6921) Document posix isolators could not isolate resources in configuration.md

2017-01-13 Thread haosdent huang (JIRA)
haosdent huang created MESOS-6921:
-

 Summary: Document posix isolators could not isolate resources in 
configuration.md
 Key: MESOS-6921
 URL: https://issues.apache.org/jira/browse/MESOS-6921
 Project: Mesos
  Issue Type: Improvement
  Components: documentation
Reporter: haosdent huang
Priority: Trivial


POSIX isolators only report resource usage without do any actual isolation. We 
should make this more obviously in {{slave/flags.cpp}} and configuration.md



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (MESOS-5342) CPU pinning/binding support for CgroupsCpushareIsolatorProcess

2017-01-13 Thread haosdent huang (JIRA)

[ 
https://issues.apache.org/jira/browse/MESOS-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15821985#comment-15821985
 ] 

haosdent huang commented on MESOS-5342:
---

Hi, [~klueska][~ct.clmsn] I read the file briefly before and going to read it 
again tmr. The CgroupsCpushareIsolatorProcess have changed to CpuSubsytem. And 
some Huawei guys are adding NUMA/cpuset support to Mesos recently, they 
implementation consider cpuset like network ports which more simpler than the 
proposal in 
https://docs.google.com/document/d/1G3L1Tdulg5iW7hZ2WXbG-bqROILu7zdBh2aWYu3An6A/
 . I would try to add some comments tomorrow and see if we could merge both 
Huawei guys and [~ct.clmsn] works in the proposal.

> CPU pinning/binding support for CgroupsCpushareIsolatorProcess
> --
>
> Key: MESOS-5342
> URL: https://issues.apache.org/jira/browse/MESOS-5342
> Project: Mesos
>  Issue Type: Improvement
>  Components: cgroups, containerization
>Affects Versions: 0.28.1
>Reporter: Chris
>  Labels: cgroups, cpu, cpu-usage, gpu, isolation, isolator, 
> mentor, perfomance
>
> The cgroups isolator currently lacks support for binding (also called 
> pinning) containers to a set of cores. The GNU/Linux kernel is known to make 
> sub-optimal core assignments for processes and threads. Poor assignments 
> impact program performance, specifically in terms of cache locality. 
> Applications requiring GPU resources can benefit from this feature by getting 
> access to cores closest to the GPU hardware, which reduces cpu-gpu copy 
> latency.
> Most cluster management systems from the HPC community (SLURM) provide both 
> cgroup isolation and cpu binding. This feature would provide similar 
> capabilities. The current interest in supporting Intel's Cache Allocation 
> Technology, and the advent of Intel's Knights-series processors, will require 
> making choices about where container's are going to run on the mesos-agent's 
> processor(s) cores - this feature is a step toward developing a robust 
> solution.
> The improvement in this JIRA ticket will handle hardware topology detection, 
> track container-to-core utilization in a histogram, and use a mathematical 
> optimization technique to select cores for container assignment based on 
> latency and the container-to-core utilization histogram.
> For GPU tasks, the improvement will prioritize selection of cores based on 
> latency between the GPU and cores in an effort to minimize copy latency.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)