Alright - with that in mind. Will morning, midday or evening PST be preferred compared to your time zone?
On Thu, Apr 7, 2016 at 10:35 PM, Du, Fan <[email protected]> wrote: > > > On 2016/4/7 9:18, Niklas Nielsen wrote: > >> On Tue, Apr 5, 2016 at 7:51 PM, Du, Fan <[email protected]> wrote: >> >> >>> Thanks for heads up! Kevin and Niklas. >>> >>> Exposing LLC as resource needs special modification to current resource >>> managing and offering behavior. >>> Here is my early thoughts: >>> 1) 'cpu' resource is essentially a cpu share resources, while LLC is per >>> processor resources, >>> This will require: >>> 1a): Resource offer for cpu and LLC have to be NUMA node aware >>> For a two NUMA nodes Agent with 2x40 logical cpu cores, suppose >>> LLC >>> has 20 subsets. >>> Master will make two resource offers: >>> Offer1: cpu 40; LLC 20 with NUMA 1 >>> Offer2: cpu 40; LLC 20 with NUMA 2 >>> >>> From a high level point of view, all the RDT related features >>> require Mesos to be aware >>> of hardware topology when managing resources, .e.g Memory >>> Bandwidth >>> will also >>> be one type of resource, anyway it’s a long term goal to make >>> this >>> happen eventually. >>> >>> >> We have been discussing this at length. Whether or not to expose numa >> topology to the user and to which degree we automate by having the user >> expressing intent instead of explicit NUMA settings. >> Let's discuss it during the performance isolation meeting. Ian has been >> preparing the latest proposal on core pinning. >> > > It's good for us to discuss/work on this together and manage all the human > resources in one working group. Looking forward to next public performance > isolation meeting. > > We would have to have something like that in place before aiming to enable >> CAT, IMO. Don't get me wrong, we definitely should have CAT in mind when >> we >> design core isolation. >> The thing I would like to avoid is conflicting abstractions and duplicate >> effort, when the two isolation mechanisms are so tightly bound. >> > > I agree with that. > > > >> >> >>> 1b): Agent will apply cpu share isolation along with cpuset. >>> We might need to revisit MESOS-314 to support this partially. >>> >>> Actually, CAT kernel support could/should still support scenario when >>> task >>> migrate between NUMA >>> nodes, but right now it does not. This is why I filed the ticket and >>> draft >>> the initial design doc to track this. >>> >>> 2) All the Monitoring support(CMT and MBM) is all most ready in Mesos, >>> that’s all perf stuff. >>> Check MESOS-4955 and MESOS-4595 for details. >>> >> >> >> Yep - Bartek had a patch for this >> >> >> >>> >>> >>> >>> >>> On 2016/4/5 23:13, Niklas Nielsen wrote: >>> >>> My gut feeling is that it won't be very useful to expose LLC as a first >>>> class resource type at this point. >>>> It's very hard to pick for the user and requires framework support >>>> everywhere. >>>> Also, as Kevin mentioned, Mesos doesn't pin your tasks so you don't know >>>> which cores your tasks will be running on. >>>> >>>> We have been talking about QoS isolators in the performance isolation >>>> working group, where more low-level decisions are made on the agent >>>> itself. >>>> Both core pinning and CAT would be controls which those isolators could >>>> adjust to uphold higher level notions of task performance tiers. >>>> >>>> Let's discuss this in the performance isolation working group. We can >>>> schedule a call end of this week or start next week. >>>> >>>> Niklas >>>> >>>> >>>> On Mon, Apr 4, 2016 at 10:39 PM, Kevin Klues <[email protected]> wrote: >>>> >>>> Hi Fan, >>>> >>>>> >>>>> Thanks for putting this together. I have been looking into this quite >>>>> a bit myself recently, and have been slowly preparing a design doc for >>>>> both CAT and CMT support in Mesos. One of the biggest things I have >>>>> been trying to figure out (which is why I haven't pushed my design doc >>>>> out yet) is how to combine CAT support with the existing resource >>>>> model. >>>>> >>>>> Specifically, Mesos currently gives out fractional cores using the >>>>> cgroups cpu.shares mechanism and doesn't allow tasks to choose >>>>> specific cores to run on (even more than this, there is no way for a >>>>> task to even see which specific cores might be available). >>>>> Furthermore, when a resource offer goes out, it's just a collection of >>>>> SCALARS, SETS, and RANGES, and there's no way to tie one particular >>>>> resource to another (e.g. you can't say give me cores and memory that >>>>> are close together to mitigate NUMA effects). >>>>> >>>>> Given these limitations, it's not clear how to take immediate >>>>> advantage of CAT, since it relies on specifying a specific core to >>>>> allocate the cache from. That is, some mechanism must exist to ensure >>>>> that both the CPU and the cache are colocated. This is a problem with >>>>> the current resource model in general, and applies to properly >>>>> supporting NUMA as well. >>>>> >>>>> You seem to propose simply adding cache partitions as a first class >>>>> resource on par with CPUs and memory, with no mention of its >>>>> dependence on particular cores. What are your thoughts on this? >>>>> >>>>> Kevin >>>>> >>>>> On Mon, Apr 4, 2016 at 7:36 PM, Du, Fan <[email protected]> wrote: >>>>> >>>>> Hi,ALL >>>>>> >>>>>> MESOS-5076 is filed to investigate how Intel Cache Allocation >>>>>> Technology(CAT)[1] could be >>>>>> used in Mesos. Some introduction and early thoughts is documented >>>>>> >>>>>> here[2]. >>>>> >>>>> >>>>>> The motivation is to: >>>>>> a) Add CAT isolation support for Mesos Containerization >>>>>> b) Expose Last Level Cache(LLC) as Scalar Resource >>>>>> c) Bridge the interface gap for Docker Containerization, >>>>>> CAT support for Docker[3] has been submitted to Docker OCI with >>>>>> >>>>>> positive >>>>> >>>>> feedback. >>>>>> >>>>>> The ultimate goal is to provide operator CAT isolator for better >>>>>> >>>>>> colocation >>>>> >>>>> of cluster resources. >>>>>> I'm looking forward for any comments for community to move this >>>>>> forward. >>>>>> >>>>>> Thanks! >>>>>> >>>>>> [1]: >>>>>> >>>>>> >>>>> >>>>> http://www.intel.com/content/www/us/en/communications/cache-monitoring-cache-allocation-technologies.html >>>>> >>>>> [2]: >>>>>> >>>>>> >>>>> >>>>> https://docs.google.com/document/d/130ay0e2DZ9S61SC3tGcik5wQaC8L40t5tWj3K3GJxTg/edit?usp=sharing >>>>> >>>>> [3]:https://github.com/opencontainers/runtime-spec/pull/267 >>>>>> https://github.com/opencontainers/runc/pull/447 >>>>>> >>>>>> >>>>>> >>>>> >>>>> -- >>>>> ~Kevin >>>>> >>>>> >>>>> >>>> >>>> >>>> >> >> -- Niklas
