Alright - with that in mind. Will morning, midday or evening PST be
preferred compared to your time zone?

On Thu, Apr 7, 2016 at 10:35 PM, Du, Fan <[email protected]> wrote:

>
>
> On 2016/4/7 9:18, Niklas Nielsen wrote:
>
>> On Tue, Apr 5, 2016 at 7:51 PM, Du, Fan <[email protected]> wrote:
>>
>>
>>> Thanks for heads up! Kevin and Niklas.
>>>
>>> Exposing LLC as resource needs special modification to current resource
>>> managing and offering behavior.
>>> Here is my early thoughts:
>>> 1) 'cpu' resource is essentially a cpu share resources, while LLC is per
>>> processor resources,
>>>    This will require:
>>>    1a): Resource offer for cpu and LLC have to be NUMA node aware
>>>         For a two NUMA nodes Agent with 2x40 logical cpu cores, suppose
>>> LLC
>>> has 20 subsets.
>>>         Master will make two resource offers:
>>>         Offer1: cpu 40; LLC 20 with NUMA 1
>>>         Offer2: cpu 40; LLC 20 with NUMA 2
>>>
>>>         From a high level point of view, all the RDT related features
>>> require Mesos to be aware
>>>         of hardware topology when managing resources, .e.g Memory
>>> Bandwidth
>>> will also
>>>         be one type of resource, anyway it’s a long term goal to make
>>> this
>>> happen eventually.
>>>
>>>
>> We have been discussing this at length. Whether or not to expose numa
>> topology to the user and to which degree we automate by having the user
>> expressing intent instead of explicit NUMA settings.
>> Let's discuss it during the performance isolation meeting. Ian has been
>> preparing the latest proposal on core pinning.
>>
>
> It's good for us to discuss/work on this together and manage all the human
> resources in one working group. Looking forward to next public performance
> isolation meeting.
>
> We would have to have something like that in place before aiming to enable
>> CAT, IMO. Don't get me wrong, we definitely should have CAT in mind when
>> we
>> design core isolation.
>> The thing I would like to avoid is conflicting abstractions and duplicate
>> effort, when the two isolation mechanisms are so tightly bound.
>>
>
> I agree with that.
>
>
>
>>
>>
>>>    1b): Agent will apply cpu share isolation along with cpuset.
>>>         We might need to revisit MESOS-314 to support this partially.
>>>
>>> Actually, CAT kernel support could/should still support scenario when
>>> task
>>> migrate between NUMA
>>> nodes, but right now it does not. This is why I filed the ticket and
>>> draft
>>> the initial design doc to track this.
>>>
>>> 2) All the Monitoring support(CMT and MBM) is all most ready in Mesos,
>>> that’s all perf stuff.
>>> Check MESOS-4955 and MESOS-4595 for details.
>>>
>>
>>
>> Yep - Bartek had a patch for this
>>
>>
>>
>>>
>>>
>>>
>>>
>>> On 2016/4/5 23:13, Niklas Nielsen wrote:
>>>
>>> My gut feeling is that it won't be very useful to expose LLC as a first
>>>> class resource type at this point.
>>>> It's very hard to pick for the user and requires framework support
>>>> everywhere.
>>>> Also, as Kevin mentioned, Mesos doesn't pin your tasks so you don't know
>>>> which cores your tasks will be running on.
>>>>
>>>> We have been talking about QoS isolators in the performance isolation
>>>> working group, where more low-level decisions are made on the agent
>>>> itself.
>>>> Both core pinning and CAT would be controls which those isolators could
>>>> adjust to uphold higher level notions of task performance tiers.
>>>>
>>>> Let's discuss this in the performance isolation working group. We can
>>>> schedule a call end of this week or start next week.
>>>>
>>>> Niklas
>>>>
>>>>
>>>> On Mon, Apr 4, 2016 at 10:39 PM, Kevin Klues <[email protected]> wrote:
>>>>
>>>> Hi Fan,
>>>>
>>>>>
>>>>> Thanks for putting this together. I have been looking into this quite
>>>>> a bit myself recently, and have been slowly preparing a design doc for
>>>>> both CAT and CMT support in Mesos. One of the biggest things I have
>>>>> been trying to figure out (which is why I haven't pushed my design doc
>>>>> out yet) is how to combine CAT support with the existing resource
>>>>> model.
>>>>>
>>>>> Specifically, Mesos currently gives out fractional cores using the
>>>>> cgroups cpu.shares mechanism and doesn't allow tasks to choose
>>>>> specific cores to run on (even more than this, there is no way for a
>>>>> task to even see which specific cores might be available).
>>>>> Furthermore, when a resource offer goes out, it's just a collection of
>>>>> SCALARS, SETS, and RANGES, and there's no way to tie one particular
>>>>> resource to another (e.g. you can't say give me cores and memory that
>>>>> are close together to mitigate NUMA effects).
>>>>>
>>>>> Given these limitations, it's not clear how to take immediate
>>>>> advantage of CAT, since it relies on specifying a specific core to
>>>>> allocate the cache from. That is, some mechanism must exist to ensure
>>>>> that both the CPU and the cache are colocated.  This is a problem with
>>>>> the current resource model in general, and applies to properly
>>>>> supporting NUMA as well.
>>>>>
>>>>> You seem to propose simply adding cache partitions as a first class
>>>>> resource on par with CPUs and memory, with no mention of its
>>>>> dependence on particular cores.  What are your thoughts on this?
>>>>>
>>>>> Kevin
>>>>>
>>>>> On Mon, Apr 4, 2016 at 7:36 PM, Du, Fan <[email protected]> wrote:
>>>>>
>>>>> Hi,ALL
>>>>>>
>>>>>> MESOS-5076 is filed to investigate how Intel Cache Allocation
>>>>>> Technology(CAT)[1] could be
>>>>>> used in Mesos. Some introduction and early thoughts is documented
>>>>>>
>>>>>> here[2].
>>>>>
>>>>>
>>>>>> The motivation is to:
>>>>>> a) Add CAT isolation support for Mesos Containerization
>>>>>> b) Expose Last Level Cache(LLC) as Scalar Resource
>>>>>> c) Bridge the interface gap for Docker Containerization,
>>>>>>      CAT support for Docker[3] has been submitted to Docker OCI with
>>>>>>
>>>>>> positive
>>>>>
>>>>> feedback.
>>>>>>
>>>>>> The ultimate goal is to provide operator CAT isolator for better
>>>>>>
>>>>>> colocation
>>>>>
>>>>> of cluster resources.
>>>>>> I'm looking forward for any comments for community to move this
>>>>>> forward.
>>>>>>
>>>>>> Thanks!
>>>>>>
>>>>>> [1]:
>>>>>>
>>>>>>
>>>>>
>>>>> http://www.intel.com/content/www/us/en/communications/cache-monitoring-cache-allocation-technologies.html
>>>>>
>>>>> [2]:
>>>>>>
>>>>>>
>>>>>
>>>>> https://docs.google.com/document/d/130ay0e2DZ9S61SC3tGcik5wQaC8L40t5tWj3K3GJxTg/edit?usp=sharing
>>>>>
>>>>> [3]:https://github.com/opencontainers/runtime-spec/pull/267
>>>>>>       https://github.com/opencontainers/runc/pull/447
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>> ~Kevin
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>>
>>
>>


-- 
Niklas

Reply via email to