Since "Isolation" applies broadly outside of the context of addressing latency sensitive workloads (e.g. user/pid/network namespacing, resource limitations (e.g. cpu quota, memory limits, gpu device visibility) it would be great to choose a more specific name. Some suggestions: interference, performance-related isolation, colocation, latency sensitivity.
Thoughts? Looking forward to seeing the discussions here! Ben On Friday, January 22, 2016, Nielsen, Niklas <[email protected]> wrote: > Hi everyone, > > We have been talking about core affinity in Mesos for a while, and Ian D. > has recently been giving this topic thought in his ‘exclusive resources’ > proposal [1]. > Trying to avoid too conservative placements, latency critical workloads > are at risk without it. > We are interested in the topic through our work on oversubscription in > Serenity [2], as oversubscription was exactly to be able to colocate > latency critical and best-effort batch jobs. > We had an informal meeting yesterday, going over the proposal and trying > to get some cadence behind the capability. > > It is a tricky but exciting topic: > - How do we avoid making task launch even more complex? How do we express > the topology and acquire parts of it. Do we use hints on the affinity > properties instead? > - How do we mix pinned with normal ‘floating’ tasks. > - How do we convey information to the resource estimator about the task > sensitivity. > > Note, above list not meant for inlined discussion or answers. Let’s > collect feedback on the proposals themselves. > > Here are our proposed next steps: > - We are going to use the ‘Isolation Working Group’ as an umbrella for > this. I will fill in details and members. > - We will schedule an online meeting within the Wednesday 9AM PST next > week discussing next steps. I will share a hangout link when we get closer. > - Plan being, getting to designs (maybe more than one) we agree on and > then scope out and distribute the work needed to be done. > > Who ever is interested, join us. The use cases for this work are critical. > Maybe we can even work on some representative workloads we can verify our > proposal against. > > Cheers, > Niklas > > PS For comments on the proposal itself, please refer to Ian’s thread for > the dev list [3]. > > [1] https://issues.apache.org/jira/browse/MESOS-4138 > [2] https://github.com/mesosphere/serenity > [3] https://www.mail-archive.com/dev%40mesos.apache.org/msg33892.html >
