Echoing Ben Mahler's comment. I still don't find the ThrottleInfo very intuitive. Did you discuss the general notion of resource quality further?
On Mon, Mar 21, 2016 at 11:50 PM, Klaus Ma <[email protected]> wrote: > @benm/joris, > > here's the user scenario in my mind: > > 1. master offers resources to the framework, e.g. 2 cpu > 2. framework launch a task (2 cpu) and *mark the task/executors as > throttleable* > 3. in ResourceEstimator, it should only consider the throttleable > task/executors: > - keep enough resources for the tasks/executors *without* throttleable > flag/attribute > - report allocated but not used resources by task/executor *with* > throttleable flag/attribute; for example, report 1 cpu as " > *Revocable.Throttleable"* resources to framework in this case > 4. it's up to framework to use which resources; "*Revocable.Throttleable*" > means it'll share compress resources with resources owner, "*Revocable*" > (without ThrottleableInfo) means it'll be evicted when the resources owner > reclaimed it back > 5. QoS Controller makes sure: > - enough resources for the tasks/executors *without* throttleable > flag/attribute > - if used resources exceed allocated resources *with* throttleable > flag/attribute, evict the task/executor on revocable resource > > So to @connor's question, maybe a flag/attribute to task/executor when > launching it. Regarding the name, both "ScavengeInfo"/ > "BestEffortInfo"/"ThrottleableInfo" are OK for me, maybe "ScavengeInfo" is > better. > > Any comments? > > For this scenario, I think there're still open questions: > 1. Can framework launch task with throttleable flag/attribute on revocable > resources? > 2. For ResourceEstimator/QoS Controllor, should Agent double check it > report? > 3. What's the behaviour between the two container: the container on > original resouces & the container on revocable resource? > 4. Who handle compressible/in-compressible resources? Maybe > ResourceEstimator/QoSController, it should not report in-compressible > resources as Revocable.Throttleable. > > Thanks > Klaus > > On Tuesday, March 22, 2016 at 4:13:10 AM UTC+8, Benjamin Mahler wrote: >> >> Yeah that's definitely a question I've been asking myself, and we synced >> on that with Niklas during the last meeting. The thought currently is that >> we should choose a better name than ThrottleInfo. ThrottleInfo seems to >> carry too strong of an implication about what the resources will >> experience. Rather, we could pick a name like "ScavengeInfo" / >> "BestEffortInfo" / etc that indicates that these resources are running >> within the un-utilized portion of the machine and _may_ experience >> degradation. >> >> On Mon, Mar 21, 2016 at 1:26 AM, Joris Van Remoortere < >> [email protected]> wrote: >> >>> @klaus: >>> I think @connor's question is whether we are absolutely sure we never >>> want to support throttle-able but non-revocable resources. >>> It's clear from the protos that this is not supported, the question is >>> whether we are sure that is what we want. If so, can you elaborate as to >>> *why* we would never want that concept in Mesos. >>> >>> — >>> *Joris Van Remoortere* >>> Mesosphere >>> >>> On Sun, Mar 20, 2016 at 8:33 PM, Klaus Ma <[email protected]> >>> wrote: >>> >>>> Here's some input :). >>>> >>>> If throttling is tolerable but preemption is not, how would that be >>>> expressed? (Is that supported?) >>>> [Klaus]: It's not supported; only revocable resources has this >>>> attribute: non-throttleable or throttleable. The throttleable revocable >>>> resources is reported by ResourceEstimator which means the resources maybe >>>> throttled by its original owner. >>>> >>>> How does this work with the QoS controller? Will there be a new >>>> correction type to indicate throttling, or does throttling happen "behind >>>> the agent's back"? >>>> [Klaus]: The QoSController/ResourceEstimator only manages throttleable >>>> revocable resources; the others resources (regular resources and >>>> non-throttleable revocable resources) are managed by allocator. The >>>> "manage" means generation and destroy/eviction. Regarding "throttling >>>> happen", good question. I think the throttling will dependent on >>>> containers, let me double check it :). >>>> >>>> If any comments, please let me know. >>>> >>>> ---- >>>> Da (Klaus), Ma (马达) | PMP® | Advisory Software Engineer >>>> Platform OpenSource Technology, STG, IBM GCG >>>> +86-10-8245 4084 | [email protected] | http://k82.me >>>> >>>> On Sat, Mar 19, 2016 at 11:15 PM, <[email protected]> wrote: >>>> >>>>> Thanks for the good explanations so far Ben and Klaus. Apologies if >>>>> you guys already covered these questions in the meeting: >>>>> >>>>> If throttling is tolerable but preemption is not, how would that be >>>>> expressed? (Is that supported?) >>>>> >>>>> How does this work with the QoS controller? Will there be a new >>>>> correction type to indicate throttling, or does throttling happen "behind >>>>> the agent's back"? >>>>> >>>>> Thanks, >>>>> -- >>>>> Connor >>>>> >>>>> > On Mar 19, 2016, at 04:01, Klaus Ma <[email protected]> wrote: >>>>> > >>>>> > @team, in the latest meeting, we agree to keep current name >>>>> ThrottleInfo. >>>>> > >>>>> > If any more comments, please let me know. >>>>> > >>>>> >> On Wednesday, March 16, 2016 at 9:32:37 PM UTC+8, Guangya Liu wrote: >>>>> >> Also please show your comments if any for the name here, the >>>>> current name is ThrottleInfo, in Kubernetes resources qos design document, >>>>> they are using scavenging as the key work for such behaviour, so a >>>>> possible >>>>> name here could be ScavengeInfo , please show your comments if any for >>>>> those two names or even if you want to propose a new name here. >>>>> >> >>>>> >> message RevocableInfo { >>>>> >> message ThrottleInfo {} >>>>> >> >>>>> >> // If set, indicates that the resources may be throttled at >>>>> >> // any time. Throttle-able resoruces can be used for tasks >>>>> >> // that do not have strict performance requirements and are >>>>> >> // capable of handling being throttled. >>>>> >> optional ThrottleInfo throttle_info = 1; >>>>> >> } >>>>> >> >>>>> >> 在 2016年3月16日星期三 UTC+8上午10:24:14,Klaus Ma写道: >>>>> >>> >>>>> >>> The patches are updated accordingly; JIRA: MESOS-3888 , RR: >>>>> https://reviews.apache.org/r/40375/ . >>>>> >>> >>>>> >>> Thanks >>>>> >>> klaus >>>>> >>> >>>>> >>>> On Saturday, March 12, 2016 at 11:09:46 AM UTC+8, Benjamin Mahler >>>>> wrote: >>>>> >>>> Hey folks, >>>>> >>>> >>>>> >>>> In the resource allocation working group we've been looking into >>>>> a few projects that will make the allocator able to offer out resources as >>>>> revocable. For example: >>>>> >>>> >>>>> >>>> -We'll want to eventually allocate resources as revocable _by >>>>> default_, only allowing non-revocable when there are guarantees put in >>>>> place (static reservations or quota). >>>>> >>>> >>>>> >>>> -On the path to revocable by default, we can incrementally start >>>>> to offer certain resources as revocable. Consider when quota is set but >>>>> the >>>>> role isn't using all of the quota. The unallocated quota can be offered to >>>>> other roles, but it should be revocable because we may revoke them should >>>>> the quota'ed role want to use the resources. Unused reservations fall into >>>>> a similar category. >>>>> >>>> >>>>> >>>> -Going revocable by default also allows us to enforce fairness in >>>>> a dynamically changing cluster by revoking resources as weights are >>>>> changed, frameworks are added or removed, etc. >>>>> >>>> >>>>> >>>> In this context, "revocable" means that the resources may be >>>>> taken away and the container will be destroyed. The meaning of "revocable" >>>>> in the context of usage oversubscription includes this, but also the >>>>> container may experience a throttling (e.g. lower cpu shares, less network >>>>> priority, etc). >>>>> >>>> >>>>> >>>> For this reason, and because we internally need to distinguish >>>>> revocable resources between the those that are generated by usage >>>>> oversubscription and those that are generated by the allocator, we're >>>>> thinking of the following change to the API: >>>>> >>>> >>>>> >>>> >>>>> >>>> >>>>> >>>> - message RevocableInfo {} >>>>> >>>> + message RevocableInfo { >>>>> >>>> + message ThrottleInfo {} >>>>> >>>> + >>>>> >>>> + // If set, indicates that the resources may be throttled at >>>>> >>>> + // any time. Throttle-able resoruces can be used for tasks >>>>> >>>> + // that do not have strict performance requirements and are >>>>> >>>> + // capable of handling being throttled. >>>>> >>>> + optional ThrottleInfo throttle_info; >>>>> >>>> + } >>>>> >>>> >>>>> >>>> // If this is set, the resources are revocable, i.e., any >>>>> tasks or >>>>> >>>> - // executors launched using these resources could get >>>>> preempted or >>>>> >>>> - // throttled at any time. This could be used by frameworks to >>>>> run >>>>> >>>> - // best effort tasks that do not need strict uptime or >>>>> performance >>>>> >>>> + // executors launched using these resources could be >>>>> terminated at >>>>> >>>> + // any time. This could be used by frameworks to run >>>>> >>>> + // best effort tasks that do not need strict uptime >>>>> >>>> // guarantees. Note that if this is set, 'disk' or >>>>> 'reservation' >>>>> >>>> // cannot be set. >>>>> >>>> optional RevocableInfo revocable = 9; >>>>> >>>> >>>>> >>>> >>>>> >>>> >>>>> >>>> Essentially we want to distinguish between revocable and >>>>> revocable + throttle-able. This is because usage-oversubscription >>>>> generates >>>>> throttle-able revocable resources, whereas the allocator does not. This >>>>> also solves our problem of distinguishing between these two kinds of >>>>> revocable resources internally. >>>>> >>>> >>>>> >>>> Feedback welcome! >>>>> >>>> >>>>> >>>> Ben >>>>> >>>> >>>>> >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "Mesos Resource Allocation Working Group" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to [email protected]. >>>> To post to this group, send email to [email protected]. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/mesos-allocation/CAGmKMfU4NJpY8NP6PVu9g-ebfi7Q9UseEdPUc0XkL1qqqqm75g%40mail.gmail.com >>>> <https://groups.google.com/d/msgid/mesos-allocation/CAGmKMfU4NJpY8NP6PVu9g-ebfi7Q9UseEdPUc0XkL1qqqqm75g%40mail.gmail.com?utm_medium=email&utm_source=footer> >>>> . >>>> >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Mesos Resource Allocation Working Group" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to [email protected]. >>> To post to this group, send email to [email protected]. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/mesos-allocation/CAKgKkU6EoOVvFVw0ppptE%2BA%2BHXuTB5UwETkHXz8CPsdqaDeh%2BQ%40mail.gmail.com >>> <https://groups.google.com/d/msgid/mesos-allocation/CAKgKkU6EoOVvFVw0ppptE%2BA%2BHXuTB5UwETkHXz8CPsdqaDeh%2BQ%40mail.gmail.com?utm_medium=email&utm_source=footer> >>> . >>> >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- Niklas
