In addition, this technique could be implemented in the allocator with an understanding of global demand: https://www.youtube.com/watch?v=BkBMYUe76oI
That would allow for tunable fair-sharing based on DRF-principles. On Wed, Sep 23, 2015 at 10:59 AM haosdent <[email protected]> wrote: > Feel free to open a story in jira if you think you ideas are awesome. :-) > On Sep 23, 2015 10:54 PM, "Sharma Podila" <[email protected]> wrote: > >> Ah, OK, thanks. Yes, Fenzo is a Java library. >> >> It might be a nice addition to Mesos master to get a global view of >> contention for resources. In addition to autoscaling, it would be useful in >> the allocator. >> >> >> >> On Wed, Sep 23, 2015 at 7:29 AM, Aaron Carey <[email protected]> wrote: >> >>> Thanks Sharma, >>> >>> I was in the audience for a talk you did about Fenzo at MesosCon :) It >>> looked great but we're a python shop primarily so the Java requirement >>> would be a problem for us. >>> >>> The scaling in the scheduler makes total sense, (obvious when you think >>> about it!), I was naively hoping for some sort of knowledge of that back in >>> the Mesos master as we were hoping to have scaling be independent of >>> schedulers. I think this'll need a re-think! >>> >>> Thanks for your help! >>> >>> Aaron >>> >>> ------------------------------ >>> *From:* Sharma Podila [[email protected]] >>> *Sent:* 23 September 2015 15:22 >>> >>> *To:* [email protected] >>> *Subject:* Re: Metric for tasks queued/waiting? >>> >>> Jobs/tasks wait in framework schedulers, not mesos master. Autoscaling >>> triggers must come from schedulers, not only because that's who knows the >>> pending task set size, but, also because it knows how many of them need to >>> be launched right away, on what kind of machines. >>> >>> We built such an autoscaling capability in our framework schedulers. The >>> autoscaling is achieved by our library Fenzo >>> <https://github.com/Netflix/Fenzo> which we open sourced recently. Also >>> read about Fenzo autoscaling here >>> <https://github.com/Netflix/Fenzo/wiki/Autoscaling>. You should look >>> into using that if you are developing your own scheduler. Or, have your >>> scheduler team pick up Fenzo for autoscaling. >>> >>> Also, note that scaling up is temptingly easy by watching the pending >>> task queue. But, scaling down requires bin packing, etc. Other issues pop >>> up as well, for example: >>> >>> - what if a user submits tasks that cannot be satisfied? Will autoscale >>> keep increasing the cluster size unbounded? >>> - what if you would like to have a heterogeneous mix of hosts and tasks? >>> which kind of hosts do you need to autoscale based on which tasks are >>> pending? >>> >>> These are automatically addressed in Fenzo. >>> >>> Sharma >>> >>> >>> On Wed, Sep 23, 2015 at 4:56 AM, Aaron Carey <[email protected]> wrote: >>> >>>> No, I basically had the same question as Jim (but maybe didn't word it >>>> so well ;)) >>>> >>>> I'll have a look at your response there :) >>>> >>>> ------------------------------ >>>> *From:* haosdent [[email protected]] >>>> *Sent:* 23 September 2015 10:12 >>>> *To:* [email protected] >>>> *Subject:* Re: Metric for tasks queued/waiting? >>>> >>>> Does /metrics/snapshot not satisfy your requirement? >>>> >>>> On Wed, Sep 23, 2015 at 4:50 PM, Aaron Carey <[email protected]> wrote: >>>> >>>>> Hi all, >>>>> >>>>> Is there any way to get a metric of all tasks currently waiting/queued >>>>> in Mesos (across all schedulers)? The snapshot metrics seem to cover ever >>>>> other kind of task state? This would be quite useful for auto-scaling >>>>> purposes.. >>>>> >>>>> Thanks, >>>>> Aaron >>>>> >>>> >>>> >>>> >>>> -- >>>> Best Regards, >>>> Haosdent Huang >>>> >>> >>> >>

