I'd love to see this solved in a general way; "How does the framework communicate (insert intent, metric, hint, etc) to mesos".
In one way, the 'webui_url' of in the framework info conveys "This is how you get to my web ui". As providing a webui was a common pattern for the frameworks. This could be expanded, so the framework can report an 'apiui_url' or maybe even more specific "metrics_url" where the mesos master (or other frameworks and 3rd party tooling) can get insights into queue depths, resource preferences, etc. We can start discussing this further in a JIRA ticket :) Niklas On 23 September 2015 at 13:54, Alex Gaudio <[email protected]> wrote: > Hi Aaron, > > You might consider trying to solve the autoscaling problem with Relay, a > Python tool I use to solve this problem. Feel free to shoot me an email if > you are interested. > > github.com/sailthru/relay > > Alex > > On Wed, Sep 23, 2015, 11:03 AM David Greenberg <[email protected]> > wrote: > >> In addition, this technique could be implemented in the allocator with an >> understanding of global demand: >> https://www.youtube.com/watch?v=BkBMYUe76oI >> >> That would allow for tunable fair-sharing based on DRF-principles. >> >> On Wed, Sep 23, 2015 at 10:59 AM haosdent <[email protected]> wrote: >> >>> Feel free to open a story in jira if you think you ideas are awesome. :-) >>> >> On Sep 23, 2015 10:54 PM, "Sharma Podila" <[email protected]> wrote: >>> >>>> Ah, OK, thanks. Yes, Fenzo is a Java library. >>>> >>>> It might be a nice addition to Mesos master to get a global view of >>>> contention for resources. In addition to autoscaling, it would be useful in >>>> the allocator. >>>> >>>> >>>> >>>> On Wed, Sep 23, 2015 at 7:29 AM, Aaron Carey <[email protected]> wrote: >>>> >>>>> Thanks Sharma, >>>>> >>>>> I was in the audience for a talk you did about Fenzo at MesosCon :) It >>>>> looked great but we're a python shop primarily so the Java requirement >>>>> would be a problem for us. >>>>> >>>>> The scaling in the scheduler makes total sense, (obvious when you >>>>> think about it!), I was naively hoping for some sort of knowledge of that >>>>> back in the Mesos master as we were hoping to have scaling be independent >>>>> of schedulers. I think this'll need a re-think! >>>>> >>>>> Thanks for your help! >>>>> >>>>> Aaron >>>>> >>>>> ------------------------------ >>>>> *From:* Sharma Podila [[email protected]] >>>>> *Sent:* 23 September 2015 15:22 >>>>> >>>>> *To:* [email protected] >>>>> *Subject:* Re: Metric for tasks queued/waiting? >>>>> >>>>> Jobs/tasks wait in framework schedulers, not mesos master. Autoscaling >>>>> triggers must come from schedulers, not only because that's who knows the >>>>> pending task set size, but, also because it knows how many of them need to >>>>> be launched right away, on what kind of machines. >>>>> >>>>> We built such an autoscaling capability in our framework schedulers. >>>>> The autoscaling is achieved by our library Fenzo >>>>> <https://github.com/Netflix/Fenzo> which we open sourced recently. >>>>> Also read about Fenzo autoscaling here >>>>> <https://github.com/Netflix/Fenzo/wiki/Autoscaling>. You should look >>>>> into using that if you are developing your own scheduler. Or, have your >>>>> scheduler team pick up Fenzo for autoscaling. >>>>> >>>>> Also, note that scaling up is temptingly easy by watching the pending >>>>> task queue. But, scaling down requires bin packing, etc. Other issues pop >>>>> up as well, for example: >>>>> >>>>> - what if a user submits tasks that cannot be satisfied? Will >>>>> autoscale keep increasing the cluster size unbounded? >>>>> - what if you would like to have a heterogeneous mix of hosts and >>>>> tasks? which kind of hosts do you need to autoscale based on which tasks >>>>> are pending? >>>>> >>>>> These are automatically addressed in Fenzo. >>>>> >>>>> Sharma >>>>> >>>>> >>>>> On Wed, Sep 23, 2015 at 4:56 AM, Aaron Carey <[email protected]> wrote: >>>>> >>>>>> No, I basically had the same question as Jim (but maybe didn't word >>>>>> it so well ;)) >>>>>> >>>>>> I'll have a look at your response there :) >>>>>> >>>>>> ------------------------------ >>>>>> *From:* haosdent [[email protected]] >>>>>> *Sent:* 23 September 2015 10:12 >>>>>> *To:* [email protected] >>>>>> *Subject:* Re: Metric for tasks queued/waiting? >>>>>> >>>>>> Does /metrics/snapshot not satisfy your requirement? >>>>>> >>>>>> On Wed, Sep 23, 2015 at 4:50 PM, Aaron Carey <[email protected]> wrote: >>>>>> >>>>>>> Hi all, >>>>>>> >>>>>>> Is there any way to get a metric of all tasks currently >>>>>>> waiting/queued in Mesos (across all schedulers)? The snapshot metrics >>>>>>> seem >>>>>>> to cover ever other kind of task state? This would be quite useful for >>>>>>> auto-scaling purposes.. >>>>>>> >>>>>>> Thanks, >>>>>>> Aaron >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Best Regards, >>>>>> Haosdent Huang >>>>>> >>>>> >>>>> >>>>

