Re: Metric for tasks queued/waiting?

David Greenberg Wed, 23 Sep 2015 08:06:44 -0700

In addition, this technique could be implemented in the allocator with an
understanding of global demand: https://www.youtube.com/watch?v=BkBMYUe76oI


That would allow for tunable fair-sharing based on DRF-principles.

On Wed, Sep 23, 2015 at 10:59 AM haosdent <[email protected]> wrote:

> Feel free to open a story in jira if you think you ideas are awesome. :-)
> On Sep 23, 2015 10:54 PM, "Sharma Podila" <[email protected]> wrote:
>
>> Ah, OK, thanks. Yes, Fenzo is a Java library.
>>
>> It might be a nice addition to Mesos master to get a global view of
>> contention for resources. In addition to autoscaling, it would be useful in
>> the allocator.
>>
>>
>>
>> On Wed, Sep 23, 2015 at 7:29 AM, Aaron Carey <[email protected]> wrote:
>>
>>> Thanks Sharma,
>>>
>>> I was in the audience for a talk you did about Fenzo at MesosCon :) It
>>> looked great but we're a python shop primarily so the Java requirement
>>> would be a problem for us.
>>>
>>> The scaling in the scheduler makes total sense, (obvious when you think
>>> about it!), I was naively hoping for some sort of knowledge of that back in
>>> the Mesos master as we were hoping to have scaling be independent of
>>> schedulers. I think this'll need a re-think!
>>>
>>> Thanks for your help!
>>>
>>> Aaron
>>>
>>> ------------------------------
>>> *From:* Sharma Podila [[email protected]]
>>> *Sent:* 23 September 2015 15:22
>>>
>>> *To:* [email protected]
>>> *Subject:* Re: Metric for tasks queued/waiting?
>>>
>>> Jobs/tasks wait in framework schedulers, not mesos master. Autoscaling
>>> triggers must come from schedulers, not only because that's who knows the
>>> pending task set size, but, also because it knows how many of them need to
>>> be launched right away, on what kind of machines.
>>>
>>> We built such an autoscaling capability in our framework schedulers. The
>>> autoscaling is achieved by our library Fenzo
>>> <https://github.com/Netflix/Fenzo> which we open sourced recently. Also
>>> read about Fenzo autoscaling here
>>> <https://github.com/Netflix/Fenzo/wiki/Autoscaling>. You should look
>>> into using that if you are developing your own scheduler. Or, have your
>>> scheduler team pick up Fenzo for autoscaling.
>>>
>>> Also, note that scaling up is temptingly easy by watching the pending
>>> task queue. But, scaling down requires bin packing, etc. Other issues pop
>>> up as well, for example:
>>>
>>> - what if a user submits tasks that cannot be satisfied? Will autoscale
>>> keep increasing the cluster size unbounded?
>>> - what if you would like to have a heterogeneous mix of hosts and tasks?
>>> which kind of hosts do you need to autoscale based on which tasks are
>>> pending?
>>>
>>> These are automatically addressed in Fenzo.
>>>
>>> Sharma
>>>
>>>
>>> On Wed, Sep 23, 2015 at 4:56 AM, Aaron Carey <[email protected]> wrote:
>>>
>>>> No, I basically had the same question as Jim (but maybe didn't word it
>>>> so well ;))
>>>>
>>>> I'll have a look at your response there :)
>>>>
>>>> ------------------------------
>>>> *From:* haosdent [[email protected]]
>>>> *Sent:* 23 September 2015 10:12
>>>> *To:* [email protected]
>>>> *Subject:* Re: Metric for tasks queued/waiting?
>>>>
>>>> Does /metrics/snapshot not satisfy your requirement?
>>>>
>>>> On Wed, Sep 23, 2015 at 4:50 PM, Aaron Carey <[email protected]> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> Is there any way to get a metric of all tasks currently waiting/queued
>>>>> in Mesos (across all schedulers)? The snapshot metrics seem to cover ever
>>>>> other kind of task state? This would be quite useful for auto-scaling
>>>>> purposes..
>>>>>
>>>>> Thanks,
>>>>> Aaron
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Best Regards,
>>>> Haosdent Huang
>>>>
>>>
>>>
>>

Re: Metric for tasks queued/waiting?

Reply via email to