RE: Metric for tasks queued/waiting?

Aaron Carey Thu, 24 Sep 2015 02:28:52 -0700

Thanks Alex,

The problem here is more along the lines of getting the metrics to feed into 
the algorithm, rather than the algorithm itself. Relay looks very cool though 
thanks :)

Aaron

________________________________
From: Alex Gaudio [[email protected]]
Sent: 23 September 2015 21:54
To: [email protected]
Subject: Re: Metric for tasks queued/waiting?

Hi Aaron,

You might consider trying to solve the autoscaling problem with Relay, a Python 
tool I use to solve this problem.  Feel free to shoot me an email if you are 
interested.

github.com/sailthru/relay<http://github.com/sailthru/relay>

Alex

On Wed, Sep 23, 2015, 11:03 AM David Greenberg 
<[email protected]<mailto:[email protected]>> wrote:
In addition, this technique could be implemented in the allocator with an 
understanding of global demand: https://www.youtube.com/watch?v=BkBMYUe76oI

That would allow for tunable fair-sharing based on DRF-principles.

On Wed, Sep 23, 2015 at 10:59 AM haosdent 
<[email protected]<mailto:[email protected]>> wrote:

Feel free to open a story in jira if you think you ideas are awesome. :-)

On Sep 23, 2015 10:54 PM, "Sharma Podila" 
<[email protected]<mailto:[email protected]>> wrote:
Ah, OK, thanks. Yes, Fenzo is a Java library.

It might be a nice addition to Mesos master to get a global view of contention 
for resources. In addition to autoscaling, it would be useful in the allocator.

On Wed, Sep 23, 2015 at 7:29 AM, Aaron Carey 
<[email protected]<mailto:[email protected]>> wrote:
Thanks Sharma,

I was in the audience for a talk you did about Fenzo at MesosCon :) It looked 
great but we're a python shop primarily so the Java requirement would be a 
problem for us.

The scaling in the scheduler makes total sense, (obvious when you think about 
it!), I was naively hoping for some sort of knowledge of that back in the Mesos 
master as we were hoping to have scaling be independent of schedulers. I think 
this'll need a re-think!

Thanks for your help!

Aaron

________________________________
From: Sharma Podila [[email protected]<mailto:[email protected]>]
Sent: 23 September 2015 15:22

To: [email protected]<mailto:[email protected]>
Subject: Re: Metric for tasks queued/waiting?

Jobs/tasks wait in framework schedulers, not mesos master. Autoscaling triggers 
must come from schedulers, not only because that's who knows the pending task 
set size, but, also because it knows how many of them need to be launched right 
away, on what kind of machines.

We built such an autoscaling capability in our framework schedulers. The 
autoscaling is achieved by our library Fenzo<https://github.com/Netflix/Fenzo> 
which we open sourced recently. Also read about Fenzo autoscaling 
here<https://github.com/Netflix/Fenzo/wiki/Autoscaling>. You should look into 
using that if you are developing your own scheduler. Or, have your scheduler 
team pick up Fenzo for autoscaling.

Also, note that scaling up is temptingly easy by watching the pending task 
queue. But, scaling down requires bin packing, etc. Other issues pop up as 
well, for example:

- what if a user submits tasks that cannot be satisfied? Will autoscale keep 
increasing the cluster size unbounded?
- what if you would like to have a heterogeneous mix of hosts and tasks? which 
kind of hosts do you need to autoscale based on which tasks are pending?

These are automatically addressed in Fenzo.

Sharma

On Wed, Sep 23, 2015 at 4:56 AM, Aaron Carey 
<[email protected]<mailto:[email protected]>> wrote:
No, I basically had the same question as Jim (but maybe didn't word it so well 
;))

I'll have a look at your response there :)

________________________________
From: haosdent [[email protected]<mailto:[email protected]>]
Sent: 23 September 2015 10:12
To: [email protected]<mailto:[email protected]>
Subject: Re: Metric for tasks queued/waiting?

Does /metrics/snapshot not satisfy your requirement?

On Wed, Sep 23, 2015 at 4:50 PM, Aaron Carey 
<[email protected]<mailto:[email protected]>> wrote:
Hi all,

Is there any way to get a metric of all tasks currently waiting/queued in Mesos 
(across all schedulers)? The snapshot metrics seem to cover ever other kind of 
task state? This would be quite useful for auto-scaling purposes..

Thanks,
Aaron

--
Best Regards,
Haosdent Huang

RE: Metric for tasks queued/waiting?

Reply via email to