Re: Would you recommend Aurora?

2016-06-19 Thread Ziliang Chen
Thank you, Erb !

On Sat, Jun 18, 2016 at 12:40 AM, Erb, Stephan <stephan@blue-yonder.com>
wrote:

> 1) It is really hard to answer that question, especially given that there
> is a huge difference between a scheduled cron and running job. Your best
> guess is probably to do some load testing for your particular usecase, and
> to evaluate other design choices as necessary.
>
>
>
> 2) The link I provided for the 40 tasks per second is actually a config
> option. So you could change this, if absolutely necessary.
>
>
>
> *From: *Ziliang Chen <zlchen@gmail.com>
> *Reply-To: *"user@aurora.apache.org" <user@aurora.apache.org>
> *Date: *Monday 13 June 2016 at 03:56
> *To: *"user@aurora.apache.org" <user@aurora.apache.org>
>
> *Subject: *Re: Would you recommend Aurora?
>
>
>
> Thanks Erb for the great details.
>
> 1) Assume we have 1000 customer, each of the customer has 1000 periodical
> cron jobs. I would like the schedule the total 1M jobs across a pool of
> machines. If Aurora can't take this load, any suggestion/candidate ?
>
> 2) 40 tasks per second. Is there a way to change the default by
> configuration instead of modifying the code ?
>
>
>
> Thank you very much !
>
>
>
> On Mon, Jun 13, 2016 at 1:50 AM, Erb, Stephan <stephan@blue-yonder.com>
> wrote:
>
> Could you clarify your cron usecase? Millions of cron jobs that run up to
> every minute sounds more like you want a couple of long running processes
> that do the actual work with a little sleep in between, rather than doing
> task spawning and distribution in Mesos & Aurora for each of them.
>
>
>
> Regarding Aurora's scale: Twitter has recently disclosed that they have
> 250,000 containers/tasks running, with the largest cluster being in the
> range of 30,000 nodes [1].  Aurora is by default not trying to schedule
> more than 40 tasks per second [2]. You can probably try to adjust that
> value, but this could bring other downsides.
>
> ​
>
> [1] https://youtu.be/FU7wrqsRj3o?t=21m11s
>
> [2]
> https://github.com/apache/aurora/blob/master/src/main/java/org/apache/aurora/scheduler/scheduling/SchedulingModule.java#L39-L41
>
> --
>
> *From:* Ziliang Chen <zlchen@gmail.com>
> *Sent:* Saturday, June 11, 2016 17:15
>
>
> *To:* user@aurora.apache.org
> *Subject:* Re: Would you recommend Aurora?
>
>
>
> Hi,
>
>
>
> Great discussion here.
>
> May I extend the question a little bit ? I am wondering how Aurora scales:
> can Aurora schedule millions of cron (for cron, the jobs run periodically
> say every 1, 2 or 5 minutes) /service jobs ? Is there any
> documentation/perf benchmark for Aurora i can refer to ? I heard that
> Aurora can schedule several thousands jobs per second. Never tested that,
> but good to confirm.
>
>
>
> Thanks a lot !
>
>
>
> On Thu, May 26, 2016 at 1:01 AM, Jillian Cocklin <
> jillian.cock...@danalinc.com> wrote:
>
> Thanks Brian & Maxim, those are great leads.  Awesome that Heron has gone
> open source!  Definitely glad to have learned more about Aurora – for the
> right situation it seems like a really great solution.
>
>
>
> Thanks,
>
> J.
>
>
>
> *From:* Brian Hatfield [mailto:bhatfi...@twitter.com]
> *Sent:* Wednesday, May 25, 2016 9:57 AM
> *To:* user@aurora.apache.org
>
>
> *Subject:* Re: Would you recommend Aurora?
>
>
>
> I mentioned Heron yesterday in this thread - you might like to know that
> as of this morning, it's now open source:
> https://blog.twitter.com/2016/open-sourcing-twitter-heron
>
>
>
> On Wed, May 25, 2016 at 12:22 PM, Maxim Khutornenko <ma...@apache.org>
> wrote:
>
> Hi Jillian,
>
>
>
> You may still consider Aurora if you want a more complex (ala Heron-style)
> orchestration around your batch processing workloads.
>
>
>
> That said, there are plenty of alternatives for batch processing if you
> feel that'll be too much to load:
> http://mesos.apache.org/documentation/latest/frameworks/
>
>
>
> There is also a young but promising framework specifically targeting large
> batch job counts that you may want to explore:
> https://github.com/twosigma/Cook.
>
>
>
> On Wed, May 25, 2016 at 8:12 AM, Jillian Cocklin <
> jillian.cock...@danalinc.com> wrote:
>
> Thanks Brian and Rick - that's what I was starting to think too.  I
> appreciate your input and the quick responses.
>
>
>
> Best,
>
> J.
>
> Get Outlook for iOS <https://aka.ms/o0ukef>
>
>
>
> _
> From: r...@chartbeat.com
> Sen

Re: Would you recommend Aurora?

2016-06-17 Thread Erb, Stephan
1) It is really hard to answer that question, especially given that there is a 
huge difference between a scheduled cron and running job. Your best guess is 
probably to do some load testing for your particular usecase, and to evaluate 
other design choices as necessary.

2) The link I provided for the 40 tasks per second is actually a config option. 
So you could change this, if absolutely necessary.

From: Ziliang Chen <zlchen@gmail.com>
Reply-To: "user@aurora.apache.org" <user@aurora.apache.org>
Date: Monday 13 June 2016 at 03:56
To: "user@aurora.apache.org" <user@aurora.apache.org>
Subject: Re: Would you recommend Aurora?

Thanks Erb for the great details.
1) Assume we have 1000 customer, each of the customer has 1000 periodical cron 
jobs. I would like the schedule the total 1M jobs across a pool of machines. If 
Aurora can't take this load, any suggestion/candidate ?
2) 40 tasks per second. Is there a way to change the default by configuration 
instead of modifying the code ?

Thank you very much !

On Mon, Jun 13, 2016 at 1:50 AM, Erb, Stephan 
<stephan@blue-yonder.com<mailto:stephan@blue-yonder.com>> wrote:

Could you clarify your cron usecase? Millions of cron jobs that run up to every 
minute sounds more like you want a couple of long running processes that do the 
actual work with a little sleep in between, rather than doing task spawning and 
distribution in Mesos & Aurora for each of them.



Regarding Aurora's scale: Twitter has recently disclosed that they have 250,000 
containers/tasks running, with the largest cluster being in the range of 30,000 
nodes [1].  Aurora is by default not trying to schedule more than 40 tasks per 
second [2]. You can probably try to adjust that value, but this could bring 
other downsides.

​

[1] https://youtu.be/FU7wrqsRj3o?t=21m11s

[2] 
https://github.com/apache/aurora/blob/master/src/main/java/org/apache/aurora/scheduler/scheduling/SchedulingModule.java#L39-L41


From: Ziliang Chen <zlchen@gmail.com<mailto:zlchen@gmail.com>>
Sent: Saturday, June 11, 2016 17:15

To: user@aurora.apache.org<mailto:user@aurora.apache.org>
Subject: Re: Would you recommend Aurora?

Hi,

Great discussion here.
May I extend the question a little bit ? I am wondering how Aurora scales: can 
Aurora schedule millions of cron (for cron, the jobs run periodically say every 
1, 2 or 5 minutes) /service jobs ? Is there any documentation/perf benchmark 
for Aurora i can refer to ? I heard that Aurora can schedule several thousands 
jobs per second. Never tested that, but good to confirm.

Thanks a lot !

On Thu, May 26, 2016 at 1:01 AM, Jillian Cocklin 
<jillian.cock...@danalinc.com<mailto:jillian.cock...@danalinc.com>> wrote:
Thanks Brian & Maxim, those are great leads.  Awesome that Heron has gone open 
source!  Definitely glad to have learned more about Aurora – for the right 
situation it seems like a really great solution.

Thanks,
J.

From: Brian Hatfield 
[mailto:bhatfi...@twitter.com<mailto:bhatfi...@twitter.com>]
Sent: Wednesday, May 25, 2016 9:57 AM
To: user@aurora.apache.org<mailto:user@aurora.apache.org>

Subject: Re: Would you recommend Aurora?

I mentioned Heron yesterday in this thread - you might like to know that as of 
this morning, it's now open source: 
https://blog.twitter.com/2016/open-sourcing-twitter-heron

On Wed, May 25, 2016 at 12:22 PM, Maxim Khutornenko 
<ma...@apache.org<mailto:ma...@apache.org>> wrote:
Hi Jillian,

You may still consider Aurora if you want a more complex (ala Heron-style) 
orchestration around your batch processing workloads.

That said, there are plenty of alternatives for batch processing if you feel 
that'll be too much to load: 
http://mesos.apache.org/documentation/latest/frameworks/

There is also a young but promising framework specifically targeting large 
batch job counts that you may want to explore: https://github.com/twosigma/Cook.

On Wed, May 25, 2016 at 8:12 AM, Jillian Cocklin 
<jillian.cock...@danalinc.com<mailto:jillian.cock...@danalinc.com>> wrote:
Thanks Brian and Rick - that's what I was starting to think too.  I appreciate 
your input and the quick responses.

Best,
J.
Get Outlook for iOS<https://aka.ms/o0ukef>

_____________
From: r...@chartbeat.com<mailto:r...@chartbeat.com>
Sent: Wednesday, May 25, 2016 4:47 AM
Subject: Re: Would you recommend Aurora?
To: <user@aurora.apache.org<mailto:user@aurora.apache.org>>

Sounds to me like you want something like spark or a traditional map reduce 
framework.

On May 24, 2016, at 9:36 PM, Brian Hatfield 
<bhatfi...@twitter.com<mailto:bhatfi...@twitter.com>> wrote:
It seems like Aurora would not be the solution to your problem entirely.

It sounds like you either want a stream processor with a way to stream in the 
chunked batch (see also: Storm or 

Re: Would you recommend Aurora?

2016-05-25 Thread Maxim Khutornenko
Hi Jillian,

You may still consider Aurora if you want a more complex (ala Heron-style)
orchestration around your batch processing workloads.

That said, there are plenty of alternatives for batch processing if you
feel that'll be too much to load:
http://mesos.apache.org/documentation/latest/frameworks/

There is also a young but promising framework specifically targeting large
batch job counts that you may want to explore:
https://github.com/twosigma/Cook.

On Wed, May 25, 2016 at 8:12 AM, Jillian Cocklin <
jillian.cock...@danalinc.com> wrote:

> Thanks Brian and Rick - that's what I was starting to think too.  I
> appreciate your input and the quick responses.
>
> Best,
> J.
>
> Get Outlook for iOS <https://aka.ms/o0ukef>
>
> _
> From: r...@chartbeat.com
> Sent: Wednesday, May 25, 2016 4:47 AM
> Subject: Re: Would you recommend Aurora?
> To: <user@aurora.apache.org>
>
>
>
> Sounds to me like you want something like spark or a traditional map
> reduce framework.
>
> On May 24, 2016, at 9:36 PM, Brian Hatfield <bhatfi...@twitter.com> wrote:
>
> It seems like Aurora would not be the solution to your problem entirely.
>
> It sounds like you either want a stream processor with a way to stream in
> the chunked batch (see also: Storm or Heron (which runs on Aurora)
> <https://blog.twitter.com/2015/flying-faster-with-twitter-heron>), or a
> way to process batch jobs (see also: Hadoop, which can run on Mesos
> <https://github.com/mesos/hadoop> and possibly Aurora).
>
> I'm not sure which fits your use case better based upon your description,
> but I hope that this is at least a seed of information in the right
> direction.
>
> Brian
>
> On Tue, May 24, 2016 at 9:14 PM, Jillian Cocklin <
> jillian.cock...@danalinc.com> wrote:
>
>> I’m analyzing Aurora as a potential candidate for a new project.  While
>> the high-level architecture seems to be a good fit, I’m not seeing a lot of
>> documentation that matches our use case.
>>
>>  On an ongoing basis, we’ll receive batch files of records (~5 million
>> records per batch), and based on record types we need to “process” them
>> against our services.  We’d break up the records into small chunks,
>> instantiate a job for each chunk, and have each job be automatically queued
>> up to run on available resources (which can be auto scaled up/down as
>> needed).
>>
>>
>>
>> At first glance it looked like Aurora could create jobs  - but I can’t
>> tell whether those can be made as templates so that they can be dynamically
>> instantiated, passed data, and run simultaneously.  Are there any best
>> practices or code examples for this?  Most of what I’ve found fits better
>> with the use case of having different static jobs (like chron jobs or IT
>> services) that each need to be run on a periodic basis or continue running
>> indefinitely.
>>
>>
>>
>> Can anyone let me know whether this is worth pursuing with Aurora?
>>
>>
>>
>> Thanks!
>>
>> J.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>
>