Re: Would you recommend Aurora?
Thank you, Erb ! On Sat, Jun 18, 2016 at 12:40 AM, Erb, Stephan <stephan@blue-yonder.com> wrote: > 1) It is really hard to answer that question, especially given that there > is a huge difference between a scheduled cron and running job. Your best > guess is probably to do some load testing for your particular usecase, and > to evaluate other design choices as necessary. > > > > 2) The link I provided for the 40 tasks per second is actually a config > option. So you could change this, if absolutely necessary. > > > > *From: *Ziliang Chen <zlchen@gmail.com> > *Reply-To: *"user@aurora.apache.org" <user@aurora.apache.org> > *Date: *Monday 13 June 2016 at 03:56 > *To: *"user@aurora.apache.org" <user@aurora.apache.org> > > *Subject: *Re: Would you recommend Aurora? > > > > Thanks Erb for the great details. > > 1) Assume we have 1000 customer, each of the customer has 1000 periodical > cron jobs. I would like the schedule the total 1M jobs across a pool of > machines. If Aurora can't take this load, any suggestion/candidate ? > > 2) 40 tasks per second. Is there a way to change the default by > configuration instead of modifying the code ? > > > > Thank you very much ! > > > > On Mon, Jun 13, 2016 at 1:50 AM, Erb, Stephan <stephan@blue-yonder.com> > wrote: > > Could you clarify your cron usecase? Millions of cron jobs that run up to > every minute sounds more like you want a couple of long running processes > that do the actual work with a little sleep in between, rather than doing > task spawning and distribution in Mesos & Aurora for each of them. > > > > Regarding Aurora's scale: Twitter has recently disclosed that they have > 250,000 containers/tasks running, with the largest cluster being in the > range of 30,000 nodes [1]. Aurora is by default not trying to schedule > more than 40 tasks per second [2]. You can probably try to adjust that > value, but this could bring other downsides. > > > > [1] https://youtu.be/FU7wrqsRj3o?t=21m11s > > [2] > https://github.com/apache/aurora/blob/master/src/main/java/org/apache/aurora/scheduler/scheduling/SchedulingModule.java#L39-L41 > > -- > > *From:* Ziliang Chen <zlchen@gmail.com> > *Sent:* Saturday, June 11, 2016 17:15 > > > *To:* user@aurora.apache.org > *Subject:* Re: Would you recommend Aurora? > > > > Hi, > > > > Great discussion here. > > May I extend the question a little bit ? I am wondering how Aurora scales: > can Aurora schedule millions of cron (for cron, the jobs run periodically > say every 1, 2 or 5 minutes) /service jobs ? Is there any > documentation/perf benchmark for Aurora i can refer to ? I heard that > Aurora can schedule several thousands jobs per second. Never tested that, > but good to confirm. > > > > Thanks a lot ! > > > > On Thu, May 26, 2016 at 1:01 AM, Jillian Cocklin < > jillian.cock...@danalinc.com> wrote: > > Thanks Brian & Maxim, those are great leads. Awesome that Heron has gone > open source! Definitely glad to have learned more about Aurora – for the > right situation it seems like a really great solution. > > > > Thanks, > > J. > > > > *From:* Brian Hatfield [mailto:bhatfi...@twitter.com] > *Sent:* Wednesday, May 25, 2016 9:57 AM > *To:* user@aurora.apache.org > > > *Subject:* Re: Would you recommend Aurora? > > > > I mentioned Heron yesterday in this thread - you might like to know that > as of this morning, it's now open source: > https://blog.twitter.com/2016/open-sourcing-twitter-heron > > > > On Wed, May 25, 2016 at 12:22 PM, Maxim Khutornenko <ma...@apache.org> > wrote: > > Hi Jillian, > > > > You may still consider Aurora if you want a more complex (ala Heron-style) > orchestration around your batch processing workloads. > > > > That said, there are plenty of alternatives for batch processing if you > feel that'll be too much to load: > http://mesos.apache.org/documentation/latest/frameworks/ > > > > There is also a young but promising framework specifically targeting large > batch job counts that you may want to explore: > https://github.com/twosigma/Cook. > > > > On Wed, May 25, 2016 at 8:12 AM, Jillian Cocklin < > jillian.cock...@danalinc.com> wrote: > > Thanks Brian and Rick - that's what I was starting to think too. I > appreciate your input and the quick responses. > > > > Best, > > J. > > Get Outlook for iOS <https://aka.ms/o0ukef> > > > > _ > From: r...@chartbeat.com > Sen
Re: Would you recommend Aurora?
1) It is really hard to answer that question, especially given that there is a huge difference between a scheduled cron and running job. Your best guess is probably to do some load testing for your particular usecase, and to evaluate other design choices as necessary. 2) The link I provided for the 40 tasks per second is actually a config option. So you could change this, if absolutely necessary. From: Ziliang Chen <zlchen@gmail.com> Reply-To: "user@aurora.apache.org" <user@aurora.apache.org> Date: Monday 13 June 2016 at 03:56 To: "user@aurora.apache.org" <user@aurora.apache.org> Subject: Re: Would you recommend Aurora? Thanks Erb for the great details. 1) Assume we have 1000 customer, each of the customer has 1000 periodical cron jobs. I would like the schedule the total 1M jobs across a pool of machines. If Aurora can't take this load, any suggestion/candidate ? 2) 40 tasks per second. Is there a way to change the default by configuration instead of modifying the code ? Thank you very much ! On Mon, Jun 13, 2016 at 1:50 AM, Erb, Stephan <stephan@blue-yonder.com<mailto:stephan@blue-yonder.com>> wrote: Could you clarify your cron usecase? Millions of cron jobs that run up to every minute sounds more like you want a couple of long running processes that do the actual work with a little sleep in between, rather than doing task spawning and distribution in Mesos & Aurora for each of them. Regarding Aurora's scale: Twitter has recently disclosed that they have 250,000 containers/tasks running, with the largest cluster being in the range of 30,000 nodes [1]. Aurora is by default not trying to schedule more than 40 tasks per second [2]. You can probably try to adjust that value, but this could bring other downsides. [1] https://youtu.be/FU7wrqsRj3o?t=21m11s [2] https://github.com/apache/aurora/blob/master/src/main/java/org/apache/aurora/scheduler/scheduling/SchedulingModule.java#L39-L41 From: Ziliang Chen <zlchen@gmail.com<mailto:zlchen@gmail.com>> Sent: Saturday, June 11, 2016 17:15 To: user@aurora.apache.org<mailto:user@aurora.apache.org> Subject: Re: Would you recommend Aurora? Hi, Great discussion here. May I extend the question a little bit ? I am wondering how Aurora scales: can Aurora schedule millions of cron (for cron, the jobs run periodically say every 1, 2 or 5 minutes) /service jobs ? Is there any documentation/perf benchmark for Aurora i can refer to ? I heard that Aurora can schedule several thousands jobs per second. Never tested that, but good to confirm. Thanks a lot ! On Thu, May 26, 2016 at 1:01 AM, Jillian Cocklin <jillian.cock...@danalinc.com<mailto:jillian.cock...@danalinc.com>> wrote: Thanks Brian & Maxim, those are great leads. Awesome that Heron has gone open source! Definitely glad to have learned more about Aurora – for the right situation it seems like a really great solution. Thanks, J. From: Brian Hatfield [mailto:bhatfi...@twitter.com<mailto:bhatfi...@twitter.com>] Sent: Wednesday, May 25, 2016 9:57 AM To: user@aurora.apache.org<mailto:user@aurora.apache.org> Subject: Re: Would you recommend Aurora? I mentioned Heron yesterday in this thread - you might like to know that as of this morning, it's now open source: https://blog.twitter.com/2016/open-sourcing-twitter-heron On Wed, May 25, 2016 at 12:22 PM, Maxim Khutornenko <ma...@apache.org<mailto:ma...@apache.org>> wrote: Hi Jillian, You may still consider Aurora if you want a more complex (ala Heron-style) orchestration around your batch processing workloads. That said, there are plenty of alternatives for batch processing if you feel that'll be too much to load: http://mesos.apache.org/documentation/latest/frameworks/ There is also a young but promising framework specifically targeting large batch job counts that you may want to explore: https://github.com/twosigma/Cook. On Wed, May 25, 2016 at 8:12 AM, Jillian Cocklin <jillian.cock...@danalinc.com<mailto:jillian.cock...@danalinc.com>> wrote: Thanks Brian and Rick - that's what I was starting to think too. I appreciate your input and the quick responses. Best, J. Get Outlook for iOS<https://aka.ms/o0ukef> _____________ From: r...@chartbeat.com<mailto:r...@chartbeat.com> Sent: Wednesday, May 25, 2016 4:47 AM Subject: Re: Would you recommend Aurora? To: <user@aurora.apache.org<mailto:user@aurora.apache.org>> Sounds to me like you want something like spark or a traditional map reduce framework. On May 24, 2016, at 9:36 PM, Brian Hatfield <bhatfi...@twitter.com<mailto:bhatfi...@twitter.com>> wrote: It seems like Aurora would not be the solution to your problem entirely. It sounds like you either want a stream processor with a way to stream in the chunked batch (see also: Storm or
Re: Would you recommend Aurora?
Hi Jillian, You may still consider Aurora if you want a more complex (ala Heron-style) orchestration around your batch processing workloads. That said, there are plenty of alternatives for batch processing if you feel that'll be too much to load: http://mesos.apache.org/documentation/latest/frameworks/ There is also a young but promising framework specifically targeting large batch job counts that you may want to explore: https://github.com/twosigma/Cook. On Wed, May 25, 2016 at 8:12 AM, Jillian Cocklin < jillian.cock...@danalinc.com> wrote: > Thanks Brian and Rick - that's what I was starting to think too. I > appreciate your input and the quick responses. > > Best, > J. > > Get Outlook for iOS <https://aka.ms/o0ukef> > > _ > From: r...@chartbeat.com > Sent: Wednesday, May 25, 2016 4:47 AM > Subject: Re: Would you recommend Aurora? > To: <user@aurora.apache.org> > > > > Sounds to me like you want something like spark or a traditional map > reduce framework. > > On May 24, 2016, at 9:36 PM, Brian Hatfield <bhatfi...@twitter.com> wrote: > > It seems like Aurora would not be the solution to your problem entirely. > > It sounds like you either want a stream processor with a way to stream in > the chunked batch (see also: Storm or Heron (which runs on Aurora) > <https://blog.twitter.com/2015/flying-faster-with-twitter-heron>), or a > way to process batch jobs (see also: Hadoop, which can run on Mesos > <https://github.com/mesos/hadoop> and possibly Aurora). > > I'm not sure which fits your use case better based upon your description, > but I hope that this is at least a seed of information in the right > direction. > > Brian > > On Tue, May 24, 2016 at 9:14 PM, Jillian Cocklin < > jillian.cock...@danalinc.com> wrote: > >> I’m analyzing Aurora as a potential candidate for a new project. While >> the high-level architecture seems to be a good fit, I’m not seeing a lot of >> documentation that matches our use case. >> >> On an ongoing basis, we’ll receive batch files of records (~5 million >> records per batch), and based on record types we need to “process” them >> against our services. We’d break up the records into small chunks, >> instantiate a job for each chunk, and have each job be automatically queued >> up to run on available resources (which can be auto scaled up/down as >> needed). >> >> >> >> At first glance it looked like Aurora could create jobs - but I can’t >> tell whether those can be made as templates so that they can be dynamically >> instantiated, passed data, and run simultaneously. Are there any best >> practices or code examples for this? Most of what I’ve found fits better >> with the use case of having different static jobs (like chron jobs or IT >> services) that each need to be run on a periodic basis or continue running >> indefinitely. >> >> >> >> Can anyone let me know whether this is worth pursuing with Aurora? >> >> >> >> Thanks! >> >> J. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> > > >