Thanks Erb for the great details. 1) Assume we have 1000 customer, each of the customer has 1000 periodical cron jobs. I would like the schedule the total 1M jobs across a pool of machines. If Aurora can't take this load, any suggestion/candidate ? 2) 40 tasks per second. Is there a way to change the default by configuration instead of modifying the code ?
Thank you very much ! On Mon, Jun 13, 2016 at 1:50 AM, Erb, Stephan <stephan....@blue-yonder.com> wrote: > Could you clarify your cron usecase? Millions of cron jobs that run up to > every minute sounds more like you want a couple of long running processes > that do the actual work with a little sleep in between, rather than doing > task spawning and distribution in Mesos & Aurora for each of them. > > > Regarding Aurora's scale: Twitter has recently disclosed that they have > 250,000 containers/tasks running, with the largest cluster being in the > range of 30,000 nodes [1]. Aurora is by default not trying to schedule > more than 40 tasks per second [2]. You can probably try to adjust that > value, but this could bring other downsides. > > > > [1] https://youtu.be/FU7wrqsRj3o?t=21m11s > > [2] > https://github.com/apache/aurora/blob/master/src/main/java/org/apache/aurora/scheduler/scheduling/SchedulingModule.java#L39-L41 > > ------------------------------ > *From:* Ziliang Chen <zlchen....@gmail.com> > *Sent:* Saturday, June 11, 2016 17:15 > > *To:* user@aurora.apache.org > *Subject:* Re: Would you recommend Aurora? > > Hi, > > Great discussion here. > May I extend the question a little bit ? I am wondering how Aurora scales: > can Aurora schedule millions of cron (for cron, the jobs run periodically > say every 1, 2 or 5 minutes) /service jobs ? Is there any > documentation/perf benchmark for Aurora i can refer to ? I heard that > Aurora can schedule several thousands jobs per second. Never tested that, > but good to confirm. > > Thanks a lot ! > > On Thu, May 26, 2016 at 1:01 AM, Jillian Cocklin < > jillian.cock...@danalinc.com> wrote: > >> Thanks Brian & Maxim, those are great leads. Awesome that Heron has gone >> open source! Definitely glad to have learned more about Aurora – for the >> right situation it seems like a really great solution. >> >> >> >> Thanks, >> >> J. >> >> >> >> *From:* Brian Hatfield [mailto:bhatfi...@twitter.com] >> *Sent:* Wednesday, May 25, 2016 9:57 AM >> *To:* user@aurora.apache.org >> >> *Subject:* Re: Would you recommend Aurora? >> >> >> >> I mentioned Heron yesterday in this thread - you might like to know that >> as of this morning, it's now open source: >> https://blog.twitter.com/2016/open-sourcing-twitter-heron >> >> >> >> On Wed, May 25, 2016 at 12:22 PM, Maxim Khutornenko <ma...@apache.org> >> wrote: >> >> Hi Jillian, >> >> >> >> You may still consider Aurora if you want a more complex (ala >> Heron-style) orchestration around your batch processing workloads. >> >> >> >> That said, there are plenty of alternatives for batch processing if you >> feel that'll be too much to load: >> http://mesos.apache.org/documentation/latest/frameworks/ >> >> >> >> There is also a young but promising framework specifically targeting >> large batch job counts that you may want to explore: >> https://github.com/twosigma/Cook. >> >> >> >> On Wed, May 25, 2016 at 8:12 AM, Jillian Cocklin < >> jillian.cock...@danalinc.com> wrote: >> >> Thanks Brian and Rick - that's what I was starting to think too. I >> appreciate your input and the quick responses. >> >> >> >> Best, >> >> J. >> >> Get Outlook for iOS <https://aka.ms/o0ukef> >> >> >> >> _____________________________ >> From: r...@chartbeat.com >> Sent: Wednesday, May 25, 2016 4:47 AM >> Subject: Re: Would you recommend Aurora? >> To: <user@aurora.apache.org> >> >> >> >> Sounds to me like you want something like spark or a traditional map >> reduce framework. >> >> >> On May 24, 2016, at 9:36 PM, Brian Hatfield <bhatfi...@twitter.com> >> wrote: >> >> It seems like Aurora would not be the solution to your problem entirely. >> >> >> >> It sounds like you either want a stream processor with a way to stream in >> the chunked batch (see also: Storm or Heron (which runs on Aurora) >> <https://blog.twitter.com/2015/flying-faster-with-twitter-heron>), or a >> way to process batch jobs (see also: Hadoop, which can run on Mesos >> <https://github.com/mesos/hadoop> and possibly Aurora). >> >> >> >> I'm not sure which fits your use case better based upon your description, >> but I hope that this is at least a seed of information in the right >> direction. >> >> >> >> Brian >> >> >> >> On Tue, May 24, 2016 at 9:14 PM, Jillian Cocklin < >> jillian.cock...@danalinc.com> wrote: >> >> I’m analyzing Aurora as a potential candidate for a new project. While >> the high-level architecture seems to be a good fit, I’m not seeing a lot of >> documentation that matches our use case. >> >> On an ongoing basis, we’ll receive batch files of records (~5 million >> records per batch), and based on record types we need to “process” them >> against our services. We’d break up the records into small chunks, >> instantiate a job for each chunk, and have each job be automatically queued >> up to run on available resources (which can be auto scaled up/down as >> needed). >> >> >> >> At first glance it looked like Aurora could create jobs - but I can’t >> tell whether those can be made as templates so that they can be dynamically >> instantiated, passed data, and run simultaneously. Are there any best >> practices or code examples for this? Most of what I’ve found fits better >> with the use case of having different static jobs (like chron jobs or IT >> services) that each need to be run on a periodic basis or continue running >> indefinitely. >> >> >> >> Can anyone let me know whether this is worth pursuing with Aurora? >> >> >> >> Thanks! >> >> J. >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> >> > > > > -- > Regards, Zi-Liang > > Mail:zlchen....@gmail.com > -- Regards, Zi-Liang Mail:zlchen....@gmail.com