Re: Would you recommend Aurora?

Ziliang Chen Sun, 12 Jun 2016 18:57:35 -0700

Thanks Erb for the great details.
1) Assume we have 1000 customer, each of the customer has 1000 periodical
cron jobs. I would like the schedule the total 1M jobs across a pool of
machines. If Aurora can't take this load, any suggestion/candidate ?
2) 40 tasks per second. Is there a way to change the default by
configuration instead of modifying the code ?


Thank you very much !

On Mon, Jun 13, 2016 at 1:50 AM, Erb, Stephan <[email protected]>
wrote:

> Could you clarify your cron usecase? Millions of cron jobs that run up to
> every minute sounds more like you want a couple of long running processes
> that do the actual work with a little sleep in between, rather than doing
> task spawning and distribution in Mesos & Aurora for each of them.
>
>
> Regarding Aurora's scale: Twitter has recently disclosed that they have
> 250,000 containers/tasks running, with the largest cluster being in the
> range of 30,000 nodes [1].  Aurora is by default not trying to schedule
> more than 40 tasks per second [2]. You can probably try to adjust that
> value, but this could bring other downsides.
>
> 
>
> [1] https://youtu.be/FU7wrqsRj3o?t=21m11s
>
> [2]
> https://github.com/apache/aurora/blob/master/src/main/java/org/apache/aurora/scheduler/scheduling/SchedulingModule.java#L39-L41
>
> ------------------------------
> *From:* Ziliang Chen <[email protected]>
> *Sent:* Saturday, June 11, 2016 17:15
>
> *To:* [email protected]
> *Subject:* Re: Would you recommend Aurora?
>
> Hi,
>
> Great discussion here.
> May I extend the question a little bit ? I am wondering how Aurora scales:
> can Aurora schedule millions of cron (for cron, the jobs run periodically
> say every 1, 2 or 5 minutes) /service jobs ? Is there any
> documentation/perf benchmark for Aurora i can refer to ? I heard that
> Aurora can schedule several thousands jobs per second. Never tested that,
> but good to confirm.
>
> Thanks a lot !
>
> On Thu, May 26, 2016 at 1:01 AM, Jillian Cocklin <
> [email protected]> wrote:
>
>> Thanks Brian & Maxim, those are great leads.  Awesome that Heron has gone
>> open source!  Definitely glad to have learned more about Aurora – for the
>> right situation it seems like a really great solution.
>>
>>
>>
>> Thanks,
>>
>> J.
>>
>>
>>
>> *From:* Brian Hatfield [mailto:[email protected]]
>> *Sent:* Wednesday, May 25, 2016 9:57 AM
>> *To:* [email protected]
>>
>> *Subject:* Re: Would you recommend Aurora?
>>
>>
>>
>> I mentioned Heron yesterday in this thread - you might like to know that
>> as of this morning, it's now open source:
>> https://blog.twitter.com/2016/open-sourcing-twitter-heron
>>
>>
>>
>> On Wed, May 25, 2016 at 12:22 PM, Maxim Khutornenko <[email protected]>
>> wrote:
>>
>> Hi Jillian,
>>
>>
>>
>> You may still consider Aurora if you want a more complex (ala
>> Heron-style) orchestration around your batch processing workloads.
>>
>>
>>
>> That said, there are plenty of alternatives for batch processing if you
>> feel that'll be too much to load:
>> http://mesos.apache.org/documentation/latest/frameworks/
>>
>>
>>
>> There is also a young but promising framework specifically targeting
>> large batch job counts that you may want to explore:
>> https://github.com/twosigma/Cook.
>>
>>
>>
>> On Wed, May 25, 2016 at 8:12 AM, Jillian Cocklin <
>> [email protected]> wrote:
>>
>> Thanks Brian and Rick - that's what I was starting to think too.  I
>> appreciate your input and the quick responses.
>>
>>
>>
>> Best,
>>
>> J.
>>
>> Get Outlook for iOS <https://aka.ms/o0ukef>
>>
>>
>>
>> _____________________________
>> From: [email protected]
>> Sent: Wednesday, May 25, 2016 4:47 AM
>> Subject: Re: Would you recommend Aurora?
>> To: <[email protected]>
>>
>>
>>
>> Sounds to me like you want something like spark or a traditional map
>> reduce framework.
>>
>>
>> On May 24, 2016, at 9:36 PM, Brian Hatfield <[email protected]>
>> wrote:
>>
>> It seems like Aurora would not be the solution to your problem entirely.
>>
>>
>>
>> It sounds like you either want a stream processor with a way to stream in
>> the chunked batch (see also: Storm or Heron (which runs on Aurora)
>> <https://blog.twitter.com/2015/flying-faster-with-twitter-heron>), or a
>> way to process batch jobs (see also: Hadoop, which can run on Mesos
>> <https://github.com/mesos/hadoop> and possibly Aurora).
>>
>>
>>
>> I'm not sure which fits your use case better based upon your description,
>> but I hope that this is at least a seed of information in the right
>> direction.
>>
>>
>>
>> Brian
>>
>>
>>
>> On Tue, May 24, 2016 at 9:14 PM, Jillian Cocklin <
>> [email protected]> wrote:
>>
>> I’m analyzing Aurora as a potential candidate for a new project.  While
>> the high-level architecture seems to be a good fit, I’m not seeing a lot of
>> documentation that matches our use case.
>>
>>  On an ongoing basis, we’ll receive batch files of records (~5 million
>> records per batch), and based on record types we need to “process” them
>> against our services.  We’d break up the records into small chunks,
>> instantiate a job for each chunk, and have each job be automatically queued
>> up to run on available resources (which can be auto scaled up/down as
>> needed).
>>
>>
>>
>> At first glance it looked like Aurora could create jobs  - but I can’t
>> tell whether those can be made as templates so that they can be dynamically
>> instantiated, passed data, and run simultaneously.  Are there any best
>> practices or code examples for this?  Most of what I’ve found fits better
>> with the use case of having different static jobs (like chron jobs or IT
>> services) that each need to be run on a periodic basis or continue running
>> indefinitely.
>>
>>
>>
>> Can anyone let me know whether this is worth pursuing with Aurora?
>>
>>
>>
>> Thanks!
>>
>> J.
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>
>
> --
> Regards, Zi-Liang
>
> Mail:[email protected]
>



-- 
Regards, Zi-Liang

Mail:[email protected]

Re: Would you recommend Aurora?

Reply via email to