I mentioned Heron yesterday in this thread - you might like to know that as
of this morning, it's now open source:
https://blog.twitter.com/2016/open-sourcing-twitter-heron

On Wed, May 25, 2016 at 12:22 PM, Maxim Khutornenko <[email protected]>
wrote:

> Hi Jillian,
>
> You may still consider Aurora if you want a more complex (ala Heron-style)
> orchestration around your batch processing workloads.
>
> That said, there are plenty of alternatives for batch processing if you
> feel that'll be too much to load:
> http://mesos.apache.org/documentation/latest/frameworks/
>
> There is also a young but promising framework specifically targeting large
> batch job counts that you may want to explore:
> https://github.com/twosigma/Cook.
>
> On Wed, May 25, 2016 at 8:12 AM, Jillian Cocklin <
> [email protected]> wrote:
>
>> Thanks Brian and Rick - that's what I was starting to think too.  I
>> appreciate your input and the quick responses.
>>
>> Best,
>> J.
>>
>> Get Outlook for iOS <https://aka.ms/o0ukef>
>>
>> _____________________________
>> From: [email protected]
>> Sent: Wednesday, May 25, 2016 4:47 AM
>> Subject: Re: Would you recommend Aurora?
>> To: <[email protected]>
>>
>>
>>
>> Sounds to me like you want something like spark or a traditional map
>> reduce framework.
>>
>> On May 24, 2016, at 9:36 PM, Brian Hatfield <[email protected]>
>> wrote:
>>
>> It seems like Aurora would not be the solution to your problem entirely.
>>
>> It sounds like you either want a stream processor with a way to stream in
>> the chunked batch (see also: Storm or Heron (which runs on Aurora)
>> <https://blog.twitter.com/2015/flying-faster-with-twitter-heron>), or a
>> way to process batch jobs (see also: Hadoop, which can run on Mesos
>> <https://github.com/mesos/hadoop> and possibly Aurora).
>>
>> I'm not sure which fits your use case better based upon your description,
>> but I hope that this is at least a seed of information in the right
>> direction.
>>
>> Brian
>>
>> On Tue, May 24, 2016 at 9:14 PM, Jillian Cocklin <
>> [email protected]> wrote:
>>
>>> I’m analyzing Aurora as a potential candidate for a new project.  While
>>> the high-level architecture seems to be a good fit, I’m not seeing a lot of
>>> documentation that matches our use case.
>>>
>>>  On an ongoing basis, we’ll receive batch files of records (~5 million
>>> records per batch), and based on record types we need to “process” them
>>> against our services.  We’d break up the records into small chunks,
>>> instantiate a job for each chunk, and have each job be automatically queued
>>> up to run on available resources (which can be auto scaled up/down as
>>> needed).
>>>
>>>
>>>
>>> At first glance it looked like Aurora could create jobs  - but I can’t
>>> tell whether those can be made as templates so that they can be dynamically
>>> instantiated, passed data, and run simultaneously.  Are there any best
>>> practices or code examples for this?  Most of what I’ve found fits better
>>> with the use case of having different static jobs (like chron jobs or IT
>>> services) that each need to be run on a periodic basis or continue running
>>> indefinitely.
>>>
>>>
>>>
>>> Can anyone let me know whether this is worth pursuing with Aurora?
>>>
>>>
>>>
>>> Thanks!
>>>
>>> J.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>
>>
>>
>

Reply via email to