Sounds to me like you want something like spark or a traditional map reduce 
framework.

> On May 24, 2016, at 9:36 PM, Brian Hatfield <[email protected]> wrote:
> 
> It seems like Aurora would not be the solution to your problem entirely.
> 
> It sounds like you either want a stream processor with a way to stream in the 
> chunked batch (see also: Storm or Heron (which runs on Aurora)), or a way to 
> process batch jobs (see also: Hadoop, which can run on Mesos and possibly 
> Aurora). 
> 
> I'm not sure which fits your use case better based upon your description, but 
> I hope that this is at least a seed of information in the right direction.
> 
> Brian
> 
>> On Tue, May 24, 2016 at 9:14 PM, Jillian Cocklin 
>> <[email protected]> wrote:
>> I’m analyzing Aurora as a potential candidate for a new project.  While the 
>> high-level architecture seems to be a good fit, I’m not seeing a lot of 
>> documentation that matches our use case. 
>>  On an ongoing basis, we’ll receive batch files of records (~5 million 
>> records per batch), and based on record types we need to “process” them 
>> against our services.  We’d break up the records into small chunks, 
>> instantiate a job for each chunk, and have each job be automatically queued 
>> up to run on available resources (which can be auto scaled up/down as 
>> needed).   
>> 
>>  
>> 
>> At first glance it looked like Aurora could create jobs  - but I can’t tell 
>> whether those can be made as templates so that they can be dynamically 
>> instantiated, passed data, and run simultaneously.  Are there any best 
>> practices or code examples for this?  Most of what I’ve found fits better 
>> with the use case of having different static jobs (like chron jobs or IT 
>> services) that each need to be run on a periodic basis or continue running 
>> indefinitely.
>> 
>>  
>> 
>> Can anyone let me know whether this is worth pursuing with Aurora?
>> 
>>  
>> 
>> Thanks!
>> 
>> J.
>> 
> 

Reply via email to