Thanks Brian and Rick - that's what I was starting to think too. I appreciate your input and the quick responses.
Best, J. Get Outlook for iOS<https://aka.ms/o0ukef> _____________________________ From: [email protected]<mailto:[email protected]> Sent: Wednesday, May 25, 2016 4:47 AM Subject: Re: Would you recommend Aurora? To: <[email protected]<mailto:[email protected]>> Sounds to me like you want something like spark or a traditional map reduce framework. On May 24, 2016, at 9:36 PM, Brian Hatfield <[email protected]<mailto:[email protected]>> wrote: It seems like Aurora would not be the solution to your problem entirely. It sounds like you either want a stream processor with a way to stream in the chunked batch (see also: Storm or Heron (which runs on Aurora)<https://blog.twitter.com/2015/flying-faster-with-twitter-heron>), or a way to process batch jobs (see also: Hadoop, which can run on Mesos<https://github.com/mesos/hadoop> and possibly Aurora). I'm not sure which fits your use case better based upon your description, but I hope that this is at least a seed of information in the right direction. Brian On Tue, May 24, 2016 at 9:14 PM, Jillian Cocklin <[email protected]<mailto:[email protected]>> wrote: I'm analyzing Aurora as a potential candidate for a new project. While the high-level architecture seems to be a good fit, I'm not seeing a lot of documentation that matches our use case. On an ongoing basis, we'll receive batch files of records (~5 million records per batch), and based on record types we need to "process" them against our services. We'd break up the records into small chunks, instantiate a job for each chunk, and have each job be automatically queued up to run on available resources (which can be auto scaled up/down as needed). At first glance it looked like Aurora could create jobs - but I can't tell whether those can be made as templates so that they can be dynamically instantiated, passed data, and run simultaneously. Are there any best practices or code examples for this? Most of what I've found fits better with the use case of having different static jobs (like chron jobs or IT services) that each need to be run on a periodic basis or continue running indefinitely. Can anyone let me know whether this is worth pursuing with Aurora? Thanks! J.
