It seems like Aurora would not be the solution to your problem entirely. It sounds like you either want a stream processor with a way to stream in the chunked batch (see also: Storm or Heron (which runs on Aurora) <https://blog.twitter.com/2015/flying-faster-with-twitter-heron>), or a way to process batch jobs (see also: Hadoop, which can run on Mesos <https://github.com/mesos/hadoop> and possibly Aurora).
I'm not sure which fits your use case better based upon your description, but I hope that this is at least a seed of information in the right direction. Brian On Tue, May 24, 2016 at 9:14 PM, Jillian Cocklin < [email protected]> wrote: > I’m analyzing Aurora as a potential candidate for a new project. While > the high-level architecture seems to be a good fit, I’m not seeing a lot of > documentation that matches our use case. > > On an ongoing basis, we’ll receive batch files of records (~5 million > records per batch), and based on record types we need to “process” them > against our services. We’d break up the records into small chunks, > instantiate a job for each chunk, and have each job be automatically queued > up to run on available resources (which can be auto scaled up/down as > needed). > > > > At first glance it looked like Aurora could create jobs - but I can’t > tell whether those can be made as templates so that they can be dynamically > instantiated, passed data, and run simultaneously. Are there any best > practices or code examples for this? Most of what I’ve found fits better > with the use case of having different static jobs (like chron jobs or IT > services) that each need to be run on a periodic basis or continue running > indefinitely. > > > > Can anyone let me know whether this is worth pursuing with Aurora? > > > > Thanks! > > J. > > > > > > > > > > > > > > >
