Sounds to me like you want something like spark or a traditional map reduce framework.
> On May 24, 2016, at 9:36 PM, Brian Hatfield <[email protected]> wrote: > > It seems like Aurora would not be the solution to your problem entirely. > > It sounds like you either want a stream processor with a way to stream in the > chunked batch (see also: Storm or Heron (which runs on Aurora)), or a way to > process batch jobs (see also: Hadoop, which can run on Mesos and possibly > Aurora). > > I'm not sure which fits your use case better based upon your description, but > I hope that this is at least a seed of information in the right direction. > > Brian > >> On Tue, May 24, 2016 at 9:14 PM, Jillian Cocklin >> <[email protected]> wrote: >> I’m analyzing Aurora as a potential candidate for a new project. While the >> high-level architecture seems to be a good fit, I’m not seeing a lot of >> documentation that matches our use case. >> On an ongoing basis, we’ll receive batch files of records (~5 million >> records per batch), and based on record types we need to “process” them >> against our services. We’d break up the records into small chunks, >> instantiate a job for each chunk, and have each job be automatically queued >> up to run on available resources (which can be auto scaled up/down as >> needed). >> >> >> >> At first glance it looked like Aurora could create jobs - but I can’t tell >> whether those can be made as templates so that they can be dynamically >> instantiated, passed data, and run simultaneously. Are there any best >> practices or code examples for this? Most of what I’ve found fits better >> with the use case of having different static jobs (like chron jobs or IT >> services) that each need to be run on a periodic basis or continue running >> indefinitely. >> >> >> >> Can anyone let me know whether this is worth pursuing with Aurora? >> >> >> >> Thanks! >> >> J. >> >
