I’m really not a fan of Airflow. I’d prefer Aurora to handle DAG scheduling and a thin python wrapper around Aurora’s DSL - which I really like. I think supporting batch workflows in Aurora is a missing feature. That’s the only reason we’re hesitating to replace Chronos.
What would be the basic workflow if we plan to implement this feature to Aurora? > On 03 Feb 2016, at 00:06, Erb, Stephan <[email protected]> wrote: > > FWIW, the guys from oscar health have build this one: > http://dna.hioscar.com/2015/12/09/running-job-pipelines-in-aurora/ > <http://dna.hioscar.com/2015/12/09/running-job-pipelines-in-aurora/> > Unfortunately, it does not seem to be open source. At least, I cannot find it > on their github page https://github.com/oscarhealth > <https://github.com/oscarhealth>. > > > In addition, have you thought about keeping the dependency management outside > of Aurora in a different tool and use Aurora just for the execution? For > example, you could use Airflow (https://github.com/airbnb/airflow > <https://github.com/airbnb/airflow>) to do the entire dependency management, > time tracking etc. But when it's up to doing some actual work you use an > AuroraOperator (tbd :-) in Airflow that schedules your job on Aurora. Writing > a custom operator is not that hard > (https://pythonhosted.org/airflow/code.html?highlight=operator#basesensoroperator > > <https://pythonhosted.org/airflow/code.html?highlight=operator#basesensoroperator>). > > I guess this would give you the best of both worlds. If you are fancy, you > also use Aurora to spawn airflow itself. > > Regards, > Stephan > > > From: Krisztian Szucs <[email protected]> > Sent: Tuesday, February 2, 2016 10:22 PM > To: [email protected] > Subject: Re: Explicit job execution order > >> >> On 02 Feb 2016, at 22:01, Bill Farner <[email protected] >> <mailto:[email protected]>> wrote: >> >> My mistake, i skimmed past Chronos and was thinking services rather than >> batch. I think this is a legitimate use case, but nobody has seemed to yet >> have the requirement + commitment to add the feature. I will happily guide >> anyone willing to put forth effort! > > We have both of it, especially if You provide a quick solution to define > primitive Job dependencies in order to start migration workflows from chronos. > During migration we’ll dig into the details. > >> >> On Tue, Feb 2, 2016 at 12:58 PM, Krisztian Szucs <[email protected] >> <mailto:[email protected]>> wrote: >> We need to implement hybrid workflows, including batch processing (Spark). >> Many of the jobs run unique docker images with very different dependencies >> and resources so we can’t use the Process level ordering instead of Job >> ordering. >> >> I’ve seen the resolution of https://issues.apache.org/jira/browse/AURORA-735 >> <https://issues.apache.org/jira/browse/AURORA-735> is Later :) >> >>> On 02 Feb 2016, at 21:44, Bill Farner <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> In general, i've assumed that job dependencies create more problems than >>> they solve (e.g. scheduling behavior when a parent job is removed, >>> parent/child relationships that span auth groups, etc). Dependencies seem >>> handy for setting up and tearing down groups of jobs for things like >>> development environments, but that should be easily replaceable by a small >>> script. Is this contrary to your experience? >> >> Through API calls? >> >>> >>> On Tue, Feb 2, 2016 at 12:34 PM, Krisztian Szucs <[email protected] >>> <mailto:[email protected]>> wrote: >>> Hi Everyone! >>> >>> We’d like to migrate our jobs from Chronos to Aurora. >>> AFAIK Aurora doesn’t support dependant jobs. >>> Could You recommend any tools or a workaround to specify e.g. parent jobs? >>> >>> - Krisztian >>> >>> >> >> > >
