> Op 13 mei 2016, om 23:06 heeft harish singh <[email protected]> het > volgende geschreven: > > we are seeing this in production. I wont be able to update the version > right now. But I will try to test this out over the weekend. > But if I consider 1.7.0, am I doing something incorrect? or did something > change in .1.rc6?
No I wouldn’t consider you are doing something wrong from an initial analysis. However, we did a lot of stability fixes in 1.7.1. Especially in your scenario you might want to try out AIRFLOW-20 (See lira). If you can supply an example dag it will make it easier to help out. > > One thing I forgot to mention was that - we do run a backfill before we > turn on the DAG. > So if I have to turn the DAG on right now, I will first run a backfill for > last 24 hours and then I turn it on (from the UI) so that it gets scheduled > by the scheduler. > > Nevertheless, I am going to try this scenario on 1.7.1.rc6. > > Thanks! > > > On Fri, May 13, 2016 at 1:54 PM, Bolke de Bruin <[email protected]> wrote: > >> >>> Op 13 mei 2016, om 22:51 heeft harish singh <[email protected]> >> het volgende geschreven: >>> >>> Bolke, its 1.7.0 >>> >>> >>> On Fri, May 13, 2016 at 1:35 PM, Bolke de Bruin <[email protected]> >> wrote: >>> >>>> >>>>> Op 13 mei 2016, om 22:19 heeft harish singh <[email protected]> >>>> het volgende geschreven: >>>>> >>>>> Hi guys, >>>>> >>>>> I am having an issue with making 'depends_on_past=true' work >>>>> >>>>> This my pipeline: >>>>> >>>>> a -> b -> c -> d -> e >>>>> >>>>> a -> x -> e >>>>> >>>>> a -> y -> e >>>>> >>>>> I have default args for all Tasks: >>>>> >>>>> scheduling_start_date = (datetime.utcnow() - >>>>> datetime.timedelta(hours=1)).replace(minute=0, second=0, >>>>> microsecond=0) >>>>> >>>>> default_args = { >>>>> 'owner': 'airflow', >>>>> 'depends_on_past': False, >>>>> 'start_date': scheduling_start_date, >>>>> 'email': ['[email protected]'], >>>>> 'email_on_failure': False, >>>>> 'email_on_retry': False, >>>>> 'retries': 2, >>>>> 'retry_delay': default_retries_delay, >>>>> # 'queue': 'bash_queue', >>>>> # 'pool': 'backfill', >>>>> # 'priority_weight': 10, >>>>> # 'end_date': datetime(2016, 1, 1),} >>>>> >>>>> >>>>> But specifically for tasks d, x, y , I have depends_on_past = true >>>>> >>>>> depends_on_past=True >>>>> >>>>> >>>>> So now: >>>>> For the first hour, d, x and y failed. >>>>> So I am assuming in the next hour these jobs should not be even tried? >>>>> right ? >>>>> But I see in the next hour and subsequent hours, these tasks are >> getting >>>>> triggered (and failing) ... >>>>> Should the behavior be : that if a tasks previous execution failed, no >>>>> attempt is made during the next run of dag? >>>>> Or am I doing something very "bad" here? >>>> >>>> >>>> What version are you on Harish? >>>> >>>> >> >> Can you try 1.7.1.rc6 before w dive in? >> >>
