> Op 13 mei 2016, om 23:06 heeft harish singh <[email protected]> het 
> volgende geschreven:
> 
> we are seeing this in production. I wont be able to update the version
> right now. But I will try to test this out over the weekend.
> But if I consider 1.7.0, am I doing something incorrect? or did something
> change in .1.rc6?

No I wouldn’t consider you are doing something wrong from an initial analysis. 
However, we did a lot of stability fixes in 1.7.1. Especially in your scenario 
you might want to try out AIRFLOW-20 (See lira). If you can supply an example 
dag it will make it easier to help out.


> 
> One thing I forgot to mention was that - we do run a backfill before we
> turn on the DAG.
> So if I have to turn the DAG on right now, I will first run a backfill for
> last 24 hours and then I turn it on (from the UI) so that it gets scheduled
> by the scheduler.
> 
> Nevertheless, I am going to try this scenario on 1.7.1.rc6.
> 
> Thanks!
> 
> 
> On Fri, May 13, 2016 at 1:54 PM, Bolke de Bruin <[email protected]> wrote:
> 
>> 
>>> Op 13 mei 2016, om 22:51 heeft harish singh <[email protected]>
>> het volgende geschreven:
>>> 
>>> Bolke, its 1.7.0
>>> 
>>> 
>>> On Fri, May 13, 2016 at 1:35 PM, Bolke de Bruin <[email protected]>
>> wrote:
>>> 
>>>> 
>>>>> Op 13 mei 2016, om 22:19 heeft harish singh <[email protected]>
>>>> het volgende geschreven:
>>>>> 
>>>>> Hi guys,
>>>>> 
>>>>> I am having an issue with making 'depends_on_past=true' work
>>>>> 
>>>>> This my pipeline:
>>>>> 
>>>>> a -> b -> c -> d -> e
>>>>> 
>>>>> a -> x -> e
>>>>> 
>>>>> a -> y -> e
>>>>> 
>>>>> I have default args for all Tasks:
>>>>> 
>>>>> scheduling_start_date = (datetime.utcnow() -
>>>>> datetime.timedelta(hours=1)).replace(minute=0, second=0,
>>>>> microsecond=0)
>>>>> 
>>>>> default_args = {
>>>>>  'owner': 'airflow',
>>>>>  'depends_on_past': False,
>>>>>  'start_date': scheduling_start_date,
>>>>>  'email': ['[email protected]'],
>>>>>  'email_on_failure': False,
>>>>>  'email_on_retry': False,
>>>>>  'retries': 2,
>>>>>  'retry_delay': default_retries_delay,
>>>>>  # 'queue': 'bash_queue',
>>>>>  # 'pool': 'backfill',
>>>>>  # 'priority_weight': 10,
>>>>>  # 'end_date': datetime(2016, 1, 1),}
>>>>> 
>>>>> 
>>>>> But specifically for tasks d, x, y , I have depends_on_past = true
>>>>> 
>>>>> depends_on_past=True
>>>>> 
>>>>> 
>>>>> So now:
>>>>> For the first hour, d, x and y failed.
>>>>> So I am assuming in the next hour these jobs should not be even tried?
>>>>> right ?
>>>>> But I see in the next hour and subsequent hours,  these tasks are
>> getting
>>>>> triggered (and failing) ...
>>>>> Should the behavior be : that if a tasks previous execution failed, no
>>>>> attempt is made during the next run of dag?
>>>>> Or am I doing something very "bad" here?
>>>> 
>>>> 
>>>> What version are you on Harish?
>>>> 
>>>> 
>> 
>> Can you try 1.7.1.rc6 before w dive in?
>> 
>> 

Reply via email to