Hi guys,

Since we have "dag_conurrency" restriction, I tried to play with
 dagrun_timeout.
So that after some interval, dag runs are marked failed and pipeline
progresses.
But this is not happening.

I have this dag (@hourly):

A -> B -> C  -> D -> E

C: depends_on_past=true

My dagrun_timeout is 60 minutes

default_args = {
    'owner': 'airflow',
    'depends_on_past': False,
    'start_date': scheduling_start_date,
    'email': ['airf...@airflow.com'],
    'email_on_failure': False,
    'email_on_retry': False,
    'retries': 2,
    'retry_delay': default_retries_delay,
    'dagrun_timeout':datetime.timedelta(minutes=60)
}


Parallelism setting in airflow.cfg:

parallelism = 8
dag_concurrency = 8
max_active_runs_per_dag = 8


For hour 1, all the tasks got completed.
Now in hour 2,  say task C failed.

>From hour 3 onwards, Tasks A and B keep running.
Task C never triggers because it depends on past (and past hour failed)

Since dag conurrency is 8, my pipeline progresses from hour 3 to hour 10
(thats next 8 hours) for Tasks A and B. After this, pipeline stalls.

"dagrun_timeout" was 60 minutes.  This should mean that after 60 minutes,
from hour 3 onwards, the DAG runs that has been up for more than 60 minutes
should be marked FAILED  and the pipeline should progress?

But this is not happening. So I am guessing my understanding here is not
correct.

What should be behavior when we use  "dagrun_timeout" ?
Also, how can I make sure that the dag proceeds in this situation?

In the example I gave above, Task A and B should keep running every hour
(since it doesnt depend on past).
Why it runs 8(dag_conurrency) instances  and stalls?


Thanks,
Harish

Reply via email to