[ https://issues.apache.org/jira/browse/AIRFLOW-6469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
t oo updated AIRFLOW-6469: -------------------------- Description: dagrun_timeout is very limited right now see https://github.com/apache/airflow/pull/4782/files https://stackoverflow.com/questions/57110885/how-to-define-a-timeout-for-apache-airflow-dags https://stackoverflow.com/questions/57051453/what-is-the-difference-between-execution-timeout-and-dagrun-timeout-in-airflow this is about making dagrun_timeout work for non-scheduled (ie externally triggered) dags without '# of active DagRuns == max_active_runs' limitation challenge seems to be dag_run table's start_date does not get updated when 'clear' command is run... https://github.com/apache/airflow/blob/d7499b11d2bb2dc78dcee4b02c6388dfea6a2c3a/airflow/models/dag.py https://github.com/apache/airflow/blob/1a52182ea0fabb5c941e5e8c990db71097eb87a4/airflow/jobs/scheduler_job.py so this will not work as start_date is not the latest start_date... # if dagrun_timeout reached, the run failed elif ( self.start_date and dag.dagrun_timeout and self.start_date < time_now - dag.dagrun_timeout ): self.log.info('dagrun_timeout reached; marking run %s failed', self) self.set_state(State.FAILED) dag.handle_callback(self, success=False, reason='dagrun_timeout', session=session) was: dagrun_timeout is very limited right now see https://github.com/apache/airflow/pull/4782/files https://stackoverflow.com/questions/57110885/how-to-define-a-timeout-for-apache-airflow-dags https://stackoverflow.com/questions/57051453/what-is-the-difference-between-execution-timeout-and-dagrun-timeout-in-airflow this is about making dagrun_timeout work for non-scheduled (ie externally triggered) dags without '# of active DagRuns == max_active_runs' limitation challenge seems to be dag_run table's start_date does not get updated when 'clear' command is run... https://github.com/apache/airflow/blob/d7499b11d2bb2dc78dcee4b02c6388dfea6a2c3a/airflow/models/dag.py https://github.com/apache/airflow/blob/1a52182ea0fabb5c941e5e8c990db71097eb87a4/airflow/jobs/scheduler_job.py > Enforce dagrun_timeout for non-scheduled (ie externally triggered) dags > without # of active DagRuns == max_active_runs limitation > --------------------------------------------------------------------------------------------------------------------------------- > > Key: AIRFLOW-6469 > URL: https://issues.apache.org/jira/browse/AIRFLOW-6469 > Project: Apache Airflow > Issue Type: Improvement > Components: scheduler > Affects Versions: 1.10.7 > Reporter: t oo > Priority: Major > > dagrun_timeout is very limited right now > see > https://github.com/apache/airflow/pull/4782/files > https://stackoverflow.com/questions/57110885/how-to-define-a-timeout-for-apache-airflow-dags > https://stackoverflow.com/questions/57051453/what-is-the-difference-between-execution-timeout-and-dagrun-timeout-in-airflow > this is about making dagrun_timeout work for non-scheduled (ie externally > triggered) dags without '# of active DagRuns == max_active_runs' limitation > challenge seems to be dag_run table's start_date does not get updated when > 'clear' command is run... > https://github.com/apache/airflow/blob/d7499b11d2bb2dc78dcee4b02c6388dfea6a2c3a/airflow/models/dag.py > https://github.com/apache/airflow/blob/1a52182ea0fabb5c941e5e8c990db71097eb87a4/airflow/jobs/scheduler_job.py > so this will not work as start_date is not the latest start_date... > # if dagrun_timeout reached, the run failed > elif ( > self.start_date and dag.dagrun_timeout and > self.start_date < time_now - dag.dagrun_timeout > ): > self.log.info('dagrun_timeout reached; marking run %s failed', > self) > self.set_state(State.FAILED) > dag.handle_callback(self, success=False, reason='dagrun_timeout', > session=session) -- This message was sent by Atlassian Jira (v8.3.4#803005)