Re: [DISCUSS] Allowing Multiple DAG Runs with the Same Execution Date

James Coder Thu, 24 Aug 2023 17:43:48 -0700

How topical, I was surprised the other day to learn that this was still a 
thing. When triggering a dag run via api, I was under the impression that a 
unique run_id was enough. It would be nice to drop it.

James Coder
________________________________
From: Hussein Awala <huss...@awala.fr>
Sent: Thursday, August 24, 2023 7:13:40 PM
To: dev@airflow.apache.org <dev@airflow.apache.org>
Subject: [DISCUSS] Allowing Multiple DAG Runs with the Same Execution Date

Is the assumption that we should have only one DAG run per execution date
still valid?

In recent years, Airflow DAG has gained various features making it more
dynamic, including branching operators, DAG run configuration (DAG params),
and dynamic task mapping. While the scheduler creates most Airflow DAG runs
based on defined `schedule`, users also trigger runs externally using the
Webserver, REST API, and Python API.

As users employ Airflow for diverse use cases, sometimes they need more
than one DAG run for the same data interval (like event-based DAGs and
asynchronous requests processing, ...), but they are blocked by the unique
constraint on the execution date column. To work around this, users often
implement hacks such as introducing a new logical date param and using it
instead of the standard variable provided in the context.

Sometimes, a scheduled DAG's results need to be overridden using manually
triggered DAG runs that use different parameters values. Unfortunately, the
current system doesn't accommodate this without deleting the original DAG
run, leading to a loss of some of the historical execution records.

How essential is it to maintain a unique execution/logical date, and why
isn't the uniqueness of the run ID sufficient to address this?

Re: [DISCUSS] Allowing Multiple DAG Runs with the Same Execution Date

Reply via email to