yxiao1996 opened a new issue, #29479: URL: https://github.com/apache/airflow/issues/29479
### Description My team has an airflow stack orchestrating many of our ETL jobs. Recently, we noticed a bug in one of our hourly job's code causing a large number of hourly datasets to be malformed(more than 500). Thus, after fixing the ETL job, we have the need to rerun all the impacted DAG runs to correct the data. I learnt that it's possible to batch initiate DAG reruns through the "Browse" -> "Task Instances" view, where I can filter out the task instances I want to rerun through execution date and dag Id and simply clear their states. However, since we set a concurrency limit for DAG runs, after we initiate the reruns, it appears that airflow is prioritizing reruns over latest runs, so out latest DAG runs got delayed. As we have a agreement with our consumers on data processing time, we don't want the latest DAG runs to get delayed. I'm wondering if it's possible for Airflow to support a more intuitive and safe way to rerun DAG's in batch, where reruns only start if they don't impact latest runs. ### Use case/motivation Backfill/rerun is a common use case for ETL workflows. I feel it could be valuable to make it safe and more intuitive. ### Related issues Didn't find related issues after some simple searches. Please let me know if this is discussed somewhere else before. ### Are you willing to submit a PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
