GeryDeKocliko opened a new issue, #59043:
URL: https://github.com/apache/airflow/issues/59043
### Apache Airflow version
3.1.3
### If "Other Airflow 2/3 version" selected, which one?
_No response_
### What happened?
When I run a backfill with parameters on a DAG that already has a DAG run
for the same logical date, the parameters from the backfill are not used.
Instead, Airflow reuses the parameters from the existing DAG run (the first
run that was created for that logical date). The parameters I specify in the
backfill form are ignored.
My use case is that I parallelize work by client:
• I have a DAG that runs once per day and can process one
specific client based on a parameter (e.g. client_id or data).
• I want to be able to backfill only one client for a given day,
even if the DAG has already run that day for all clients (or a different
client).
Currently, if a DAG run already exists for that date, and I backfill with
different parameters, the existing run’s parameters are kept and tasks see the
old parameters rather than the new ones.
So effectively:
• For new logical dates (no existing DAG run), backfill
parameters work as expected.
• For existing logical dates, backfill parameters do not override
the existing DAG run configuration.
### What you think should happen instead?
When I trigger a backfill for a date range and provide parameters (via “Run
Parameters” or JSON configuration), the parameters for each DAG run should:
• Be taken from what I provide in the backfill form.
• Override any existing parameters or configuration for that
logical date.
• Be visible inside tasks through:
• {{ params.my_param }} in templates
• context["params"]["my_param"] in Python code
• or a typed ParamsDict argument in a task function.
In other words, if a DAG run already exists for a given execution date and I
explicitly launch a backfill with new parameters, I expect those new parameters
to be applied to that run (or to a new run), so I can effectively “re-run” that
date with different params.
This is important for use cases where:
• A single DAG processes multiple clients, selected by a
parameter.
• We need to re-run only one client for a specific date, even if
the date already has a DAG run.
### How to reproduce
Create the following DAG:
```python
from datetime import datetime
from airflow.models.param import Param, ParamsDict
from airflow.decorators import dag, task
@task
def print_data(params: ParamsDict):
data = params.get("data")
print(f"DATA --> {data}")
@dag(
dag_id='test_backfill_with_params',
start_date=datetime(2023, 1, 1),
params={
"data": Param(
title="My custom data",
type="string",
default="Hello World!",
)
},
schedule='@daily',
catchup=False,
)
def test_backfill_with_params_dag():
print_data()
test_backfill_with_params_dag()
```
Steps:
1. Go to the Airflow 3 UI.
2. Trigger a manual run for test_backfill_with_params on a given
date (e.g. 2025-06-01) with default parameters (or with data = "Hello World!").
3. Confirm that the task prints DATA --> Hello World!.
4. Now open the Backfill modal for the same DAG.
5. Set a date range that includes the same logical date, for
example:
• Start: 2025-06-01 00:00
• End: 2025-06-02 00:00
6. In the Run Parameters (or configuration JSON), set:
• data = "Overridden Value"
7. Trigger the backfill.
Expected:
• For the run corresponding to 2025-06-01, the task should
receive params["data"] == "Overridden Value".
• The log should show something like:
DATA --> Overridden Value
Actual:
• For the 2025-06-01 run, the task still sees the original
parameter value (e.g. "Hello World!"), or the parameters from the first run
that was created for that logical date.
• The value provided in the backfill form is ignored for dates
that already have a DAG run.
### Operating System
Mac OS 14.6 (23G80)
### Versions of Apache Airflow Providers
No specific providers involved / default set.
### Deployment
Docker-Compose
### Deployment details
_No response_
### Anything else?
This behavior makes it difficult to use a single DAG to process multiple
clients in parallel via a parameter, because backfills cannot effectively
re-run a single client for a day if a run already exists.
It seems that when a DAG run already exists for a logical date, Airflow
reuses that run (including its original params) instead of applying the new
configuration provided in the backfill request.
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [x] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]