hkc-8010 opened a new issue, #66905:
URL: https://github.com/apache/airflow/issues/66905
### Apache Airflow version
main branch
### What happened?
`TriggerDagRunOperator` can fail with `DagRunAlreadyExists` even though the
child Dag run was created successfully.
In the Airflow 3 task-sdk path, `DagRunOperations.trigger()` sends `POST
/execution/dag-runs/{dag_id}/{run_id}` through the generic execution API retry
layer. If the server creates the Dag run but the client sees an ambiguous
transport or request error, the retry can POST the same run ID again and
receive `409 Conflict`.
The task runner then treats that as a real pre-existing run, marks the
parent task failed, and does not write the `trigger_run_id` XCom.
### What you think should happen instead?
A transport-level ambiguity after a trigger POST should not be converted
into a duplicate-run failure when the requested Dag run now exists.
### How to reproduce
1. Mock `POST /dag-runs/{dag_id}/{run_id}` so the server-side run is created
but the client sees `httpx.RequestError`.
2. Return an existing Dag run from `GET /dag-runs/{dag_id}/{run_id}`.
3. The trigger operation should treat this as success for that run ID
instead of surfacing `DAGRUN_ALREADY_EXISTS`.
### Code pointers
`task-sdk/src/airflow/sdk/api/client.py`
`task-sdk/src/airflow/sdk/execution_time/task_runner.py`
### Are you willing to submit PR?
Yes.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]