ephraimbuddy commented on issue #57618: URL: https://github.com/apache/airflow/issues/57618#issuecomment-3729237417
I have created another PR to deal specifically with the log you got. Can you try it: https://github.com/apache/airflow/pull/60330. If it fixes it, then if there's 3.1.6rc2 we will include it. From the log scheduler_race_condition_example.json: - 01:21:46 Scheduler pod A enqueues the TI as try 1: "2026-01-09T01:21:46.380591Z [info ] Add task TaskInstanceKey(dag_id='sadp_dag_196', task_id='task_1',... - 01:22:54 Scheduler pod B enqueues the same TI again as try 2: "2026-01-09T01:22:54.802854Z [info ] Add task TaskInstanceKey(dag_id='sadp_dag_196', task_id='task_1', ... This second enqueue happens before any worker starts (the first worker “Executing workload” line is 01:24:48), so the try bump is happening purely in scheduler logic. Then: - 01:24:48 → 01:25:03 the worker runs try 2 and the Execution API marks the TI success. - 01:26:30 a worker pod later starts try 1, but the API rejects it: Cannot start Task Instance in invalid state ... previous_state=success /run ... 409 Conflict This happened because DagRun.schedule_tis() updated rows using only TI.id.in_(...). In HA, scheduler B can have a stale in-memory view that still treats the TI as schedulable, run schedule_tis(), and increment try_number even if scheduler A has already advanced the TI. I reproduced it in unit tests. However, please try it in your deployment. > how should I check it? The case you saw is different from adoption so it won't record in TI history. The adoption case which was fixed happens only when a scheduler is marked failed and another scheduler unable to adopt the task, resets it. If you want to check if there are cases like that in your deployment, you can `select * from task_instance_history where dag_id=...` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
