ephraimbuddy commented on code in PR #59650:
URL: https://github.com/apache/airflow/pull/59650#discussion_r2655216920
##########
airflow-core/src/airflow/api_fastapi/execution_api/routes/task_instances.py:
##########
@@ -384,6 +384,17 @@ def ti_update_state(
)
if previous_state != TaskInstanceState.RUNNING:
+ # In HA, it's possible to receive a "late" finish/state update after
another
+ # component already moved the TI to a terminal state. Treat this as an
idempotent no-op to avoid
+ # crashing the process.
Review Comment:
I'm going to verify this more but here's what I thought was going on:
Scheduler A tried to start X but the scheduler was marked failed. Scheduler
B picked up the task(couldn't adopt) after resetting it and start X in another
pod. At this point we now have two pods running. So I think one of the pods
received an update state before the other.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]