rjh-yext opened a new issue, #59378:
URL: https://github.com/apache/airflow/issues/59378
### Apache Airflow version
Other Airflow 2/3 version (please specify below)
### If "Other Airflow 2/3 version" selected, which one?
apache/airflow:3.1.3 Docker image
### What happened?
We're seeing an issue where a task is started, stopped b/c Airflow thinks it
should be not be running, then attempts multiple restarts of the task. This
results in the task starting execution multiple times, but it appears that
Airflow loses track of (or ignores) the execution result. Note that the
requests to restart occur within (milli)seconds of the task first starting. In
some cases, there are several retries (11x), and the dag is marked as `Failed`
though the offending tasks are marked as `Skipped`, when they clearly have been
attempted multiple times.
Our deployment of Airflow has two instances of the Scheduler running, and
we've seen this error occur both when the task is re/started from the same
instance, and when it has been re/started from different instances of the
Scheduler.
One example of the sequence of Scheduler log entries are as follows. There
does not appear to be any other relevant or associated logs within the time
frame, but I can provide any further logs if requested. In this case, there are
no indications of any error or restart of the tasks in the Dag run logs.
`[2025-12-08T08:00:01.522+0000] {{_client.py:1026}} INFO - HTTP Request:
PATCH
http://myinstance/execution/task-instances/019afcf9-7ee5-713b-be96-758c026e7d15/run
"HTTP/1.1 200 OK"`
`2025-12-08 08:00:01 [debug ] Sending [supervisor]
msg=StartupDetails(ti=TaskInstance(id=UUID('019afcf9-7ee5-713b-be96-758c026e7d15'),
task_id='MyTask', dag_id='MyDag',
run_id='scheduled__2025-12-07T08:00:00+00:00', try_number=1, ... `
`[2025-12-08T08:01:22.494+0000] {{_client.py:1026}} INFO - HTTP Request:
PATCH
http://myinstance/execution/task-instances/019afcf9-7ee5-713b-be96-758c026e7d15/run
"HTTP/1.1 200 OK"`
`2025-12-08 08:01:22 [debug ] Sending [supervisor]
msg=StartupDetails(ti=TaskInstance(id=UUID('019afcf9-7ee5-713b-be96-758c026e7d15'),
task_id='MyTask', dag_id='MyDag',
run_id='scheduled__2025-12-07T08:00:00+00:00', try_number=2`
`[2025-12-08T08:01:22.239+0000] {{_client.py:1026}} INFO - HTTP Request: PUT
http://myinstance/execution/task-instances/019afcf9-7ee5-713b-be96-758c026e7d15/heartbeat
"HTTP/1.1 409 Conflict"`
`2025-12-08 08:01:22 [error ] Server indicated the task shouldn't be
running anymore [supervisor] detail={'detail': {'reason': 'not_running',
'message': 'TI is no longer in the running state and task should terminate',
'current_state': 'scheduled'}} status_code=409
ti_id=UUID('019afcf9-7ee5-713b-be96-758c026e7d15')`
`[2025-12-08T08:01:22.642+0000] {{_client.py:1026}} INFO - HTTP Request: PUT
http://airflowwebserver.service.nj1.consul:6002/execution/task-instances/019afcf9-7ee5-713b-be96-758c026e7d15/rtif
"HTTP/1.1 201 Created"`
Occasionally, the dag logs will output something like the following before
restarting the task:
`2025-12-08 01:18:04.771 | Server indicated the task shouldn't be running
anymore. Terminating process`
`2025-12-08 01:18:04.771 | Task killed!`
### What you think should happen instead?
_No response_
### How to reproduce
Seems to occur sporadically, and not in any consistent manner. The dags with
which this occurs also varies.
### Operating System
Debian GNU/Linux 12 (bookworm)
### Versions of Apache Airflow Providers
apache-airflow-providers-fab == 3.0.2
apache-airflow-providers-google == 15.1.0
apache-airflow-providers-slack == 9.5.0
apache-airflow-providers-standard == 1.9.0
### Deployment
Other Docker-based deployment
### Deployment details
There are two instances of Airflow Scheduler deployed
### Anything else?
_No response_
### Are you willing to submit PR?
- [ ] Yes I am willing to submit a PR!
### Code of Conduct
- [x] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]