wolvery opened a new issue, #58562:
URL: https://github.com/apache/airflow/issues/58562
### Apache Airflow version
3.1.3
### If "Other Airflow 2/3 version" selected, which one?
_No response_
### What happened?
Some workers sometimes result in subprocesses being killed with exit code -9
(SIGKILL). The error occurs during SDK client PATCH calls to the API server
(task_instances/{id}/run) and is accompanied by ServerResponseError: Server
returned error.
From the API Server, we can see this error:
10.80.85.20:49928 - "PATCH
/execution/task-instances/019a5dd0-200b-7ef9-9c6e-4ce858d2a12c/run HTTP/1.1"
409 Conflict
This is the log of the Worker:
```
[info [] [Metric Exporter[] Connecting to OpenTelemetry Collector at ...
{"timestamp":"2025-11-07T10:20:36.718192Z","level":"info","event":"Executing
workload","workload":"ExecuteTask(token='exxxxx',
ti=TaskInstance(id=UUID('019a5dd0-200b-7ef9-9c6e-4ce858d2a12c'),
dag_version_id=UUID('019a5dbb-f809-73fe-aaec-c80c3c901047'),
task_id='log_tasks_specs', dag_id='nonendemiccampaigncreatedeventv3__r',
run_id='scheduled__2025-11-07T09:15:00+00:00', try_number=1, map_index=-1,
pool_slots=1, queue='default', priority_weight=4, executor_config=None,
parent_context_carrier={}, context_carrier={}),
dag_rel_path=PurePosixPath('revision_dags/process.py'),
bundle_info=BundleInfo(name='dags-folder', version=None),
log_path='dag_id=nonendemiccampaigncreatedeventv3__r/run_id=scheduled__2025-11-07T09:15:00+00:00/task_id=log_tasks_specs/attempt=1.log',
type='ExecuteTask')","logger":"__main__","filename":"execute_workload.py","lineno":56}
{"timestamp":"2025-11-07T10:20:36.719045Z","level":"info","event":"Connecting
to
server:","server":"http://workflow-manager-priority-api-server:8080/execution/","logger":"__main__","filename":"execute_workload.py","lineno":64}
{"timestamp":"2025-11-07T10:20:36.809907Z","level":"info","event":"Secrets
backends loaded for
worker","count":1,"backend_classes":["EnvironmentVariablesBackend"],"logger":"supervisor","filename":"supervisor.py","lineno":1870}
{"timestamp":"2025-11-07T10:20:36.888747Z","level":"info","event":"Process
exited","pid":18,"exit_code":-9,"signal_sent":"SIGKILL","logger":"supervisor","filename":"supervisor.py","lineno":709}
Traceback (most recent call last):
File "/usr/python/lib/python3.10/runpy.py", line 196, in
_run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/python/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File
"/home/airflow/.local/lib/python3.10/site-packages/airflow/sdk/execution_time/execute_workload.py",
line 125, in <module>
main()
File
"/home/airflow/.local/lib/python3.10/site-packages/airflow/sdk/execution_time/execute_workload.py",
line 121, in main
execute_workload(workload)
File
"/home/airflow/.local/lib/python3.10/site-packages/airflow/sdk/execution_time/execute_workload.py",
line 66, in execute_workload
supervise(
File
"/home/airflow/.local/lib/python3.10/site-packages/airflow/sdk/execution_time/supervisor.py",
line 1878, in supervise
process = ActivitySubprocess.start(
File
"/home/airflow/.local/lib/python3.10/site-packages/airflow/sdk/execution_time/supervisor.py",
line 940, in start
proc._on_child_started(ti=what, dag_rel_path=dag_rel_path,
bundle_info=bundle_info)
File
"/home/airflow/.local/lib/python3.10/site-packages/airflow/sdk/execution_time/supervisor.py",
line 951, in _on_child_started
ti_context = self.client.task_instances.start(ti.id, self.pid,
start_date)
File
"/home/airflow/.local/lib/python3.10/site-packages/airflow/sdk/api/client.py",
line 210, in start
resp = self.client.patch(f"task-instances/{id}/run",
content=body.model_dump_json())
File "/home/airflow/.local/lib/python3.10/site-packages/httpx/_client.py",
line 1218, in patch
return self.request(
File
"/home/airflow/.local/lib/python3.10/site-packages/tenacity/__init__.py", line
338, in wrapped_f
return copy(f, *args, **kw)
File
"/home/airflow/.local/lib/python3.10/site-packages/tenacity/__init__.py", line
477, in __call__
do = self.iter(retry_state=retry_state)
File
"/home/airflow/.local/lib/python3.10/site-packages/tenacity/__init__.py", line
378, in iter
result = action(retry_state)
File
"/home/airflow/.local/lib/python3.10/site-packages/tenacity/__init__.py", line
400, in <lambda>
self._add_action_func(lambda rs: rs.outcome.result())
File "/usr/python/lib/python3.10/concurrent/futures/_base.py", line 451,
in result
return self.__get_result()
File "/usr/python/lib/python3.10/concurrent/futures/_base.py", line 403,
in __get_result
raise self._exception
File
"/home/airflow/.local/lib/python3.10/site-packages/tenacity/__init__.py", line
480, in __call__
result = fn(*args, **kwargs)
File
"/home/airflow/.local/lib/python3.10/site-packages/airflow/sdk/api/client.py",
line 861, in request
return super().request(*args, **kwargs)
File "/home/airflow/.local/lib/python3.10/site-packages/httpx/_client.py",
line 825, in request
return self.send(request, auth=auth, follow_redirects=follow_redirects)
File "/home/airflow/.local/lib/python3.10/site-packages/httpx/_client.py",
line 914, in send
response = self._send_handling_auth(
File "/home/airflow/.local/lib/python3.10/site-packages/httpx/_client.py",
line 942, in _send_handling_auth
response = self._send_handling_redirects(
File "/home/airflow/.local/lib/python3.10/site-packages/httpx/_client.py",
line 999, in _send_handling_redirects
raise exc
File "/home/airflow/.local/lib/python3.10/site-packages/httpx/_client.py",
line 982, in _send_handling_redirects
hook(response)
File
"/home/airflow/.local/lib/python3.10/site-packages/airflow/sdk/api/client.py",
line 175, in raise_on_4xx_5xx
return get_json_error(response) or response.raise_for_status()
File
"/home/airflow/.local/lib/python3.10/site-packages/airflow/sdk/api/client.py",
line 171, in get_json_error
raise err
airflow.sdk.api.client.ServerResponseError: Server returned error
```
### What you think should happen instead?
It should execute without issues.
### How to reproduce
It happens radomnly over the day. We had 957 occurrencies.
### Operating System
K8s
### Versions of Apache Airflow Providers
We are following the constraints for the version with kubernetes
executor/kubernetes operator.
### Deployment
Official Apache Airflow Helm Chart
### Deployment details
_No response_
### Anything else?
Similar in spirit to issue
[#57961](https://github.com/apache/airflow/issues/57961?utm_source=chatgpt.com)
, but includes SIGKILL / subprocess exit.
### Are you willing to submit PR?
- [x] Yes I am willing to submit a PR!
### Code of Conduct
- [x] I agree to follow this project's [Code of
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]