wolvery opened a new issue, #58562:
URL: https://github.com/apache/airflow/issues/58562

   ### Apache Airflow version
   
   3.1.3
   
   ### If "Other Airflow 2/3 version" selected, which one?
   
   _No response_
   
   ### What happened?
   
   Some workers sometimes result in subprocesses being killed with exit code -9 
(SIGKILL). The error occurs during SDK client PATCH calls to the API server 
(task_instances/{id}/run) and is accompanied by ServerResponseError: Server 
returned error.
   
   From the API Server, we can see this error:
   10.80.85.20:49928 - "PATCH 
/execution/task-instances/019a5dd0-200b-7ef9-9c6e-4ce858d2a12c/run HTTP/1.1" 
409 Conflict
   
   This is the log of the Worker:
   ```
    [info     [] [Metric Exporter[] Connecting to OpenTelemetry Collector at ...
   {"timestamp":"2025-11-07T10:20:36.718192Z","level":"info","event":"Executing 
workload","workload":"ExecuteTask(token='exxxxx', 
ti=TaskInstance(id=UUID('019a5dd0-200b-7ef9-9c6e-4ce858d2a12c'), 
dag_version_id=UUID('019a5dbb-f809-73fe-aaec-c80c3c901047'), 
task_id='log_tasks_specs', dag_id='nonendemiccampaigncreatedeventv3__r', 
run_id='scheduled__2025-11-07T09:15:00+00:00', try_number=1, map_index=-1, 
pool_slots=1, queue='default', priority_weight=4, executor_config=None, 
parent_context_carrier={}, context_carrier={}), 
dag_rel_path=PurePosixPath('revision_dags/process.py'), 
bundle_info=BundleInfo(name='dags-folder', version=None), 
log_path='dag_id=nonendemiccampaigncreatedeventv3__r/run_id=scheduled__2025-11-07T09:15:00+00:00/task_id=log_tasks_specs/attempt=1.log',
 
type='ExecuteTask')","logger":"__main__","filename":"execute_workload.py","lineno":56}
   
{"timestamp":"2025-11-07T10:20:36.719045Z","level":"info","event":"Connecting 
to 
server:","server":"http://workflow-manager-priority-api-server:8080/execution/","logger":"__main__","filename":"execute_workload.py","lineno":64}
   {"timestamp":"2025-11-07T10:20:36.809907Z","level":"info","event":"Secrets 
backends loaded for 
worker","count":1,"backend_classes":["EnvironmentVariablesBackend"],"logger":"supervisor","filename":"supervisor.py","lineno":1870}
   {"timestamp":"2025-11-07T10:20:36.888747Z","level":"info","event":"Process 
exited","pid":18,"exit_code":-9,"signal_sent":"SIGKILL","logger":"supervisor","filename":"supervisor.py","lineno":709}
   Traceback (most recent call last):
     File "/usr/python/lib/python3.10/runpy.py", line 196, in 
_run_module_as_main
       return _run_code(code, main_globals, None,
     File "/usr/python/lib/python3.10/runpy.py", line 86, in _run_code
       exec(code, run_globals)
     File 
"/home/airflow/.local/lib/python3.10/site-packages/airflow/sdk/execution_time/execute_workload.py",
 line 125, in <module>
       main()
     File 
"/home/airflow/.local/lib/python3.10/site-packages/airflow/sdk/execution_time/execute_workload.py",
 line 121, in main
       execute_workload(workload)
     File 
"/home/airflow/.local/lib/python3.10/site-packages/airflow/sdk/execution_time/execute_workload.py",
 line 66, in execute_workload
       supervise(
     File 
"/home/airflow/.local/lib/python3.10/site-packages/airflow/sdk/execution_time/supervisor.py",
 line 1878, in supervise
       process = ActivitySubprocess.start(
     File 
"/home/airflow/.local/lib/python3.10/site-packages/airflow/sdk/execution_time/supervisor.py",
 line 940, in start
       proc._on_child_started(ti=what, dag_rel_path=dag_rel_path, 
bundle_info=bundle_info)
     File 
"/home/airflow/.local/lib/python3.10/site-packages/airflow/sdk/execution_time/supervisor.py",
 line 951, in _on_child_started
       ti_context = self.client.task_instances.start(ti.id, self.pid, 
start_date)
     File 
"/home/airflow/.local/lib/python3.10/site-packages/airflow/sdk/api/client.py", 
line 210, in start
       resp = self.client.patch(f"task-instances/{id}/run", 
content=body.model_dump_json())
     File "/home/airflow/.local/lib/python3.10/site-packages/httpx/_client.py", 
line 1218, in patch
       return self.request(
     File 
"/home/airflow/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 
338, in wrapped_f
       return copy(f, *args, **kw)
     File 
"/home/airflow/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 
477, in __call__
       do = self.iter(retry_state=retry_state)
     File 
"/home/airflow/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 
378, in iter
       result = action(retry_state)
     File 
"/home/airflow/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 
400, in <lambda>
       self._add_action_func(lambda rs: rs.outcome.result())
     File "/usr/python/lib/python3.10/concurrent/futures/_base.py", line 451, 
in result
       return self.__get_result()
     File "/usr/python/lib/python3.10/concurrent/futures/_base.py", line 403, 
in __get_result
       raise self._exception
     File 
"/home/airflow/.local/lib/python3.10/site-packages/tenacity/__init__.py", line 
480, in __call__
       result = fn(*args, **kwargs)
     File 
"/home/airflow/.local/lib/python3.10/site-packages/airflow/sdk/api/client.py", 
line 861, in request
       return super().request(*args, **kwargs)
     File "/home/airflow/.local/lib/python3.10/site-packages/httpx/_client.py", 
line 825, in request
       return self.send(request, auth=auth, follow_redirects=follow_redirects)
     File "/home/airflow/.local/lib/python3.10/site-packages/httpx/_client.py", 
line 914, in send
       response = self._send_handling_auth(
     File "/home/airflow/.local/lib/python3.10/site-packages/httpx/_client.py", 
line 942, in _send_handling_auth
       response = self._send_handling_redirects(
     File "/home/airflow/.local/lib/python3.10/site-packages/httpx/_client.py", 
line 999, in _send_handling_redirects
       raise exc
     File "/home/airflow/.local/lib/python3.10/site-packages/httpx/_client.py", 
line 982, in _send_handling_redirects
       hook(response)
     File 
"/home/airflow/.local/lib/python3.10/site-packages/airflow/sdk/api/client.py", 
line 175, in raise_on_4xx_5xx
       return get_json_error(response) or response.raise_for_status()
     File 
"/home/airflow/.local/lib/python3.10/site-packages/airflow/sdk/api/client.py", 
line 171, in get_json_error
       raise err
   airflow.sdk.api.client.ServerResponseError: Server returned error
   ```
   
   ### What you think should happen instead?
   
   It should execute without issues.
   
   ### How to reproduce
   
   It happens radomnly over the day. We had 957 occurrencies.
   
   ### Operating System
   
   K8s
   
   ### Versions of Apache Airflow Providers
   
   We are following the constraints for the version with kubernetes 
executor/kubernetes operator.
   
   ### Deployment
   
   Official Apache Airflow Helm Chart
   
   ### Deployment details
   
   _No response_
   
   ### Anything else?
   
   Similar in spirit to issue 
[#57961](https://github.com/apache/airflow/issues/57961?utm_source=chatgpt.com)
   , but includes SIGKILL / subprocess exit.
   
   ### Are you willing to submit PR?
   
   - [x] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [x] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to