paramjeet01 opened a new issue, #38288:
URL: https://github.com/apache/airflow/issues/38288

   ### Apache Airflow version
   
   Other Airflow 2 version (please specify below)
   
   ### If "Other Airflow 2 version" selected, which one?
   
   2.7.3
   
   ### What happened?
   
   Random kubernetes api exception errors are thrown in airflow scheduler : 
   ```
   [2024-03-19T14:05:19.836+0000] {kubernetes_executor.py:239} INFO - Found 0 
queued task instances
   [2024-03-19T14:05:23.984+0000] {kubernetes_executor_utils.py:121} ERROR - 
Unknown error in KubernetesJobWatcher. Failing
   Traceback (most recent call last):
     File 
"/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/cncf/kubernetes/executors/kubernetes_executor_utils.py",
 line 112, in run
       self.resource_version = self._run(
     File 
"/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/cncf/kubernetes/executors/kubernetes_executor_utils.py",
 line 168, in _run
       for event in self._pod_events(kube_client=kube_client, 
query_kwargs=kwargs):
     File 
"/home/airflow/.local/lib/python3.9/site-packages/kubernetes/watch/watch.py", 
line 195, in stream
       raise client.rest.ApiException(
   kubernetes.client.exceptions.ApiException: (410)
   ```
   The task succeeded but it fails with an error :
   ```
   [2024-03-19, 14:00:57 UTC] {pod_manager.py:798} INFO - Running command... if 
[ -s /airflow/xcom/return.json ]; then cat /airflow/xcom/return.json; else echo 
__airflow_xcom_result_empty__; fi
   [2024-03-19, 14:01:42 UTC] {taskinstance.py:1937} ERROR - Task failed with 
exception
   Traceback (most recent call last):
     File 
"/home/airflow/.local/lib/python3.9/site-packages/kubernetes/stream/ws_client.py",
 line 523, in websocket_call
       client = WSClient(configuration, url, headers, capture_all)
     File 
"/home/airflow/.local/lib/python3.9/site-packages/kubernetes/stream/ws_client.py",
 line 65, in __init__
       self.sock = create_websocket(configuration, url, headers)
     File 
"/home/airflow/.local/lib/python3.9/site-packages/kubernetes/stream/ws_client.py",
 line 489, in create_websocket
       websocket.connect(url, **connect_opt)
     File 
"/home/airflow/.local/lib/python3.9/site-packages/websocket/_core.py", line 
255, in connect
       self.handshake_response = handshake(self.sock, url, *addrs, **options)
     File 
"/home/airflow/.local/lib/python3.9/site-packages/websocket/_handshake.py", 
line 57, in handshake
       status, resp = _get_resp_headers(sock)
     File 
"/home/airflow/.local/lib/python3.9/site-packages/websocket/_handshake.py", 
line 150, in _get_resp_headers
       raise WebSocketBadStatusException("Handshake status {status} {message} 
-+-+- {headers} -+-+- {body}".format(status=status, message=status_message, 
headers=resp_headers, body=response_body), status, status_message, 
resp_headers, response_body)
   websocket._exceptions.WebSocketBadStatusException: Handshake status 404 Not 
Found -+-+- {'audit-id': 'f06dc16c-c88c-41fb-8bea-1c993d4c0ef4', 
'cache-control': 'no-cache, private', 'content-type': 'application/json', 
'date': 'Tue, 19 Mar 2024 14:01:19 GMT', 'content-length': '214'} -+-+- 
b'{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"pods
 \\"download-parse-0833b34s\\" not 
found","reason":"NotFound","details":{"name":"download-parse-0833b34s","kind":"pods"},"code":404}\n'
   During handling of the above exception, another exception occurred:
   Traceback (most recent call last):
     File 
"/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/cncf/kubernetes/utils/pod_manager.py",
 line 730, in extract_xcom
       result = self.extract_xcom_json(pod)
     File 
"/home/airflow/.local/lib/python3.9/site-packages/tenacity/__init__.py", line 
289, in wrapped_f
       return self(f, *args, **kw)
     File 
"/home/airflow/.local/lib/python3.9/site-packages/tenacity/__init__.py", line 
379, in __call__
       do = self.iter(retry_state=retry_state)
     File 
"/home/airflow/.local/lib/python3.9/site-packages/tenacity/__init__.py", line 
325, in iter
       raise retry_exc.reraise()
     File 
"/home/airflow/.local/lib/python3.9/site-packages/tenacity/__init__.py", line 
158, in reraise
       raise self.last_attempt.result()
     File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 439, in 
result
       return self.__get_result()
     File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 391, in 
__get_result
       raise self._exception
     File 
"/home/airflow/.local/lib/python3.9/site-packages/tenacity/__init__.py", line 
382, in __call__
       result = fn(*args, **kwargs)
     File 
"/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/cncf/kubernetes/utils/pod_manager.py",
 line 743, in extract_xcom_json
       kubernetes_stream(
     File 
"/home/airflow/.local/lib/python3.9/site-packages/kubernetes/stream/stream.py", 
line 35, in _websocket_request
       return api_method(*args, **kwargs)
     File 
"/home/airflow/.local/lib/python3.9/site-packages/kubernetes/client/api/core_v1_api.py",
 line 994, in connect_get_namespaced_pod_exec
       return self.connect_get_namespaced_pod_exec_with_http_info(name, 
namespace, **kwargs)  # noqa: E501
     File 
"/home/airflow/.local/lib/python3.9/site-packages/kubernetes/client/api/core_v1_api.py",
 line 1101, in connect_get_namespaced_pod_exec_with_http_info
       return self.api_client.call_api(
     File 
"/home/airflow/.local/lib/python3.9/site-packages/kubernetes/client/api_client.py",
 line 348, in call_api
       return self.__call_api(resource_path, method,
     File 
"/home/airflow/.local/lib/python3.9/site-packages/kubernetes/client/api_client.py",
 line 180, in __call_api
       response_data = self.request(
     File 
"/home/airflow/.local/lib/python3.9/site-packages/kubernetes/stream/ws_client.py",
 line 529, in websocket_call
       raise ApiException(status=0, reason=str(e))
   kubernetes.client.exceptions.ApiException: (0)
   Reason: Handshake status 404 Not Found -+-+- {'audit-id': 
'f06dc16c-c88c-41fb-8bea-1c993d4c0ef4', 'cache-control': 'no-cache, private', 
'content-type': 'application/json', 'date': 'Tue, 19 Mar 2024 14:01:19 GMT', 
'content-length': '214'} -+-+- 
b'{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"pods
 \\"download-parse-0833b34s\\" not 
found","reason":"NotFound","details":{"name":"download-parse-0833b34s","kind":"pods"},"code":404}\n'
   During handling of the above exception, another exception occurred:
   Traceback (most recent call last):
     File "/opt/airflow/plugins/operators/kubernetes_pod_operator.py", line 
207, in execute
       result = self.extract_xcom(pod=self.pod)
     File 
"/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/cncf/kubernetes/operators/pod.py",
 line 557, in extract_xcom
       result = self.pod_manager.extract_xcom(pod)
     File 
"/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/cncf/kubernetes/utils/pod_manager.py",
 line 733, in extract_xcom
       self.extract_xcom_kill(pod)
     File 
"/home/airflow/.local/lib/python3.9/site-packages/tenacity/__init__.py", line 
289, in wrapped_f
       return self(f, *args, **kw)
     File 
"/home/airflow/.local/lib/python3.9/site-packages/tenacity/__init__.py", line 
379, in __call__
       do = self.iter(retry_state=retry_state)
     File 
"/home/airflow/.local/lib/python3.9/site-packages/tenacity/__init__.py", line 
325, in iter
       raise retry_exc.reraise()
     File 
"/home/airflow/.local/lib/python3.9/site-packages/tenacity/__init__.py", line 
158, in reraise
       raise self.last_attempt.result()
     File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 439, in 
result
       return self.__get_result()
     File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 391, in 
__get_result
       raise self._exception
     File 
"/home/airflow/.local/lib/python3.9/site-packages/tenacity/__init__.py", line 
382, in __call__
       result = fn(*args, **kwargs)
     File 
"/home/airflow/.local/lib/python3.9/site-packages/airflow/providers/cncf/kubernetes/utils/pod_manager.py",
 line 779, in extract_xcom_kill
       kubernetes_stream(
     File 
"/home/airflow/.local/lib/python3.9/site-packages/kubernetes/stream/stream.py", 
line 35, in _websocket_request
       return api_method(*args, **kwargs)
     File 
"/home/airflow/.local/lib/python3.9/site-packages/kubernetes/client/api/core_v1_api.py",
 line 994, in connect_get_namespaced_pod_exec
       return self.connect_get_namespaced_pod_exec_with_http_info(name, 
namespace, **kwargs)  # noqa: E501
     File 
"/home/airflow/.local/lib/python3.9/site-packages/kubernetes/client/api/core_v1_api.py",
 line 1101, in connect_get_namespaced_pod_exec_with_http_info
       return self.api_client.call_api(
     File 
"/home/airflow/.local/lib/python3.9/site-packages/kubernetes/client/api_client.py",
 line 348, in call_api
       return self.__call_api(resource_path, method,
     File 
"/home/airflow/.local/lib/python3.9/site-packages/kubernetes/client/api_client.py",
 line 180, in __call_api
       response_data = self.request(
     File 
"/home/airflow/.local/lib/python3.9/site-packages/kubernetes/stream/ws_client.py",
 line 529, in websocket_call
       raise ApiException(status=0, reason=str(e)
   ```
   
   
   ### What you think should happen instead?
   
   The task should be marked as successful 
   
   ### How to reproduce
   
   Run a task with xcom side car
   
   ### Operating System
   
   Amazon Linux 2
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Official Apache Airflow Helm Chart
   
   ### Deployment details
   
   _No response_
   
   ### Anything else?
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to