MatthewRBruce opened a new issue #15456:
URL: https://github.com/apache/airflow/issues/15456


   **Apache Airflow version**: 2.0.1
   
   
   **Kubernetes version (if you are using kubernetes)** (use `kubectl 
version`): 1.18
   
   **Environment**: GKE
   
   - **Cloud provider or hardware configuration**: GKE on GCP
   - **OS** (e.g. from /etc/os-release): Debian 10
   - **Kernel** (e.g. `uname -a`): 5.4.89+
   
   **What happened**:
   
   When executing a KuberentesPodOperator with `is_delete_operator_pod=True`, 
if the Pod doesn't complete successfully, then a 404 error is raised when 
attempting to get the final pod status.  This doesn't cause any major 
operational issues to us as the Task fails anyway, however it does cause 
confusion for our users when looking at the logs for their failed runs.
   
   ```
   Traceback (most recent call last):
     File 
"/usr/local/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 
1112, in _run_raw_task
       self._prepare_and_execute_task_with_callbacks(context, task)
     File 
"/usr/local/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 
1285, in _prepare_and_execute_task_with_callbacks
       result = self._execute_task(context, task_copy)
     File 
"/usr/local/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 
1310, in _execute_task
       result = task_copy.execute(context=context)
     File 
"/usr/local/lib/python3.8/site-packages/airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py",
 line 341, in execute
       status = self.client.read_namespaced_pod(self.pod.metadata.name, 
self.namespace)
     File 
"/usr/local/lib/python3.8/site-packages/kubernetes/client/apis/core_v1_api.py", 
line 18446, in read_namespaced_pod
       (data) = self.read_namespaced_pod_with_http_info(name, namespace, 
**kwargs)
     File 
"/usr/local/lib/python3.8/site-packages/kubernetes/client/apis/core_v1_api.py", 
line 18524, in read_namespaced_pod_with_http_info
       return 
self.api_client.call_api('/api/v1/namespaces/{namespace}/pods/{name}', 'GET',
     File 
"/usr/local/lib/python3.8/site-packages/kubernetes/client/api_client.py", line 
330, in call_api
       return self.__call_api(resource_path, method,
     File 
"/usr/local/lib/python3.8/site-packages/kubernetes/client/api_client.py", line 
163, in __call_api
       response_data = self.request(method, url,
     File 
"/usr/local/lib/python3.8/site-packages/kubernetes/client/api_client.py", line 
351, in request
       return self.rest_client.GET(url,
     File "/usr/local/lib/python3.8/site-packages/kubernetes/client/rest.py", 
line 227, in GET
       return self.request("GET", url,
     File "/usr/local/lib/python3.8/site-packages/kubernetes/client/rest.py", 
line 222, in request
       raise ApiException(http_resp=r)
   kubernetes.client.rest.ApiException: (404)
   Reason: Not Found
   ```
   
   **What you expected to happen**:
   
   A 404 error should not be raised - the pod should either be deleted after 
the state is retrieved, or the final_state returned from 
`create_new_pod_for_operator` should be used.
   
   **How to reproduce it**:
   Run A KubernetesPodOperator that doesn't result Pod with state SUCCESS with 
`is_delete_operator_pod=True` 
   
   **Anything else we need to know**:
   This appears to have been introduced here: 
https://github.com/apache/airflow/pull/11369 by adding:
   ```
   status = self.client.read_namespaced_pod(self.pod.metadata.name, 
self.namespace)
   ```
   if the pod state != SUCCESS
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to