MatthewRBruce opened a new issue #15456:
URL: https://github.com/apache/airflow/issues/15456
**Apache Airflow version**: 2.0.1
**Kubernetes version (if you are using kubernetes)** (use `kubectl
version`): 1.18
**Environment**: GKE
- **Cloud provider or hardware configuration**: GKE on GCP
- **OS** (e.g. from /etc/os-release): Debian 10
- **Kernel** (e.g. `uname -a`): 5.4.89+
**What happened**:
When executing a KuberentesPodOperator with `is_delete_operator_pod=True`,
if the Pod doesn't complete successfully, then a 404 error is raised when
attempting to get the final pod status. This doesn't cause any major
operational issues to us as the Task fails anyway, however it does cause
confusion for our users when looking at the logs for their failed runs.
```
Traceback (most recent call last):
File
"/usr/local/lib/python3.8/site-packages/airflow/models/taskinstance.py", line
1112, in _run_raw_task
self._prepare_and_execute_task_with_callbacks(context, task)
File
"/usr/local/lib/python3.8/site-packages/airflow/models/taskinstance.py", line
1285, in _prepare_and_execute_task_with_callbacks
result = self._execute_task(context, task_copy)
File
"/usr/local/lib/python3.8/site-packages/airflow/models/taskinstance.py", line
1310, in _execute_task
result = task_copy.execute(context=context)
File
"/usr/local/lib/python3.8/site-packages/airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py",
line 341, in execute
status = self.client.read_namespaced_pod(self.pod.metadata.name,
self.namespace)
File
"/usr/local/lib/python3.8/site-packages/kubernetes/client/apis/core_v1_api.py",
line 18446, in read_namespaced_pod
(data) = self.read_namespaced_pod_with_http_info(name, namespace,
**kwargs)
File
"/usr/local/lib/python3.8/site-packages/kubernetes/client/apis/core_v1_api.py",
line 18524, in read_namespaced_pod_with_http_info
return
self.api_client.call_api('/api/v1/namespaces/{namespace}/pods/{name}', 'GET',
File
"/usr/local/lib/python3.8/site-packages/kubernetes/client/api_client.py", line
330, in call_api
return self.__call_api(resource_path, method,
File
"/usr/local/lib/python3.8/site-packages/kubernetes/client/api_client.py", line
163, in __call_api
response_data = self.request(method, url,
File
"/usr/local/lib/python3.8/site-packages/kubernetes/client/api_client.py", line
351, in request
return self.rest_client.GET(url,
File "/usr/local/lib/python3.8/site-packages/kubernetes/client/rest.py",
line 227, in GET
return self.request("GET", url,
File "/usr/local/lib/python3.8/site-packages/kubernetes/client/rest.py",
line 222, in request
raise ApiException(http_resp=r)
kubernetes.client.rest.ApiException: (404)
Reason: Not Found
```
**What you expected to happen**:
A 404 error should not be raised - the pod should either be deleted after
the state is retrieved, or the final_state returned from
`create_new_pod_for_operator` should be used.
**How to reproduce it**:
Run A KubernetesPodOperator that doesn't result Pod with state SUCCESS with
`is_delete_operator_pod=True`
**Anything else we need to know**:
This appears to have been introduced here:
https://github.com/apache/airflow/pull/11369 by adding:
```
status = self.client.read_namespaced_pod(self.pod.metadata.name,
self.namespace)
```
if the pod state != SUCCESS
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]