Jane created AIRFLOW-4544:
-----------------------------
Summary: Intermittent communication issues can cause "Failed to
extract xcom from pod"
Key: AIRFLOW-4544
URL: https://issues.apache.org/jira/browse/AIRFLOW-4544
Project: Apache Airflow
Issue Type: Bug
Components: operators
Affects Versions: 1.10.1
Reporter: Jane
Running Airflow worker on one Kubernetes cluster and task pods created by
KubernetesPodOperator on another, a number of tasks have been seen to fail due
to "Failed to extract xcom from pod".
On inspection the XCom file definitely exists in the side car container.
It appears that there is a communication issue (root cause as yet undiagnosed)
and no logic in the Airflow worker identifying that this is a retryable failure.
The task fails where I would expect that success were still possible.
https://github.com/apache/airflow/blob/1.10.1/airflow/contrib/kubernetes/pod_launcher.py#L155
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)