[
https://issues.apache.org/jira/browse/AIRFLOW-6742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17035040#comment-17035040
]
Andrew Cleland commented on AIRFLOW-6742:
-----------------------------------------
This issue was occurring due to incorrect permissions on the Kubernetes service
account that I'd given to Airflow.
This was my original ClusterRole:
{code:java}
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
# labels:
# app.kubernetes.io/name: airflow
# app.kubernetes.io/version: v1.8.0
name: airflow
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]{code}
I also needed to add permissions for pods/log and pods/status resources:
{code:java}
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
# labels:
# app.kubernetes.io/name: airflow
# app.kubernetes.io/version: v1.8.0
name: airflow
rules:
- apiGroups: [""]
resources: ["pods"]
verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: [""]
resources: ["pods/log"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["pods/status"]
verbs: ["get", "list", "watch"]
{code}
> Task instance state set to failed even though the Pod succeeded when using
> KubernetesExecutor
> ---------------------------------------------------------------------------------------------
>
> Key: AIRFLOW-6742
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6742
> Project: Apache Airflow
> Issue Type: Bug
> Components: executor-kubernetes
> Affects Versions: 1.10.7, 1.10.8
> Environment: Kubernetes (1.16), Airflow (1.10.7, 10.10.8rc1)
> Reporter: Andrew Cleland
> Assignee: Daniel Imberman
> Priority: Major
> Labels: State, executor, kubernetes, taskinstance
> Attachments: airflow_scheduler_logs.txt,
> airflow_scheduler_logs_full.txt, failed_dag_run.png,
> failed_task_instance.png, k8s_pods.png, kubernetes_executor_logs.txt
>
>
> When running a KubernetesPodOperator task with the KubernetesExecutor, the
> Pod succeeds but Airflow sets the task instance state to Failed.
> Attached files:
> * k8s_pods.png - KubernetesExecutor pod and KubernetesPodOperator pod both
> succeeded
> * kubernetes_executor_logs - Launched the KubernetesPodOperator successfully
> * airflow_scheduler_logs - "Found matching task with current state failed"
> * failed_task_instance - The failed task instance in the airflow UI
> * failed_dag_run - The failed dag run in the airflow UI
> It seems that the database is being updated with task state of failed, but
> I'm not sure whereabouts this state is being changed.
> [Here|https://github.com/apache/airflow/blob/1.10.7/airflow/contrib/executors/kubernetes_executor.py#L628]
> is the line where the KubernetesExecutor queries the database and finds a
> failed task.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)