[ 
https://issues.apache.org/jira/browse/AIRFLOW-6742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17035040#comment-17035040
 ] 

Andrew Cleland commented on AIRFLOW-6742:
-----------------------------------------

This issue was occurring due to incorrect permissions on the Kubernetes service 
account that I'd given to Airflow.

This was my original ClusterRole:
{code:java}
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
 # labels:
 # app.kubernetes.io/name: airflow
 # app.kubernetes.io/version: v1.8.0
 name: airflow
rules:
- apiGroups: [""]
 resources: ["pods"]
 verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]{code}
I also needed to add permissions for pods/log and pods/status resources:
{code:java}
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
 # labels:
 # app.kubernetes.io/name: airflow
 # app.kubernetes.io/version: v1.8.0
 name: airflow
rules:
- apiGroups: [""]
 resources: ["pods"]
 verbs: ["get", "list", "watch", "create", "update", "patch", "delete"]
- apiGroups: [""]
 resources: ["pods/log"]
 verbs: ["get", "list", "watch"]
- apiGroups: [""]
 resources: ["pods/status"]
 verbs: ["get", "list", "watch"]
{code}

> Task instance state set to failed even though the Pod succeeded when using 
> KubernetesExecutor
> ---------------------------------------------------------------------------------------------
>
>                 Key: AIRFLOW-6742
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-6742
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: executor-kubernetes
>    Affects Versions: 1.10.7, 1.10.8
>         Environment: Kubernetes (1.16), Airflow (1.10.7, 10.10.8rc1)
>            Reporter: Andrew Cleland
>            Assignee: Daniel Imberman
>            Priority: Major
>              Labels: State, executor, kubernetes, taskinstance
>         Attachments: airflow_scheduler_logs.txt, 
> airflow_scheduler_logs_full.txt, failed_dag_run.png, 
> failed_task_instance.png, k8s_pods.png, kubernetes_executor_logs.txt
>
>
> When running a KubernetesPodOperator task with the KubernetesExecutor, the 
> Pod succeeds but Airflow sets the task instance state to Failed.
> Attached files:
>  * k8s_pods.png - KubernetesExecutor pod and KubernetesPodOperator pod both 
> succeeded
>  * kubernetes_executor_logs - Launched the KubernetesPodOperator successfully
>  * airflow_scheduler_logs - "Found matching task with current state failed"
>  * failed_task_instance - The failed task instance in the airflow UI
>  * failed_dag_run - The failed dag run in the airflow UI
> It seems that the database is being updated with task state of failed, but 
> I'm not sure whereabouts this state is being changed. 
> [Here|https://github.com/apache/airflow/blob/1.10.7/airflow/contrib/executors/kubernetes_executor.py#L628]
>  is the line where the KubernetesExecutor queries the database and finds a 
> failed task.
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to