[jira] [Commented] (AIRFLOW-6014) Kubernetes executor - handle preempted deleted pods - queued tasks

afusr (Jira) Tue, 19 Nov 2019 23:18:05 -0800


    [ 
https://issues.apache.org/jira/browse/AIRFLOW-6014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16978151#comment-16978151
 ]


afusr commented on AIRFLOW-6014:
--------------------------------

I've created a PR which should catch these deleted pods and mark them to be up 
for reschedule. 

We have looked at taints, it seems the ones applied to the k8s node by the gke 
autoscaler when the node spins up don't prevent Airflow pods from being 
schedule on there before all system pods have started. 

You could perhaps create some kind of watch process, to look for newly created 
nodes, apply a taint and wait for the system pods to start. But you would then 
have to ensure any system pods you want on there have a toleration added to 
their spec to ensure they are able to start. Once the system pods are up you 
could then remove the taint and allow airflow pods to be placed there. 

It's interesting as to why k8s is creating a state where this can happen in the 
first place. My guess is whilst the new node is starting, multiple airflow 
tasks backup and are waiting to be scheduled. Once it is ready, the k8s 
scheduler selects a number of airflow pods, and looking at the pod memory 
request value, decides they will all fit on the new node. Then perhaps it also 
tries to schedule any daemon sets on there, as these must be present and have a 
higher priority, they force a random airflow pod to be preempted and it is then 
deleted from the node. 

There is a similar issue described in this openshift bug report, particularly 
this comment [https://bugzilla.redhat.com/show_bug.cgi?id=1701046#c13]

The most straight forward approach I think is to just ensure that if a pod is 
pending, and it is then deleted, mark it as up for reschedule, as the linked PR 
should do. Airflow then appears (from testing) to relaunch the pod and not 
affect the retry limit for the task. 

 

 

 

 

> Kubernetes executor - handle preempted deleted pods - queued tasks
> ------------------------------------------------------------------
>
>                 Key: AIRFLOW-6014
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-6014
>             Project: Apache Airflow
>          Issue Type: Improvement
>          Components: executor-kubernetes
>    Affects Versions: 1.10.6
>            Reporter: afusr
>            Assignee: Daniel Imberman
>            Priority: Minor
>
> We have encountered an issue whereby when using the kubernetes executor, and 
> using autoscaling, airflow pods are preempted and airflow never attempts to 
> rerun these pods. 
> This is partly as a result of having the following set on the pod spec:
> restartPolicy: Never
> This makes sense as if a pod fails when running a task, we don't want 
> kubernetes to retry it, as this should be controlled by airflow. 
> What we believe happens is that when a new node is added by autoscaling, 
> kubernetes schedules a number of airflow pods onto the new node, as well as 
> any pods required by k8s/daemon sets. As these are higher priority, the 
> Airflow pods are preempted, and deleted. You see messages such as:
>  
> Preempted by kube-system/ip-masq-agent-xz77q on node 
> gke-some--airflow-00000000-node-1ltl
>  
> Within the kubernetes executor, these pods end up in a status of pending and 
> an event of deleted is received but not handled. 
> The end result is tasks remain in a queued state forever. 
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (AIRFLOW-6014) Kubernetes executor - handle preempted deleted pods - queued tasks

Reply via email to