[ 
https://issues.apache.org/jira/browse/AIRFLOW-6843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Pete DeJoy updated AIRFLOW-6843:
--------------------------------
    Description: 
If you're using the Kubernetes executor, Amazon EKS deletes pods very quickly 
after the tasks complete. This can be an issue if you're scraping Airflow logs 
from a service like FluentD, as it means that the task pod gets deleted before 
FluentD can pick up logs for fast-running (or fast-failing) tasks.

The [`kube_client_request_args`|#kube-client-request-args]] environment 
variable is passed to the [`delete_namespaced_pod`|#L443]] client request, but 
passing the necessary [`grace_period_seconds`|#delete_namespaced_pod]] as a 
[`post_param`|#L108]] on that object causes other client requests to fail, as 
that `kube_client_request_args` is passed as a kwarg to all client requests, 
even those that don't have a `post_params` option for `grace_period_seconds`.

In order to provide a config option that fixes this issue, there needs to be a 
new configurable environment variable 
`AIRFLOW__KUBERNETES____DELETE_POD_GRACE_PERIOD_SECONDS` option that we pass to 
the [`delete_namespaced_pod`|#L443]] client request as its own argument. This 
will be defaulted to 0.

  was:
If you're using the Kubernetes executor, Amazon EKS deletes pods very quickly 
after the tasks complete. This can be an issue if you're scraping Airflow logs 
from a service like FluentD, as it means that the task pod gets deleted before 
FluentD can pick up logs for fast-running (or fast-failing) tasks.

The 
[`kube_client_request_args`|[https://airflow.apache.org/docs/stable/configurations-ref.html#kube-client-request-args]]
 environment variable is passed to the 
[`delete_namespaced_pod`|[https://github.com/apache/airflow/blob/master/airflow/executors/kubernetes_executor.py#L443]]
 client request, but passing the necessary 
[`grace_period_seconds`|[https://github.com/kubernetes-client/python/blob/master/kubernetes/docs/CoreV1Api.md#delete_namespaced_pod]]
 as a 
[`post_param`|[https://github.com/kubernetes-client/python/blob/master/kubernetes/client/rest.py#L108]]
 on that object causes other client requests to fail, as that 
`kube_client_request_args` is passed as a kwarg to all client requests, even 
those that don't have a `post_params` option for `grace_period_seconds`.

In order to provide a config option that fixes this issue, there needs to be a 
new configurable environment variable 
`AIRFLOW__KUBERNETES__DELETE_POD_GRACE_PERIOD_SECONDS` option that we pass to 
the 
[`delete_namespaced_pod`|[https://github.com/apache/airflow/blob/master/airflow/executors/kubernetes_executor.py#L443]]
 client request as its own argument. This will be defaulted to 0.


> Add grace_period_seconds config option for delete_namespaced_pod kube client 
> request
> ------------------------------------------------------------------------------------
>
>                 Key: AIRFLOW-6843
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-6843
>             Project: Apache Airflow
>          Issue Type: Improvement
>          Components: executor-kubernetes
>    Affects Versions: 1.10.9
>            Reporter: Pete DeJoy
>            Assignee: Pete DeJoy
>            Priority: Major
>
> If you're using the Kubernetes executor, Amazon EKS deletes pods very quickly 
> after the tasks complete. This can be an issue if you're scraping Airflow 
> logs from a service like FluentD, as it means that the task pod gets deleted 
> before FluentD can pick up logs for fast-running (or fast-failing) tasks.
> The [`kube_client_request_args`|#kube-client-request-args]] environment 
> variable is passed to the [`delete_namespaced_pod`|#L443]] client request, 
> but passing the necessary [`grace_period_seconds`|#delete_namespaced_pod]] as 
> a [`post_param`|#L108]] on that object causes other client requests to fail, 
> as that `kube_client_request_args` is passed as a kwarg to all client 
> requests, even those that don't have a `post_params` option for 
> `grace_period_seconds`.
> In order to provide a config option that fixes this issue, there needs to be 
> a new configurable environment variable 
> `AIRFLOW__KUBERNETES____DELETE_POD_GRACE_PERIOD_SECONDS` option that we pass 
> to the [`delete_namespaced_pod`|#L443]] client request as its own argument. 
> This will be defaulted to 0.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to