[
https://issues.apache.org/jira/browse/AIRFLOW-6843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Pete DeJoy updated AIRFLOW-6843:
--------------------------------
Description:
If you're using the Kubernetes executor, Amazon EKS deletes pods very quickly
after the tasks complete. This can be an issue if you're scraping Airflow logs
from a service like FluentD, as it means that the task pod gets deleted before
FluentD can pick up logs for fast-running (or fast-failing) tasks.
The [`kube_client_request_args`|#kube-client-request-args]] environment
variable is passed to the [`delete_namespaced_pod`|#L443]] client request, but
passing the necessary [`grace_period_seconds`|#delete_namespaced_pod]] as a
[`post_param`|#L108]] on that object causes other client requests to fail, as
that `kube_client_request_args` is passed as a kwarg to all client requests,
even those that don't have a `post_params` option for `grace_period_seconds`.
In order to provide a config option that fixes this issue, there needs to be a
new configurable environment variable
`AIRFLOW__KUBERNETES____DELETE_POD_GRACE_PERIOD_SECONDS` option that we pass to
the [`delete_namespaced_pod`|#L443]] client request as its own argument. This
will be defaulted to 0.
was:
If you're using the Kubernetes executor, Amazon EKS deletes pods very quickly
after the tasks complete. This can be an issue if you're scraping Airflow logs
from a service like FluentD, as it means that the task pod gets deleted before
FluentD can pick up logs for fast-running (or fast-failing) tasks.
The
[`kube_client_request_args`|[https://airflow.apache.org/docs/stable/configurations-ref.html#kube-client-request-args]]
environment variable is passed to the
[`delete_namespaced_pod`|[https://github.com/apache/airflow/blob/master/airflow/executors/kubernetes_executor.py#L443]]
client request, but passing the necessary
[`grace_period_seconds`|[https://github.com/kubernetes-client/python/blob/master/kubernetes/docs/CoreV1Api.md#delete_namespaced_pod]]
as a
[`post_param`|[https://github.com/kubernetes-client/python/blob/master/kubernetes/client/rest.py#L108]]
on that object causes other client requests to fail, as that
`kube_client_request_args` is passed as a kwarg to all client requests, even
those that don't have a `post_params` option for `grace_period_seconds`.
In order to provide a config option that fixes this issue, there needs to be a
new configurable environment variable
`AIRFLOW__KUBERNETES__DELETE_POD_GRACE_PERIOD_SECONDS` option that we pass to
the
[`delete_namespaced_pod`|[https://github.com/apache/airflow/blob/master/airflow/executors/kubernetes_executor.py#L443]]
client request as its own argument. This will be defaulted to 0.
> Add grace_period_seconds config option for delete_namespaced_pod kube client
> request
> ------------------------------------------------------------------------------------
>
> Key: AIRFLOW-6843
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6843
> Project: Apache Airflow
> Issue Type: Improvement
> Components: executor-kubernetes
> Affects Versions: 1.10.9
> Reporter: Pete DeJoy
> Assignee: Pete DeJoy
> Priority: Major
>
> If you're using the Kubernetes executor, Amazon EKS deletes pods very quickly
> after the tasks complete. This can be an issue if you're scraping Airflow
> logs from a service like FluentD, as it means that the task pod gets deleted
> before FluentD can pick up logs for fast-running (or fast-failing) tasks.
> The [`kube_client_request_args`|#kube-client-request-args]] environment
> variable is passed to the [`delete_namespaced_pod`|#L443]] client request,
> but passing the necessary [`grace_period_seconds`|#delete_namespaced_pod]] as
> a [`post_param`|#L108]] on that object causes other client requests to fail,
> as that `kube_client_request_args` is passed as a kwarg to all client
> requests, even those that don't have a `post_params` option for
> `grace_period_seconds`.
> In order to provide a config option that fixes this issue, there needs to be
> a new configurable environment variable
> `AIRFLOW__KUBERNETES____DELETE_POD_GRACE_PERIOD_SECONDS` option that we pass
> to the [`delete_namespaced_pod`|#L443]] client request as its own argument.
> This will be defaulted to 0.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)