[
https://issues.apache.org/jira/browse/AIRFLOW-4424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16830556#comment-16830556
]
Brian Nutt commented on AIRFLOW-4424:
-------------------------------------
Thanks [~higrys], I have tried building off master after this change in
Minikube. I did not want to upgrade my airflow to the master version in our
real environments. Here is the result of setting num_runs to 5
{noformat}
[2019-04-30 17:48:17,531] {{jobs.py:1643}} INFO - Exiting scheduler loop as all
files have been processed 5 times
[2019-04-30 17:48:17,531] {{dag_processing.py:646}} INFO - Sending termination
message to manager.
[2019-04-30 17:48:17,531] {{jobs.py:1662}} INFO - Deactivating DAGs that
haven't been touched since 2019-04-30T17:47:04.362360+00:00
[2019-04-30 17:48:17,540] {{kubernetes_executor.py:812}} INFO - Shutting down
Kubernetes executor
[2019-04-30 17:48:17,559] {{dag_processing.py:661}} INFO - Manager process not
running.
[2019-04-30 17:48:17,559] {{jobs.py:1526}} INFO - Exited execute loop
[2019-04-30 17:48:17,566] {{cli_action_loggers.py:81}} DEBUG - Calling
callbacks: []
[2019-04-30 17:48:17,566] {{settings.py:168}} DEBUG - Disposing DB connection
pool (PID 1){noformat}
After this the scheduler process is just hanging there and has not ended.
> Scheduler does not terminate after num_runs when executor is
> KubernetesExecutor
> -------------------------------------------------------------------------------
>
> Key: AIRFLOW-4424
> URL: https://issues.apache.org/jira/browse/AIRFLOW-4424
> Project: Apache Airflow
> Issue Type: Bug
> Components: kubernetes, scheduler
> Affects Versions: 1.10.3
> Environment: EKS, deployed with stable airflow helm chart
> Reporter: Brian Nutt
> Priority: Blocker
> Fix For: 1.10.4
>
>
> When using the executor like the CeleryExecutor and num_runs is set on the
> scheduler, the scheduler pod restarts after num runs have completed. After
> switching to KubernetesExecutor, the scheduler logs:
> [2019-04-26 19:20:43,562] \{{kubernetes_executor.py:770}} INFO - Shutting
> down Kubernetes executor
> However, the scheduler process does not complete. This leads to the scheduler
> pod never restarting and running num_runs again. Resulted in having to roll
> back to CeleryExecutor because if num_runs is -1, the scheduler builds up
> tons of defunct processes, which is eventually making tasks not able to be
> scheduled as the underlying nodes have run out of file descriptors.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)