careduz commented on issue #14261:
URL: https://github.com/apache/airflow/issues/14261#issuecomment-784574696
We are facing the same issue (scheduler liveness probe always failing and
restarting the scheduler). Details:
**Airflow: Version 1.10.14**
**Kubernetes: Version 1.20.2** (DigitalOcean)
**Helm airflow-stable/airflow: Version 7.16.0**
```
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 27m default-scheduler Successfully
assigned airflow/airflow-scheduler-75c6c96d68-r9j4m to apollo-kaon3thg1-882c2
Normal Pulled 27m kubelet Container image
"alpine/git:latest" already present on machine
Normal Created 27m kubelet Created
container git-clone
Normal Started 27m kubelet Started
container git-clone
Normal Pulled 26m kubelet Container image
"alpine/git:latest" already present on machine
Normal Created 26m kubelet Created
container git-sync
Normal Started 26m kubelet Started
container git-sync
Normal Killing 12m (x2 over 19m) kubelet Container
airflow-scheduler failed liveness probe, will be restarted
Normal Pulled 11m (x3 over 26m) kubelet Container image
"apache/airflow:1.10.14-python3.7" already present on machine
Normal Created 11m (x3 over 26m) kubelet Created
container airflow-scheduler
Normal Started 11m (x3 over 26m) kubelet Started
container airflow-scheduler
Warning Unhealthy 6m (x12 over 21m) kubelet Liveness probe
failed:
```
And the logs are basically on a loop:
```
1] {scheduler_job.py:280} DEBUG - Waiting for
<ForkProcess(DagFileProcessor409-Process, stopped)>
[2021-02-23 22:58:35,578] {scheduler_job.py:1435} DEBUG - Starting Loop...
[2021-02-23 22:58:35,578] {scheduler_job.py:1446} DEBUG - Harvesting DAG
parsing results
[2021-02-23 22:58:35,579] {dag_processing.py:658} DEBUG - Received message
of type DagParsingStat
[2021-02-23 22:58:35,579] {dag_processing.py:658} DEBUG - Received message
of type DagParsingStat
[2021-02-23 22:58:35,580] {dag_processing.py:658} DEBUG - Received message
of type DagParsingStat
[2021-02-23 22:58:35,580] {dag_processing.py:658} DEBUG - Received message
of type DagParsingStat
[2021-02-23 22:58:35,580] {scheduler_job.py:1448} DEBUG - Harvested 0
SimpleDAGs
[2021-02-23 22:58:35,581] {scheduler_job.py:1514} DEBUG - Heartbeating the
executor
[2021-02-23 22:58:35,581] {base_executor.py:122} DEBUG - 0 running task
instances
[2021-02-23 22:58:35,582] {base_executor.py:123} DEBUG - 0 in queue
[2021-02-23 22:58:35,582] {base_executor.py:124} DEBUG - 32 open slots
[2021-02-23 22:58:35,582] {base_executor.py:133} DEBUG - Calling the <class
'airflow.executors.kubernetes_executor.KubernetesExecutor'> sync method
[2021-02-23 22:58:35,587] {scheduler_job.py:1469} DEBUG - Ran scheduling
loop in 0.01 seconds
[2021-02-23 22:58:35,587] {scheduler_job.py:1472} DEBUG - Sleeping for 1.00
seconds
[2021-02-23 22:58:36,589] {scheduler_job.py:1484} DEBUG - Sleeping for 0.99
seconds to prevent excessive logging
[2021-02-23 22:58:36,729] {settings.py:310} DEBUG - Disposing DB connection
pool (PID 6719)
[2021-02-23 22:58:36,930] {settings.py:310} DEBUG - Disposing DB connection
pool (PID 6717)
[2021-02-23 22:58:37,258] {scheduler_job.py:280} DEBUG - Waiting for
<ForkProcess(DagFileProcessor410-Process, stopped)>
[2021-02-23 22:58:37,259] {scheduler_job.py:280} DEBUG - Waiting for
<ForkProcess(DagFileProcessor411-Process, stopped)>
[2021-02-23 22:58:37,582] {scheduler_job.py:1435} DEBUG - Starting Loop...
[2021-02-23 22:58:37,583] {scheduler_job.py:1446} DEBUG - Harvesting DAG
parsing results
[2021-02-23 22:58:37,584] {dag_processing.py:658} DEBUG - Received message
of type DagParsingStat
[2021-02-23 22:58:37,586] {dag_processing.py:658} DEBUG - Received message
of type DagParsingStat
[2021-02-23 22:58:37,588] {dag_processing.py:658} DEBUG - Received message
of type DagParsingStat
[2021-02-23 22:58:37,589] {dag_processing.py:658} DEBUG - Received message
of type DagParsingStat
[2021-02-23 22:58:37,591] {scheduler_job.py:1448} DEBUG - Harvested 0
SimpleDAGs
[2021-02-23 22:58:37,592] {scheduler_job.py:1514} DEBUG - Heartbeating the
executor
[2021-02-23 22:58:37,593] {base_executor.py:122} DEBUG - 0 running task
instances
[2021-02-23 22:58:37,602] {base_executor.py:123} DEBUG - 0 in queue
[2021-02-23 22:58:37,604] {base_executor.py:124} DEBUG - 32 open slots
[2021-02-23 22:58:37,605] {base_executor.py:133} DEBUG - Calling the <class
'airflow.executors.kubernetes_executor.KubernetesExecutor'> sync method
[2021-02-23 22:58:37,607] {scheduler_job.py:1460} DEBUG - Heartbeating the
scheduler
[2021-02-23 22:58:37,620] {base_job.py:197} DEBUG - [heartbeat]
[2021-02-23 22:58:37,630] {scheduler_job.py:1469} DEBUG - Ran scheduling
loop in 0.05 seconds
[2021-02-23 22:58:37,631] {scheduler_job.py:1472} DEBUG - Sleeping for 1.00
seconds
[2021-02-23 22:58:38,165] {settings.py:310} DEBUG - Disposing DB connection
pool (PID 6769)
[2021-02-23 22:58:38,268] {settings.py:310} DEBUG - Disposing DB connection
pool (PID 6765)
[2021-02-23 22:58:38,276] {scheduler_job.py:280} DEBUG - Waiting for
<ForkProcess(DagFileProcessor412-Process, started)>
[2021-02-23 22:58:38,284] {scheduler_job.py:280} DEBUG - Waiting for
<ForkProcess(DagFileProcessor413-Process, stopped)>
[2021-02-23 22:58:38,633] {scheduler_job.py:1484} DEBUG - Sleeping for 0.95
seconds to prevent excessive logging
[2021-02-23 22:58:39,331] {settings.py:310} DEBUG - Disposing DB connection
pool (PID 6797)
[2021-02-23 22:58:39,361] {settings.py:310} DEBUG - Disposing DB connection
pool (PID 6801)
[2021-02-23 22:58:39,589] {scheduler_job.py:1435} DEBUG - Starting Loop...
[2021-02-23 22:58:39,589] {scheduler_job.py:1446} DEBUG - Harvesting DAG
parsing results
[2021-02-23 22:58:39,590] {dag_processing.py:658} DEBUG - Received message
of type DagParsingStat
[2021-02-23 22:58:39,590] {dag_processing.py:658} DEBUG - Received message
of type DagParsingStat
[2021-02-23 22:58:39,590] {dag_processing.py:658} DEBUG - Received message
of type DagParsingStat
[2021-02-23 22:58:39,590] {dag_processing.py:658} DEBUG - Received message
of type DagParsingStat
[2021-02-23 22:58:39,591] {scheduler_job.py:1448} DEBUG - Harvested 0
SimpleDAGs
[2021-02-23 22:58:39,591] {scheduler_job.py:1514} DEBUG - Heartbeating the
executor
[2021-02-23 22:58:39,591] {base_executor.py:122} DEBUG - 0 running task
instances
[2021-02-23 22:58:39,592] {base_executor.py:123} DEBUG - 0 in queue
[2021-02-23 22:58:39,593] {base_executor.py:124} DEBUG - 32 open slots
[2021-02-23 22:58:39,594] {base_executor.py:133} DEBUG - Calling the <class
'airflow.executors.kubernetes_executor.KubernetesExecutor'> sync method
[2021-02-23 22:58:39,596] {scheduler_job.py:1469} DEBUG - Ran scheduling
loop in 0.01 seconds
[2021-02-23 22:58:39,597] {scheduler_job.py:1472} DEBUG - Sleeping for 1.00
seconds
[2021-02-23 22:58:40,305] {scheduler_job.py:280} DEBUG - Waiting for
<ForkProcess(DagFileProcessor414-Process, stopped)>
[2021-02-23 22:58:40,306] {scheduler_job.py:280} DEBUG - Waiting for
<ForkProcess(DagFileProcessor415-Process, stopped)>
[2021-02-23 22:58:40,599] {scheduler_job.py:1484} DEBUG - Sleeping for 0.99
seconds to prevent excessive logging
[2021-02-23 22:58:41,349] {settings.py:310} DEBUG - Disposing DB connection
pool (PID 6829)
[2021-02-23 22:58:41,386] {settings.py:310} DEBUG - Disposing DB connection
pool (PID 6831)
[2021-02-23 22:58:41,595] {scheduler_job.py:1435} DEBUG - Starting Loop...
[2021-02-23 22:58:41,595] {scheduler_job.py:1446} DEBUG - Harvesting DAG
parsing results
[2021-02-23 22:58:41,596] {dag_processing.py:658} DEBUG - Received message
of type DagParsingStat
[2021-02-23 22:58:41,597] {dag_processing.py:658} DEBUG - Received message
of type DagParsingStat
[2021-02-23 22:58:41,598] {dag_processing.py:658} DEBUG - Received message
of type DagParsingStat
[2021-02-23 22:58:41,599] {dag_processing.py:658} DEBUG - Received message
of type DagParsingStat
[2021-02-23 22:58:41,600] {scheduler_job.py:1448} DEBUG - Harvested 0
SimpleDAGs
[2021-02-23 22:58:41,601] {scheduler_job.py:1514} DEBUG - Heartbeating the
executor
[2021-02-23 22:58:41,602] {base_executor.py:122} DEBUG - 0 running task
instances
[2021-02-23 22:58:41,602] {base_executor.py:123} DEBUG - 0 in queue
[2021-02-23 22:58:41,604] {base_executor.py:124} DEBUG - 32 open slots
[2021-02-23 22:58:41,604] {base_executor.py:133} DEBUG - Calling the <class
'airflow.executors.kubernetes_executor.KubernetesExecutor'> sync method
[2021-02-23 22:58:41,607] {scheduler_job.py:1469} DEBUG - Ran scheduling
loop in 0.01 seconds
[2021-02-23 22:58:41,608] {scheduler_job.py:1472} DEBUG - Sleeping for 1.00
seconds
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]