ptran32 commented on issue #31200:
URL: https://github.com/apache/airflow/issues/31200#issuecomment-1545580001
> As @ptran32 noted, we are also seeing unnecessary restarts for the
scheduler because a livenessprobe failed. Interestingly, this was not a problem
today, but something that started late last night - several hours after the
upgrade.
>
> But, I really cannot see any issues from the scheduler logs that would
cause the probe to fail. As a temporary solution, I am trying out increasing
the following values:
>
> * scheduler.livenessProbe.timeoutSeconds to 30
> * scheduler.livenessProbe.failureThreshold to 10
>
> Hopefully, this results in no/fewer restarts.
Hi,
if you are using Helm to deploy Airflow, the 3.6.0 version will cause the
scheduler to restart in loop, not because of the timeout but because the
liveness check is not compatible with the new release
Scheduler logs after upgrade to 3.6.0
```
Liveness probe failed: Traceback (most recent call last): File "<string>",
line 2, in <module> ModuleNotFoundError: No module named
'airflow.jobs.scheduler_job'
```
We fixed it by adjusting the liveness check command in the helm chart:
```
livenessProbe:
exec:
command:
- python
- -Wignore
- -c
- |
from typing import List
from airflow.jobs.scheduler_job_runner import
SchedulerJobRunner
```
I hope it helps.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]