ptran32 commented on issue #31200:
URL: https://github.com/apache/airflow/issues/31200#issuecomment-1545580001

   > As @ptran32 noted, we are also seeing unnecessary restarts for the 
scheduler because a livenessprobe failed. Interestingly, this was not a problem 
today, but something that started late last night - several hours after the 
upgrade.
   > 
   > But, I really cannot see any issues from the scheduler logs that would 
cause the probe to fail. As a temporary solution, I am trying out increasing 
the following values:
   > 
   > * scheduler.livenessProbe.timeoutSeconds to 30
   > * scheduler.livenessProbe.failureThreshold to 10
   > 
   > Hopefully, this results in no/fewer restarts.
   
   Hi,
   
    if you are using Helm to deploy Airflow, the 3.6.0 version will cause the 
scheduler to restart in loop, not because of the timeout but because the 
liveness check is not compatible with the new release
   
   
   Scheduler logs after upgrade to 3.6.0
   
   ```
   Liveness probe failed: Traceback (most recent call last): File "<string>", 
line 2, in <module> ModuleNotFoundError: No module named 
'airflow.jobs.scheduler_job'
   ``` 
   
   We fixed it by adjusting the liveness check command in the helm chart:
   
   ```
         livenessProbe:
             exec:
               command:
               - python
               - -Wignore
               - -c
               - |
                 from typing import List
                 from airflow.jobs.scheduler_job_runner import 
SchedulerJobRunner
   ```
   
   I hope it helps.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to