Thanks @Greg, that was exactly what I was looking for.
On Tue, 21 Apr 2020 at 14:55, Greg Neiheisel <[email protected]> wrote: > Hey Sergio, here's an example of how we use the built in healthcheck for > the webserver - > > https://github.com/astronomer/airflow-chart/blob/master/templates/webserver/webserver-deployment.yaml#L94-L117 > . > This is just ensuring that the webserver can return a 200 on this request, > rather than examining the output of the response. > > We do something a little different on the scheduler - > > https://github.com/astronomer/airflow-chart/blob/master/templates/scheduler/scheduler-deployment.yaml#L94-L112 > . > This execs into the scheduler and checks on the last heartbeat from the > database. If it's been too long, the scheduler will get rebooted. > > Hope that helps. > > On Tue, Apr 21, 2020 at 3:37 AM Sergio Kef <[email protected]> wrote: > > > Hi folks, > > > > We currently deploy Airflow on Kubernetes (using custom image, migrating > to > > official image is planned) and we use Local executor (changing to > > Kubernetes executor also in plans). > > We meet the following problem: > > For cost efficiency, our testing cluster is scaled down every night. Then > > every morning the pod running airflow is up, but not healthy. The issue > > comes from the way we start scheduler and webserver. Since they are 2 > > processes, we should have something like supervisord to handle them. > > > > Now my question is, given that we have a check-health > > <https://airflow.apache.org/docs/stable/howto/check-health.html>, how > > could > > it be used in liveness/probe check from k8 so it understands that pod is > > not healthy any more and it should redeploy it? > > > > Have others met similar issues? If so how did you approach? > > > > Sergio. > > > > > -- > *Greg Neiheisel* / Chief Architect Astronomer.io >
