dstandish commented on pull request #19003:
URL: https://github.com/apache/airflow/pull/19003#issuecomment-944652256


   So what's slow about the liveness probe is the import of airflow, which 
takes around 5 seconds it seems.
   
   Is the "right" way to do this to add `scheduler-health` endpoint?  I guess 
from separation of concerns maybe that's not possible, since API probably runs 
on webserver, and maybe we don't want scheduler health to depend on webserver 
health.
   
   Looking at [this contived 
example](https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#define-a-liveness-command),
 I thought of an alternative.  It seems that possibly another way would be to 
run a sidecar that runs the health check in a loop in a long-running python 
process, and after success it ensures a `scheduler-health` file exists; if not 
success, remove the file (or we could put status in an alway-there file).  Then 
the scheduler liveness probe could be `cat /scheduler-health`, and no need to 
import airflow.  This would seem to be a lighter weight solution overall (and 
therefore more predictable).  But more hacky than a scheduler-health endpoint, 
and it requires a long-running sidecar.
   
   Here's the referenced example:
   
   ```yaml
   apiVersion: v1
   kind: Pod
   metadata:
     labels:
       test: liveness
     name: liveness-exec
   spec:
     containers:
     - name: liveness
       image: k8s.gcr.io/busybox
       args:
       - /bin/sh
       - -c
       - touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 600
       livenessProbe:
         exec:
           command:
           - cat
           - /tmp/healthy
         initialDelaySeconds: 5
         periodSeconds: 5
   ```
   
   From that page:
   
   > For the first 30 seconds of the container's life, there is a 
`/tmp/healthy` file. So during the first 30 seconds, the command cat 
`/tmp/healthy` returns a success code. After 30 seconds, cat /tmp/healthy 
returns a failure code.
   
   Thoughts?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to