dstandish commented on pull request #19003: URL: https://github.com/apache/airflow/pull/19003#issuecomment-944652256
So what's slow about the liveness probe is the import of airflow, which takes around 5 seconds it seems. Is the "right" way to do this to add `scheduler-health` endpoint? I guess from separation of concerns maybe that's not possible, since API probably runs on webserver, and maybe we don't want scheduler health to depend on webserver health. Looking at [this contived example](https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/#define-a-liveness-command), I thought of an alternative. It seems that possibly another way would be to run a sidecar that runs the health check in a loop in a long-running python process, and after success it ensures a `scheduler-health` file exists; if not success, remove the file (or we could put status in an alway-there file). Then the scheduler liveness probe could be `cat /scheduler-health`, and no need to import airflow. This would seem to be a lighter weight solution overall (and therefore more predictable). But more hacky than a scheduler-health endpoint, and it requires a long-running sidecar. Here's the referenced example: ```yaml apiVersion: v1 kind: Pod metadata: labels: test: liveness name: liveness-exec spec: containers: - name: liveness image: k8s.gcr.io/busybox args: - /bin/sh - -c - touch /tmp/healthy; sleep 30; rm -rf /tmp/healthy; sleep 600 livenessProbe: exec: command: - cat - /tmp/healthy initialDelaySeconds: 5 periodSeconds: 5 ``` From that page: > For the first 30 seconds of the container's life, there is a `/tmp/healthy` file. So during the first 30 seconds, the command cat `/tmp/healthy` returns a success code. After 30 seconds, cat /tmp/healthy returns a failure code. Thoughts? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
