This is an automated email from the ASF dual-hosted git repository. kaxilnaik pushed a commit to branch v2-1-test in repository https://gitbox.apache.org/repos/asf/airflow.git
commit 9bb26b7d51fdd21576b21b79b2f9f3efd3db398c Author: Jarek Potiuk <[email protected]> AuthorDate: Fri Sep 10 20:13:31 2021 +0200 Fixes warm shutdown for celery worker. (#18068) The way how dumb-init propagated the signal by default made celery worker not to handle termination well. Default behaviour of dumb-init is to propagate signals to the process group rather than to the single child it uses. This is protective behaviour, in case a user runs 'bash -c' command without 'exec' - in this case signals should be sent not only to the bash but also to the process(es) it creates, otherwise bash exits without propagating the signal and you need second signal to kill all processes. However some airflow processes (in particular airflow celery worker) behave in a responsible way and handles the signals appropriately - when the first signal is received, it will switch to offline mode and let all workers terminate (until grace period expires resulting in Warm Shutdown. Therefore we can disable the protection of dumb-init and let it propagate the signal to only the single child it spawns in the Helm Chart. Documentation of the image was also updated to include explanation of signal propagation. For explicitness the DUMB_INIT_SETSID variable has been set to 1 in the image as well. Fixes #18066 (cherry picked from commit 9e13e450032f4c71c54d091e7f80fe685204b5b4) --- Dockerfile | 1 + chart/templates/workers/worker-deployment.yaml | 3 ++ docs/docker-stack/entrypoint.rst | 41 ++++++++++++++++++++++++++ 3 files changed, 45 insertions(+) diff --git a/Dockerfile b/Dockerfile index e08a050..de9248c 100644 --- a/Dockerfile +++ b/Dockerfile @@ -479,6 +479,7 @@ LABEL org.apache.airflow.distro="debian" \ org.opencontainers.image.title="Production Airflow Image" \ org.opencontainers.image.description="Reference, production-ready Apache Airflow image" +ENV DUMB_INIT_SETSID="1" ENTRYPOINT ["/usr/bin/dumb-init", "--", "/entrypoint"] CMD [] diff --git a/chart/templates/workers/worker-deployment.yaml b/chart/templates/workers/worker-deployment.yaml index 38e4e6d..7ae2627 100644 --- a/chart/templates/workers/worker-deployment.yaml +++ b/chart/templates/workers/worker-deployment.yaml @@ -169,6 +169,9 @@ spec: envFrom: {{- include "custom_airflow_environment_from" . | default "\n []" | indent 10 }} env: + # Only signal the main process, not the process group, to make Warm Shutdown work properly + - name: DUMB_INIT_SETSID + value: "0" {{- include "custom_airflow_environment" . | indent 10 }} {{- include "standard_airflow_environment" . | indent 10 }} {{- if .Values.workers.kerberosSidecar.enabled }} diff --git a/docs/docker-stack/entrypoint.rst b/docs/docker-stack/entrypoint.rst index a999892..4b64904 100644 --- a/docs/docker-stack/entrypoint.rst +++ b/docs/docker-stack/entrypoint.rst @@ -161,6 +161,47 @@ If there are any other arguments - they are simply passed to the "airflow" comma > docker run -it apache/airflow:2.1.0-python3.6 version 2.1.0 +Signal propagation +------------------ + +Airflow uses ``dumb-init`` to run as "init" in the entrypoint. This is in order to propagate +signals and reap child processes properly. This means that the process that you run does not have +to install signal handlers to work properly and be killed when the container is gracefully terminated. +The behaviour of signal propagation is configured by ``DUMB_INIT_SETSID`` variable which is set to +``1`` by default - meaning that the signals will be propagated to the whole process group, but you can +set it to ``0`` to enable ``single-child`` behaviour of ``dumb-init`` which only propagates the +signals to only single child process. + +The table below summarizes ``DUMB_INIT_SETSID`` possible values and their use cases. + ++----------------+----------------------------------------------------------------------+ +| Variable value | Use case | ++----------------+----------------------------------------------------------------------+ +| 1 (default) | Propagates signals to all processes in the process group of the main | +| | process running in the container. | +| | | +| | If you run your processes via ``["bash", "-c"]`` command and bash | +| | spawn new processes without ``exec``, this will help to terminate | +| | your container gracefully as all processes will receive the signal. | ++----------------+----------------------------------------------------------------------+ +| 0 | Propagates signals to the main process only. | +| | | +| | This is useful if your main process handles signals gracefully. | +| | A good example is warm shutdown of Celery workers. The ``dumb-init`` | +| | in this case will only propagate the signals to the main process, | +| | but not to the processes that are spawned in the same process | +| | group as the main one. For example in case of Celery, the main | +| | process will put the worker in "offline" mode, and will wait | +| | until all running tasks complete, and only then it will | +| | terminate all processes. | +| | | +| | For Airflow's Celery worker, you should set the variable to 0 | +| | and either use ``["celery", "worker"]`` command. | +| | If you are running it through ``["bash", "-c"]`` command, | +| | you need to start the worker via ``exec airflow celery worker`` | +| | as the last command executed. | ++----------------+----------------------------------------------------------------------+ + Additional quick test options -----------------------------
