This is an automated email from the ASF dual-hosted git repository.

kaxilnaik pushed a commit to branch v2-1-test
in repository https://gitbox.apache.org/repos/asf/airflow.git

commit 9bb26b7d51fdd21576b21b79b2f9f3efd3db398c
Author: Jarek Potiuk <[email protected]>
AuthorDate: Fri Sep 10 20:13:31 2021 +0200

    Fixes warm shutdown for celery worker. (#18068)
    
    The way how dumb-init propagated the signal by default
    made celery worker not to handle termination well.
    
    Default behaviour of dumb-init is to propagate signals to the
    process group rather than to the single child it uses. This is
    protective behaviour, in case a user runs 'bash -c' command
    without 'exec' - in this case signals should be sent not only
    to the bash but also to the process(es) it creates, otherwise
    bash exits without propagating the signal and you need second
    signal to kill all processes.
    
    However some airflow processes (in particular airflow celery worker)
    behave in a responsible way and handles the signals appropriately
    - when the first signal is received, it will switch to offline
    mode and let all workers terminate (until grace period expires
    resulting in Warm Shutdown.
    
    Therefore we can disable the protection of dumb-init and let it
    propagate the signal to only the single child it spawns in the
    Helm Chart. Documentation of the image was also updated to include
    explanation of signal propagation. For explicitness the
    DUMB_INIT_SETSID variable has been set to 1 in the image as well.
    
    Fixes #18066
    
    (cherry picked from commit 9e13e450032f4c71c54d091e7f80fe685204b5b4)
---
 Dockerfile                                     |  1 +
 chart/templates/workers/worker-deployment.yaml |  3 ++
 docs/docker-stack/entrypoint.rst               | 41 ++++++++++++++++++++++++++
 3 files changed, 45 insertions(+)

diff --git a/Dockerfile b/Dockerfile
index e08a050..de9248c 100644
--- a/Dockerfile
+++ b/Dockerfile
@@ -479,6 +479,7 @@ LABEL org.apache.airflow.distro="debian" \
   org.opencontainers.image.title="Production Airflow Image" \
   org.opencontainers.image.description="Reference, production-ready Apache 
Airflow image"
 
+ENV DUMB_INIT_SETSID="1"
 
 ENTRYPOINT ["/usr/bin/dumb-init", "--", "/entrypoint"]
 CMD []
diff --git a/chart/templates/workers/worker-deployment.yaml 
b/chart/templates/workers/worker-deployment.yaml
index 38e4e6d..7ae2627 100644
--- a/chart/templates/workers/worker-deployment.yaml
+++ b/chart/templates/workers/worker-deployment.yaml
@@ -169,6 +169,9 @@ spec:
           envFrom:
           {{- include "custom_airflow_environment_from" . | default "\n  []" | 
indent 10 }}
           env:
+            # Only signal the main process, not the process group, to make 
Warm Shutdown work properly
+            - name: DUMB_INIT_SETSID
+              value: "0"
           {{- include "custom_airflow_environment" . | indent 10 }}
           {{- include "standard_airflow_environment" . | indent 10 }}
           {{- if .Values.workers.kerberosSidecar.enabled }}
diff --git a/docs/docker-stack/entrypoint.rst b/docs/docker-stack/entrypoint.rst
index a999892..4b64904 100644
--- a/docs/docker-stack/entrypoint.rst
+++ b/docs/docker-stack/entrypoint.rst
@@ -161,6 +161,47 @@ If there are any other arguments - they are simply passed 
to the "airflow" comma
   > docker run -it apache/airflow:2.1.0-python3.6 version
   2.1.0
 
+Signal propagation
+------------------
+
+Airflow uses ``dumb-init`` to run as "init" in the entrypoint. This is in 
order to propagate
+signals and reap child processes properly. This means that the process that 
you run does not have
+to install signal handlers to work properly and be killed when the container 
is gracefully terminated.
+The behaviour of signal propagation is configured by ``DUMB_INIT_SETSID`` 
variable which is set to
+``1`` by default - meaning that the signals will be propagated to the whole 
process group, but you can
+set it to ``0`` to enable ``single-child`` behaviour of ``dumb-init`` which 
only propagates the
+signals to only single child process.
+
+The table below summarizes ``DUMB_INIT_SETSID`` possible values and their use 
cases.
+
++----------------+----------------------------------------------------------------------+
+| Variable value | Use case                                                    
         |
++----------------+----------------------------------------------------------------------+
+| 1 (default)    | Propagates signals to all processes in the process group of 
the main |
+|                | process running in the container.                           
         |
+|                |                                                             
         |
+|                | If you run your processes via ``["bash", "-c"]`` command 
and bash    |
+|                | spawn  new processes without ``exec``, this will help to 
terminate   |
+|                | your container gracefully as all processes will receive the 
signal.  |
++----------------+----------------------------------------------------------------------+
+| 0              | Propagates signals to the main process only.                
         |
+|                |                                                             
         |
+|                | This is useful if your main process handles signals 
gracefully.      |
+|                | A good example is warm shutdown of Celery workers. The 
``dumb-init`` |
+|                | in this case will only propagate the signals to the main 
process,    |
+|                | but not to the processes that are spawned in the same 
process        |
+|                | group as the main one. For example in case of Celery, the 
main       |
+|                | process will put the worker in "offline" mode, and will 
wait         |
+|                | until all running tasks complete, and only then it will     
         |
+|                | terminate all processes.                                    
         |
+|                |                                                             
         |
+|                | For Airflow's Celery worker, you should set the variable to 
0        |
+|                | and either use ``["celery", "worker"]`` command.            
         |
+|                | If you are running it through ``["bash", "-c"]`` command,   
         |
+|                | you  need to start the worker via ``exec airflow celery 
worker``     |
+|                | as the last command executed.                               
         |
++----------------+----------------------------------------------------------------------+
+
 Additional quick test options
 -----------------------------
 

Reply via email to