[GitHub] [airflow] qzyu999 opened a new issue, #32637: Kill a sidecar container for worker pods on KubernetesExecutor

via GitHub Sun, 16 Jul 2023 13:43:19 -0700


qzyu999 opened a new issue, #32637:
URL: https://github.com/apache/airflow/issues/32637


   ### Official Helm Chart version
   
   1.9.0
   
   ### Apache Airflow version
   
   2.6.1
   
   ### Kubernetes Version
   
   1.21
   
   ### Helm Chart configuration
   
   ```
   workers:
     # Number of airflow celery workers in StatefulSet
     replicas: 1
     # Max number of old replicasets to retain
     revisionHistoryLimit: ~
   
     # Command to use when running Airflow workers (templated).
     command: ~
     # command: ["/bin/bash", "-c", 
"/opt/airflow/entrypoint_kubernetes_worker.sh"]
     # Args to use when running Airflow workers (templated).
     args:
       - "bash"
       - "-c"
       # The format below is necessary to get `helm lint` happy
       - |-
         exec \
         airflow {{ semverCompare ">=2.0.0" .Values.airflowVersion | ternary 
"celery worker" "worker" }} \
         && /bin/bash -c /opt/airflow/entrypoint_kubernetes_worker.sh
   ```
   
   ### Docker Image customizations
   
   ```
   FROM apache/airflow:2.6.1-python3.8
   ADD requirements.txt .
   RUN pip install -r requirements.txt
   COPY ./entrypoint_kubernetes_worker.sh /opt/airflow
   COPY ./dags/ /opt/airflow/dags/
   # some other stuff hidden
   USER root
   RUN chmod -R 777 /opt/airflow/
   USER default
   ```
   
   ### What happened
   
   I'm running the Airflow Helm chart within a company's Kubernetes namespace. 
So, I'm using KubernetesExecutor for the Airflow tasks. Due to privileges, to 
implement logging, I had to create a sidecar container for each of the worker 
pods (using fluentd). From there, the logs are streamed to 
OpenSearch/Dashboards and back to the Airflow web UI. My issue though is that 
when the Airflow container for each worker pod finishes, the fluentd sidecar 
stays running and just hangs. I found an interesting potential solution 
([here](https://medium.com/apache-airflow/enable-kerberos-with-airflow-kubernetesexecutor-6e86621e97a5)),
 where they use a volumeMount to send a file once the Airflow worker finishes 
which the sidecar container is also listening to. Once that file is found, it 
will signal a SIGTERM to the sidecar container to shutdown.
   
   It looked pretty reasonable, so I added the shell scripts to both my Airflow 
and fluentd Docker images, and proceeded with trying to setup the pod YAML file 
so that it would run the necessary commands/args. I noticed in the values.yaml, 
that actually you can't really configure those, since they never end up getting 
sent to the pod-template-file.kubernetes-helm-yaml despite having values set in 
the values.yaml under Values.workers.
   
   When I run the Chart on my namespace, I see that there are values being 
dynamically created (for commands (indirectly via the entrypoint)/args). I 
checked the GitHub, and it seems that I would have to add something 
[here](https://github.com/apache/airflow/blob/f1e1cdcc3b2826e68ba133f350300b5065bbca33/airflow/executors/kubernetes_executor_utils.py#L339)
 in perhaps the run_next function by adding something to the command variable. 
For example, I would need to add `&& /bin/bash -c my_script.sh`.
   
   Is there a way to accomplish what I'm looking to do? Is my approach all 
wrong? I would appreciate all the help that I could get. Thank you for your 
time.
   
   ### What you think should happen instead
   
   I would want to have the `command` and `args` in the `Values.workers` areas 
of the `values.yaml` file be editable so that they eventually end up in the 
worker pods. Perhaps they can run after the initialization, where the dynamic 
values are set during run time, and some comment could be made to indicate what 
types of things are possible to add at the end of the command/args.
   
   ### How to reproduce
   
   Download the Helm chart and try to edit the `Values.workers.command` or 
`Values.workers.args`. They don't end up in the final worker pod, and there 
doesn't seem to be a way to easily change the values in some other way.
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [airflow] qzyu999 opened a new issue, #32637: Kill a sidecar container for worker pods on KubernetesExecutor

Reply via email to