fengsi commented on pull request #8150: URL: https://github.com/apache/airflow/pull/8150#issuecomment-664664791
I followed the [dumping YAML hack](https://github.com/apache/airflow/pull/8150#issuecomment-663355935) @xEviL mentioned and it worked. The Helm template (based on [`stable/airflow`](https://github.com/helm/charts/tree/master/stable/airflow) chart) I ended up using is: ``` apiVersion: v1 kind: ConfigMap metadata: name: {{ include "airflow.fullname" . }}-files labels: app: {{ include "airflow.labels.app" . }} chart: {{ include "airflow.labels.chart" . }} release: {{ .Release.Name }} heritage: {{ .Release.Service }} data: kubernetes-executor-template.yaml: | apiVersion: v1 kind: Pod metadata: name: {{ include "airflow.fullname" . }}-worker spec: containers: - name: {{ .Chart.Name }}-worker image: {{ .Values.airflow.airflow.image.repository }}:{{ .Values.airflow.airflow.image.tag }} imagePullPolicy: {{ .Values.airflow.airflow.image.pullPolicy}} envFrom: - configMapRef: name: "{{ include "airflow.fullname" . }}-env" env: - name: AIRFLOW__CORE__EXECUTOR value: LocalExecutor {{- if .Values.airflow.airflow.extraEnv }} {{- toYaml .Values.airflow.airflow.extraEnv | nindent 12 }} {{- end }} {{- if .Values.airflow.airflow.extraVolumeMounts }} volumeMounts: {{- toYaml .Values.airflow.airflow.extraVolumeMounts | nindent 12 }} {{- end }} {{- if .Values.airflow.airflow.extraVolumes }} volumes: {{- toYaml .Values.airflow.airflow.extraVolumes | nindent 8 }} {{- end }} ``` The `.Values.airflow.airflow` not `.Values.airflow` is because I used `stable/airflow` as a dependency of my own chart. The catches here are: 1. I needed to specify both `metadata.name` and `spec.containers[*].name`. Though they will be overridden by other logic later, I cannot create the "reference Pod" with them omitted; 2. I needed to hard-code `AIRFLOW__CORE__EXECUTOR=LocalExecutor` in the template for the `KubernetesExecutor` **Pod**, otherwise it will pick `AIRFLOW__CORE__EXECUTOR=KubernetesExecutor` from global settings and fail – that's what I mean by "partially contaminated by `KubernetesExecutor` settings"; 3. As you can see, I also made it to pick up `extraEnv`, `extraVolumeMounts` and `extraVolumes` from global settings. This allows mounting correct stuff. This approach I took was partially due to the way the [`stable/airflow`](https://github.com/helm/charts/tree/master/stable/airflow) chart was implemented. ### What I learned about `pod_template_file` so far: Well, it worked, at least. **BUT**, unfortunately, it's more of a hack than a solution. It's like reverse engineering a scratch Pod file: create an object based on the template (some parameters are expected but are really just placeholders), hack on the fly in Airflow `KubernetesExecutor` logic to make the real Pod object, and convert it to config again with modified values. The problems I can see: 1. There is really no way for a user to know exactly what the final outcome would be, unless diving into every single line of the source code and hack the hell out of if; 2. This "two-way" binding approach is too dynamic, sometimes scary; 3. With `pod_template_file`, tons of `AIRFLOW__KUBERNETES__*` settings are now split into two groups: one for those "meant for scheduler when launching Pods but don't really apply to any individual Pod", and one for those that "should be passed down to Pods individually". I don't see a clear boundary between these two use cases, and of course for the latter one, a fixed template cannot do much, and it may still require more hacks like the `pod_mutation_hook` or `executor_config` mentioned by @mik-laj. Guess this is an example of the "two-way" binding pattern we should probably avoid in the future. As for the right direction for next steps, my use case is somewhat covered by above hacks. But I think rather than rushing into any particular approach, we may need more time gathering K8s use cases from more users (plus the K8s ecosystem is evolving fast too). And it may make more sense to start with a new executor when the right time comes, if we were lucky to find an elegant brand new solution. We can then keep current `KubernetesExecutor` as is and don't try to make it more complicated than what it should have been. Just my 2 cents. :) ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
