[
https://issues.apache.org/jira/browse/AIRFLOW-3126?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17166967#comment-17166967
]
ASF GitHub Bot commented on AIRFLOW-3126:
-----------------------------------------
fengsi commented on pull request #8150:
URL: https://github.com/apache/airflow/pull/8150#issuecomment-664664791
I followed the [dumping YAML
hack](https://github.com/apache/airflow/pull/8150#issuecomment-663355935)
@xEviL mentioned and it worked. The Helm template (based on
[`stable/airflow`](https://github.com/helm/charts/tree/master/stable/airflow)
chart) I ended up using is:
```
apiVersion: v1
kind: ConfigMap
metadata:
name: {{ include "airflow.fullname" . }}-files
labels:
app: {{ include "airflow.labels.app" . }}
chart: {{ include "airflow.labels.chart" . }}
release: {{ .Release.Name }}
heritage: {{ .Release.Service }}
data:
kubernetes-executor-template.yaml: |
apiVersion: v1
kind: Pod
metadata:
name: {{ include "airflow.fullname" . }}-worker
spec:
containers:
- name: {{ .Chart.Name }}-worker
image: {{ .Values.airflow.airflow.image.repository }}:{{
.Values.airflow.airflow.image.tag }}
imagePullPolicy: {{ .Values.airflow.airflow.image.pullPolicy}}
envFrom:
- configMapRef:
name: "{{ include "airflow.fullname" . }}-env"
env:
- name: AIRFLOW__CORE__EXECUTOR
value: LocalExecutor
{{- if .Values.airflow.airflow.extraEnv }}
{{- toYaml .Values.airflow.airflow.extraEnv | nindent 12 }}
{{- end }}
{{- if .Values.airflow.airflow.extraVolumeMounts }}
volumeMounts:
{{- toYaml .Values.airflow.airflow.extraVolumeMounts | nindent 12
}}
{{- end }}
{{- if .Values.airflow.airflow.extraVolumes }}
volumes:
{{- toYaml .Values.airflow.airflow.extraVolumes | nindent 8 }}
{{- end }}
```
The `.Values.airflow.airflow` not `.Values.airflow` is because I used
`stable/airflow` as a dependency of my own chart.
The catches here are:
1. I needed to specify both `metadata.name` and `spec.containers[*].name`.
Though they will be overridden by other logic later, I cannot create the
"reference Pod" with them omitted;
2. I needed to hard-code `AIRFLOW__CORE__EXECUTOR=LocalExecutor` in the
template for the `KubernetesExecutor` **Pod**, otherwise it will pick
`AIRFLOW__CORE__EXECUTOR=KubernetesExecutor` from global settings and fail –
that's what I mean by "partially contaminated by `KubernetesExecutor` settings";
3. As you can see, I also made it to pick up `extraEnv`, `extraVolumeMounts`
and `extraVolumes` from global settings. This allows mounting correct stuff.
This approach I took was partially due to the way the
[`stable/airflow`](https://github.com/helm/charts/tree/master/stable/airflow)
chart was implemented.
### What I learned about `pod_template_file` so far:
Well, it worked, at least.
**BUT**, unfortunately, it's more of a hack than a solution. It's like
reverse engineering a scratch Pod file: create an object based on the template
(some parameters are expected but are really just placeholders), hack on the
fly in Airflow `KubernetesExecutor` logic to make the real Pod object, and
convert it to config again with modified values. The problems I can see:
1. There is really no way for a user to know exactly what the final outcome
would be, unless diving into every single line of the source code and hack the
hell out of if;
2. This "two-way" binding approach is too dynamic, sometimes scary;
3. With `pod_template_file`, tons of `AIRFLOW__KUBERNETES__*` settings are
now split into two groups: one for those "meant for scheduler when launching
Pods but don't really apply to any individual Pod", and one for those that
"should be passed down to Pods individually". I don't see a clear boundary
between these two use cases, and of course for the latter one, a fixed template
cannot do much, and it may still require more hacks like the
`pod_mutation_hook` or `executor_config` mentioned by @mik-laj. Guess this is
an example of the "two-way" binding pattern we should probably avoid in the
future.
As for the right direction for next steps, my use case is somewhat covered
by above hacks. But I think rather than rushing into any particular approach,
we may need more time gathering K8s use cases from more users (plus the K8s
ecosystem is evolving fast too).
And it may make more sense to start with a new executor when the right time
comes, if we were lucky to find an elegant brand new solution. We can then
keep current `KubernetesExecutor` as is and don't try to make it more
complicated than what it should have been. Just my 2 cents. :)
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
> kubernetes executor not allowing new PV
> ---------------------------------------
>
> Key: AIRFLOW-3126
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3126
> Project: Apache Airflow
> Issue Type: Bug
> Components: executors
> Affects Versions: 1.10.0
> Reporter: Rahul Singh
> Assignee: Brandon Willard
> Priority: Major
>
> Kubernetes executor allows only two PV one for dag and other for logs , in
> project scenario there are multiple PV needed , like separate PV to store
> data , PV to store common script files etc . Current implementation is very
> static to only two PV.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)