Brandon Willard created AIRFLOW-6778:
----------------------------------------
Summary: Add a DAGs PVC Mount Point Option for Workers under
Kubernetes Executor
Key: AIRFLOW-6778
URL: https://issues.apache.org/jira/browse/AIRFLOW-6778
Project: Apache Airflow
Issue Type: Improvement
Components: executor-kubernetes, worker
Affects Versions: 1.10.9, 1.10.8, 1.10.7, 1.10.6
Reporter: Brandon Willard
The worker pods generated by the Kubernetes Executor force the DAGs PVC to be
mounted at the Airflow DAGs folder. This, combined with a general inability to
specify arbitrary PVCs on workers (see AIRFLOW-3126 and the linked/duplicated
issues), severely constrains the usability of worker pods and the Kubernetes
Executor as a whole.
For example, if a DAGs-containing PVC is rooted at a Python package (e.g.
{{package/}}) that needs to be installed on each worker (e.g. DAGs in
{{package/dags/}}, package install point at {{package/setup.py}}, and Airflow
DAGs location {{/airflow/dags}}), then the current static mount point logic
will only allow a worker to directly mount the entire package into the Airflow
DAGs location — while the actual DAGs are in a subdirectory — or exclusively
mount the package's sub-path {{package/dags}} (using the existing
{{kubernetes.dags_volume_subpath}} config option). While the latter is at
least correct, it completely forego the required parent directory making the
requisite package unavailable for installation.
In general, the only approach that seems to work for the Kubernetes Executor is
to specify a worker image with all DAG dependencies pre-loaded, which largely
voids the usefulness of a single DAGs PVC that can be dynamically updated. At
best, one can include a {{requirements.txt}} in the PVC and use it with an
entry-point script built into the image, but that still doesn't help with
source installations.
One quick fix for this situation is to allow one to specify the mount point.
With this option, one can mount the PVC anywhere and specify an Airflow DAGs
location that works in conjunction with the mount point (e.g. mount the PVC at
{{/airflow/package}} and independently set the Airflow DAGs location to
{{/airflow/package/dags}}). This option would — in many cases — obviate the
need for the marginally useful {{kubernetes.dags_volume_subpath}} options, as
well.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)