[ 
https://issues.apache.org/jira/browse/AIRFLOW-6778?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

YuChuanQi reassigned AIRFLOW-6778:
----------------------------------

    Assignee: YuChuanQi  (was: Daniel Imberman)

> Add a DAGs PVC Mount Point Option for Workers under Kubernetes Executor
> -----------------------------------------------------------------------
>
>                 Key: AIRFLOW-6778
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-6778
>             Project: Apache Airflow
>          Issue Type: Improvement
>          Components: executor-kubernetes, worker
>    Affects Versions: 1.10.6, 1.10.7, 1.10.8, 1.10.9
>            Reporter: Brandon Willard
>            Assignee: YuChuanQi
>            Priority: Blocker
>              Labels: kubernetes, options
>
> The worker pods generated by the Kubernetes Executor force the DAGs PVC to be 
> mounted at the Airflow DAGs folder.  This, combined with a general inability 
> to specify arbitrary PVCs on workers (see AIRFLOW-3126 and the 
> linked/duplicated issues), severely constrains the usability of worker pods 
> and the Kubernetes Executor as a whole.
>  
> For example, if a DAGs-containing PVC is rooted at a Python package (e.g. 
> {{package/}}) that needs to be installed on each worker (e.g. DAGs in 
> {{package/dags/}}, package install point at {{package/setup.py}}, and Airflow 
> DAGs location {{/airflow/dags}}), then the current static mount point logic 
> will only allow a worker to directly mount the entire package into the 
> Airflow DAGs location  —  while the actual DAGs are in a subdirectory — or 
> exclusively mount the package's sub-path {{package/dags}} (using the existing 
> {{kubernetes.dags_volume_subpath}} config option).  While the latter is at 
> least correct, it completely foregoes the required parent directory and it 
> makes the requisite package unavailable for installation (e.g. the files 
> under {{package/}} are not available).
>  
> -In general, the only approach that seems to work for the Kubernetes Executor 
> is to specify a worker image with all DAG dependencies pre-loaded, which 
> largely voids the usefulness of a single DAGs PVC that can be dynamically 
> updated.  At best, one can include a {{requirements.txt}} in the PVC and use 
> it in tandem with an entry-point script built into the image, but that still 
> doesn't help with source installations of custom packages stored and updated 
> in a PVC.-
> Edit: This isn't even possible, because worker pods are created using [the 
> {{command}} field instead of 
> {{args}}|https://kubernetes.io/docs/tasks/inject-data-application/define-command-argument-container/#notes]!
>  
> A quick fix for this situation is to allow one to specify the DAGs PVC mount 
> point.  With this option, one can mount the PVC anywhere and specify an 
> Airflow DAGs location that works in conjunction with the mount point (e.g. 
> mount the PVC at {{/airflow/package}} and independently set the Airflow DAGs 
> location to {{/airflow/package/dags}}).  This option would — in many cases — 
> obviate the need for the marginally useful {{kubernetes.dags_volume_subpath}} 
> options, as well.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to