[
https://issues.apache.org/jira/browse/AIRFLOW-6953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Nicolas updated AIRFLOW-6953:
-----------------------------
Description:
As far as I can tell, there is no easy way to mount one or more volumes into
all worker pods when using the {{KubernetesExecutor}}. Though most of the code
infrastructure is there (add to address AIRFLOW-3022) one has to add this kind
of snippet of code in all his/her DAG to be able to mount a volume
systematically.
{code:python}
volume_config = {
'nfs': {'path': '/',
'server': 'my-server.provider.cloud',
},
}
default_args = {
# Other args ...
'executor_config': {
'KubernetesExecutor': {
'volumes': [{'name': 'nfs-synchronised-dags', **volume_config}],
'volume_mounts': [{'name':'nfs-synchronised-dags',
'mountPath': '/usr/local/airflow/dags',
'readOnly': True,
'subPath': None,
},
],
},
},
}
{code}
Not very DRY.
This makes it particularly difficult to use methods other than git-sync or a
PVC to mount a DAG volume. On the long run allowing this could also great
simplify the config: instead of having many config stanzas for specific case,
you just declare volumes, volumes
mount and on happens to be a DAG volume, it is only a matter to setting the
right path into dags_folder. I really don't understand why Airflow cares so
much about PVC and such, that's K8S's deal.
It would be a matter of adding these sections to Airflow's configuration:
{code:java}
[kubernetes_volumes]
# e.g. for an NFS volume
<volume name1> = {"nfs": {"path": "/", "server": "my-server.provider.could"}}
<volume name2> = ...
[kubernetes_volume_mounts]
<volume name1> = {"mountPath": "/usr/local/airflow/dags", "readOnly": true,
"subPath": null}
<volume name2> = ..{code}
The key is the name of the volume and the value is the volume spec, as a JSON
document.
was:
As far as I can tell, there is no easy way to mount one or more volumes into
all worker pods when using the {{KubernetesExecutor}}. Though most of the code
infrastructure is there (add to address AIRFLOW-3022) one has to add this kind
of snippet of code in all his/her DAG to be able to mount a volume
systematically.
{code:python}
volume_config = {
'nfs': {'path': '/',
'server': 'fs-23d987e8.efs.eu-west-1.amazonaws.com',
},
}
default_args = {
# Other args ...
'executor_config': {
'KubernetesExecutor': {
'volumes': [{'name': 'nfs-synchronised-dags', **volume_config}],
'volume_mounts': [{'name':'nfs-synchronised-dags',
'mountPath': '/usr/local/airflow/dags',
'readOnly': True,
'subPath': None,
},
],
},
},
}
{code}
Not very DRY.
This makes it particularly difficult to use methods other than git-sync or a
PVC to mount a DAG volume. On the long run allowing this could also great
simplify the config: instead of having many config stanzas for specific case,
you just declare volumes, volumes
mount and on happens to be a DAG volume, it is only a matter to setting the
right path into dags_folder. I really don't understand why Airflow cares so
much about PVC and such, that's K8S's deal.
It would be a matter of adding these sections to Airflow's configuration:
{code:java}
[kubernetes_volumes]
# e.g. for an NFS volume
<volume name1> = {"nfs": {"path": "/", "server": "my-server.provider.could"}}
<volume name2> = ...
[kubernetes_volume_mounts]
<volume name1> = {"mountPath": "/usr/local/airflow/dags", "readOnly": true,
"subPath": null}
<volume name2> = ..{code}
The key is the name of the volume and the value is the volume spec, as a JSON
document.
> KubernetesExecutor: ba able to mount volumes in all worker pods
> ----------------------------------------------------------------
>
> Key: AIRFLOW-6953
> URL: https://issues.apache.org/jira/browse/AIRFLOW-6953
> Project: Apache Airflow
> Issue Type: Improvement
> Components: configuration
> Affects Versions: 1.10.0, 1.10.1, 1.10.2, 1.10.3, 1.10.4, 1.10.5, 1.10.6,
> 1.10.7, 1.10.8, 1.10.9
> Environment: Linux (Debian), on K8S
> Reporter: Nicolas
> Priority: Major
>
> As far as I can tell, there is no easy way to mount one or more volumes into
> all worker pods when using the {{KubernetesExecutor}}. Though most of the
> code infrastructure is there (add to address AIRFLOW-3022) one has to add
> this kind of snippet of code in all his/her DAG to be able to mount a volume
> systematically.
> {code:python}
> volume_config = {
> 'nfs': {'path': '/',
> 'server': 'my-server.provider.cloud',
> },
> }
> default_args = {
> # Other args ...
> 'executor_config': {
> 'KubernetesExecutor': {
> 'volumes': [{'name': 'nfs-synchronised-dags', **volume_config}],
> 'volume_mounts': [{'name':'nfs-synchronised-dags',
> 'mountPath': '/usr/local/airflow/dags',
> 'readOnly': True,
> 'subPath': None,
> },
> ],
> },
> },
> }
> {code}
> Not very DRY.
> This makes it particularly difficult to use methods other than git-sync or a
> PVC to mount a DAG volume. On the long run allowing this could also great
> simplify the config: instead of having many config stanzas for specific case,
> you just declare volumes, volumes
> mount and on happens to be a DAG volume, it is only a matter to setting the
> right path into dags_folder. I really don't understand why Airflow cares so
> much about PVC and such, that's K8S's deal.
> It would be a matter of adding these sections to Airflow's configuration:
> {code:java}
> [kubernetes_volumes]
> # e.g. for an NFS volume
> <volume name1> = {"nfs": {"path": "/", "server": "my-server.provider.could"}}
> <volume name2> = ...
> [kubernetes_volume_mounts]
> <volume name1> = {"mountPath": "/usr/local/airflow/dags", "readOnly": true,
> "subPath": null}
> <volume name2> = ..{code}
> The key is the name of the volume and the value is the volume spec, as a JSON
> document.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)