CorsettiS opened a new issue, #30347: URL: https://github.com/apache/airflow/issues/30347
### Description Just like the logs, it would be interesting and very useful to allow the _DAGS_FOLDER_ & _PLUGINS_FOLDER_ to refer to a path in any cloud provider. Ideally, a connection to the cloud provider should be created only when the DAGs are being parsed. A possible implementation I thought about is to create temp dirs with the downloaded contents of the _DAGS_FOLDER_ & _PLUGINS_FOLDER_ buckets, refer to these temp dirs by _DAGS_FOLDER_ & _PLUGINS_FOLDER_ and every time that the dags are parsed, the contents from the cloud are downloaded to other dynamically-created temp dir and compared with the ones in use, replacing eventual dags or plugins that may have changed. That is 1. creates new config vars called **remote_dags_folder_conn_id** & **remote_plugins_folder_conn_id** so we can fetch the credentials for the cloud providers using similar mechanism that is currently used for remote logs 2. if _DAGS_FOLDER_ or _PLUGINS_FOLDER_ starts with **s3://** , **gs://** , etc then: 3. A temp dir is created locally 4. The files from the referred cloud path are downloaded into the temp dir and used as the dag_folder & plugins_folder reference for airflow 5. When dags folder is being parsed, the updated version is downloaded to a new dynamically created temp dir that is compared with the current temp dir used, and in the case of eventual changes the latest one overwrites the previous one where it is needed ### Use case/motivation First motivation is that it would make it easier to deal with airflow on kubernetes, where both the scheduler and worker need to have access to up-to-date dags & plugins folders and as of now it is not straightforward to set it up, specially considering that the current best approach involves gitSync, which sometimes may not work due to some blocks from the company's cluster (which is my case). By having those folders reachable from a cloud provider, airflow setup becomes easier as a whole. Second motivation is that it is a very elegant way of decoupling airflow into infrastructure & components. Many organisations have a unified git repo containing airflow dags & plugins plus infra-specific files (Dockerimage, Docker compose yaml file, .txt files listing libraries, etc), and it would be nice to at least have the possibility of separating that. ### Related issues _No response_ ### Are you willing to submit a PR? - [ ] Yes I am willing to submit a PR! ### Code of Conduct - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
