I'd also definitely be interested in this, as we have a asyn cron job that syncs with a remote S3 location. I'd also be happy to help tackle some of this work if there's a ticket involved.
On Thu, Mar 15, 2018 at 4:38 PM, Joy Gao <[email protected]> wrote: > Hi guys, > > A related topic has been discussed recently via a separate email thread > (see 'How to add hooks for strong deployment consistency? > <https://lists.apache.org/thread.html/%3CCAB= > [email protected]%3E> > ') > > The idea brought up by Maxime is to modify DagBag and implement a > DagFetcher abstraction, where the default is "FileSystemDagFetcher", but it > open up doors for "GitRepoDagFetcher", "ArtifactoryDagFetcher", > "TarballInS3DagFetcher", or in this case, "HDFSDagFetcher", "S3DagFetcher", > and "GCSDagFetcher". > > We are all in favor of this, but as far as I'm aware no one has owned this > yet. So if you (or anyone) wants to work on this, please create a JIRA and > call it out :) > > Cheers, > Joy > > > > On Thu, Mar 15, 2018 at 3:54 PM, Chris Fei <[email protected]> wrote: > > > Hi Diogo, > > > > This would be valuable for me as well, I'd love first-class support for > > hdfs://..., s3://..., gcs://..., etc as a value for dags_folder. As a > > workaround, I deploy a maintenance DAG that periodically downloads other > > DAGs from GCS into my DAG folder. Not perfect, but gets the job done. > > Chris > > > > On Thu, Mar 15, 2018, at 6:32 PM, Diogo Franco wrote: > > > Hi all, > > > > > > I think that the ability to fill up the DagBag from remote > > > locations would> be useful (in my use case, having the dags folder in > > HDFS would > > > greatly> simplify the release process). > > > > > > Was there any discussion on this previously? I looked around > > > briefly but> couldn't find it. > > > > > > Maybe the method **DagBag.collect_dags** in *airflow/models.py *could> > > delegate the walking part to specific methods based on the > > > *dags_folder *prefix, > > > in a sort of plugin architecture. This would allow the > > > dags_folder to be> defined like hdfs://namenode/user/airflow/dags, or > > s3://... > > > > > > If this makes sense, I'd love to work on it. > > > > > > Cheers, > > > Diogo Franco > > > > >
