On Tue, Nov 3, 2020 at 12:05 PM Alan K Chin <[email protected]> wrote: > @Jarek - It sounds like git-sync is or rather should be the default way users > add/modify DAGs. With that said, have you had any experience with customers > syncing their dags to other forms of dag storage (S3 etc.) and what the > outcomes were? >
We have been using S3 for DAG sync in production for more than a year now. The biggest benefit is we basically never have to worry about scalability or availability compared to other solutions. Access control can be managed through IAM roles, which is entirely transparent to application code. On top of that, the S3 event delivery feature can be leveraged to avoid the pulling loop to make sync almost real time. Only downside is you need to set up a CI/CD pipeline to publish DAG changes to S3. I wrote about our implementation in our tech blog at https://tech.scribd.com/blog/2020/breaking-up-the-dag-repo.html.
