Hi all, Per another thread, I wanted to share the video from a recent talk I gave at PyData DC 2016 about my company's experience finding and modding Airflow for our needs, and our design paradigm around ETLs:
https://www.youtube.com/watch?v=60FUHEkcPyY&index=35&list=PLGVZCDnMOq0qLoYpkeySVtfdbQg1A_GiB In that talk I discuss a file dependency system mod we added to our Airflow installation, the concept of which we ported from other pipelining systems we had evaluated like Make/Drake, Pydoit, and Luigi. It supports storing and accessing arbitrary formatted files between dependent tasks so you can ship data down a DAG. Some people at the talk were interested to see it, so we open sourced it here and are interested in feedback or ideas from the dev list for how to make it better: https://github.com/industrydive/fileflow Thanks! Laura
