> The main extension point that is missed by us is DagBag. Its current > implementation is multi-layered, and it looks like it's starting to be the > main point of interest around the community. > > DagBag affects user deployments in the following ways: > * being able to read dags from remote FS/DB > * being able to read dags from different sources (starting from multiple > folders..) > * being able to read specific version of the DAG (used by UI, now we have > optimization around the latest version of DAG coming from DB, but UI still > can't see the old version)) > * being able effectively to generate DAGs without putting the code into > dag.py
I think next week we have the call about dag serialisation https://lists.apache.org/thread.html/rb3b89fb90709782c78fdbd2c181a7df4acc34c8ff7ed997e4b9792a7%40%3Cdev.airflow.apache.org%3E - so part of this could be interesting. But It looks to me that we need to restart the discussion on DAG fetching. It was a heated discussion some year ago, but seems there is a renewed interest for it in the community. Maybe someone would like to take a lead on it and start a separate discussions on it? Any other reasons people are modifying Airflow's source code to get what Airflow does not give them? J. > > The workaround for this is in the monkey-patching airflow code. However, > that's not a solution, defining this extension as a plugin can help a lot! > DagBag is a one-purpose element, the API around it can be easily defined. > > Another area is hooks around Operator execution. I'll leave it for another > day :) > > Evgeny. > databand.ai > > On 2020/02/18 12:33:17, Jarek Potiuk <jarek.pot...@polidea.com> wrote: > > I have another discussion to start. We've recently talked to a number of > > customers who are extending airflow. It's often the case that people are > > modifying airflow's source code and later have a hard time with updating it > > when newer versions of Airflow are released. > > > > This is a common trait - we've heard the same story from at least three of > > our customers and it was also mentioned today at Slack by one of the users: > > https://apache-airflow.slack.com/archives/CSS36QQS1/p1582014236100800 and > > I heard some anecdotal evidence of people doing it at many events I spoke. > > > > We have a plugin mechanism that allows for some extensibility but I wonder > > if we should do something more complete. Maybe we could gather from people > > a list of ways people are extending Airflow currently by modifying it's > > code and maybe we can come up with some "extension points" that we might > > introduce to Airflow to let them add custom functionality they want rather > > than modifying Airflow's code. > > > > I think many of the extensions do not need to modify Airflow's source code, > > and there won't be many extensions (maybe even we already have all that we > > need but people do not know that they can extend airflow without modifying > > the code. I think it would require a bit more description (and maybe some > > verification) of what internal API of Airflow is for those extensions so > > that we can keep backwards compatibility > > > > Let me know what you think? Is it worth it? Do you see any problems with > > trying to manage that? Maybe that's something we could introduce in 2.0 (at > > least by better documenting what is "an extension" and providing some > > examples on how Airflow can be extended. > > > > J. > > > > -- > > > > Jarek Potiuk > > Polidea <https://www.polidea.com/> | Principal Software Engineer > > > > M: +48 660 796 129 <+48660796129> > > [image: Polidea] <https://www.polidea.com/> > > -- Jarek Potiuk Polidea | Principal Software Engineer M: +48 660 796 129