> The main extension point that is missed by us is DagBag. Its current 
> implementation is multi-layered, and it looks like it's starting to be the 
> main point of interest around the community.
>
> DagBag affects user deployments in the following ways:
>   * being able to read dags from remote FS/DB
>   * being able to read dags from different sources (starting from multiple 
> folders..)
>   * being able to read specific version of the DAG (used by UI, now we have 
> optimization around the latest version of DAG coming from DB, but UI still 
> can't see the old version))
>   * being able effectively to generate DAGs without putting the code into 
> dag.py

I think next week we have the call about dag serialisation
https://lists.apache.org/thread.html/rb3b89fb90709782c78fdbd2c181a7df4acc34c8ff7ed997e4b9792a7%40%3Cdev.airflow.apache.org%3E
- so part of this could be interesting. But It looks to me that we
need to restart the discussion on DAG fetching. It was a heated
discussion some year ago, but seems there is a renewed interest for it
in the community. Maybe someone would like to take a lead on it and
start a separate discussions on it?

Any other reasons people are modifying Airflow's source code to get
what Airflow does not give them?

J.
>
> The workaround for this is in the monkey-patching airflow code. However, 
> that's not a solution, defining this extension as a plugin can help a lot!
> DagBag is a one-purpose element, the API around it can be easily defined.
>
> Another area is hooks around Operator execution. I'll leave it for another 
> day :)
>
> Evgeny.
> databand.ai
>
> On 2020/02/18 12:33:17, Jarek Potiuk <jarek.pot...@polidea.com> wrote:
> > I have another discussion to start. We've recently talked to a number of
> > customers who are extending airflow. It's often the case that people are
> > modifying airflow's source code and later have a hard time with updating it
> > when newer versions of Airflow are released.
> >
> > This is a common trait - we've heard the same story from at least three of
> > our customers and it was also mentioned today at Slack by one of the users:
> > https://apache-airflow.slack.com/archives/CSS36QQS1/p1582014236100800  and
> > I heard some anecdotal evidence of people doing it at many events I spoke.
> >
> > We have a plugin mechanism that allows for some extensibility but I wonder
> > if we should do something more complete. Maybe we could gather from people
> > a list of ways people are extending Airflow currently by modifying it's
> > code and maybe we can come up with some "extension points" that we might
> > introduce to Airflow to let them add custom functionality they want rather
> > than modifying Airflow's code.
> >
> > I think many of the extensions do not need to modify Airflow's source code,
> > and there won't be many extensions (maybe even we already have all that we
> > need but people do not know that they can extend airflow without modifying
> > the code. I think it would require a bit more description (and maybe some
> > verification) of what internal API of Airflow is for those extensions so
> > that we can keep backwards compatibility
> >
> > Let me know what you think? Is it worth it? Do you see any problems with
> > trying to manage that? Maybe that's something we could introduce in 2.0 (at
> > least by better documenting what is "an extension" and providing some
> > examples on how Airflow can be extended.
> >
> > J.
> >
> > --
> >
> > Jarek Potiuk
> > Polidea <https://www.polidea.com/> | Principal Software Engineer
> >
> > M: +48 660 796 129 <+48660796129>
> > [image: Polidea] <https://www.polidea.com/>
> >



--

Jarek Potiuk
Polidea | Principal Software Engineer

M: +48 660 796 129

Reply via email to