amoghrajesh commented on PR #59956: URL: https://github.com/apache/airflow/pull/59956#issuecomment-3709189310
> > Docs: Unsure what to do about it, should we continue endorsing imports of format: from airflow.plugins_manager or move to from airflow.sdk.plugins_manager? I am doubtful because plugins can also be used for CORE only modifications like react_apps or fastapi_apps. > > I think what we should do now is to define `sdk.AirflowProviderPlugin` and `airflow.AirflowCorePlugin` and make the old "AirflowPlugin" (for now) - and possibly we can already start importing the **right** ones in the right places as base classes. There is a bit of difficulty with Listeners though - because some listeners are "core" listeners, and some are "worker" listeners - but I think making it clear which listener is "task-sdk" and which one is "core" is something that we should do long time ago anyway. Our users (and ousrselves) are utterly confused where each listener is executed and it would be nice to make it clear by separating the "core" and "task-sdk" plugin interfaces now - even if for now we will use the same discovery mechanism in both. Possibly also eventually those two different plugin types should be advertised using different entrypoints - but that's something we can do later likely. > > A bit more context why and what is my "North Star" here. > > I think **eventually** we should have two (or maybe even 3) types of plugins. I thought about it in the past when discussing the task isolation with @ashb and splitting the distributions, and I am quite sure we will have to do it sooner or later (and rather sooner) and have different distribution for "core" and "task.sdk" plugins - when things are different than regular "providers" (i.e. when we have provider-specific executors, listeners and macros). > > This is related to something we discussed yesterday with @kacpermuda in [#59921 (comment)](https://github.com/apache/airflow/pull/59921#issuecomment-3700663762) -> and it's clearly visible there because as of Airflow 3.2. openlineage is a bit in a Shroedinger state - it both needs, and does not need `sqlalchemy`, depending if it is installed with `airflow-core` (needs), or with `task-sdk` (might use when installed but does not need it). We get-by for now by declaring sqlalchemy as an optional dependency, and handling the optionality in the "worker" side of things. > > I don't think we are "ready" to discuss the exact way how to do the split - with naming and implementation details - (we need to complete core isolation from task.sdk first). > > But I think I have a very clear idea how it can work on high level (and would be the ultimate distribution split I had in mind when we started discussing task-isolation). > > My thinking is that we might have two types of distribution - current `task-sdk providers` (hooks, operators, macros and other task-sdk side plugins) - and `core-plugins` (`apache-airflow-core-plugins-openlineage`) where we will only have "core" types of plugins. Possibly even later we can split out `UI` plugin types `apache-airflow-ui-plugins-openlineage` ??) - but that would only be needed if we decide to split out "airflow-ui" from "airflow-core" - which we might not want to do at all - depending on how different "UI" and "core" dependencies turn out after we complete current isolation work. > > That would be very, very visible with edge provider - which (if we do all the split till the end) should have all 3 (!) types of distributions (cc: @jscheffl): > > * `apache-airflow-providers-edge3` -> worker side (edge3 does not have task-sdk plugins I think but they coudl be defined here) > * `apache-airflow-core-plugins-edge3` -> executor > * `apache-airlfow-ui-plugins-edge3` -> UI plugin > > Similarly: > > * `apache-airflow-providers-amazon` (including amazon-specific macros if there are any). > * `apache-airflow-core-plugins-amazon` -> executor > > A little bit of problem there is that those "separate" distributions will necessarlily need to share some code. But this problem is largely solved now with `shared` concept of ours. We would just have to extend the `shared` concept we have now to providers - because those different distributions (say for amazon) will have some common code. We should have for example `shared.amazon_common` and use it in both `apache-airflow-providers-amazon` and `apache-airflow-providers-core-plugins` for example. > > That is my `dream structure of Airflow 3` that is my "North Star". > > We do not have to decide on all that now - naming, folders, etc. can (and should be) discussed later when we complete the current isolation work. But we can do some preliminary work - i.e. start separating "types" of plugins. > > > Guess I can move most of the tests to shared > > Yes. That's a lot of good detail and very valid! I had similar thoughts and would probably want to have a plugins sdk some time in the future (I was discussing this with @uranusjr just a while earlier) and the motivation to do that was https://github.com/apache/airflow/issues/59093. The thought was to have a plugin SDK that allows loading components into protected processes without serialization. But we can probably chat about that just a little later. @potiuk would you be OK to throw these thoughts you have into a issue for broader reach and formulation of some plan? Let me know if you would want me to take a look too -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
