potiuk opened a new issue, #44511: URL: https://github.com/apache/airflow/issues/44511
Each provider should have it's own independed sub-project in our mono-repo. They should be fully "standalone" so that you can not only develop them completely independently from airflow core, but also that all the dependencies of thoseshould be stored in their own pyproject.toml. And `uv workspace` feature should be used to bind together all the provider sub-projects so that you can continue the development where you have airflow, task-sdk, providers and all the other sub-projects of Airlflow monorepo together in a single editable environment. This move has been attempted earlier in https://github.com/apache/airflow/pull/28292 and https://github.com/apache/airflow/pull/28291 - but we did not have a good "workspace" solution and Airflow 2 namespace approach prevented us from making it good environment for provider development. With Airflow 3 and `uv workspace` feature that has been added - largely with our input so that Airlfow's provider structure could benefit from the `uv workspace` functionality, it's now entirely possible to do. This means that dependencies should be moved from `provirer.yaml` files to `pyproject.toml` The ideal setup there is to have this kind of structure (details to be worked out): ``` providers/ providers-amazon/ src tests-integration tests-system tests docs pyproject.toml ... ``` Some important properties of the solution: * Airflow "core" projects should not rely on providers being installed * It should be possible to install all airflow core packages and providers and synchronize/resolves wit `uv sync * it should be possible to install provider in `--editable` mode treating it as separate project from the workspace * It should be possible to install provider with GitHub URL * docs/ all kinds of tests, images etc. should all work independently (though thanks to monorepo, we can keep the code to run those in `breeze` as we do currently. Some of the current `doc` code will need to be moved to breeze as well for that likely * we should be able to apply pyproject.toml changes for all providers automatically (might be semi-automated or with pre-commit). Quite often we make "global" changes there that affect all providers - and currently it is done via modifying breeze and templates for dynamically generated pyproject.toml file * we need to keep reproducibility of provider packages intact - which likely means that they should be still generated with breeze - with all the "extra" stuff such as making sure we have controlled package build environment. * we will need to change building of packages in CI in `docker container` environment - while currently we use `flit` as build backend and this comes from generated `pyproject.toml` that is placed inside breeze, if we keep pyproject.toml files in the repository, incoming PRs from forks might change build backends and thus inject any code in our build process The most likely way to implement it is to: 1. manually convert one / few representative but not biggest providers first (POC) - and make a few releases with those - while updating our breeze automation to work in both cases - that will allow to iron out some teething problems 2. develop automation for converting the providers - similar to https://github.com/apache/airflow/pull/28291 3. perform the test if the `uv workspace` feature is usable at scale of 100+ projects bound together (and work with `uv` team to fix it if not) 4. apply - rather quickly, but incrementally - the automation to all the providers of ours - while letting all the in-progress contributors about the changes upfront and explaing what needs to be done -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
