potiuk opened a new issue, #44511:
URL: https://github.com/apache/airflow/issues/44511

   Each provider should have it's own independed sub-project in our mono-repo.
   
   They should be fully "standalone" so that you can not only develop them 
completely independently from airflow core, but also that all the dependencies 
of thoseshould be stored in their own pyproject.toml. And `uv workspace` 
feature should be used to bind together all the provider sub-projects so that 
you can continue the development where you have airflow, task-sdk, providers 
and all the other sub-projects of Airlflow monorepo together in a single 
editable environment.
   
   This move has been attempted earlier in 
https://github.com/apache/airflow/pull/28292 and 
https://github.com/apache/airflow/pull/28291 - but we did not have a good 
"workspace" solution and Airflow 2 namespace approach prevented us from making 
it good environment for provider development. With Airflow 3 and `uv workspace` 
feature that has been added - largely with our input so that Airlfow's provider 
structure could benefit from the `uv workspace` functionality, it's now 
entirely possible to do.
   
   This means that dependencies should be moved from `provirer.yaml` files to 
`pyproject.toml`
   
   The ideal setup there is to have this kind of structure (details to be 
worked out):
   
   ```
   providers/
            providers-amazon/
                           src
                           tests-integration
                           tests-system
                           tests
                           docs
                           pyproject.toml
                           ...
   ```
   
   Some important properties of the solution:
   
   * Airflow "core" projects should not rely on providers being installed
   * It should be possible to install all airflow core packages and providers 
and synchronize/resolves wit `uv sync
   * it should be possible to install provider in `--editable` mode treating it 
as separate project from the workspace
   * It should be possible to install provider  with GitHub URL 
   * docs/ all kinds of tests, images etc. should all work independently 
(though thanks to monorepo, we can keep the code to run those in `breeze` as we 
do currently. Some of the current `doc` code will need to be moved to breeze as 
well for that likely
   * we should be able to apply pyproject.toml changes for all providers 
automatically (might be semi-automated or with pre-commit). Quite often we make 
"global" changes there that affect all providers - and currently it is done via 
modifying breeze and templates for dynamically generated pyproject.toml file
   * we need to keep reproducibility of provider packages intact - which likely 
means that they should be still generated with breeze - with all the "extra" 
stuff such as making sure we have controlled package build environment.
   * we will need to change building of packages in CI in `docker container` 
environment - while currently we use `flit` as build backend and this comes 
from generated `pyproject.toml` that is placed inside breeze, if we keep 
pyproject.toml files in the repository, incoming PRs from forks might change 
build backends and thus inject any code in our build process
   
   The most likely way to implement it is to:
   
   1. manually convert one / few representative but not biggest providers first 
(POC) - and make a few releases with those - while updating our breeze 
automation to work in both cases - that will allow to iron out some teething 
problems
   2. develop automation for converting the providers - similar to 
https://github.com/apache/airflow/pull/28291
   3. perform the test if the `uv workspace` feature is usable at scale of 100+ 
projects bound together (and work with `uv` team to fix it if not)
   4. apply - rather quickly, but incrementally - the automation to all the 
providers of ours - while letting all the in-progress contributors about the 
changes upfront and explaing what needs to be done
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to