Hello everyone,

I have almost ready PR (I need to fix doc building and do some more tests /
docs for local development environments), for the first step of provider
separation to separate sub-projects inside our mono-repo - which has been
discussed and agreed to a long time ago - and I had a POC for it some years
ago, but only recently with `uv workspaces` we could progress with it
(finally the tooling caught up with what Airflow needs). The `uv
workspaces` were implemented by the Astral team after discussing with us
how they should be implemented so that they could be used in Airflow.

PR here: https://github.com/apache/airflow/pull/45259

I have just one provider that is moved" "airbyte" in this PR.

The way it is implemented is that breeze and CI supports both "old style"
and "new style" providers at the same time. I want to move two/three more
providers that are a little more complex - but once completed we should be
able to relatively quickly move all providers one-by-on (I would love the
usual involvement of others - I will create a script to mostly move things
automatically, but there will likely be some small things to fix in each
provider, so better to do it one-by-one, to solve smaller number of
problems at a time. During the move, all regular processes (including all
CI builds and releasing packages) should work as "usual" - relevant breeze
commands are converted to support both cases automatically.

Once completed -> we should be able to remove the complex-ish code for
"old-style" providers. And there are a few next steps - rearrangements on
how we use workspaces for `tests_common` and final split to mutliple
packages for airflow core (and likely moving airflow code to "src"
subdirectory of the project) - but those should be done later once we agree
how exactly we should split the packages. Currently you can't yet run
provider tests in the provider "standalone" because of tests_comon
dependence for example (they have to be run as part of the "airflow"
project). We might also reduce further data kept in provider.yaml (in this
PR dependencies are moved to pyproject.toml of the provider from
provider.yaml). There are few more cleanups there - but it's best to move
the providers first and then do the next steps.

The issue in DEV/CI project for that one
https://github.com/apache/airflow/issues/44511

Here is - in general - the new provider directory structure: in effect,
each provider is a separate "standard" python project, so we will not have
to copy files around when building directories - each provider will be just
another provider package.

providers
        |- PROVIDER_ID
        |            |- src
        |            |    |-airflow
        |            |            |- providers
        |            |                       |- PROVIDER_ID
        |            |- tests
        |            |      |- providers
        |            |                 |- PROVIDER_ID
        |            |- docs
        |            |     |- .latest-doc-only-changes.txt
        |            |- pyproject.toml
        |            |- CHANGELOG.rst
        |            |- provider.yaml
        |            |- README.rst
        |- PROVIDER_ID2
        ...

Looking forward to reviews and merging it soon.

J.

Reply via email to