That's hardcore (pun intended) :D
Great work and good luck merging it!

On Fri, Mar 14, 2025 at 9:28 PM Jarek Potiuk <ja...@potiuk.com> wrote:

> Hey here,
>
> I have a first (very draft and still requires a number of changes) PR for
> the final step of big refactoring of our projects and using workspace. This
> is to let you know about the changes coming (so please take a look at the
> consequences to not be surprised).
>
> This is the most *scary* one -> moving all airflow code to
> "airflow-core". And I have  draft version of it in
> https://github.com/apache/airflow/pull/47798
>
> And it's not for the faint of heart :)
>
> [image: image.png]
>
> Note! It's not yet complete and unless you have some general comments,
> it's likely not worth pointing to individual changes (yet) - it's more to
> take a look at how things will look like eventually. I will work in the
> next two days to get it to  reviewable state, and will keep it rebased and
> running till mid next-week. I would like to have it ready (including the
> release process) for the fourth (and final?) beta).
>
> Some resulting packaging changes:
>
> *FOR DEVELOPMENT:*
>
> * the pyproject.toml in the "root" of Airflow is still "apache-airflow"
> package - but this will be an empty "meta" package that will install
> together "apache-airflow-core", "apache-airflow-task-sdk" and optionally
> providers (via extras)
>
> * the airflow-core is a new "apache-airflow-core" distribution, where only
> airflow dependencies and airflow "core" extras are configured (smtp/ otel,
> pandas,rabbitmq etc) - I will likely cleanup some of those as well, some of
> them are not needed. the nice thing is that this package has all
> dependencies static (no hatch_build.py - everything is in pyproject.toml) -
> which is pretty cool and allow us to better use dependabot for security
> upgrades and notifications
>
> The airflow-core structure is pretty standard:
>
> airflow-core  # <- this is folder where airflow-core distribution is
>             \- src
>             |     \ airflow # <- This is airflow package
>             |             \- api
>             |             |- api_fastapi
>             |             |- assets
>             |             ...
>             |- tests
>             |       \- always
>             |       |- api
>             |       ...
>             |- docs
>             |
>             |- pyproject.toml
>             |- README.md
>
>
> * for development - i will describe later the `pypi` way, but with `uv`
> things get simpler and we have a few new options (Dennis - this is
> continuation of discussion on the uv sync commands, so it's worth to
> look closely:
>
> There are a number of ways you will be (eventually able to interact with
> venv. After you checkout Airflow. You can change working directory and work
> on different packages and depending on which directory you run `uv sync` -
> uv (using workspace feature) will sync the **expected** dependencies.
>
> It's best to get used to the fact that instead of one airflow project we
> will have ~100 pretty independent projects, and while you can continue
> working with all of them as a single huge "workspace", it is generally way
> more convenient to change directory to the "distribution" you are working
> on currently and do everything there - with isolated set of dependencies
> required only for that "distribution" - "airflow-core", "task-sdk",
> "providers/amazon", "providers/mongo" - those are all separate
> distributions, and more and more we will be able to treat them as
> independent projects (but we will conveniently keep the option to develop
> and run tests in a joined "workspace" environment at the top of the project
> where we can install and test everything together - that's a bit of `uv
> workspace` magic in play.
>
> Here are typical patterns:
>
> 1) Installing all development dependencies for everything (I.e complete
> environment like in breeze)  -- allows to run all tests for all airflow and
> all providers
>
> cd .
> uv sync --all-packages
>
> 2) installing just airflow core with required dependencies (ready for most
> core tests)
>
> cd airflow-core
> uv sync
>
> 3) installing airflow core with optional dependencies (should allow to run
> all core tests - including for the optional core features such as otel etc).
>
> cd airflow-core
> uv sync --all-extras
>
> 4) installing individual provider dependencies (say amazon) - this allows
> to run all tests of the provider you are working on - including installing
> all dependencies from cross-provider dependencies (i.e. if you have google
> tests in amazon provider, it will also install necessary google
> dependencies).
>
> cd providers/amazon
> uv sync
>
> Generally speaking - "airflow-core" will become (eventually) a truly
> airflow-only distribution. It will have a few dependencies to "standard"
> and "fab" providers - but I hope we will be able to get rid of those during
> the resulting cleanup.
>
> The IDE (IntelliJ) setting will just require "airflow-core/src" and
> "airflow-core/tests" to be source/test roots as usual for other
> distributions.
>
> I will update the docs after I complete the PR, there are some small
> variations on when to install which extras and I will play a bit to get to
> the best developer experience and least surprises.
>
> *FOR USERS*
>
> For "installable" airflow (i.e. user's experience) - the changes will be
> pretty much 100% transparent. When user will install "apache-airflow" or
> "apache-airflow[google]" - things will work as they did before - only
> instead of one "apache-airflow" distribution, they will have
> "apache-airflow", "apache-airflow-core" and "apache-airflow-task-sdk"
> installed.
>
> Regarding version numbers etc., I will start a separate discussion - later
> next week after we see how those packages will interact ("apache-airflow"
> will only contain extras, but for compatibility reasons we likely want to
> pin both "apache-airflow" and "apache-airflow-core" to each other, so that
> users will be able to upgrade "core" by upgrading "apache-airflow" - we do
> not want to change those habits likely.
>
> The "apache-airflow-task-sdk" will be versioned separately.
>
> Please take a look - also at the PR, see if you have any big
> issues/questions/doubts - let's start discussion here - I am happy to
> answer all general questions and adapt the PR to respond to
> questions/suggestions.
>
> In the meantime I will be working on making the PR green and adding
> missing bits and pieces for the release process.
>
> J.
>
>
>
>

Reply via email to