Hello here,

It took a bit more than I thought.

Painting the corners of the house and all the small cabintest is sometimes
difficult and requires a very diverse set of painting tools and a bit of
finesse and cleaning up things here and there discovered while doing the
repainting. You can even discover a few dead bodies in the closet (I did)
and you need to get rid of them.

But I think I am quite close to making the PR green (speaking of colors).
https://github.com/apache/airflow/pull/48223. - it also includes a
newsfragment describing the "user-facing" changes that introducing
splitting the distributions of airflow, extras brings.

We have now 4 more "internal" distributions in our code - all with their
own ``pyproject.toml`` and requirements  - and you can easily switch
between them with 'cd distribution`, `uv sync`.

* docker-tests
* kubernetes-tests
* helm-tests
* dev

One (and only one) backwards incompatible change introduced so far in the
installation process is that `leveldb` extra is now removed from both
`apache-airflow` and `apache-airflow-core`, and the way to enable it is to
install `apache-airflow-providers-google[leveldb]` - which I think makes
more sense and is a very small incompatibility. Also we have now very
clearly defined `[all]` and `[all-core]` extras. that should be nice for
CI/testing etc for our users. All the other extras are gone and either
moved to internal distributions or linked via dependency groups.

I still (as usual) have to wrestle a bit with sphinx speaking riddles, and
waiting for the current providers release to unblock "pip constraints" and
some small dev dependencies setup.

The PR - it's - of course - huge (and it's hard to make it smaller - but a
lot of it is just moving files) - and I tried to be as detailed as possible
with describing what's in in the description (in short, a lot of small
things that had to be fixed).
This is **still** not the final setup - and we will all continue iterating
on it over the next weeks and months - even after we release Airflow 3.

There are still some TODOs and improvements to be done - and I described
also what is NOT yet done in the PR yet and deferred for later - I prefer
to solve any teething problems this week (assuming that we will merge it
early this week) and iterate on some of the other things later.

For example I hope I will be able to make "doc" generation much easier even
before we will have to make some of the airflow 3 documentation updates and
few other things that might make it easier to run testing locally
(eventually I want to push down vast majority of the tests to be both
runnable in CI (for reproducibility) and in local venv, so that you will
not need breeze CI image as often as before - but that requires switching
to `uv.lock` mechanism (this is coming right after hopefully).

So if you are tempted to comment "let's do this and that" - first consult
the description of the PR / commit message - and see if what you propose is
not already planned.

The ultimate goal I have is that each distribution we have  in the monorepo
is **really** isolated and only uses the other distributions via
pyporoject.toml dependencies - we are not yet there, but it's getting
closer.

Once we will reach that goal - moving the distributions should take hours
rather than weeks (as it is currently).

So TP - if you still want to have a discussion about moving distributions
around and repainting our house again, we will be able to do it some time
later much more easily (even after we release airflow 3.0) - so I continue
to suggest that discussion about it is raised on the devlist - but later,
please, it is already taking a lot of time to implement and test things now
while "small" changes are done in our setup.

J.


On Thu, Mar 27, 2025 at 4:51 PM Jarek Potiuk <ja...@potiuk.com> wrote:

> Hello,
>
> The lazy consensus has been reached.
>
> I have a PR [1] that I hope to make green soon - it moves a number of
> (mostly tests) files around and creates few more internal distributions -
> all dependencies are move to corresponding pyproject.toml files and a lot
> of custom code to create environments to run our tests is going to be
> simplified, essentially making it into a pattern of:
>
> cd DISTRIBUTION (say `cd kubernetes-tests`)
> uv run pytest
>
> I am also moving the `docs` code to `devel-common` with the ultimate goal
> of having specific configuration per providers, airflow-core, helm chart
> and others to be "isolated". We are not yet there - but I did not want to
> make too many changes at once, but the next step will be to separate those
> projects and with the power of `uv` build the docs for each package locally
> - without having to do it all in the image and from within the
> "distribution". I hope to be able to complete it next week, it does require
> some unentangling of interconnected test and doc harness, but maybe (let's
> see I will finally be able to isolate it enough to be able to iterate on
> the docs and rebuild it mostly on-the flight while you are working on them.
> That's a bit of side effect of all the uv/standard distribution approach -
> and this splitting is a necessary step to make it happen.
>
> My personal goal is to make it isolated enough and simple enough to move
> distributions (i.e. folders) around that once we release airflow 3 or even
> before we will be able to finally have the discussion here "Ok we have now
> split and isolated everything and maybe we want to move things around to
> make it even better-structured and - yes - maybe we will - together figure
> out some better consistency patterns.
>
> Maybe then repainting the interiors will be - indeed - an afterthought and
> something that we will be able to do with a flick of a finger. We are not
> yet there, but I am working to get us there.
>
> https://github.com/apache/airflow/pull/48223
>
> J.
>
>
> On Mon, Mar 24, 2025 at 12:36 PM Jarek Potiuk <ja...@potiuk.com> wrote:
>
>> Also: The lazy consensus will be reached if no objections are raised in
>> 72 HRs: 1pm CET, 23 March, 2025
>> https://www.timeanddate.com/countdown/generic?iso=20250327T13
>>
>> On Mon, Mar 24, 2025 at 10:53 AM Jarek Potiuk <ja...@potiuk.com> wrote:
>>
>>> Hello here,
>>>
>>> *TL;DR; Following the
>>> discussion https://lists.apache.org/thread/88yj3qxqdmc4ony7k8nvp292m28df31c
>>> <https://lists.apache.org/thread/88yj3qxqdmc4ony7k8nvp292m28df31c> I am
>>> calling for a lazy consensus on using only `uv` for local development of
>>> Airlfow. *
>>>
>>> This adds a bit of reliance on a VC-backed astral-owned (but fully
>>> open-sourced) `uv` which has some risks, but we think the risks are small.
>>> It's only for airflow development - we will leave constraint mechanism for
>>> our users, so that their workflows will not require to change the frontend
>>> they use to install airflow (both `pip` and `uv` supporting our
>>> constraints, `poetry` and others installing airflow without constraints and
>>> having to manage their own limits - that does not change)
>>>
>>> A bit more context and summary of the discussion follows.
>>>
>>> I also have a half-ready change that performs all the removals and I
>>> have a bit more info on what's going to happen when we go `uv only.
>>>
>>> ---------
>>>
>>> More context:
>>>
>>> Here is what we will get if go `uv` only:
>>>
>>> Good:
>>>
>>> * currently we support `uv` and `pip` (or more generally "any
>>> PEP-standard frontend) workflows
>>> * however, some uv features (particularly `workspaces` and `uv.lock`)
>>> that are not yet standard, make the development workflows of ours very easy
>>> to follow (AKA uv provides much better "Contributor journey optimisation")
>>> * by focusing on "one way to do things" with uv we will be able to
>>> simplify a lot of contributor experience but also a lot of our packaging
>>> code and configuration:
>>>   - we will be able to get rid of the dynamic  pyproject.toml
>>> dependencies around preinstalled providers
>>>   - we will be able to get rid of hatch_build.py in the main repo
>>> completely
>>>   - we will only leave `hatch_build.py` in the "airflow-core" for custom
>>> build step compiling assets and performing git version injection (but this
>>> is purely optional and used only when releases are prepared)
>>>   - we will be able to get rid of generated_provider_dependencies.json
>>> prepared by pre-commit - it was needed for hatch_build.py mainly and we can
>>> replace its use completely by directly reading pyproject.toml files in
>>> pre-commit and build scripts.
>>>  - that in turn will enable us (eventually - that is another change) to
>>> switch to dependabot-only driven process of upgrading our dependencies and
>>> use `uv.lock` for local development and avoiding "works for me" syndrome -
>>> this will make it more controllable when things are going to break main and
>>> our "responses" will based on Dependabot PRs rather than notification of
>>> canary build breaking, which will make it - I think much easier to handle.
>>>
>>> * uv is true open-source - i.e. OSI-compliant licences (uv is
>>> dual-licenced with Apache 2 and MIT)
>>>
>>> Somewhat concerning (but acceptable risk it seems):
>>>
>>> * Those features we rely on are `uv-only` for now. The uv team is
>>> committed to support standards and actively participate in all the PEP
>>> packaging discussions - including reproducible installation, dependency
>>> groups and others (workspaces in the future most likely) - which means that
>>> in the future other tools will catch up and we will be able to follow
>>> standards when they are approved and implemented.
>>>
>>> * unlike other packaging tools (under Python Software Foundation -
>>> Python Packaging Governance) - `uv` is privately owned, by VC-backed
>>> Astral. They are great open-source stakeholders and understand the value of
>>> `uv` being open-source and we personally know the team and they have very
>>> good intentions, no doubt about it. But as various events have shown in the
>>> past - change-of-control events might influence "openness", so we cannot be
>>> sure what the future brings.
>>>
>>> * However, being "true OSS" - `uv` can be forked by anyone if anything
>>> bad happens. Also it does not impact our users - just contributing
>>> workflow, so the risks are small and mitigable (like freezing tooling and
>>> working on a solution in the meantime).
>>>
>>>
>>> J.
>>>
>>>
>>>

Reply via email to