potiuk commented on PR #43001:
URL: https://github.com/apache/airflow/pull/43001#issuecomment-2412632076
Yeah. that's expected and it should fix itself when we merge to main (you
will be able to see it in `canary` builds):
This piece of output is:
```
#48 4.743 Installing airflow from main. It is used to cache dependencies
#48 4.743
#48 4.745 + curl -fsSL
https://github.com/apache/airflow/archive/main.tar.gz
#48 4.745 + tar xz -C /tmp/tmp.0VWy67lki1 --strip 1
#48 6.344 + uv pip install --python /usr/local/bin/python --editable
'/tmp/tmp.0VWy67lki1[devel-ci]'
```
If you look closely - it downloads airlfow from `main` and then uses it to
install it locally. This is a "smart" way of caching - we are downloading the
"main" version so that we can pre-install packages without invalidating docker
image when dependencies of airflow change.
The way how docker layers work makes it difficult to cache dependencies from
Python - because you first need to copy the dependency specifications
(pyproject.toml, hatch_build.py, provider.yaml files) or airflow sources to the
image in order to perform installation:
As an example (simplified):
```
1# COPY pyproject.toml src .
2# uv pip install .
```
The thing is that when you copy `pyproject.toml` the `1#` layer gets
invalidated (which also invalidates layer `2#` - and it means that EVERY TIME
`pyproject.toml` changes, we need to install whole airflow installation from
the scratch (becuase the `2#` layer has "installed airflow" and it gets
invalidated.
There are various strategies to cope with it - most of them can use `pip` or
`uv` cache, or using cache mounts:
https://docs.docker.com/build/cache/optimize/#use-cache-mounts for local builds.
But airflow is a BEAST. The uv cache almost doubles the size of our image
(2GB -> 4GB) because the `uv` cache is huge and is not optimized for size but
for speed. The "cache mounts" only works for local builds and it takes ~6
minutes or so (less than `pip` but still substantial) to install airflow for
the first time locally in the cache - also such local cache has some edge cases
when it needs to be invalidated etc.
Instead we are using remote cache
https://docs.docker.com/build/cache/optimize/#use-an-external-cache - basically
our images, when they build locally use `--cache-from`) - and our CI builds and
uploads cache to ghcr.io (with --cache-to).
This way the cache is refreshed every time `main` is green, and anyone who
builds breeze image locally will use that cache.
And the "download main archive + install it" - will generally prepare a
`base` installation. This layer is not invalidated for quite some time (usually
it will be when a new python base image is released, or apt-dependencies are
changed). But until then it provides a "base" cache - layer - and then it is
not invalidated after pyproject.toml is added:
```
1# curl -fsSL https://github.com/apache/airflow/archive/main.tar.gz &&
tar xz -C /tmp/tmp.0VWy67lki1 --strip 1 &&
uv pip install --python /usr/local/bin/python --editable
'/tmp/tmp.0VWy67lki1[devel-ci]'
2# COPY pyproject.toml src .
3# uv pip install --python /usr/local/bin/python --editable .
```
In this scenario:
* The `1#` layer gets refreshed every few weeks -> when python base image
changes. It does not get invalidated when pyproject.toml or src changes. This
layer is pulled (rather quickly comparing to installation) from ghcr.io when
`--cache-from` is used during `breeze ci-image build`
* The `2#` layer gets invalidated when `pyproject.toml` or `src` change
(also `3#` is invalidated as it follows `2#`. Then `uv pip install` already
has **most** packages are installed already in `1#` - so `uv pip install`
generally will only incrementally install whatever changed in `pyproject.toml`
(and `hatch_build.py` and `provider.yaml` in our case.
This means that in most cases rebuilding images is < 1 minute (and in some
cases under 20 seconds) when sources or pyproject.toml changes. This saves
enormous build time for CI and wait time for developers using breeze.
That's why currently this step installs still `asgiref` from main. But this
will change once we merge this change to main (and we will again install things
from main.
Now - I think the caching is currently slightly broken after the "providers"
move (that's why you see it in the first place) - I am going to take a look at
it shortly https://github.com/apache/airflow/issues/42999
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]