One more update - I am still looking at it and fine-tuning stuff and will have a few more things coming
I found out that we were still using `pip` for `pip constraints generation` (those are the constraints that our users use). I switched that one to `uv` and it's now 30 seconds instead of more than 5 minutes - which is more than 10x improvement. Plus - we get all-canonical `pypi` names back, because I also switched to `uv pip freeze` one and uv nicely canonicalizes all the constraints generated. I am also switching now with https://github.com/apache/airflow/pull/37754 to a new 0.1.11 version that has some bug-fixes and new features, this PR also add upgrade-check that will tell us when the new version of `pip` and `uv` are available (by failing canary build job). J. On Tue, Feb 27, 2024 at 7:49 PM Oliveira, Niko <oniko...@amazon.com.invalid> wrote: > Fantastic results! > > > It also means that if you've been using breeze and were sometimes afraid > to > > > hit "y" to rebuild the image, being afraid that it will take 20 minutes > or > > so - not any more. It should be WAY faster now. > > I'm very excited about this speed up as well as our CI :) > > ________________________________ > From: Jarek Potiuk <ja...@potiuk.com> > Sent: Tuesday, February 27, 2024 2:44:14 AM > To: dev@airflow.apache.org > Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSS] Considering trying > out uv for our CI workflows > > CAUTION: This email originated from outside of the organization. Do not > click links or open attachments unless you can confirm the sender and know > the content is safe. > > > > AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur externe. > Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne pouvez > pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain que > le contenu ne présente aucun risque. > > > > Summarising where we are: > > After ~24 hrs of operations, it looks really cool and fulfills (and > actually exceeds) all my expectations. > > * Multiple PRs succeeded, we got quite a few constraints updated > automatically after successful canary runs: > https://github.com/apache/airflow/commits/constraints-main/ (and they look > perfectly fine - pretty much what I'd expect) > * I looked through a number of image builds in "canary" runs and the > regular 10-12 minutes build-image jobs are down to 3-4 minutes > * I just did an experiment and on my machine I run a complete from the > scratch CI image with new dependencies build for breeze (with `breeze ci > image build --python 3.9 --docker-cache disabled > --upgrade-to-newer-dependencies` ) and compared it with v2-8-test branch > where we do not have the change applied yet > > Results (on my desktop machine (16 cores, network 1Gb download and very > fast disk): > > * v2-8-test: 730 s -> *12 minutes * > * main: 227 s -> less than *4 minutes (!)* > > That's 70% (!) faster. This is a complete full rebuild of the image, > including installing all dependencies from the scratch and attempting to > upgrade them to the latest compatible versions. That is the WORST case. > Of course it will vary - depending on the network speed you have and number > of CPU (unlike `pip` for now `uv` heavily uses parallelism - both for > downloads and installation and that is one of the reasons why the > difference is so huge). I'd love to hear the results of such comparisons > from others with different machines/networking/disks - to get a bit more > scientific data points. > > It also means that if you've been using breeze and were sometimes afraid to > hit "y" to rebuild the image, being afraid that it will take 20 minutes or > so - not any more. It should be WAY faster now. > > I will also proceed to attempt to use the `--resolution lowest` soon and > try to see if we can have a nice automation in place to bump our > min-versions to the "actually working" versions - for all our extras. That > would be a major win for our users - as there will never be a case in the > future that they upgrade airflow to a newer version and some old dependency > remains and is not compatible. It does not happen often, > > Seeing the speed difference - I am actually going now to regularly use `uv > pip` for any local installation as well - it should save a LOT of time - > especially that if you have multiple environments, it keeps a single cache > for all your installed packages (and their metadata) - this means that if > you have several virtualenvs installed and switch between them, the > installation and reinstallation of packages between those packages should > be lightning fast (like single seconds rather than 10s of seconds for > smallest installation). I'd heartily recommend it to anyone. > > Let's see about the stability. I know there are few edge-cases that are not > handled well - Damian helpfully pointed out to the "apache-airflow[all]" > case that currently is problematic, so I will keep an eye on new versions > and fixes (In CI of ours we are currently pinned to 0.1.10 - so we are > shielded from any potential stability problems and we will need to manually > upgrade to newer versions when they appear). > > J. >