Of course the PR is not yet complete - there are few things to fix there for special cases we have.
On Sun, Feb 25, 2024 at 12:33 AM Jarek Potiuk <[email protected]> wrote: > Hello here. > > I have a PR https://github.com/apache/airflow/pull/37683 that implements: > > * ability to choose either uv or PIP when building our images > * CI images are built with uv by default (but you can use `--no-use-uv` as > a flag and switch back to `pip` > * PROD images are built with pip by default (but you can us `--use-uv` as > a flag an switch to uv > > The preliminary tests show indeed that uv not only has a much faster > baseline, but also their use of caching fits extremely well into our > strategy of building images and we will get huge improvements of our CI > build timing when using uv. > > Just for the context - our CI images when built are using a caching > strategy to optimise for f > > 1) fast building when there are no changes (around 1 minute to build with > pip), > 2) slower building when someone adds or modifies non-conflicting > dependency (around. 8 minutes to build, out of which ~ 6 m is pip > resolution and installation) > 3) much longer build time when there are conflicting dependencies or when > we change Dockerfile or scripts or when Python base image changes (around > 27 minutes build out of which pip resolving is ~ 20m). > > Those are all `pip` numbers. Currently `pip` does not use resolution > caching between the steps. Comparison of some basic installation steps from > initial tests show that UV is way faster: > > * Resolving and Installing airflow with [devel-ci] (610 dependencies): pip > ~ 6m, uv ~ 1m 30 s > * Re-resolving and reinstalling [devel-ci] using local pyproject.toml; pip > ~ 4m (cache is not used), uv ~ 4s (!!!!) - because cache is used in this > case. > > I have not yet tested well (but I will once they happen) --eager upgrade > of dependencies (pip - very much depends but it's often in the range of 10 > minutes) - I expect it not to take more than 2-3 minutes with uv > > So overall it looks like we are looking at those improvements: > > 1) Regular builds with no dependency changes: pip.~ 1m , uv ~ 1m > (because we are using docker layer caching and pip resolution and > installation is not used at all) > 2) Updating dependencies: 8m with pip will probably go down with uv to ~ > 3.30s => 60% improvement and in many cases ~ 2.5 m when there are no remote > changes and cache is used (70% improvement) > 3) Re-resolving and reinstalling everything 27 m will probably go down > with uv to ~ 9m => 67% improvements. > > If those numbers hold and the resolution quality will be comparable to > `pip` - then well, it's definitely worth it - and the numbers are very > close to what the `uv` authors claimed. > > I am impressed :) > > J. > > > > On Thu, Feb 22, 2024 at 5:25 AM Amogh Desai <[email protected]> > wrote: > >> I agree with Niko here. >> >> If someone is willing to give it a try, we should enable it experimentally >> and give it a stint for a couple of weeks. If we see significant results, >> we can adopt it. >> >> Thanks & Regards, >> Amogh Desai >> >> On Thu, Feb 22, 2024 at 3:32 AM Oliveira, Niko >> <[email protected]> >> wrote: >> >> > The Astral folks also seem very focused on it being a drop-in/compliant >> > replacement for pip. So I think it's definitely worth dropping it in and >> > seeing if we get the expected performance improvements. If tests still >> pass >> > and user facing constraints and install instructions remain unchanged I >> > don't see why not, if someone is willing to spend the time on it. Never >> > mind the extra features it would give us (I, like others, am also very >> > excited about --resolution=lowest, ability). >> > >> > ________________________________ >> > From: Andrey Anshin <[email protected]> >> > Sent: Tuesday, February 20, 2024 12:26:56 AM >> > To: [email protected] >> > Subject: RE: [EXTERNAL] [COURRIEL EXTERNE] [DISCUSS] Considering trying >> > out uv for our CI workflows >> > >> > CAUTION: This email originated from outside of the organization. Do not >> > click links or open attachments unless you can confirm the sender and >> know >> > the content is safe. >> > >> > >> > >> > AVERTISSEMENT: Ce courrier électronique provient d’un expéditeur >> externe. >> > Ne cliquez sur aucun lien et n’ouvrez aucune pièce jointe si vous ne >> pouvez >> > pas confirmer l’identité de l’expéditeur et si vous n’êtes pas certain >> que >> > le contenu ne présente aucun risque. >> > >> > >> > >> > > I share Andrey's skepticism. It's just yet another tool which has an >> > unclear >> > development strategy. >> > >> > My point was more about a matter of presentation. If someone told you >> "this >> > is a new tool, like a killer of previous tools" then you might think >> > "Yeah...yeah...yeah.. yet another replacement to tool X... not really >> > interesting". On the other hand if someone told you what in cases you >> might >> > solve, then this might be a mind changer. >> > >> > Especially the promising `--resolution=lowest` option. We always want to >> > test something with minimal dependencies because we are not sure that it >> > might work with pretty old dependencies, and recently I've started to >> work >> > on POC to collect minimal versions of the Airflow and Providers. And at >> the >> > moment when I almost finished it the uv was released. Well sometimes it >> is >> > better to wait a bit and maybe someone would invent the same >> > solution 😁 and you don't have to spend a personal time. >> > >> > So as POC I'm on it, we still need a `pip` and validate some stuff by a >> pip >> > because it is only one officially supported way to install Airflow but >> if >> > something could be improved in the CI then I'm on it, in most cases it >> > would be behind of Breeze and many of the contributors might be even not >> > noticed that something changed. >> > >> > >> > >> > >> > >> > >> > >> > >> > On Tue, 20 Feb 2024 at 09:56, Jarek Potiuk <[email protected]> wrote: >> > >> > > Actually - of you read that blog post, the strategy is clear - they >> aim >> > to >> > > create a comprehensive packaging tooling and improvnts are measured >> > (80-100 >> > > times they claim - I using caching - they (unlike pip) use a lot of >> local >> > > caching including resolving dependencies). >> > > >> > > So I think both arguments are not valid if you ask me. >> > > >> > > wt., 20 lut 2024, 02:37 użytkownik Alexander Shorin < >> [email protected]> >> > > napisał: >> > > >> > > > I share Andrey's skepticism. It's just yet another tool which has an >> > > > unclear development strategy. Should you make it a free testing >> suite? >> > > What >> > > > project would receive in exchange? A lot of words about being >> faster, >> > but >> > > > how much? Are these milliseconds worth to change the stable tool >> with a >> > > new >> > > > one? And will it notably improve something? >> > > > >> > > > I think it's worth to try it just for fun and provide feedback, but >> > it'll >> > > > have to pass a long road to become such stable as pip. >> > > > >> > > > -- >> > > > ,,,^..^,,, >> > > > >> > > > >> > > > On Tue, Feb 20, 2024 at 3:06 AM Jarek Potiuk <[email protected]> >> wrote: >> > > > >> > > > > My opinion: >> > > > > >> > > > > I think there is a place for a number of such tools. For a long >> time >> > > the >> > > > > packaging team and `pip` team have been working not only on `pip` >> > > > > implementation but also (and most importantly) to make sure that >> what >> > > > `pip` >> > > > > does is to be the beacon of standardisation of packaging APIs and >> > PEPs. >> > > > It >> > > > > will never IMHO have a lot of the fancy features that other tools >> > might >> > > > > provide (like the ones I mentioned). It will always be there to >> > provide >> > > > the >> > > > > robust and solid CLI to run all packaging things, but there are >> > plenty >> > > of >> > > > > opportunities to provide improved or modified, or more (or less) >> > > > > opinionated ways of doing things that are addressing some cases >> that >> > > > `pip` >> > > > > team simply will not be able or willing to handle, preferring >> "pure" >> > > > > standard approach vs. implement all the optional things. For >> example >> > > the >> > > > > way how pre-releases are handled can be improved to be more >> > selective. >> > > > The >> > > > > PEP describing it gives the tools an option to add more fancy >> > > behaviours >> > > > > (some of which we could find useful in our CI tooling). Should >> `pip` >> > > > > implement those - I don't think so. It would distract maintainers >> > from >> > > > > other more important things. It is quite ok to use other tooling >> in >> > > > places >> > > > > like our CI, where they do some parts of the installation better. >> > > > > >> > > > > For me `pip` is going more into the direction of `usable reference >> > > > > implementation of package installed` - any standard/ PEP will not >> > > matter >> > > > if >> > > > > `pip` does not implement it. But others might go in different >> > > directions >> > > > > and implement some less popular features and do it better, faster, >> > with >> > > > > greater flexibility. IMHO it's a win-win. >> > > > > >> > > > > J. >> > > > > >> > > > > >> > > > > On Mon, Feb 19, 2024 at 11:40 PM Andrey Anshin < >> > > [email protected] >> > > > > >> > > > > wrote: >> > > > > >> > > > > > Yesterday my friend shared with me that tool and I've been told >> > that >> > > > more >> > > > > > presumably it would be a niche tool. I've been told "who needs >> yet >> > > > > another >> > > > > > installer which stands to resolve all your problems' '. >> > > > > > I guess I was wrong? >> > > > > > >> > > > > > On Tue, 20 Feb 2024 at 00:53, Jarek Potiuk <[email protected]> >> > wrote: >> > > > > > >> > > > > > > Hey everyone, >> > > > > > > >> > > > > > > Few days ago the ruff creators have released a new tool uv - >> > which >> > > is >> > > > > an >> > > > > > > extremely fast (written in rust) and fully featured tool >> > generally >> > > > > fully >> > > > > > > compatible with `pip`. >> > > > > > > >> > > > > > > Blog post here: https://astral.sh/blog/uv >> > > > > > > >> > > > > > > It looks like It has a number of things that would make our CI >> > > cases >> > > > > and >> > > > > > > tooling quite a bit faster and better including a few things >> > that I >> > > > > have >> > > > > > > implemented some workarounds for and some that I have not >> > > > > > > implemented because `pip` had no good solution. >> > > > > > > >> > > > > > > I looked at the docs and it solves some problems that are >> > currently >> > > > > > > difficult or impossible to handle with `pip`: >> > > > > > > >> > > > > > > * ability to use overrides (which are constraints on steroids >> - >> > > > > allowing >> > > > > > to >> > > > > > > override limits specified by the packages - this will be very >> > > useful >> > > > to >> > > > > > > better handle our cases with "chicken-egg" providers (for >> example >> > > > like >> > > > > we >> > > > > > > had in FAB) where we have pre-release packages depending on >> each >> > > > other >> > > > > > > >> > > > > > > * different resolution strategies including >> --resolution=lowest >> > > which >> > > > > > will >> > > > > > > finally allow us to see whether airflow's lower bounds are >> still >> > > > > holding >> > > > > > > (i.e. - will our test still pass if we use the lowest >> supported >> > > > version >> > > > > > of >> > > > > > > our dependencies? this is something i wanted to do for quite >> > some >> > > > time >> > > > > > and >> > > > > > > recorded an issue for that - >> > > > > > > https://github.com/apache/airflow/issues/35549 >> > > > > > > but lack of tooling support made it a wish, with >> > > > `--resolution=lowest` >> > > > > it >> > > > > > > seems like super-easy thing to do. >> > > > > > > >> > > > > > > * It is said to be many, many times faster - with better >> caching >> > > and >> > > > > > > resolution speeds (similarly like with ruff they claim orders >> of >> > > > > > magnitude >> > > > > > > speedups in a number of cases). We can likely make very good >> use >> > of >> > > > it >> > > > > > and >> > > > > > > speed up some parts of our CI workflow significantly. >> > > > > > > >> > > > > > > I might likely do some experimenting with uv in our toolchain, >> > but >> > > > > wanted >> > > > > > > to make sure we are all aware of it - and ask if someone has >> > > > something >> > > > > > > against it (and maybe someone would like to do some work there >> > > trying >> > > > > it >> > > > > > > out - I will be happy to guide others with the dev/tooling >> > mindset >> > > > and >> > > > > > > incline to do some changes there/review PRs and cooperate on >> > > testing >> > > > > > those >> > > > > > > things. >> > > > > > > >> > > > > > > It's not a user-facing change, and I do not think we want to >> get >> > > rid >> > > > of >> > > > > > > `pip` as an installation tool in general (in our images and >> user >> > > > facing >> > > > > > > side) - it's mostly an internal CI tooling improvement I am >> > > thinking >> > > > > of. >> > > > > > > Maybe at some point in time we can recommend it also for >> > > development >> > > > > > > workflows, and maybe someday it will gain enough popularity to >> > > think >> > > > > > about >> > > > > > > recommending it to our users, but definitely not now nor in >> even >> > > > > mid-term >> > > > > > > future. >> > > > > > > >> > > > > > > Let me know what you think. >> > > > > > > >> > > > > > > Repo here: https://github.com/astral-sh/uv >> > > > > > > >> > > > > > > J. >> > > > > > > >> > > > > > >> > > > > >> > > > >> > > >> > >> >
