Re: Architecture baseline for Forky

Rob Landley Thu, 04 Dec 2025 15:19:26 -0800

I note that I've mostly noped out of this discussion becausehttps://mstdn.jp/@landley/115504860540842713 andhttps://mastodon.sdf.org/@washbear/115646255465589454 but as long as I'mcatching up on back email anyway...


On 11/12/25 12:27, Adrian Bunk wrote:

We are already providing a non-PIE version of the Python interpreter for
users who need it for performance reasons, and it is for example
possible that the benefits of providing packages without hardening (for
situations where hardening is not necessary) might bring larger benefits
than architecture-optimized versions.

Long ago when I was doing https://landley.net/aboriginal/about.html(work which eventually allowed Alpine to be based on busybox), I benchedthat statically linking busybox let the autoconf stage of package buildscomplete about 20% faster under qemu.

(My theory was lazy binding patched out the PLT indirection on the firstcall which dirtied the executable page and forced QEMU to discard thenative code cache and retranslate it, often multiple times as multipleindirections were dynamically patched. I later found it hilarious thatthe dynamic linking people went on to do snap and flatpak and so on,using FAR more space for no obvious gain...)

Does that mean static linking is faster everywhere? Dunno, I haven'ttried "everywhere". You can't "optimize" without saying what you'reoptimizing FOR, and the ground changes out from under you.

Loop unrolling was an optimization, then became a pessimization when cpucaches showed up, then an optimization again when L2 caches showed up,and the pendulum went back and forth multiple times before I stoppedtrying to even track it sometime around when branch prediction turnedinto a security hole and people started doing TLB invalidationmitigations for it. My takeaway lesson was outside of tight inner loops,do the simple thing and let the hardware and optimizers take care ofthemselves.

I do know I left the Red Hat world for the Debian world when the newFedora CD wouldn't install on the Pentium Pro I had at the time (becausethey'd "moved on" to an architecture newer than the hardware I was stillusing).

I had to learn what x86-64-v1 vs v2 were when an android NDK update madeall binaries it produced segfault on my netbook. I cared because I wasmaintaining their command line utilities, and it was nice to be able toactually test that environment. But I didn't discard my hardware tohumor the change, I just ran my test binaries under qemu until thatnetbook died...

There was talk back then (what, 2018?) about teaching repositories toknow about various architecture flags so it could pull optimizedpackages for your machine, but the discussion petered out because thegains were small and the overhead was huge.


> Would x32 optimized for v3 be the best option for many use cases?

It would prevent the x86-64-v2 laptop I'm typing this on from runningthose binaries, but I've already talked to the netbsd guys and to themrunning on systems people want to use their stuff on is a point ofpride. Like it used to be on Linux, before everybody got old and tiredand needed to lighten the load.

Decisions have costs. It's your call to cull your herd and chastise theoutliers, but it usually means some subset will move on to things thatare still fun.

It's an interesting move giving ultimatums to people who never gotforced onto windows and never moved to GPLv3. Not "I am stepping downfrom this and going this way instead", but "xfree86 is now under thisnew license, you will all comply hey where are you going"...


*shrug* You do you.

Rob

Re: Architecture baseline for Forky

Reply via email to