On 6/28/25 23:18, Andy Chu wrote:
Hm I looked at the goals of toybox again:

Toybox's main goal is to make Android self-hosting by improving Android's 
command line utilities so it can build an installable Android Open Source 
Project image entirely from source under a stock Android system.

Toybox aims to provide one quarter of a theoretical "minimal native development 
environment"

In theory, this should only require four packages

I don't know much about Android -- is this at all realistic for FIVE
packages -- if you add mksh, which I believe is the Android system
shell ?

Eh, define realistic. AOSP is built around git (kind of conceptually), and their build infrastructure uses python 3. So you'd build a system to build a system.

Aboriginal Linux had 7 packages: linux, busybox, uclibc, gcc, binutils, make and bash. and could built Linux From Scratch under the result in a fully automated target-independent fashion using https://landley.net/aboriginal/control-images/

Ok, https://github.com/landley/control-images/tree/master/images/lfs-bootstrap/mnt cheated slightly with one extra package, as https://github.com/landley/control-images/blob/master/images/lfs-bootstrap/download.sh attests. But https://landley.net/aboriginal/mirror/gettext-stub-1.tar.gz was a tiny little thing to stub out some gnu/stupid, stub versions of a dozen internationalization functions that all either returned their first argument, NULL, or "C". The header could have been a here document and then an empty .a file to satisfy gnu builds that insisted on pulling in the library.

As for getting mkroot to do what aboriginal linux used to, I have no interest in testing mksh beyond not breaking Android's use of the toybox test suite (which runs it under mksh).

The AOSP build is large and has a lot of other dependencies, but Elliott's been doing what he calls "hermetic builds" where AOSP tries to provide a lot of its build prerequisites as shipped binaries, and Toybox provides a lot of those. (Search for the world "hermetic" in toybox's news.html page, it's been mentioned with links a few times.)

The https://landley.net/toybox/roadmap.html#dev_env section of the toybox roadmap is my old dependency list that Aboriginal Linux needed to rebuild itself under itself, and then build Linux From Scratch under the result. But it's been a moving target. I regression test kernel builds with mkroot each release. It uses the same "airlock step" that aboriginal had, where the build $PATH is replaced with a single directory with all the binaries the build needs before building the packages:

https://github.com/landley/toybox/blob/master/mkroot/mkroot.sh#L54

The airlock is mostly set up by toybox's "make install_airlock" target which uses a PENDING and TOOLCHAIN command list, the first being commands that toybox should eventually provide (but doesn't yet) and the second being commands the host needs to provide (mostly the compiler):

https://github.com/landley/toybox/blob/master/scripts/install.sh#L105

Currently PENDING has: expr git tr bash sh gzip awk bison flex make ar

(All but bison, flex, and make have semi-complete "pending" versions in toybox.)

And TOOLCHAIN has: as cc ld objdump bc gcc

And the last two of those I have patches to remove the need for from the kernel build, https://landley.net/bin/mkroot/0.8.12/linux-patches/0004-Replace-timeconst.bc-with-mktimeconst.c.patch and https://landley.net/bin/mkroot/0.8.12/linux-patches/0001-try-generic-compiler-name-cc-before-falling-back-to-.patch respectively.

My tool to instrument a build so I can see every command line called out of the $PATH is currently mkroot/record-commands (which builds toys/example/logpath.c), and descends from the "command logging wrapper" described in https://landley.net/aboriginal/FAQ.html#debug_logging

(This doesn't catch the ones called from absolute paths, usually by scripts with #!/usr/blah at the start. Also gmake will call /bin/sh (instead of sh out of the $PATH) unless you set SHELL, see https://www.gnu.org/software/make/manual/make.html#Choosing-the-Shell for the gnu/stupid du jour.)

Can Android even be built on Android at all, with any number of
packages?   e.g. if you download all the dev tools onto an Android
device ...  I imagine it is a ton of tools, and not very fun.

That's an Elliott question, and they did some sort of container infratructure (which may or may not be related to https://www.youtube.com/watch?v=Eu-rqMHqM6I ) in newer versions of Android than my phone runs, which can presumably install arbitrary linux distros in either containers or VMs, so it's a semi-philosophical question? (But the countering trusting trust stuff still applies.)

I've been working _towards_ it since 2011, but... let's just say the past decade has not provided my ideal work environment.

Anyway, if there is something realistic we could do here with OSH,
that may be of interest to our funders http://nlnet.nl

https://github.com/landley/toybox/blob/master/toys/pending/sh.c has most of the infrastructure in place already. If I wanted to use bash or mksh in another aboriginal linux style LFS build setup, I could. (And Alpine Linux exists, which benefited from all the busybox work I did back in the day.)

e.g. testing that important packages can actually be built, and
reducing real failures to reproducible test cases.  That is a lot of
real work

Which Alpine has presumably done. I'm not trying to patch packages, I'm using them as test cases. Which is how you wind up with stuff like:

https://github.com/landley/toybox/commit/32b3587af261

Which is CLEARLY THEIR BUG, yet we must cope.

 From some viewpoints it could be theoretical, but proving that you can
build a real system is important!

I've done it. The old "lfs-bootstrap" images in https://landley.net/aboriginal/downloads/old/binaries/1.4.1/extras/ were "here's the linux from scratch 6.x root filesystem that built under qemu from the minimal native development environment system image this release".

I got FANCY back then. If you're wondering why the (current) airlock scripts detect multiple instances of the same command in the $PATH and symlink them into numbered fallback directories, it's for things like distcc, which the old scripts used to move the heavy lifting of compilation out of the emulated environment to run on the host machine:

https://landley.net/notes-2008.html#07-06-2008

I probably blathered about that at Ottawa Linux Symposium:

https://bootlin.com/pub/video/2008/ols/ols2008-rob-landley-linux-compiler.ogg

Still on the TODO list for the new stuff. Back in the day I could get about -j3 usefully going before the emulator became the bottleneck. Well using SMP for the actual compile part, the configure stage was 100% the bottleneck in all the gnu package builds. Still is. More totally unnecessary gnu/stupid: the compiler sets a zillion builtin macros you can see with:

$ :|cc -dM -E -

And between that, c11's __has_include(), and features.h you can eliminate almost all configure time probes because it ALREADY KNOWS. Just set your cross compiler and let your headers pick through the symbols to figure out what to do.

One of my many todo items is re-testing whether running ./configure with static linked binaries is still 20% faster under QEMU these days:

https://landley.net/notes-2009.html#14-10-2009

I _think_ that was back before PLT and GOT were collated into arrays, meaning QEMU the dynamic references were patched in-situ instead of redirecting off an object table, so QEMUJ had to re-translate each executable page every time it was written to (self modifying code REALLY fscks with dynamic translation) meaning the overhead of dynamic linking patching all the jumps in place was just pathological. Then there was that terrible RTLD_LAZY nonsense which SOMEHOW MADE IT WORSE, and of course SOME linking variations would always indirect off the PLT/GOT and others would patch the relocation into the caller as part of the first call... I think -fPIC or not was involved here somehow? (PIE is SORT of nice, but static PIE not using the dynamic linker but STILL DYNAMIC LINKING ITSELF means it has to be STATICALLY LINKING THE RUNTIME DYNAMIC LINKER and that's about where I step away from the keyboard.

Don't ask me how using dlopen messes with any of that. Sigh, I keep thinking Rich Felker's dlopen() rant is on https://ewontfix.com/ somewhere but no, it's buried in the musl openwall list which Google can't find anymore since https://www.wheresyoured.at/the-men-who-killed-google/

Anyway, it's been a while since I last seriously dug into linking, because it's a can of worms. (https://landley.net/bin/mkroot/0.8.11/linux-patches/0002-sh4-fdpic.patch doesn't count because I actually needed it for something.)

Sigh, everything has so much backstory. QEMU having to translate pages is a thing I blathered about back when I was trying to do a "qemu weekly news", I explained how/why dyngen worked:

https://landley.net/qemu/2008-01-15.html#Jan_17,_2008_-_[PATCH_0_5]_Enable_building_of_op.o_on_gcc4

Right before it got ripped out and replaced:

https://landley.net/qemu/2008-01-29.html#Feb_1,_2008_-_TCG

But the general principles still apply. (SO MUCH of computer science is "we learned how the principles worked from some old obsolete thing that's been replaced, and the new one still works fundamentally the same way but it's a lot more complicated so you can't actually SEE that unless you understand where it came from. It's a pedagogical disaster leading to https://www.landley.net/history/mirror/institutional_memory.html loss and I dunno what to do about that, but what else is new?)

Andy

Rob
_______________________________________________
Toybox mailing list
Toybox@lists.landley.net
http://lists.landley.net/listinfo.cgi/toybox-landley.net

Reply via email to