Re: Rethinking configuration tuples

Jacob Bachmeyer Wed, 30 Aug 2023 19:25:04 -0700

John Ericson wrote:

On 8/27/23 23:59, Jacob Bachmeyer wrote:
[...]
This is also the framework in which *-*-linux-gnu-musl makes sensefor a system that uses Musl libc but is otherwise a GNU/Linux system.
Right but again where do we draw the line? For example, can one usesystemd and its large entourage of intertwined software, or must oneuse GNU Shepherd or System V init?
In the case of *-*-linux-gnu and *-*-linux-gnu-musl, the differenceis the C runtime library (GNU libc vs. Musl libc) such that sharedobjects linked for one ABI are not compatible with the other. IfMusl libc were exactly 100% binary compatible with GNU libc, thenthere would be no *-*-linux-gnu-musl platform, since it would beindistinguishable from *-*-linux-gnu.
Err I mean, is there am example of a *-*-linux-$nongnu-musl?

I would expect that to name an embedded environment using Musl libc andthe Linux kernel, but that is not a full system. (Example: may noteven have a shell at all)

[...]
The choice of system service management is orthogonal to this, sinceit has minimal impact on user programs. (Unless systemd gets evenmore outrageously invasive...)
Agreed, just wanted to double check.

Of course, if systemd *does* get sufficiently outrageously invasive, wemight need a *-*-linux-systemd-glibc tuple... (Since systemd gleefullymakes extensive use of Linux-kernel-specific features, it cannotpossibly be a standard on the GNU system, which supports multiple Freekernels.)

Except configure usually does not need a "fully disambiguated"form---the canonical form produced by config.sub is fine, sinceconfigure is usually matching against the full tuple using shell casepatterns. The flat list with a defined order is optimal for thisstrategy, since it allows to easily check for the presence of any tagor combination of tags.
Shell case patterns can be a bit of a footgun. For example, a commonmistake is doing * instead of *-*.

If the allowed pattern elements are sufficiently unambiguous, there isno mistake, since `*' matches text including `-'. In fact, when testingan "is tag FOO present?" predicate `*-foo-* | *-foo' would be correct.(I assume that a CPU type will remain required and will remain first inthe list.)

I would rather case on disambiguated variables. Indeed,AC_CANONICAL_HOST computes host_cpu, host_vendor, and host_os forprecisely that purpose. If config.sub could split out thedisambiguated form, those variables could be defined more simply androbustly.

Allow the hypothetical --parse option to accept a PREFIX argument andyou are pretty much there:


$ ./config.sub --parse=host x86_64-linux-gnu
host=x86_64-pc-linux-gnu
host_cpu=x86_64
host_vendor=pc
host_kernel=linux
host_os=gnu
$

That form should be both easily parsed by other tools and suitable for`eval` in shell scripts.

Note that config.sub is itself a shell script, and handling JSON inshell is a giant pain. The most we could reasonably do is whatconfig.sub already does: determine each component as a separatevariable and then output that by substituting text into a template.
Yes I agree config.sub in its current form (must be highly portableacross different Bourne-shell derivatives) has no hope of parsingJSON. It could output it or it could also output your${key}=${value}\n format, and it could also consume your format.Your format is ideal for it!
Adding a prefix to each key in the key=value format is trivial andwould further help shell scripts that want to "parse by eval" butconfigure itself tests predicates rather than caring exactly whatpart of the configuration tuple means what. Put another way,configure is usually looking for a yes/no answer, so a pre-parsedform is less useful than a single string that can be used for patternmatches.
I agree testing is more robust, but for better or worse I still do seescripts using those host_* variables mentioned above. (Testing ispossible but requires more care to get right for cross-compilation,for one.)


In this case the test is `case $host in ... esac`.

There is no reasonable way to feed the key=value format /into/config.sub: configuration tuples are hyphen-delimited lists.
I think there is. The overall algorithm is roughly "(a) decide whichcomponent is which, (b) sanitize and normalize components decision tothat decision". We would skip step (a) and go straight to step (b) inorder to do this.
This indicates part of the value of doing this: rather than just"system testing" the entirety of config.sub, we would now havesomething closer to a "unit test" of part of it in isolation.
FWIW, this is similar to a rearranging the code to a support a modewhere non-normal-form configs are rejected instead of normalized.

The problem is still getting it /into/ config.sub: config.sub expects asingle command-line argument, while pre-parsed form spans a few lines.

[...]
I am not entirely certain why, but I know that there is some reasonwe call the common GNU/Linux systems *-*-linux-gnu instead of *-*-linux.
To be honest, I think this is basically the "call it GNU/Linux notLinux" controversy --- i.e. at the time it was done for social nottechnical reasons. I don't mind, since now that we have multiple libcsthere /is/ a technical reason to distinguish. But this circles back tomy hunch that Kernel (syscall interface) + libc (ABI) determines OSuniquely enough for config.sub's purposes.

That is possible, but still a valid reason for the GNU Project to staywith that angle.

I called the fifth field "LIBCABI" because it can be a libc name oran ABI name; in practice the two are usually closely related. Someexisting tuples place a libc name in that slot, while others use amore generic ABI or file format name, such as "elf" in yourexample. For it to be a source of confusion, there would need tobe a libc that supports multiple ABIs, and you would simply use theABI names in that case.
Perhaps you know of examples of existing ones out in the wild that Iam not aware of that need to include kernel, OS, and libc? Do shareif you do!
The major example that immediately comes to mind would be a GNU/Linuxdistribution using Musl libc. But that comes back to why*-*-linux-gnu exists in the first place...
Erm I mean not an extant system that would use such a config underyour system, but an extant config (not necessarily a GNU one, could bean LLVM, Rust, or something else one) for such a system. In otherwords, I am asking whether there was a case where someone elseevidently decided that kernel+libc was not enough info and OS was alsoneeded to further disambiguate.


I do not know of any off the top of my head.


-- Jacob

Re: Rethinking configuration tuples

Reply via email to