Re: Rethinking configuration tuples (was: Re: config.sub should normalize *-*-windows-*)
On 8/24/23 23:54, Jacob Bachmeyer wrote: John Ericson wrote: This is why I opened with "Operating System" lacks a coherent objective definition. [...] As I understand, historically, "operating systems" were proprietary monoliths and the GNU Project originally expected to produce another monolith, but /our/ monolith would be Free Software. As an interim measure, the GNU utilities were designed to be widely portable across the various individually-monolithic proprietary operating systems then in use across a wide variety of hardware. The broader Free Software Movement unexpectedly shattered that state of affairs, leading to the 4-element configuration tuple form, when the Linux kernel became available and it was noticed that---oops!---GNU on Linux and GNU on HURD would have significant differences that at least some of the GNU packages would need to handle. (For example, GNU libc is very different between Linux, where POSIX I/O maps fairly directly to underlying syscalls, and HURD, where POSIX I/O must be translated to Mach IPC, but both of these are Free GNU systems.) This means that the GNU system is a somewhat blurry category, with many variants possible, and is orthogonal to "Linux": there are GNU/Linux systems, GNU systems using other kernels, and Linux-based systems not using GNU at all. This latter category is fairly common in embedded systems, where the GNU utilities are often eschewed for lighter-weight alternatives to save flash space (or, less honorably, to avoid GPL3). Yes I agree with this state of affairs. I sometimes (but not always!) detect a sort of "Linux Scooped us" sentiment in GNU quarters, but as I see it portability and diversity of distros was pretty much inevitable --- replacing propriety Unix userlands with GNU software was a huge point in how GNU got going in academic/institutional environments in the early days, and even if Hurd got there before Linux there would be no reason to rip out that portability. JSON is pretty much a hard no for me: it is far too complex for what really needs to be a simple structure. Flat strings work very well for the way that GNU software typically expects to parse a configuration tuple using shell constructs. Perhaps it would be better to redefine configuration tuples as a flat list of tags with a canonical ordering? (The reason for a canonical ordering is in part to ensure that all existing coherent configuration tuple strings remain valid and to ensure that text-based pattern matching continues to work.) Ah sorry, I shouldn't have made reference to JSON at all --- what I really was getting at is the /abstract syntax/. In particular, rather than having an abstract syntax of "list of strings" (parsing today's concrete syntax by breaking on dash), where the meaning of each string is ambiguous / context-sensative, we have of "keys mapped to enumerations", i.e. one always knows the meaning of each component explicitly / without inspecting it or its context. JSON or your flat list in canonical ordering (where I assume we are careful to never skip a type of component) are both valid concrete syntaxes that can be parsed / printed from this abstract syntax. --- Concretely, I think these are pretty clear configs: CPU-VENDOR-windows-mingnu # MinGW, MS C + GNU C++ and other GNU-ish things, TODO distinguish between MSVCRT and UCRT I say that this one really should just be *-mingw. Sure. I went with mingnu because the "w" is redundant with the "windows", but ultimately I care more about the pattern than the exact choice of identifiers / enumeration tags. (As we way in programming language land, I care about the thing "up to alpha-renaming"). Note that there are both MinGW32 and MinGW64, corresponding to 32-bit and 64-bit Windows APIs. Should that be included or should the CPU type be used to distinguish? (e.g. i686-pc-windows-mingw is MinGW32 and x86_64-pc-windows-mingw is MinGW64?) Yes I think so. If you look at https://www.mingw-w64.org/downloads/ one even sees |x86_64-w64-mingw32| which is quite something, and 64-bit! I think what happened is that "w32" to was chosen to mean the then-new win32 API/ABI, as opposed to DOS. Win64 as I understand is necessarily a new ABI because of the change in CPU arch, but not really a new API, being more of a "let's make the minimal amount of changes so the source/headers are portable" situation. So a combination of "same API" and "too lazy to update GNU config" made "mingw32" stick around. f16804b79ee5a23a9994a1cdc760cd9ba813148a added mingw64 to GNU config in 2012, which is far after the advent of 64-bit Windows. In the proposed five-element form, MSVCRT and UCRT are easily distinguished. Example: i686-pc-windows-mingw-msvcrt i686-pc-windows-mingw-ucrt x86_64-pc-windows-mingw-msvcrt x86_64-pc-windows-mingw-ucrt That is very true, I will grant you that :) CPU-VENDOR-windows-cygnus # Cygwin CPU-VENDOR-
Re: config.sub should normalize *-*-windows-*
Thanks Connor. I think we are both on the same page! On 8/24/23 14:51, connor horman wrote: It seems to me reading this thread that we've come into two conflicting realities: * There exists targets that need to be distinguished, and * They are not distinct in any component that config.sub has, therefore they cannot and should not be distinguished. mingw and msvc both use the NT kernel, and the windows operating system. So it seems to me that windows, the OS, is the correct way to describe them. According to the discussions on this thread, they should thusly both canonicalize to the same target. And yet, not only is there desire to separate these targets, they already are. Agreed. We can have our cake and eat it to both both: (a) distinguishing things which are already distinguished and (b) having configs follow consistent conventions. LLVM (as well as my own target parsing tool) refer to the last two components as "sys" with two subcomponents (of which at least one exists), being os and env. IMO, this seems a far more coherent definition that satisfies the requirements, and even more correctly matches targets that already exist. Agreed! musl is another extreme example: There is no musl OS. The last component being musl refers to the use of the musl libc. The resulting binary can then be used on either a GNU system or a non-GNU linux system like alpine, void, or iglunix. Thus musl cannot be regarded as an "OPERATING_SYSTEM" but rather an an environment. Agreed! Even on linux-gnu the definition is murky at best. While I won't dispute the existence of a GNU operating system running atop the linux kernel, in many cases, the actual linux-gnu tag merely refers to glibc. Few things using targets nowadays actually cares about the rest of the tools, and when they do care that they exist (on --host or even --target), they typically don't care that they're provided by GNU, and even may not care that they match the interface of the tool provided by GNU. Only on --build are the tools really cared about, and I don't see many things matching the build tuple or even canonicalizing it. If we thus define an "Operating System" as "kernel+libc+tools atop that" it becomes clear to me that few things written nowadays care about the "GNU Operating System" and only really care about the "GNU Environment". Agreed! Well put --- even if we were to find a rigorous objective definition for "Operating System" in general, encompassing a long tail of auxiliary interfaces, it would be overly specific what what things inspecting the output of config.sub actually care about. (FWIW I am also fine saying there exists the "GNU Operating System", but to me "Operating System" is always an exercise in branding, tying together disparate components which always in principle (e.g. if we had the source code) could be mixed-and-matched in other ways.) I would like this very much to happen, along with the Rust project which has it's own target defs (but similar as well). I am glad I am not the only one! John
Re: config.sub should normalize *-*-windows-*
connor horman writes: > It seems to me reading this thread that we've come into two > conflicting realities: * There exists targets that need to be > distinguished, and * They are not distinct in any component that > config.sub has, therefore they cannot and should not be distinguished. > > mingw and msvc both use the NT kernel, and the windows operating > system. So it seems to me that windows, the OS, is the correct way to > describe them. According to the discussions on this thread, they > should thusly both canonicalize to the same target. And yet, not only > is there desire to separate these targets, they already are. > > LLVM (as well as my own target parsing tool) refer to the last two > components as "sys" with two subcomponents (of which at least one > exists), being os and env. IMO, this seems a far more coherent > definition that satisfies the requirements, and even more correctly > matches targets that already exist. The objective is to keep the status quo unchanged till Hell freezes over, so that no programs will ever be broken. > musl is another extreme example: There is no musl OS. The last > component being musl refers to the use of the musl libc. The resulting > binary can then be used on either a GNU system or a non-GNU linux > system like alpine, void, or iglunix. Thus musl cannot be regarded as > an "OPERATING_SYSTEM" but rather an an environment. > > Even on linux-gnu the definition is murky at best. While I won't > dispute the existence of a GNU operating system running atop the linux > kernel, in many cases, the actual linux-gnu tag merely refers to > glibc. Few things using targets nowadays actually cares about the rest > of the tools, and when they do care that they exist (on --host or even > --target), they typically don't care that they're provided by GNU, and > even may not care that they match the interface of the tool provided > by GNU. Only on --build are the tools really cared about, and I don't > see many things matching the build tuple or even canonicalizing it. If > we thus define an "Operating System" as "kernel+libc+tools atop that" > it becomes clear to me that few things written nowadays care about the > "GNU Operating System" and only really care about the "GNU > Environment". For the purpose of compiling programs, systems using the GNU libc are equivalent to GNU systems. config.* does not draw excessively fine distinctions between them. In keeping with that, systems using the Musl libc are so similar that they may as well be considered as a single operating system. This contrasts with MinGW and MSVC, whose discrepancies are of sufficient consequence to warrant individual identification by config.*. And as configurations which embody these distinctions _already exist_, they should never change, nor be supplanted by new and purportedly ``improved'' configurations. I reiterate, until the very end of time...
Rethinking configuration tuples (was: Re: config.sub should normalize *-*-windows-*)
John Ericson wrote: This is why I opened with "Operating System" lacks a coherent objective definition. The more pugilistic message is to say the rest of the world doesn't think the GNU operating system exists --- that there is simply a choice of kernel (Linux, k*BSD, Hurd, something else...) and choices of libraries and system components on top of that, and many combinations are possible. The rest of the world might say this in a mean way, but I say it is actually a /good/ thing --- software freedom means one /can/ choose my components à la carte, and only a lack of software freedom results in a kernel and mass of libraries outside one's control blurring together into a scary "take it or leave it" monolith we call an operating system. As I understand, historically, "operating systems" were proprietary monoliths and the GNU Project originally expected to produce another monolith, but /our/ monolith would be Free Software. As an interim measure, the GNU utilities were designed to be widely portable across the various individually-monolithic proprietary operating systems then in use across a wide variety of hardware. The broader Free Software Movement unexpectedly shattered that state of affairs, leading to the 4-element configuration tuple form, when the Linux kernel became available and it was noticed that---oops!---GNU on Linux and GNU on HURD would have significant differences that at least some of the GNU packages would need to handle. (For example, GNU libc is very different between Linux, where POSIX I/O maps fairly directly to underlying syscalls, and HURD, where POSIX I/O must be translated to Mach IPC, but both of these are Free GNU systems.) This means that the GNU system is a somewhat blurry category, with many variants possible, and is orthogonal to "Linux": there are GNU/Linux systems, GNU systems using other kernels, and Linux-based systems not using GNU at all. This latter category is fairly common in embedded systems, where the GNU utilities are often eschewed for lighter-weight alternatives to save flash space (or, less honorably, to avoid GPL3). On 8/24/23 08:51, Adam Joseph wrote: [...] It seems like a lot of the proposals in this thread are being evaluated not based on whether or not they are coherent, but rather on whether or not they take us a few nanometers closer to whatever happens to whatever LLVM's internal implementation details happen to be this week. I care about coherence, the reason I like to see what LLVM does that working from a parsed representation forces the software to be much more honest. Since GNU config doesn't reveal its categories but just spits out another opaque string, there is no external pressure for its categorization to be any good. LLVM, on the other hand, dispenses with strings entirely and just uses the enums, so it is forced to make sure those enums make sense and work for the branching the program has to do. LLVM parsing of configs is ad-hoc Postel's law stuff like everyone else, but its internal representation is actually quite stable. Parsing is the ugly nasty part that gets to the pristine clear ontology on the other side. Ultimately I would like to convene everyone to commit to an agreed upon internal representation too. E.g. clang and GNU config could both spit out some JSON that is unambiguous and should match. I think that would alleviate a lot of Adam's concerns about "following LLVM". But I don't think it is possible to convene the working group needed to standardize such a format yet, because there is little trust between parties. Moving us a "a few nanometers closer" on each side demonstrates that there is willingness to compromise. JSON is pretty much a hard no for me: it is far too complex for what really needs to be a simple structure. Flat strings work very well for the way that GNU software typically expects to parse a configuration tuple using shell constructs. Perhaps it would be better to redefine configuration tuples as a flat list of tags with a canonical ordering? (The reason for a canonical ordering is in part to ensure that all existing coherent configuration tuple strings remain valid and to ensure that text-based pattern matching continues to work.) --- Concretely, I think these are pretty clear configs: CPU-VENDOR-windows-mingnu # MinGW, MS C + GNU C++ and other GNU-ish things, TODO distinguish between MSVCRT and UCRT I say that this one really should just be *-mingw. Note that there are both MinGW32 and MinGW64, corresponding to 32-bit and 64-bit Windows APIs. Should that be included or should the CPU type be used to distinguish? (e.g. i686-pc-windows-mingw is MinGW32 and x86_64-pc-windows-mingw is MinGW64?) In the proposed five-element form, MSVCRT and UCRT are easily distinguished. Example: i686-pc-windows-mingw-msvcrt i686-pc-windows-mingw-ucrt x86_64-pc-windows-mingw-msvcrt x86_64-pc-windows-mingw-ucrt
Re: config.sub should normalize *-*-windows-*
Adam Joseph wrote: Quoting Jacob Bachmeyer (2023-08-21 19:35:05) No, we are not. CPU-VENDOR-KERNEL-OS-LIBCABI, with at least one of the If you want to redefine existing terms, please be forthright about the fact that your proposal does so. I argue that this is something that has already happened under our collective noses (and before I had any interest in configuration tuples beyond using them with configure). The "OS" field is no longer consistent. This usage is in conflict with the existing definition; LIBC and ABI are subfields of OS. It isn't resolving any "technical debt" -- it's sowing mass confusion. >From config.sub: # The goal of this file is to map all the various variations of a given # machine specification into a single specification in the form: # CPU_TYPE-MANUFACTURER-OPERATING_SYSTEM # or in some cases, the newer four-part form: # CPU_TYPE-MANUFACTURER-KERNEL-OPERATING_SYSTEM # It is wrong to echo any other type of specification. The variable name "LIBCABI" comes from config.guess, where it is not described, but is always parsed as a refinement of the OPERATING_SYSTEM field. There is never a hyphen between OPERATING_SYSTEM and LIBCABI because they are in fact different parsings of the same string. While I may have drawn inspiration from past work on config.guess, I used the name "LIBCABI" to reflect that it can be either a libc implementation or an ABI name; the two are usually closely related in practice. config tuples were originally triplets and now often feature a 4-element CPU-VENDOR-KERNEL-OS form Yes, we've had ~20 years to appreciate the confusion it caused, and now we know better than to do something like that again. It seems like a lot of the proposals in this thread are being evaluated not based on whether or not they are coherent, but rather on whether or not they take us a few nanometers closer to whatever happens to whatever LLVM's internal implementation details happen to be this week. My proposals have been an effort (in my view at least) to restore coherency here, and if LLVM is using *-windows-gnu at the moment, LLVM is /wrong/. That tuple should describe a POSIX GNU environment running on a Windows system. Such an environment is theoretically possible, but does not currently exist as far as I know. `CPU-VENDOR-linux-gnu-musl` I lack words to describe this. I suppose it could be useful if the goal were to drive config.sub into such a self-inconsistent state that everybody abandons it. Then I need to explain it again: CPU and VENDOR are all caps because they remain variable in that pattern. Perhaps I should have written `*-*-linux-gnu-musl' there. That tuple describes a GNU system (*-gnu-*) running on a Linux kernel (*-linux-*) using Musl libc (*-musl). Do you argue that (*-gnu) should mean specifically GNU libc even though there is more to the GNU system than libc? -- Jacob
Re: config.sub should normalize *-*-windows-*
It seems to me reading this thread that we've come into two conflicting realities: * There exists targets that need to be distinguished, and * They are not distinct in any component that config.sub has, therefore they cannot and should not be distinguished. mingw and msvc both use the NT kernel, and the windows operating system. So it seems to me that windows, the OS, is the correct way to describe them. According to the discussions on this thread, they should thusly both canonicalize to the same target. And yet, not only is there desire to separate these targets, they already are. LLVM (as well as my own target parsing tool) refer to the last two components as "sys" with two subcomponents (of which at least one exists), being os and env. IMO, this seems a far more coherent definition that satisfies the requirements, and even more correctly matches targets that already exist. musl is another extreme example: There is no musl OS. The last component being musl refers to the use of the musl libc. The resulting binary can then be used on either a GNU system or a non-GNU linux system like alpine, void, or iglunix. Thus musl cannot be regarded as an "OPERATING_SYSTEM" but rather an an environment. Even on linux-gnu the definition is murky at best. While I won't dispute the existence of a GNU operating system running atop the linux kernel, in many cases, the actual linux-gnu tag merely refers to glibc. Few things using targets nowadays actually cares about the rest of the tools, and when they do care that they exist (on --host or even --target), they typically don't care that they're provided by GNU, and even may not care that they match the interface of the tool provided by GNU. Only on --build are the tools really cared about, and I don't see many things matching the build tuple or even canonicalizing it. If we thus define an "Operating System" as "kernel+libc+tools atop that" it becomes clear to me that few things written nowadays care about the "GNU Operating System" and only really care about the "GNU Environment". On Thu, Aug 24, 2023 at 12:22 John Ericson wrote: > This is why I opened with "Operating System" lacks a coherent objective > definition. > > The more pugilistic message is to say the rest of the world doesn't think > the GNU operating system exists --- that there is simply a choice of kernel > (Linux, k*BSD, Hurd, something else...) and choices of libraries and system > components on top of that, and many combinations are possible. The rest of > the world might say this in a mean way, but I say it is actually a *good* > thing --- software freedom means one *can* choose my components à la > carte, and only a lack of software freedom results in a kernel and mass of > libraries outside one's control blurring together into a scary "take it or > leave it" monolith we call an operating system. > On 8/24/23 08:51, Adam Joseph wrote: > > Quoting Jacob Bachmeyer (2023-08-21 19:35:05) > > No, we are not. CPU-VENDOR-KERNEL-OS-LIBCABI, with at least one of the > > If you want to redefine existing terms, please be forthright about the fact > that > your proposal does so. > > This usage is in conflict with the existing definition; LIBC and ABI are > subfields of OS. It isn't resolving any "technical debt" -- it's sowing mass > confusion. > > From config.sub: > > # The goal of this file is to map all the various variations of a given > # machine specification into a single specification in the form: > # CPU_TYPE-MANUFACTURER-OPERATING_SYSTEM > # or in some cases, the newer four-part form: > # CPU_TYPE-MANUFACTURER-KERNEL-OPERATING_SYSTEM > # It is wrong to echo any other type of specification. > > The variable name "LIBCABI" comes from config.guess, where it is not > described, > but is always parsed as a refinement of the OPERATING_SYSTEM field. There is > never a hyphen between OPERATING_SYSTEM and LIBCABI because they are in fact > different parsings of the same string. > > I'll add that all linux-* configs in config.guess use LIBC/LIBCABI. I take > this as further evidence that distinguishing OSes atop Linux is useless. > Per the above, I think this is good! > > config tuples were originally triplets and now often feature a 4-element > CPU-VENDOR-KERNEL-OS form > > Yes, we've had ~20 years to appreciate the confusion it caused, and now we > know > better than to do something like that again. > > Adam are you saying you prefer the state of 3-component configs? > > It seems like a lot of the proposals in this thread are being evaluated not > based on whether or not they are coherent, but rather on whether or not they > take us a few nanometers closer to whatever happens to whatever LLVM's > internal > implementation details happen to be this week. > > I care about coherence, the reason I like to see what LLVM does that > working from a parsed representation forces the software to be much more > honest. Since GNU config doesn't reveal its categories but just spits out > another opaque
Re: config.sub should normalize *-*-windows-*
This is why I opened with "Operating System" lacks a coherent objective definition. The more pugilistic message is to say the rest of the world doesn't think the GNU operating system exists --- that there is simply a choice of kernel (Linux, k*BSD, Hurd, something else...) and choices of libraries and system components on top of that, and many combinations are possible. The rest of the world might say this in a mean way, but I say it is actually a /good/ thing --- software freedom means one /can/ choose my components à la carte, and only a lack of software freedom results in a kernel and mass of libraries outside one's control blurring together into a scary "take it or leave it" monolith we call an operating system. On 8/24/23 08:51, Adam Joseph wrote: Quoting Jacob Bachmeyer (2023-08-21 19:35:05) No, we are not. CPU-VENDOR-KERNEL-OS-LIBCABI, with at least one of the If you want to redefine existing terms, please be forthright about the fact that your proposal does so. This usage is in conflict with the existing definition; LIBC and ABI are subfields of OS. It isn't resolving any "technical debt" -- it's sowing mass confusion. From config.sub: # The goal of this file is to map all the various variations of a given # machine specification into a single specification in the form: # CPU_TYPE-MANUFACTURER-OPERATING_SYSTEM # or in some cases, the newer four-part form: # CPU_TYPE-MANUFACTURER-KERNEL-OPERATING_SYSTEM # It is wrong to echo any other type of specification. The variable name "LIBCABI" comes from config.guess, where it is not described, but is always parsed as a refinement of the OPERATING_SYSTEM field. There is never a hyphen between OPERATING_SYSTEM and LIBCABI because they are in fact different parsings of the same string. I'll add that all linux-* configs in config.guess use LIBC/LIBCABI. I take this as further evidence that distinguishing OSes atop Linux is useless. Per the above, I think this is good! config tuples were originally triplets and now often feature a 4-element CPU-VENDOR-KERNEL-OS form Yes, we've had ~20 years to appreciate the confusion it caused, and now we know better than to do something like that again. Adam are you saying you prefer the state of 3-component configs? It seems like a lot of the proposals in this thread are being evaluated not based on whether or not they are coherent, but rather on whether or not they take us a few nanometers closer to whatever happens to whatever LLVM's internal implementation details happen to be this week. I care about coherence, the reason I like to see what LLVM does that working from a parsed representation forces the software to be much more honest. Since GNU config doesn't reveal its categories but just spits out another opaque string, there is no external pressure for its categorization to be any good. LLVM, on the other hand, dispenses with strings entirely and just uses the enums, so it is forced to make sure those enums make sense and work for the branching the program has to do. LLVM parsing of configs is ad-hoc Postel's law stuff like everyone else, but its internal representation is actually quite stable. Parsing is the ugly nasty part that gets to the pristine clear ontology on the other side. Ultimately I would like to convene everyone to commit to an agreed upon internal representation too. E.g. clang and GNU config could both spit out some JSON that is unambiguous and should match. I think that would alleviate a lot of Adam's concerns about "following LLVM". But I don't think it is possible to convene the working group needed to standardize such a format yet, because there is little trust between parties. Moving us a "a few nanometers closer" on each side demonstrates that there is willingness to compromise. --- Concretely, I think these are pretty clear configs: CPU-VENDOR-windows-mingnu # MinGW, MS C + GNU C++ and other GNU-ish things, TODO distinguish between MSVCRT and UCRT CPU-VENDOR-windows-cygnus # Cygwin CPU-VENDOR-windows-msys # MSYS2, a lot like Cygwin CPU-VENDOR-windows-msvc # MS C + MS C++ CPU-VENDOR-linux-gnu # gnu libc CPU-VENDOR-linux-musl # musl libc CPU-VENDOR-linux-android # bionic libc I know Po Lu doesn't like them, because they overlap with existing ones. But what about you two, Adam and Jacob? I am trying to compromise between what various things do already, and and also correct things like windows-gnu (even if there is no such thing as the GNU operating system (only multiple GNU Hurd-supporting distros), I agree that MinGW is clearly not a complete enough of set of GNU software to earn the right to drop the "minimal" part). If we can accept these, I think I will have no problem getting LLVM to accept windows-mingnu, and perhaps even warn/deprecate windows-gnu. After that, I think we are close enough to convene a working group for a JSON/whatever explicit standard. And that would be amazing. --
Re: config.sub should normalize *-*-windows-*
Quoting Jacob Bachmeyer (2023-08-21 19:35:05) > No, we are not. CPU-VENDOR-KERNEL-OS-LIBCABI, with at least one of the If you want to redefine existing terms, please be forthright about the fact that your proposal does so. This usage is in conflict with the existing definition; LIBC and ABI are subfields of OS. It isn't resolving any "technical debt" -- it's sowing mass confusion. >From config.sub: # The goal of this file is to map all the various variations of a given # machine specification into a single specification in the form: # CPU_TYPE-MANUFACTURER-OPERATING_SYSTEM # or in some cases, the newer four-part form: # CPU_TYPE-MANUFACTURER-KERNEL-OPERATING_SYSTEM # It is wrong to echo any other type of specification. The variable name "LIBCABI" comes from config.guess, where it is not described, but is always parsed as a refinement of the OPERATING_SYSTEM field. There is never a hyphen between OPERATING_SYSTEM and LIBCABI because they are in fact different parsings of the same string. > config tuples were originally triplets and now often feature a 4-element > CPU-VENDOR-KERNEL-OS form Yes, we've had ~20 years to appreciate the confusion it caused, and now we know better than to do something like that again. It seems like a lot of the proposals in this thread are being evaluated not based on whether or not they are coherent, but rather on whether or not they take us a few nanometers closer to whatever happens to whatever LLVM's internal implementation details happen to be this week. > `CPU-VENDOR-linux-gnu-musl` I lack words to describe this. I suppose it could be useful if the goal were to drive config.sub into such a self-inconsistent state that everybody abandons it. Perhaps that is the plan. - a
Re: config.sub should normalize *-*-windows-*
Jacob Bachmeyer writes: > There are several GNU/Linux distributions that either use or can use > musl libc already. (See: > https://en.wikipedia.org/w/index.php?title=Musl&oldid=1164590075>) > Musl libc does not have the same features as GNU libc, so it is > rightly a different ABI target, however, the system is still a GNU > variant, so its configure tuple should still match *-gnu-* for the > same reasons that the GNU project wants to call the overall system > GNU/Linux. These systems are not GNU/Linux systems, since they don't incorporate the GNU libc; this particularly applies to Autoconf users, since they will not have access to extensions furnished by the GNU libc. >> To top it all off, considerations for such systems affect the entire GNU >> project, and config cannot unilaterally ordain decisions regarding their >> treatment. config-patches is definitely the wrong mailing list. > > OK then, what is the right mailing list? gnu-system-discuss, maybe?
Re: config.sub should normalize *-*-windows-*
Po Lu wrote: Jacob Bachmeyer writes: No, we are not. CPU-VENDOR-KERNEL-OS-LIBCABI, with at least one of the latter three omitted, fits the bill. In that case, the reference to MinGW means that "OS" and/or "KERNEL" are omitted and MinGW is the ABI. The next logical extension is to allow all five to be present, to describe systems flexible enough to accommodate multiple ABIs. We're not trying to change the world here, so let's wait until a more urgent need presents itself before issuing plans for drastic changes to a format that has been firmly established for well over two decades, okay? I do not see this as planning a drastic change. I see this issue as acknowledging a change that has already happened unnoticed. The problem is that that example does exist, so we need to find a systematic way to accommodate it before more such variant GNU systems are produced and we have a real mess. Which systems are distributed in this manner? And what difference does C library they elect to use for system utilites make towards the compilation of user programs with Autoconf? There are several GNU/Linux distributions that either use or can use musl libc already. (See: https://en.wikipedia.org/w/index.php?title=Musl&oldid=1164590075>) Musl libc does not have the same features as GNU libc, so it is rightly a different ABI target, however, the system is still a GNU variant, so its configure tuple should still match *-gnu-* for the same reasons that the GNU project wants to call the overall system GNU/Linux. To top it all off, considerations for such systems affect the entire GNU project, and config cannot unilaterally ordain decisions regarding their treatment. config-patches is definitely the wrong mailing list. OK then, what is the right mailing list? -- Jacob
Re: config.sub should normalize *-*-windows-*
Jacob Bachmeyer writes: > No, we are not. CPU-VENDOR-KERNEL-OS-LIBCABI, with at least one of > the latter three omitted, fits the bill. In that case, the reference > to MinGW means that "OS" and/or "KERNEL" are omitted and MinGW is the > ABI. The next logical extension is to allow all five to be present, > to describe systems flexible enough to accommodate multiple ABIs. We're not trying to change the world here, so let's wait until a more urgent need presents itself before issuing plans for drastic changes to a format that has been firmly established for well over two decades, okay? > The problem is that that example does exist, so we need to find a > systematic way to accommodate it before more such variant GNU systems > are produced and we have a real mess. Which systems are distributed in this manner? And what difference does C library they elect to use for system utilites make towards the compilation of user programs with Autoconf? To top it all off, considerations for such systems affect the entire GNU project, and config cannot unilaterally ordain decisions regarding their treatment. config-patches is definitely the wrong mailing list. Thanks.
Re: config.sub should normalize *-*-windows-*
Po Lu wrote: Jacob Bachmeyer writes: [...] but several existing tuples use a libc or ABI name in place of a kernel and/or operating system. In each of those cases, the ABI name _can_ be construed as a kernel (since there is no kernel at all), or the libc name refers to a general category of OS. Neither of these situations are applicable to MinGW or MSVC. Arguably, MinGW *is* an ABI name. Either way, that ship has already sailed. So we're stuck with dubbing MinGW an operating system. No, we are not. CPU-VENDOR-KERNEL-OS-LIBCABI, with at least one of the latter three omitted, fits the bill. In that case, the reference to MinGW means that "OS" and/or "KERNEL" are omitted and MinGW is the ABI. The next logical extension is to allow all five to be present, to describe systems flexible enough to accommodate multiple ABIs. Think about why the GNU project pushes to call the common system "GNU/Linux" and you should see the reason for using `*-*-linux-gnu-musl' to express a GNU/Linux system using musl libc. If the GNU libc isn't being used, it's not a complete GNU system. We should defer establishing suitable configuration names for Frankenstein systems until the moment they come into existence. The problem is that that example does exist, so we need to find a systematic way to accommodate it before more such variant GNU systems are produced and we have a real mess. -- Jacob
Re: config.sub should normalize *-*-windows-*
Jacob Bachmeyer writes: > Then its present use is *wrong* and a bug that should be fixed. The subject of this thread, indeed. > It is a little more complex than that: the GNU system theoretically > can run on any of multiple kernels. While Linux is most commonly > used, GNU HURD is still in development and I understand that there is > a Debian variant using the GNU utilities on a FreeBSD kernel. They're *-*-kfreebsd-gnu and *-*-gnu. >>> but several existing tuples use a libc or ABI name in place of a >>> kernel and/or operating system. >>> >> >> In each of those cases, the ABI name _can_ be construed as a kernel >> (since there is no kernel at all), or the libc name refers to a general >> category of OS. Neither of these situations are applicable to MinGW or >> MSVC. >> > > Arguably, MinGW *is* an ABI name. Either way, that ship has already sailed. So we're stuck with dubbing MinGW an operating system. > Think about why the GNU project pushes to call the common system > "GNU/Linux" and you should see the reason for using > `*-*-linux-gnu-musl' to express a GNU/Linux system using musl libc. If the GNU libc isn't being used, it's not a complete GNU system. We should defer establishing suitable configuration names for Frankenstein systems until the moment they come into existence.
Re: config.sub should normalize *-*-windows-*
John Ericson wrote: On Mon, Aug 21, 2023, at 1:17 AM, Po Lu wrote: Jacob Bachmeyer mailto:jcb62...@gmail.com>> writes: > No: MinGW is Windows native "Win32" API while a future `windows-gnu' > would be the GNU system's POSIX API on an NT kernel. These are *very* > different configurations; `windows-gnu' would more closely resemble > Cygwin. This is not what the `x86_64-pc-windows-gnu' configuration presently canonicalized by config.sub represents. I have offered multiple times to change it to windows-mingnu or something else. Let's not be hung up on this, it is just making a straw man of the broader project of making configs that are more consistent. If it describes MinGW, then it should be windows-mingw32 or windows-mingw64 as appropriate. The CPU field /should/ be redundant to that, but x86_64 can run 32-bit code, so it would probably be a good idea, unless we want to canonicalize `x86_64-pc-*mingw32' to `i686-pc-windows-mingw'. Should canonicalization change the CPU field when one CPU type has a compatibility mode for another CPU type and the ABI implies use of that mode? [...] > But they already have drifted: config tuples were originally triplets > and now often feature a 4-element CPU-VENDOR-KERNEL-OS form Only as a result of a technical need to distinguish Linux-based GNU systems from other GNU systems. Absent that requirement, we would simply call GNU/Linux systems *-*-gnu, Alpine *-*-alpine, and Android *-*-android. You just said it! We have the exact same "technical need" on Windows as with Linux of identifying different platforms that share the same syscall interface. For the same reason we don't want people to have to write *-*-gnu | *-*-alpine | *-*-android (an endlessly growing list of special cases) to use e.g. the clone system call, we don't want them to have to maintain a big and ever growing list of Windows variants for a conditional that is just about Windows in general. The catch here is that any package recognizing both *-gnu and *-windows-* would need to ensure that the match for *-gnu has priority, since an actual *-windows-gnu environment would be the (POSIX) GNU system running on an NT kernel and would /not/ have the standard Windows-isms that *-windows-* otherwise implies. CygWin may have similar issues; I believe that it is currently treated as a unique OS unrelated to Windows. Or should we define a new `windowsix' KERNEL value for POSIX environments on NT kernels? -- Jacob
Re: config.sub should normalize *-*-windows-*
Jacob Bachmeyer writes: > I said "with only 3-or-4-of-5 elements present"; that using all 5 > elements is currently invalid does not change that we effectively > /have/ 5 elements, with a restriction that only 3 or 4 of them can > actually be present in any one tuple. > > > -- Jacob And my point was that you're suffering from a bout of pareidolia...
Re: config.sub should normalize *-*-windows-*
Po Lu wrote: Jacob Bachmeyer writes: No: MinGW is Windows native "Win32" API while a future `windows-gnu' would be the GNU system's POSIX API on an NT kernel. These are *very* different configurations; `windows-gnu' would more closely resemble Cygwin. This is not what the `x86_64-pc-windows-gnu' configuration presently canonicalized by config.sub represents. Then its present use is *wrong* and a bug that should be fixed. I say it would be more appropriate to accept `x86_64-pc-mingw64' as a short form for `x86_64-pc-windows-mingw64', since Wine could enable a `x86_64-pc-linux-mingw64' platform to exist. (Wine's goal is that that platform should be indistinguishable from `x86_64-pc-windows-mingw64', but it is certainly a distinct configuration from the user's perspective.) Wine is a compatibility layer that emulates the MS-Windows kernel. It is not config's role to report the intricacies of the operating system implementation, only details that affect user programs running under that operating system. As I said, Wine's goal is that `x86_64-pc-linux-mingw64' be indistinguishable from `x86_64-pc-windows-mingw64' but that does not preclude using the KERNEL-OS form. But they already have drifted: config tuples were originally triplets and now often feature a 4-element CPU-VENDOR-KERNEL-OS form Only as a result of a technical need to distinguish Linux-based GNU systems from other GNU systems. Absent that requirement, we would simply call GNU/Linux systems *-*-gnu, Alpine *-*-alpine, and Android *-*-android. It is a little more complex than that: the GNU system theoretically can run on any of multiple kernels. While Linux is most commonly used, GNU HURD is still in development and I understand that there is a Debian variant using the GNU utilities on a FreeBSD kernel. but several existing tuples use a libc or ABI name in place of a kernel and/or operating system. In each of those cases, the ABI name _can_ be construed as a kernel (since there is no kernel at all), or the libc name refers to a general category of OS. Neither of these situations are applicable to MinGW or MSVC. Arguably, MinGW *is* an ABI name. I simply note this and suggest recognizing this fact that config tuples are actually now currently 3-or-4-of-5 elements. The GNU system is definitely flexible enough for that 5-element form to be appropriate: `CPU-VENDOR-linux-gnu' (GNU/Linux, implied to be using glibc) and `CPU-VENDOR-linux-gnu-musl' (GNU/Linux using musl libc) are plausibly distinguishable, for example, and could even both be useful on the *same* machine, if, for example, some low-level system utilities are linked against musl libc while most user programs use glibc. Such configurations do not exist, so we need not provide for them in config.*. And in any case, these ``low level utilities'' would be configured for *-*-linux-musl, while user programs would be configured for *-*-linux-gnu. I see no reason config.* must take special measures to recognize these Frankenstein systems, since the C library used to build some system utilities has no bearing on the operation of other user programs built for *-*-linux-gnu. Think about why the GNU project pushes to call the common system "GNU/Linux" and you should see the reason for using `*-*-linux-gnu-musl' to express a GNU/Linux system using musl libc. -- Jacob
Re: config.sub should normalize *-*-windows-*
Po Lu wrote: Jacob Bachmeyer writes This is why I am arguing that we should acknowledge that the naming conventions have, in practice, already changed to CPU-VENDOR-KERNEL-OS-LIBCABI, with only 3-or-4-of-5 elements present at the moment. $ ./config.sub a-b-c-d-e Invalid configuration 'a-b-c-d-e': more than four components I said "with only 3-or-4-of-5 elements present"; that using all 5 elements is currently invalid does not change that we effectively /have/ 5 elements, with a restriction that only 3 or 4 of them can actually be present in any one tuple. -- Jacob
Re: config.sub should normalize *-*-windows-*
"John Ericson" writes: > I have offered multiple times to change it to windows-mingnu or > something else. Let's not be hung up on this, it is just making a straw > man of the broader project of making configs that are more > consistent. The point is, it should not contain `windows' at all, and it should not differ from an existing triplet designating the same configuration. The objective of config development is to uniquely identify a system configuration, and nothing beyond that -- changing the set of values reported for the same configuration in the name of ``consistency'' certainly runs contrary to such an objective. > You have misunderstood Jacob's point, which is that the MinGW > interface has multiple implementations, namely top Windows itself > and via Wine. Indeed see > https://manpages.ubuntu.com/manpages/kinetic/man1/winegcc-development.1.html, > this already exists via winelib and GCC. (MinGW is basically "modify > MS headers so they work with GCC", and thus the same > modifications are needed whether we are going to run on windows > or on some Unix with winelib.) My understanding is that headers used by winegcc are provided by Winelib, not MinGW. It would be fine to distinguish this configuration from others: x86_64-pc-linux-winelib perhaps? With that said, > We never want to distinguish implementations that present the exact > same interface (indeed, that defeats the point of re-implementing the > same interface, going down this road yields a cat-and-mouse of > endless lies like browser user-agent strings), but we do want to > distinguish different interfaces. All of the above is tangential to the subject at hand, since neither of the configurations being debated fall under the latter category. > You just said it! We have the exact same "technical need" on > Windows as with Linux of identifying different platforms that share > the same syscall interface. For the same reason we don't want > people to have to write *-*-gnu | *-*-alpine | *-*-android (an endlessly > growing list of special cases) to use e.g. the clone system call, we > don't want them to have to maintain a big and ever growing list of > Windows variants for a conditional that is just about Windows in > general. We have the means to adequately distinguish between the different Windows configurations already. Programs that want to use MSVC write *-*-winnt*, and those which need to detect MinGW write *-*-mingw*. > As a Nixpkgs developer, my goal is to see all the free software in the > world packaged in a single way which will run (or can be > cross-compiled to everywhere). This is very ambitious, and the only > way it will work is if it is easy for upstream software to be portable. > And it way to make it easy to be portable supporting all the variations > is if we clean up foot-guns like this where upstream software has to > maintain every-growing disjunctions rather than future-proof > wildcards. Chances are that software written for MSVC or MinGW will not work OOTB with future Win32 toolchains, should any come into existence, making any such future proofing redundant. It is definitely no justification for introducing duplicate triplets, especially ones inconsistent with the naming scheme delinated at the start of config.sub. You are also portraying the situation from the perspective of a packager. They are not Autoconf's primary audience: users configuring programs are. Duplicate configuration names designating the same systems will aggravate the conundrum experienced by most when configuring software. Meanwhile, if config.sub normalizes the *-*-windows-* triplets as I've proposed, users can continue providing these invalid triplets, with no changes to Autoconf scripts or build files. Which is the purpose of config.sub after all. > Sure, this sort of tech debt OCDing I am doing freely admit seems > over kill for just one package (emacs) and one windows platform > (MinGW), but please believe me when I say when considering all the > variations and all the package it *does* become something worth > practically caring about. Having ported software other than Emacs to MS-Windows (and witnesses others porting even more), I cannot agree.
Re: config.sub should normalize *-*-windows-*
On Mon, Aug 21, 2023, at 1:17 AM, Po Lu wrote: > Jacob Bachmeyer writes: > > > No: MinGW is Windows native "Win32" API while a future `windows-gnu' > > would be the GNU system's POSIX API on an NT kernel. These are *very* > > different configurations; `windows-gnu' would more closely resemble > > Cygwin. > > This is not what the `x86_64-pc-windows-gnu' configuration presently > canonicalized by config.sub represents. I have offered multiple times to change it to windows-mingnu or something else. Let's not be hung up on this, it is just making a straw man of the broader project of making configs that are more consistent. > > I say it would be more appropriate to accept `x86_64-pc-mingw64' as a > > short form for `x86_64-pc-windows-mingw64', since Wine could enable a > > `x86_64-pc-linux-mingw64' platform to exist. (Wine's goal is that > > that platform should be indistinguishable from > > `x86_64-pc-windows-mingw64', but it is certainly a distinct > > configuration from the user's perspective.) > > Wine is a compatibility layer that emulates the MS-Windows kernel. It > is not config's role to report the intricacies of the operating system > implementation, only details that affect user programs running under > that operating system. You have misunderstood Jacob's point, which is that the MinGW *interface* has multiple implementations, namely top Windows itself and via Wine. Indeed see https://manpages.ubuntu.com/manpages/kinetic/man1/winegcc-development.1.html, this already exists via winelib and GCC. (MinGW is basically "modify MS headers so they work with GCC", and thus the same modifications are needed whether we are going to run on windows or on some Unix with winelib.) We never want to distinguish implementations that present the exact same interface (indeed, that defeats the point of re-implementing the same interface, going down this road yields a cat-and-mouse of endless lies like browser user-agent strings), but we *do* want to distinguish different interfaces. > > But they already have drifted: config tuples were originally triplets > > and now often feature a 4-element CPU-VENDOR-KERNEL-OS form > > Only as a result of a technical need to distinguish Linux-based GNU > systems from other GNU systems. Absent that requirement, we would > simply call GNU/Linux systems *-*-gnu, Alpine *-*-alpine, and Android > *-*-android. You just said it! We have the exact same "technical need" on Windows as with Linux of identifying different platforms that share the same syscall interface. For the same reason we don't want people to have to write *-*-gnu | *-*-alpine | *-*-android (an endlessly growing list of special cases) to use e.g. the clone system call, we don't want them to have to maintain a big and ever growing list of Windows variants for a conditional that is just about Windows in general. As a Nixpkgs developer, my goal is to see all the free software in the world packaged in a single way which will run (or can be cross-compiled to everywhere). This is very ambitious, and the only way it will work is if it is easy for upstream software to be portable. And it way to make it easy to be portable supporting all the variations is if we clean up foot-guns like this where upstream software has to maintain every-growing disjunctions rather than future-proof wildcards. Sure, this sort of tech debt OCDing I am doing freely admit seems over kill for just one package (emacs) and one windows platform (MinGW), but please believe me when I say when considering all the variations and all the package it *does* become something worth practically caring about. John
Re: config.sub should normalize *-*-windows-*
Jacob Bachmeyer writes: > This is why I am arguing that we should acknowledge that the naming > conventions have, in practice, already changed to > CPU-VENDOR-KERNEL-OS-LIBCABI, with only 3-or-4-of-5 elements present > at the moment. $ ./config.sub a-b-c-d-e Invalid configuration 'a-b-c-d-e': more than four components
Re: config.sub should normalize *-*-windows-*
John Ericson writes: > I agree the GNU project is not under any such obligation, and that's > why I proposed windows-mingw as a compromise. Once again, what's wrong with plain mingw? Or winnt? > It is more work for me to go make both GCC and LLVM support, but I > rather do that than be stuck with plain mingw32. Your preferences do not necessarily reflect those of the thousands of Autoconf users, all of whom have lived with the status quo for decades. > There is no existing convention for windows. Really? What's alpha-dec-winnt*, or i586-pc-mingw32? > So far every time a new "brand name" 3rd position component has been > chosen without any sort of pattern. There doesn't have to be a pattern. > Now that I've made (over the past few years) GNU config be more > structured and more easily support longer configs, it is time to > establish a convention. windows-* makes sense Neither make sense. And the overriding objective of all config.* development is to _NEVER_ change the set of canonical values, or even worse, introduce duplicate ones. > and has precedent. >From LLVM? That may be so, but the GNU project elected to use `mingw' and `winnt' decades prior to LLVM's very existence.
Re: config.sub should normalize *-*-windows-*
Jacob Bachmeyer writes: > No: MinGW is Windows native "Win32" API while a future `windows-gnu' > would be the GNU system's POSIX API on an NT kernel. These are *very* > different configurations; `windows-gnu' would more closely resemble > Cygwin. This is not what the `x86_64-pc-windows-gnu' configuration presently canonicalized by config.sub represents. > I say it would be more appropriate to accept `x86_64-pc-mingw64' as a > short form for `x86_64-pc-windows-mingw64', since Wine could enable a > `x86_64-pc-linux-mingw64' platform to exist. (Wine's goal is that > that platform should be indistinguishable from > `x86_64-pc-windows-mingw64', but it is certainly a distinct > configuration from the user's perspective.) Wine is a compatibility layer that emulates the MS-Windows kernel. It is not config's role to report the intricacies of the operating system implementation, only details that affect user programs running under that operating system. > But they already have drifted: config tuples were originally triplets > and now often feature a 4-element CPU-VENDOR-KERNEL-OS form Only as a result of a technical need to distinguish Linux-based GNU systems from other GNU systems. Absent that requirement, we would simply call GNU/Linux systems *-*-gnu, Alpine *-*-alpine, and Android *-*-android. > but several existing tuples use a libc or ABI name in place of a > kernel and/or operating system. In each of those cases, the ABI name _can_ be construed as a kernel (since there is no kernel at all), or the libc name refers to a general category of OS. Neither of these situations are applicable to MinGW or MSVC. > I simply note this and suggest recognizing this fact that config > tuples are actually now currently 3-or-4-of-5 elements. The GNU > system is definitely flexible enough for that 5-element form to be > appropriate: `CPU-VENDOR-linux-gnu' (GNU/Linux, implied to be using > glibc) and `CPU-VENDOR-linux-gnu-musl' (GNU/Linux using musl libc) are > plausibly distinguishable, for example, and could even both be useful > on the *same* machine, if, for example, some low-level system > utilities are linked against musl libc while most user programs use > glibc. Such configurations do not exist, so we need not provide for them in config.*. And in any case, these ``low level utilities'' would be configured for *-*-linux-musl, while user programs would be configured for *-*-linux-gnu. I see no reason config.* must take special measures to recognize these Frankenstein systems, since the C library used to build some system utilities has no bearing on the operation of other user programs built for *-*-linux-gnu.
Re: config.sub should normalize *-*-windows-*
Po Lu wrote: Jacob Bachmeyer writes: At this time, yes. However, the GNU utilities are designed to be fairly portable and the NT kernel was designed to support multiple ABIs, so a hypothetical port of GNU to run under MS-Windows is within the realm of possibility. (In fact, the underlying architecture of NT should have all of the primitives needed to support HURD or a closely related system.) It is more likely that this would be implemented on ReactOS (which aims for ABI compatibility with NT 5.1, is a stable target, and is Free) first, but a hypothetical `x86_64-pc-windows-gnu' (or perhaps `x86_64-pc-reactos-gnu') config tuple *is* a future possibility. This hypothesizing is not relevant here. x86_64-pc-windows-* represents MinGW, and should be normalized correspondingly. No: MinGW is Windows native "Win32" API while a future `windows-gnu' would be the GNU system's POSIX API on an NT kernel. These are *very* different configurations; `windows-gnu' would more closely resemble Cygwin. And what would we canonicalize `x86_64-pc-windows-gnu' to, other than itself, currently? x86_64-pc-mingw64, which I mentioned at the outset of this thread. I say it would be more appropriate to accept `x86_64-pc-mingw64' as a short form for `x86_64-pc-windows-mingw64', since Wine could enable a `x86_64-pc-linux-mingw64' platform to exist. (Wine's goal is that that platform should be indistinguishable from `x86_64-pc-windows-mingw64', but it is certainly a distinct configuration from the user's perspective.) It appears that config tuples may be drifting towards a 5-element CPU-VENDOR-KERNEL-OS-LIBCABI form, with each of the last three elements potentially optional, which makes any real tuple ambiguous, except that the valid strings for KERNEL, OS, and LIBCABI are from distinct sets. Configuration tuples don't ``drift'', and they certainly should not change or duplicate other triplets. But they already have drifted: config tuples were originally triplets and now often feature a 4-element CPU-VENDOR-KERNEL-OS form, but several existing tuples use a libc or ABI name in place of a kernel and/or operating system. I simply note this and suggest recognizing this fact that config tuples are actually now currently 3-or-4-of-5 elements. The GNU system is definitely flexible enough for that 5-element form to be appropriate: `CPU-VENDOR-linux-gnu' (GNU/Linux, implied to be using glibc) and `CPU-VENDOR-linux-gnu-musl' (GNU/Linux using musl libc) are plausibly distinguishable, for example, and could even both be useful on the *same* machine, if, for example, some low-level system utilities are linked against musl libc while most user programs use glibc. -- Jacob
Re: config.sub should normalize *-*-windows-*
Po Lu wrote: John Ericson writes: Thanks Jacob, That's absolutely right that Win NT supports multiple personalities and so all sorts of things are possible. (Indeed that is how WSL1 worked.) MinGW stands for "Minimalist GNU for Windows" [1], and I suspect that is why Saleem choose windows-gnu in that commit almost a decade ago. I supposed we could say that "minimalist GNU" is not "GNU", and do windows-mingnu or something, and then I could submit an LLVM patch to try to support that. But I suppose I lean towards support configs that at least one of GCC or Clang supports already, rather than making up completely new stuff. GNU config is part of the GNU project, developing the GNU operating system, which opted for ``mingw'' many, many moons ago. We are under no obligation to adhere to LLVM standards, especially when they require us to misrepresent the nature of a specific system configuration. This is also correct: `windows-gnu' does not currently exist and its use to describe MinGW in LLVM is /wrong/. The GNU system implements/extends POSIX and MinGW attempts to port GNU utilities to run under native Windows, which does *not* implement POSIX, therefore MinGW is *not* `windows-gnu'. Also, I would like to point out that the "scales to more variations" argument is not at all hypothetical. If one looks at [2] one will see that MSYS is a variation of Cygwin, and a mingw-style environments can be made from the newer ucrt or older msvcrt. Today there are just too many subtle variations to capture them all with sensible. It looks like MSYS [3] reuses a triple for multiple configurations, and just relies on users getting the PATH right, but that clearly isn't ideal. Creating windows-* variants to handle them all in a consistent and predictable manner is much better. We can create new triplets for new environments once they do come into existence. But they should not duplicate existing ones, and they must conform to the existing naming convention for configuration triplets. This is why I am arguing that we should acknowledge that the naming conventions have, in practice, already changed to CPU-VENDOR-KERNEL-OS-LIBCABI, with only 3-or-4-of-5 elements present at the moment. -- Jacob
Re: config.sub should normalize *-*-windows-*
On 8/21/23 00:39, Po Lu wrote: John Ericson writes: Thanks Jacob, That's absolutely right that Win NT supports multiple personalities and so all sorts of things are possible. (Indeed that is how WSL1 worked.) MinGW stands for "Minimalist GNU for Windows" [1], and I suspect that is why Saleem choose windows-gnu in that commit almost a decade ago. I supposed we could say that "minimalist GNU" is not "GNU", and do windows-mingnu or something, and then I could submit an LLVM patch to try to support that. But I suppose I lean towards support configs that at least one of GCC or Clang supports already, rather than making up completely new stuff. GNU config is part of the GNU project, developing the GNU operating system, which opted for ``mingw'' many, many moons ago. We are under no obligation to adhere to LLVM standards, especially when they require us to misrepresent the nature of a specific system configuration. I agree the GNU project is not under any such obligation, and that's why I proposed windows-mingw as a compromise. It is more work for me to go make both GCC and LLVM support, but I rather do that than be stuck with plain mingw32. Also, I would like to point out that the "scales to more variations" argument is not at all hypothetical. If one looks at [2] one will see that MSYS is a variation of Cygwin, and a mingw-style environments can be made from the newer ucrt or older msvcrt. Today there are just too many subtle variations to capture them all with sensible. It looks like MSYS [3] reuses a triple for multiple configurations, and just relies on users getting the PATH right, but that clearly isn't ideal. Creating windows-* variants to handle them all in a consistent and predictable manner is much better. We can create new triplets for new environments once they do come into existence. But they should not duplicate existing ones, and they must conform to the existing naming convention for configuration triplets. There is no existing convention for windows. So far every time a new "brand name" 3rd position component has been chosen without any sort of pattern. Now that I've made (over the past few years) GNU config be more structured and more easily support longer configs, it is time to establish a convention. windows-* makes sense, and has precedent. John
Re: config.sub should normalize *-*-windows-*
John Ericson writes: > Thanks Jacob, > > That's absolutely right that Win NT supports multiple personalities > and so all sorts of things are possible. (Indeed that is how WSL1 > worked.) > > MinGW stands for "Minimalist GNU for Windows" [1], and I suspect that > is why Saleem choose windows-gnu in that commit almost a decade ago. I > supposed we could say that "minimalist GNU" is not "GNU", and do > windows-mingnu or something, and then I could submit an LLVM patch to > try to support that. But I suppose I lean towards support configs that > at least one of GCC or Clang supports already, rather than making up > completely new stuff. GNU config is part of the GNU project, developing the GNU operating system, which opted for ``mingw'' many, many moons ago. We are under no obligation to adhere to LLVM standards, especially when they require us to misrepresent the nature of a specific system configuration. > Also, I would like to point out that the "scales to more variations" > argument is not at all hypothetical. If one looks at [2] one will see > that MSYS is a variation of Cygwin, and a mingw-style environments can > be made from the newer ucrt or older msvcrt. Today there are just too > many subtle variations to capture them all with sensible. It looks > like MSYS [3] reuses a triple for multiple configurations, and just > relies on users getting the PATH right, but that clearly isn't > ideal. Creating windows-* variants to handle them all in a consistent > and predictable manner is much better. We can create new triplets for new environments once they do come into existence. But they should not duplicate existing ones, and they must conform to the existing naming convention for configuration triplets.
Re: config.sub should normalize *-*-windows-*
Thanks Jacob, That's absolutely right that Win NT supports multiple personalities and so all sorts of things are possible. (Indeed that is how WSL1 worked.) MinGW stands for "Minimalist GNU for Windows" [1], and I suspect that is why Saleem choose windows-gnu in that commit almost a decade ago. I supposed we could say that "minimalist GNU" is not "GNU", and do windows-mingnu or something, and then I could submit an LLVM patch to try to support that. But I suppose I lean towards support configs that at least one of GCC or Clang supports already, rather than making up completely new stuff. Also, I would like to point out that the "scales to more variations" argument is not at all hypothetical. If one looks at [2] one will see that MSYS is a variation of Cygwin, and a mingw-style environments can be made from the newer ucrt or older msvcrt. Today there are just too many subtle variations to capture them all with sensible. It looks like MSYS [3] reuses a triple for multiple configurations, and just relies on users getting the PATH right, but that clearly isn't ideal. Creating windows-* variants to handle them all in a consistent and predictable manner is much better. John P.S. I've also CC'd Martin Storjso who has worked on LLVM MinGW things recently [1]: https://en.wikipedia.org/wiki/MinGW [2]: https://www.msys2.org/docs/environments/ [3]: https://packages.msys2.org/package/mingw-w64-x86_64-gcc https://packages.msys2.org/package/mingw-w64-ucrt-x86_64-gcc On 8/20/23 22:40, Po Lu wrote: Jacob Bachmeyer writes: At this time, yes. However, the GNU utilities are designed to be fairly portable and the NT kernel was designed to support multiple ABIs, so a hypothetical port of GNU to run under MS-Windows is within the realm of possibility. (In fact, the underlying architecture of NT should have all of the primitives needed to support HURD or a closely related system.) It is more likely that this would be implemented on ReactOS (which aims for ABI compatibility with NT 5.1, is a stable target, and is Free) first, but a hypothetical `x86_64-pc-windows-gnu' (or perhaps `x86_64-pc-reactos-gnu') config tuple *is* a future possibility. This hypothesizing is not relevant here. x86_64-pc-windows-* represents MinGW, and should be normalized correspondingly. And what would we canonicalize `x86_64-pc-windows-gnu' to, other than itself, currently? x86_64-pc-mingw64, which I mentioned at the outset of this thread. It appears that config tuples may be drifting towards a 5-element CPU-VENDOR-KERNEL-OS-LIBCABI form, with each of the last three elements potentially optional, which makes any real tuple ambiguous, except that the valid strings for KERNEL, OS, and LIBCABI are from distinct sets. Configuration tuples don't ``drift'', and they certainly should not change or duplicate other triplets.
Re: config.sub should normalize *-*-windows-*
Jacob Bachmeyer writes: > At this time, yes. However, the GNU utilities are designed to be > fairly portable and the NT kernel was designed to support multiple > ABIs, so a hypothetical port of GNU to run under MS-Windows is within > the realm of possibility. (In fact, the underlying architecture of NT > should have all of the primitives needed to support HURD or a closely > related system.) It is more likely that this would be implemented on > ReactOS (which aims for ABI compatibility with NT 5.1, is a stable > target, and is Free) first, but a hypothetical `x86_64-pc-windows-gnu' > (or perhaps `x86_64-pc-reactos-gnu') config tuple *is* a future > possibility. This hypothesizing is not relevant here. x86_64-pc-windows-* represents MinGW, and should be normalized correspondingly. > And what would we canonicalize `x86_64-pc-windows-gnu' to, other than > itself, currently? x86_64-pc-mingw64, which I mentioned at the outset of this thread. > It appears that config tuples may be drifting towards a 5-element > CPU-VENDOR-KERNEL-OS-LIBCABI form, with each of the last three > elements potentially optional, which makes any real tuple ambiguous, > except that the valid strings for KERNEL, OS, and LIBCABI are from > distinct sets. Configuration tuples don't ``drift'', and they certainly should not change or duplicate other triplets.
Re: config.sub should normalize *-*-windows-*
Po Lu wrote: "John Ericson" writes: [...] Had those Windows-based platforms been introduced later, something like the configs that Saleem added to LLVM would have been used from the get go --- grouping the Windows-based platforms and grouping the Linux-based platforms are both advantageous ways of categorizing things, and advantageous for the same reasons. We are trying to develop the GNU operating system, and it is in our interest to convey the distinction between GNU systems employing the Linux kernel, and other operating systems that are - by happenstance - built on top of the same kernel. OTOH, MinGW does not provide an operating system founded upon the Windows kernel, so it is incorrect to apply the: machine-vendor-kernel-OS quadruplet scheme to it. To rub salt into the wound, the GNU operating system does NOT run under a MS-Windows kernel. So ``windows-gnu'' is not just conjecture, it is also a misnomer. At this time, yes. However, the GNU utilities are designed to be fairly portable and the NT kernel was designed to support multiple ABIs, so a hypothetical port of GNU to run under MS-Windows is within the realm of possibility. (In fact, the underlying architecture of NT should have all of the primitives needed to support HURD or a closely related system.) It is more likely that this would be implemented on ReactOS (which aims for ABI compatibility with NT 5.1, is a stable target, and is Free) first, but a hypothetical `x86_64-pc-windows-gnu' (or perhaps `x86_64-pc-reactos-gnu') config tuple *is* a future possibility. As I said in the other email, I am not forcing anyone to do anything. You are, as users will soon begin to provide invalid triplets such as `x86_64-pc-windows-gnu' to their configuration files. And instead of canonicalizing them, the express purpose of config.sub, they are reproduced verbatim, much to the detriment of configure scripts and to the chagrin of package maintainers. And what would we canonicalize `x86_64-pc-windows-gnu' to, other than itself, currently? It appears that config tuples may be drifting towards a 5-element CPU-VENDOR-KERNEL-OS-LIBCABI form, with each of the last three elements potentially optional, which makes any real tuple ambiguous, except that the valid strings for KERNEL, OS, and LIBCABI are from distinct sets. -- Jacob
Re: config.sub should normalize *-*-windows-*
"John Ericson" writes: > If Musl, GNU Libc, and Android are all different operating systems, > why are MSVCRT, MinGW, and Cygwin not different operating systems? Musl is not an operating system, but Musl-based systems are distinct from GNU and Android systems, in that they share nothing in common except for the Linux kernel. These considerations are GNU project policy, see: https://www.gnu.org/gnu/why-gnu-linux.en.html In contrast, MinGW and MSVC are merely different compilers for the same OS, using a different ABI, but the same system libraries and services. A program will run on any MS-Windows system irrespective of whether it was compiled with MSVC or MinGW. And config.* already regards Cygwin as a separate operating system. > The simplest reading of history that doesn't require any contortions > is that MinGW and Cygwin predated configs with more than 3 components, > but Android did not. That's an inaccurate portrayal of history, at best. See below. > Had those Windows-based platforms been introduced later, something > like the configs that Saleem added to LLVM would have been used from > the get go --- grouping the Windows-based platforms and grouping the > Linux-based platforms are both advantageous ways of categorizing > things, and advantageous for the same reasons. We are trying to develop the GNU operating system, and it is in our interest to convey the distinction between GNU systems employing the Linux kernel, and other operating systems that are - by happenstance - built on top of the same kernel. OTOH, MinGW does not provide an operating system founded upon the Windows kernel, so it is incorrect to apply the: machine-vendor-kernel-OS quadruplet scheme to it. To rub salt into the wound, the GNU operating system does NOT run under a MS-Windows kernel. So ``windows-gnu'' is not just conjecture, it is also a misnomer. > As I said in the other email, I am not forcing anyone to do anything. You are, as users will soon begin to provide invalid triplets such as `x86_64-pc-windows-gnu' to their configuration files. And instead of canonicalizing them, the express purpose of config.sub, they are reproduced verbatim, much to the detriment of configure scripts and to the chagrin of package maintainers. > You can take the latest version and do nothing else. Anyone that uses > *-windows-gnu will have their build fail, just as it fails > today. There is no problem. Users will start expecting configure to grok such configurations. We will start receiving bug reports, for the simple reason that the present config.* cabal failed, in this case, to excercise the elementary degree of circumspection and good judgement that should be applied when maintaining a program underpinning thousands of important GNU (and other) projects.
Re: config.sub should normalize *-*-windows-*
On Fri, Aug 18, 2023, at 8:24 PM, Po Lu wrote: > GNU is an operating system. Musl-based systems are not GNU, so -musl > represents a ``musl-based operating system''. > > > I do not think this is something to be frowned upon because "Operating > > System.", after all, also lacks any rigorous objective definition. > > It does not, within the GNU project at least. GNU is one operating > system; Android is another, as are Musl-based systems. And MS-Windows > is a single operating system. If Musl, GNU Libc, and Android are all different operating systems, why are MSVCRT, MinGW, and Cygwin not different operating systems? Listing off examples is *not* providing an objective definition. The simplest reading of history that doesn't require any contortions is that MinGW and Cygwin predated configs with more than 3 components, but Android did not. Had those Windows-based platforms been introduced later, something like the configs that Saleem added to LLVM would have been used from the get go --- grouping the Windows-based platforms and grouping the Linux-based platforms are both advantageous ways of categorizing things, and advantageous for the same reasons. > How is that worse than forcing every program wishing to support MS-Windows to > introduce express support for 2 or 3 disparate and incorrect triplets? As I said in the other email, I am not forcing anyone to do anything. > Anyway, I plan to merge the latest config.* into Emacs soon. So > speaking as someone responsible, in part, for keeping the MS-Windows > port of Emacs in working order, I would like to see the change I > illustrated installed ASAP. You can take the latest version and do nothing else. Anyone that uses *-windows-gnu will have their build fail, just as it fails today. There is no problem. John
Re: config.sub should normalize *-*-windows-*
"John Ericson" writes: > In fairness I just recently submitted the patches added them, so > absent a clear notion of GNU config releases I think a grace period > where we can reconsider recently added changes is acceptable. So let's please remove that change, and replace it with one that canonicalizes *-*-windows-*. > Even more important than this is the principle that config.sub canonical > names are *never* changed, even if > they are wrong according to some external standard. > > That said, I don't think we should so normalize them. I took them from LLVM, > which has supported them for years and normalized in > the *other* direction (i.e. to these), and Rust which follows LLVM's lead. I > knew we couldn't change the old ones to normalize to the new > ones, so I thought a fair middle ground was that neither would normalize to > the other. > > For the record LLVM, Rust, and even sometimes GNU config don't treat > *-*-foo-bar as *-*-$kernel-$os, but rather *-*-$kernel-$abi. Where > ABI is a sort of catch-all residual. This is why > e.g. riscv-unknown-linux-musl is accepted --- no one would think the > Musl libc is an operating system! Rather "gnu" is interpreted to be > mean "glibc, libstdcxx++, etc.". GNU is an operating system. Musl-based systems are not GNU, so -musl represents a ``musl-based operating system''. > I do not think this is something to be frowned upon because "Operating > System.", after all, also lacks any rigorous objective definition. It does not, within the GNU project at least. GNU is one operating system; Android is another, as are Musl-based systems. And MS-Windows is a single operating system. > At the end of the day there is: > > 1 The syscall interface to communicate with "the outside world". (By > "kernel" we really mean syscall interface, it is possible for two > different implementations, like the actual Windows kernel, and Wine, > to support the same syscall interface.) There is already a set precedent for how such changes are treated by config*: see mips*n32, arm*eabi, et cetera. > 2 Other code linked into the same process. ABI covers the "most > important" parts of this, especially when the userland ABI is more > stable than the syscall ABI (Many BSDs, some Windows NT things, etc.) ABI differences don't constitute a new operating system. > And even ignoring all that, the *windows*-* convention makes clear > that these are variations of extra stuff atop on Windows. In many > instances, it doesn't matter which one of them is in use. Using the > new triples makes it easier for that agnostic code to roll with the > punches. They are invalid triples. > > > My intention in adding these to GNU config was to then rework our > Windows cross compilation in Nixpkgs to use them, which would mean > likewise submitting patches to GCC and other things to accept them > too. Normalizing them away would prevent me from doing all these other > yak shaves, and trying to get the various flavors of Windows cross to > work more consistently, because everything downstream from config.sub > invocations would work the same way as before. IMO that would > basically defeat the purpose of accepting them at all in GNU config > --- better to reject that do a normalization that may not be desired. If these configuration triplets are normalized, GNU projects (and others using config*) will automagically work when they are provided. How is that worse than forcing every program wishing to support MS-Windows to introduce express support for 2 or 3 disparate and incorrect triplets? Anyway, I plan to merge the latest config.* into Emacs soon. So speaking as someone responsible, in part, for keeping the MS-Windows port of Emacs in working order, I would like to see the change I illustrated installed ASAP.
Re: config.sub should normalize *-*-windows-*
On 8/18/23 07:42, Zack Weinberg wrote: > On Thu, Aug 17, 2023, at 8:34 PM, Po Lu wrote: > ... > >> Given that the MinGW ABI does not constitute the GNU operating system >> executing on the MS-Windows kernel, and MSVC is not an operating system, >> such blunders should be ignored, or at least normalized... In fairness I just recently submitted the patches added them, so absent a clear notion of GNU config releases I think a grace period where we can reconsider recently added changes is acceptable. > Even more important than this is the principle that config.sub canonical > names are *never* changed, even if they are wrong according to some external > standard. That said, I don't think we should so normalize them. I took them from LLVM, which has supported them for years and normalized in the *other* direction (i.e. to these), and Rust which follows LLVM's lead. I knew we couldn't change the old ones to normalize to the new ones, so I thought a fair middle ground was that neither would normalize to the other. For the record LLVM, Rust, and even sometimes GNU config don't treat *-*-foo-bar as *-*-$kernel-$os, but rather *-*-$kernel-$abi. Where ABI is a sort of catch-all residual. This is why e.g. riscv-unknown-linux-musl is accepted --- no one would think the Musl libc is an operating system! Rather "gnu" is interpreted to be mean "glibc, libstdcxx++, etc.". I do not think this is something to be frowned upon because "Operating System.", after all, also lacks any rigorous objective definition. At the end of the day there is: 1. The syscall interface to communicate with "the outside world". (By "kernel" we really mean syscall interface, it is possible for two different implementations, like the actual Windows kernel, and Wine, to support the same syscall interface.) 2. Other code linked into the same process. ABI covers the "most important" parts of this, especially when the userland ABI is *more* stable than the syscall ABI (Many BSDs, some Windows NT things, etc.) And even ignoring all that, the *windows*-* convention makes clear that these are variations of extra stuff atop on Windows. In many instances, it doesn't matter which one of them is in use. Using the new triples makes it easier for that agnostic code to roll with the punches. My intention in adding these to GNU config was to then rework our Windows cross compilation in Nixpkgs to use them, which would mean likewise submitting patches to GCC and other things to accept them too. Normalizing them away would prevent me from doing all these other yak shaves, and trying to get the various flavors of Windows cross to work more consistently, because everything downstream from config.sub invocations would work the same way as before. IMO that would basically defeat the purpose of accepting them at all in GNU config --- better to reject that do a normalization that may not be desired. Cheers, John P.S. CCing Saleem Abdulrasool who originally added them to LLVM in https://reviews.llvm.org/D2947, and who has continued to work on LLVM-land Windows support, e.g. for Swift. (I imagine Swift, like Rust, also uses the *windows*-* ones.) Perhaps he may have some additional insight to add.
Re: config.sub should normalize *-*-windows-*
On Thu, Aug 17, 2023, at 8:34 PM, Po Lu wrote: ... > Given that the MinGW ABI does not constitute the GNU operating system > executing on the MS-Windows kernel, and MSVC is not an operating system, > such blunders should be ignored, or at least normalized... Even more important than this is the principle that config.sub canonical names are *never* changed, even if they are wrong according to some external standard. zw
config.sub should normalize *-*-windows-*
x86_64-pc-windows-* is first and foremost a _misnomer_. The format of a configuration triplet (or quadruplet) is set in stone: MACHINE-VENDOR-[KERNEL-]OS. Given that the MinGW ABI does not constitute the GNU operating system executing on the MS-Windows kernel, and MSVC is not an operating system, such blunders should be ignored, or at least normalized to one of the existing operating system values: x86_64-pc-mingw* for MinGW, and x86_64-pc-winnt for MSVC.