Re: Rethinking configuration tuples (was: Re: config.sub should normalize *-*-windows-*)

2023-08-26 Thread John Ericson


On 8/24/23 23:54, Jacob Bachmeyer wrote:

John Ericson wrote:


This is why I opened with "Operating System" lacks a coherent 
objective definition.


[...]


As I understand, historically, "operating systems" were proprietary 
monoliths and the GNU Project originally expected to produce another 
monolith, but /our/ monolith would be Free Software.  As an interim 
measure, the GNU utilities were designed to be widely portable across 
the various individually-monolithic proprietary operating systems then 
in use across a wide variety of hardware.  The broader Free Software 
Movement unexpectedly shattered that state of affairs, leading to the 
4-element configuration tuple form, when the Linux kernel became 
available and it was noticed that---oops!---GNU on Linux and GNU on 
HURD would have significant differences that at least some of the GNU 
packages would need to handle.  (For example, GNU libc is very 
different between Linux, where POSIX I/O maps fairly directly to 
underlying syscalls, and HURD, where POSIX I/O must be translated to 
Mach IPC, but both of these are Free GNU systems.)


This means that the GNU system is a somewhat blurry category, with 
many variants possible, and is orthogonal to "Linux":  there are 
GNU/Linux systems, GNU systems using other kernels, and Linux-based 
systems not using GNU at all.  This latter category is fairly common 
in embedded systems, where the GNU utilities are often eschewed for 
lighter-weight alternatives to save flash space (or, less honorably, 
to avoid GPL3).


Yes I agree with this state of affairs. I sometimes (but not always!) 
detect a sort of "Linux Scooped us" sentiment in GNU quarters, but as I 
see it portability and diversity of distros was pretty much inevitable 
--- replacing propriety Unix userlands with GNU software was a huge 
point in how GNU got going in academic/institutional environments in the 
early days, and even if Hurd got there before Linux there would be no 
reason to rip out that portability.


JSON is pretty much a hard no for me:  it is far too complex for what 
really needs to be a simple structure.  Flat strings work very well 
for the way that GNU software typically expects to parse a 
configuration tuple using shell constructs.  Perhaps it would be 
better to redefine configuration tuples as a flat list of tags with a 
canonical ordering?  (The reason for a canonical ordering is in part 
to ensure that all existing coherent configuration tuple strings 
remain valid and to ensure that text-based pattern matching continues 
to work.)


Ah sorry, I shouldn't have made reference to JSON at all --- what I 
really was getting at is the /abstract syntax/. In particular, rather 
than having an abstract syntax of "list of strings" (parsing today's 
concrete syntax by breaking on dash), where the meaning of each string 
is ambiguous / context-sensative, we have of "keys mapped to 
enumerations", i.e. one always knows the meaning of each component 
explicitly / without inspecting it or its context.


JSON or your flat list in canonical ordering (where I assume we are 
careful to never skip a type of component) are both valid concrete 
syntaxes that can be parsed / printed from this abstract syntax.





---

Concretely, I think these are pretty clear configs:

CPU-VENDOR-windows-mingnu # MinGW, MS C + GNU C++ and other GNU-ish 
things, TODO distinguish between MSVCRT and UCRT




I say that this one really should just be *-mingw.


Sure. I went with mingnu because the "w" is redundant with the 
"windows", but ultimately I care more about the pattern than the exact 
choice of identifiers / enumeration tags. (As we way in programming 
language land, I care about the thing "up to alpha-renaming").


Note that there are both MinGW32 and MinGW64, corresponding to 32-bit 
and 64-bit Windows APIs.  Should that be included or should the CPU 
type be used to distinguish?  (e.g.  i686-pc-windows-mingw is MinGW32 
and x86_64-pc-windows-mingw is MinGW64?)


Yes I think so. If you look at https://www.mingw-w64.org/downloads/ one 
even sees |x86_64-w64-mingw32| which is quite something, and 64-bit!


I think what happened is that "w32" to was chosen to mean the then-new 
win32 API/ABI, as opposed to DOS. Win64 as I understand is necessarily a 
new ABI because of the change in CPU arch, but not really a new API, 
being more of a "let's make the minimal amount of changes so the 
source/headers are portable" situation. So a combination of "same API" 
and "too lazy to update GNU config" made "mingw32" stick around.


f16804b79ee5a23a9994a1cdc760cd9ba813148a added mingw64 to GNU config in 
2012, which is far after the advent of 64-bit Windows.


In the proposed five-element form, MSVCRT and UCRT are easily 
distinguished.  Example:


i686-pc-windows-mingw-msvcrt
i686-pc-windows-mingw-ucrt
x86_64-pc-windows-mingw-msvcrt
x86_64-pc-windows-mingw-ucrt


That is very true, I will grant you that :)


CPU-VENDOR-windows-cygnus # Cygwin

CPU-VENDOR-

Re: config.sub should normalize *-*-windows-*

2023-08-26 Thread John Ericson

Thanks Connor. I think we are both on the same page!

On 8/24/23 14:51, connor horman wrote:

It seems to me reading this thread that we've come into two 
conflicting realities:

* There exists targets that need to be distinguished, and
* They are not distinct in any component that config.sub has, 
therefore they cannot and should not be distinguished.


mingw and msvc both use the NT kernel, and the windows operating 
system. So it seems to me that windows, the OS, is the correct way to 
describe them. According to the discussions on this thread, they 
should thusly both canonicalize to the same target. And yet, not only 
is there desire to separate these targets, they already are.
Agreed. We can have our cake and eat it to both both: (a) distinguishing 
things which are already distinguished and (b) having configs follow 
consistent conventions.


LLVM (as well as my own target parsing tool) refer to the last two 
components as "sys" with two subcomponents (of which at least one 
exists), being os and env. IMO, this seems a far more coherent 
definition that satisfies the requirements, and even more correctly 
matches targets that already exist.

Agreed!


musl is another extreme example: There is no musl OS. The last 
component being musl refers to the use of the musl libc. The resulting 
binary can then be used on either a GNU system or a non-GNU linux 
system like alpine, void, or iglunix. Thus musl cannot be regarded as 
an "OPERATING_SYSTEM" but rather an an environment.

Agreed!


Even on linux-gnu the definition is murky at best. While I won't 
dispute the existence of a GNU operating system running atop the linux 
kernel, in many cases, the actual linux-gnu tag merely refers to 
glibc. Few things using targets nowadays actually cares about the rest 
of the tools, and when they do care that they exist (on --host or even 
--target), they typically don't care that they're provided by GNU, and 
even may not care that they match the interface of the tool provided 
by GNU. Only on --build are the tools really cared about, and I don't 
see many things matching the build tuple or even canonicalizing it. If 
we thus define an "Operating System" as "kernel+libc+tools atop that" 
it becomes clear to me that few things written nowadays care about the 
"GNU Operating System" and only really care about the "GNU Environment".


Agreed! Well put --- even if we were to find a rigorous objective 
definition for "Operating System" in general, encompassing a long tail 
of auxiliary interfaces, it would be overly specific what what things 
inspecting the output of config.sub actually care about.


(FWIW I am also fine saying there exists the "GNU Operating System", but 
to me "Operating System" is always an exercise in branding, tying 
together disparate components which always in principle (e.g. if we had 
the source code) could be mixed-and-matched in other ways.)


I would like this very much to happen, along with the Rust project 
which has it's own target defs (but similar as well).


I am glad I am not the only one!

John


Re: config.sub should normalize *-*-windows-*

2023-08-26 Thread Po Lu
connor horman  writes:

> It seems to me reading this thread that we've come into two
> conflicting realities: * There exists targets that need to be
> distinguished, and * They are not distinct in any component that
> config.sub has, therefore they cannot and should not be distinguished.
>
> mingw and msvc both use the NT kernel, and the windows operating
> system. So it seems to me that windows, the OS, is the correct way to
> describe them. According to the discussions on this thread, they
> should thusly both canonicalize to the same target. And yet, not only
> is there desire to separate these targets, they already are.
>
> LLVM (as well as my own target parsing tool) refer to the last two
> components as "sys" with two subcomponents (of which at least one
> exists), being os and env. IMO, this seems a far more coherent
> definition that satisfies the requirements, and even more correctly
> matches targets that already exist.

The objective is to keep the status quo unchanged till Hell freezes
over, so that no programs will ever be broken.

> musl is another extreme example: There is no musl OS. The last
> component being musl refers to the use of the musl libc. The resulting
> binary can then be used on either a GNU system or a non-GNU linux
> system like alpine, void, or iglunix. Thus musl cannot be regarded as
> an "OPERATING_SYSTEM" but rather an an environment.
>
> Even on linux-gnu the definition is murky at best. While I won't
> dispute the existence of a GNU operating system running atop the linux
> kernel, in many cases, the actual linux-gnu tag merely refers to
> glibc. Few things using targets nowadays actually cares about the rest
> of the tools, and when they do care that they exist (on --host or even
> --target), they typically don't care that they're provided by GNU, and
> even may not care that they match the interface of the tool provided
> by GNU. Only on --build are the tools really cared about, and I don't
> see many things matching the build tuple or even canonicalizing it. If
> we thus define an "Operating System" as "kernel+libc+tools atop that"
> it becomes clear to me that few things written nowadays care about the
> "GNU Operating System" and only really care about the "GNU
> Environment".

For the purpose of compiling programs, systems using the GNU libc are
equivalent to GNU systems.  config.* does not draw excessively fine
distinctions between them.

In keeping with that, systems using the Musl libc are so similar that
they may as well be considered as a single operating system.  This
contrasts with MinGW and MSVC, whose discrepancies are of sufficient
consequence to warrant individual identification by config.*.

And as configurations which embody these distinctions _already exist_,
they should never change, nor be supplanted by new and purportedly
``improved'' configurations.  I reiterate, until the very end of time...



Rethinking configuration tuples (was: Re: config.sub should normalize *-*-windows-*)

2023-08-24 Thread Jacob Bachmeyer

John Ericson wrote:


This is why I opened with "Operating System" lacks a coherent 
objective definition.


The more pugilistic message is to say the rest of the world doesn't 
think the GNU operating system exists --- that there is simply a 
choice of kernel (Linux, k*BSD, Hurd, something else...) and choices 
of libraries and system components on top of that, and many 
combinations are possible. The rest of the world might say this in a 
mean way, but I say it is actually a /good/ thing --- software freedom 
means one /can/ choose my components à la carte, and only a lack of 
software freedom results in a kernel and mass of libraries outside 
one's control blurring together into a scary "take it or leave it" 
monolith we call an operating system.




As I understand, historically, "operating systems" were proprietary 
monoliths and the GNU Project originally expected to produce another 
monolith, but /our/ monolith would be Free Software.  As an interim 
measure, the GNU utilities were designed to be widely portable across 
the various individually-monolithic proprietary operating systems then 
in use across a wide variety of hardware.  The broader Free Software 
Movement unexpectedly shattered that state of affairs, leading to the 
4-element configuration tuple form, when the Linux kernel became 
available and it was noticed that---oops!---GNU on Linux and GNU on HURD 
would have significant differences that at least some of the GNU 
packages would need to handle.  (For example, GNU libc is very different 
between Linux, where POSIX I/O maps fairly directly to underlying 
syscalls, and HURD, where POSIX I/O must be translated to Mach IPC, but 
both of these are Free GNU systems.)


This means that the GNU system is a somewhat blurry category, with many 
variants possible, and is orthogonal to "Linux":  there are GNU/Linux 
systems, GNU systems using other kernels, and Linux-based systems not 
using GNU at all.  This latter category is fairly common in embedded 
systems, where the GNU utilities are often eschewed for lighter-weight 
alternatives to save flash space (or, less honorably, to avoid GPL3).



On 8/24/23 08:51, Adam Joseph wrote:

[...]
It seems like a lot of the proposals in this thread are being evaluated not
based on whether or not they are coherent, but rather on whether or not they
take us a few nanometers closer to whatever happens to whatever LLVM's internal
implementation details happen to be this week.



I care about coherence, the reason I like to see what LLVM does that 
working from a parsed representation forces the software to be much 
more honest. Since GNU config doesn't reveal its categories but just 
spits out another opaque string, there is no external pressure for its 
categorization to be any good. LLVM, on the other hand, dispenses with 
strings entirely and just uses the enums, so it is forced to make sure 
those enums make sense and work for the branching the program has to do.


LLVM parsing of configs is ad-hoc Postel's law stuff like everyone 
else, but its internal representation is actually quite stable. 
Parsing is the ugly nasty part that gets to the pristine clear 
ontology on the other side.


Ultimately I would like to convene everyone to commit to an agreed 
upon internal representation too. E.g. clang and GNU config could both 
spit out some JSON that is unambiguous and should match. I think that 
would alleviate a lot of Adam's concerns about "following LLVM". But I 
don't think it is possible to convene the working group needed to 
standardize such a format yet, because there is little trust between 
parties. Moving us a "a few nanometers closer" on each side 
demonstrates that there is willingness to compromise.




JSON is pretty much a hard no for me:  it is far too complex for what 
really needs to be a simple structure.  Flat strings work very well for 
the way that GNU software typically expects to parse a configuration 
tuple using shell constructs.  Perhaps it would be better to redefine 
configuration tuples as a flat list of tags with a canonical ordering?  
(The reason for a canonical ordering is in part to ensure that all 
existing coherent configuration tuple strings remain valid and to ensure 
that text-based pattern matching continues to work.)



---

Concretely, I think these are pretty clear configs:

CPU-VENDOR-windows-mingnu # MinGW, MS C + GNU C++ and other GNU-ish 
things, TODO distinguish between MSVCRT and UCRT




I say that this one really should just be *-mingw.  Note that there are 
both MinGW32 and MinGW64, corresponding to 32-bit and 64-bit Windows 
APIs.  Should that be included or should the CPU type be used to 
distinguish?  (e.g.  i686-pc-windows-mingw is MinGW32 and 
x86_64-pc-windows-mingw is MinGW64?)


In the proposed five-element form, MSVCRT and UCRT are easily 
distinguished.  Example:


i686-pc-windows-mingw-msvcrt
i686-pc-windows-mingw-ucrt
x86_64-pc-windows-mingw-msvcrt
x86_64-pc-windows-mingw-ucrt

Re: config.sub should normalize *-*-windows-*

2023-08-24 Thread Jacob Bachmeyer

Adam Joseph wrote:

Quoting Jacob Bachmeyer (2023-08-21 19:35:05)
  

No, we are not.  CPU-VENDOR-KERNEL-OS-LIBCABI, with at least one of the



If you want to redefine existing terms, please be forthright about the fact that
your proposal does so.
  


I argue that this is something that has already happened under our 
collective noses (and before I had any interest in configuration tuples 
beyond using them with configure).  The "OS" field is no longer consistent.



This usage is in conflict with the existing definition; LIBC and ABI are
subfields of OS.  It isn't resolving any "technical debt" -- it's sowing mass
confusion.

>From config.sub:

# The goal of this file is to map all the various variations of a given
# machine specification into a single specification in the form:
#   CPU_TYPE-MANUFACTURER-OPERATING_SYSTEM
# or in some cases, the newer four-part form:
#   CPU_TYPE-MANUFACTURER-KERNEL-OPERATING_SYSTEM
# It is wrong to echo any other type of specification.

The variable name "LIBCABI" comes from config.guess, where it is not described,
but is always parsed as a refinement of the OPERATING_SYSTEM field.  There is
never a hyphen between OPERATING_SYSTEM and LIBCABI because they are in fact
different parsings of the same string.
  


While I may have drawn inspiration from past work on config.guess, I 
used the name "LIBCABI" to reflect that it can be either a libc 
implementation or an ABI name; the two are usually closely related in 
practice.



config tuples were originally triplets and now often feature a 4-element
CPU-VENDOR-KERNEL-OS form



Yes, we've had ~20 years to appreciate the confusion it caused, and now we know
better than to do something like that again.

It seems like a lot of the proposals in this thread are being evaluated not
based on whether or not they are coherent, but rather on whether or not they
take us a few nanometers closer to whatever happens to whatever LLVM's internal
implementation details happen to be this week.
  


My proposals have been an effort (in my view at least) to restore 
coherency here, and if LLVM is using *-windows-gnu at the moment, LLVM 
is /wrong/.  That tuple should describe a POSIX GNU environment running 
on a Windows system.  Such an environment is theoretically possible, but 
does not currently exist as far as I know.



`CPU-VENDOR-linux-gnu-musl`



I lack words to describe this.  I suppose it could be useful if the goal were to
drive config.sub into such a self-inconsistent state that everybody abandons it.
  


Then I need to explain it again:  CPU and VENDOR are all caps because 
they remain variable in that pattern.  Perhaps I should have written 
`*-*-linux-gnu-musl' there.  That tuple describes a GNU system (*-gnu-*) 
running on a Linux kernel (*-linux-*) using Musl libc (*-musl).  Do you 
argue that (*-gnu) should mean specifically GNU libc even though there 
is more to the GNU system than libc?



-- Jacob



Re: config.sub should normalize *-*-windows-*

2023-08-24 Thread connor horman
It seems to me reading this thread that we've come into two conflicting
realities:
* There exists targets that need to be distinguished, and
* They are not distinct in any component that config.sub has, therefore
they cannot and should not be distinguished.

mingw and msvc both use the NT kernel, and the windows operating system. So
it seems to me that windows, the OS, is the correct way to describe them.
According to the discussions on this thread, they should thusly both
canonicalize to the same target. And yet, not only is there desire to
separate these targets, they already are.

LLVM (as well as my own target parsing tool) refer to the last two
components as "sys" with two subcomponents (of which at least one exists),
being os and env. IMO, this seems a far more coherent definition that
satisfies the requirements, and even more correctly matches targets that
already exist.

musl is another extreme example: There is no musl OS. The last component
being musl refers to the use of the musl libc. The resulting binary can
then be used on either a GNU system or a non-GNU linux system like alpine,
void, or iglunix. Thus musl cannot be regarded as an "OPERATING_SYSTEM" but
rather an an environment.

Even on linux-gnu the definition is murky at best. While I won't dispute
the existence of a GNU operating system running atop the linux kernel, in
many cases, the actual linux-gnu tag merely refers to glibc. Few things
using targets nowadays actually cares about the rest of the tools, and when
they do care that they exist (on --host or even --target), they typically
don't care that they're provided by GNU, and even may not care that they
match the interface of the tool provided by GNU. Only on --build are the
tools really cared about, and I don't see many things matching the build
tuple or even canonicalizing it. If we thus define an "Operating System" as
"kernel+libc+tools atop that" it becomes clear to me that few things
written nowadays care about the "GNU Operating System" and only really care
about the "GNU Environment".

On Thu, Aug 24, 2023 at 12:22 John Ericson  wrote:

> This is why I opened with "Operating System" lacks a coherent objective
> definition.
>
> The more pugilistic message is to say the rest of the world doesn't think
> the GNU operating system exists --- that there is simply a choice of kernel
> (Linux, k*BSD, Hurd, something else...) and choices of libraries and system
> components on top of that, and many combinations are possible. The rest of
> the world might say this in a mean way, but I say it is actually a *good*
> thing --- software freedom means one *can* choose my components à la
> carte, and only a lack of software freedom results in a kernel and mass of
> libraries outside one's control blurring together into a scary "take it or
> leave it" monolith we call an operating system.
> On 8/24/23 08:51, Adam Joseph wrote:
>
> Quoting Jacob Bachmeyer (2023-08-21 19:35:05)
>
> No, we are not.  CPU-VENDOR-KERNEL-OS-LIBCABI, with at least one of the
>
> If you want to redefine existing terms, please be forthright about the fact 
> that
> your proposal does so.
>
> This usage is in conflict with the existing definition; LIBC and ABI are
> subfields of OS.  It isn't resolving any "technical debt" -- it's sowing mass
> confusion.
>
> From config.sub:
>
> # The goal of this file is to map all the various variations of a given
> # machine specification into a single specification in the form:
> #   CPU_TYPE-MANUFACTURER-OPERATING_SYSTEM
> # or in some cases, the newer four-part form:
> #   CPU_TYPE-MANUFACTURER-KERNEL-OPERATING_SYSTEM
> # It is wrong to echo any other type of specification.
>
> The variable name "LIBCABI" comes from config.guess, where it is not 
> described,
> but is always parsed as a refinement of the OPERATING_SYSTEM field.  There is
> never a hyphen between OPERATING_SYSTEM and LIBCABI because they are in fact
> different parsings of the same string.
>
> I'll add that all linux-* configs in config.guess use LIBC/LIBCABI. I take
> this as further evidence that distinguishing OSes atop Linux is useless.
> Per the above, I think this is good!
>
> config tuples were originally triplets and now often feature a 4-element
> CPU-VENDOR-KERNEL-OS form
>
> Yes, we've had ~20 years to appreciate the confusion it caused, and now we 
> know
> better than to do something like that again.
>
> Adam are you saying you prefer the state of 3-component configs?
>
> It seems like a lot of the proposals in this thread are being evaluated not
> based on whether or not they are coherent, but rather on whether or not they
> take us a few nanometers closer to whatever happens to whatever LLVM's 
> internal
> implementation details happen to be this week.
>
> I care about coherence, the reason I like to see what LLVM does that
> working from a parsed representation forces the software to be much more
> honest. Since GNU config doesn't reveal its categories but just spits out
> another opaque

Re: config.sub should normalize *-*-windows-*

2023-08-24 Thread John Ericson
This is why I opened with "Operating System" lacks a coherent objective 
definition.


The more pugilistic message is to say the rest of the world doesn't 
think the GNU operating system exists --- that there is simply a choice 
of kernel (Linux, k*BSD, Hurd, something else...) and choices of 
libraries and system components on top of that, and many combinations 
are possible. The rest of the world might say this in a mean way, but I 
say it is actually a /good/ thing --- software freedom means one /can/ 
choose my components à la carte, and only a lack of software freedom 
results in a kernel and mass of libraries outside one's control blurring 
together into a scary "take it or leave it" monolith we call an 
operating system.


On 8/24/23 08:51, Adam Joseph wrote:

Quoting Jacob Bachmeyer (2023-08-21 19:35:05)

No, we are not.  CPU-VENDOR-KERNEL-OS-LIBCABI, with at least one of the

If you want to redefine existing terms, please be forthright about the fact that
your proposal does so.

This usage is in conflict with the existing definition; LIBC and ABI are
subfields of OS.  It isn't resolving any "technical debt" -- it's sowing mass
confusion.

 From config.sub:

# The goal of this file is to map all the various variations of a given
# machine specification into a single specification in the form:
#   CPU_TYPE-MANUFACTURER-OPERATING_SYSTEM
# or in some cases, the newer four-part form:
#   CPU_TYPE-MANUFACTURER-KERNEL-OPERATING_SYSTEM
# It is wrong to echo any other type of specification.

The variable name "LIBCABI" comes from config.guess, where it is not described,
but is always parsed as a refinement of the OPERATING_SYSTEM field.  There is
never a hyphen between OPERATING_SYSTEM and LIBCABI because they are in fact
different parsings of the same string.


I'll add that all linux-* configs in config.guess use LIBC/LIBCABI. I 
take this as further evidence that distinguishing OSes atop Linux is 
useless. Per the above, I think this is good!



config tuples were originally triplets and now often feature a 4-element
CPU-VENDOR-KERNEL-OS form

Yes, we've had ~20 years to appreciate the confusion it caused, and now we know
better than to do something like that again.


Adam are you saying you prefer the state of 3-component configs?


It seems like a lot of the proposals in this thread are being evaluated not
based on whether or not they are coherent, but rather on whether or not they
take us a few nanometers closer to whatever happens to whatever LLVM's internal
implementation details happen to be this week.


I care about coherence, the reason I like to see what LLVM does that 
working from a parsed representation forces the software to be much more 
honest. Since GNU config doesn't reveal its categories but just spits 
out another opaque string, there is no external pressure for its 
categorization to be any good. LLVM, on the other hand, dispenses with 
strings entirely and just uses the enums, so it is forced to make sure 
those enums make sense and work for the branching the program has to do.


LLVM parsing of configs is ad-hoc Postel's law stuff like everyone else, 
but its internal representation is actually quite stable. Parsing is the 
ugly nasty part that gets to the pristine clear ontology on the other side.


Ultimately I would like to convene everyone to commit to an agreed upon 
internal representation too. E.g. clang and GNU config could both spit 
out some JSON that is unambiguous and should match. I think that would 
alleviate a lot of Adam's concerns about "following LLVM". But I don't 
think it is possible to convene the working group needed to standardize 
such a format yet, because there is little trust between parties. Moving 
us a "a few nanometers closer" on each side demonstrates that there is 
willingness to compromise.


---

Concretely, I think these are pretty clear configs:

CPU-VENDOR-windows-mingnu # MinGW, MS C + GNU C++ and other GNU-ish 
things, TODO distinguish between MSVCRT and UCRT


CPU-VENDOR-windows-cygnus # Cygwin

CPU-VENDOR-windows-msys # MSYS2, a lot like Cygwin

CPU-VENDOR-windows-msvc # MS C + MS C++

CPU-VENDOR-linux-gnu # gnu libc

CPU-VENDOR-linux-musl # musl libc

CPU-VENDOR-linux-android # bionic libc

I know Po Lu doesn't like them, because they overlap with existing ones. 
But what about you two, Adam and Jacob? I am trying to compromise 
between what various things do already, and and also correct things like 
windows-gnu (even if there is no such thing as the GNU operating system 
(only multiple GNU Hurd-supporting distros), I agree that MinGW is 
clearly not a complete enough of set of GNU software to earn the right 
to drop the "minimal" part).


If we can accept these, I think I will have no problem getting LLVM to 
accept windows-mingnu, and perhaps even warn/deprecate windows-gnu. 
After that, I think we are close enough to convene a working group for a 
JSON/whatever explicit standard. And that would be amazing.


--

Re: config.sub should normalize *-*-windows-*

2023-08-24 Thread Adam Joseph
Quoting Jacob Bachmeyer (2023-08-21 19:35:05)
> No, we are not.  CPU-VENDOR-KERNEL-OS-LIBCABI, with at least one of the

If you want to redefine existing terms, please be forthright about the fact that
your proposal does so.

This usage is in conflict with the existing definition; LIBC and ABI are
subfields of OS.  It isn't resolving any "technical debt" -- it's sowing mass
confusion.

>From config.sub:

# The goal of this file is to map all the various variations of a given
# machine specification into a single specification in the form:
#   CPU_TYPE-MANUFACTURER-OPERATING_SYSTEM
# or in some cases, the newer four-part form:
#   CPU_TYPE-MANUFACTURER-KERNEL-OPERATING_SYSTEM
# It is wrong to echo any other type of specification.

The variable name "LIBCABI" comes from config.guess, where it is not described,
but is always parsed as a refinement of the OPERATING_SYSTEM field.  There is
never a hyphen between OPERATING_SYSTEM and LIBCABI because they are in fact
different parsings of the same string.

> config tuples were originally triplets and now often feature a 4-element
> CPU-VENDOR-KERNEL-OS form

Yes, we've had ~20 years to appreciate the confusion it caused, and now we know
better than to do something like that again.

It seems like a lot of the proposals in this thread are being evaluated not
based on whether or not they are coherent, but rather on whether or not they
take us a few nanometers closer to whatever happens to whatever LLVM's internal
implementation details happen to be this week.

> `CPU-VENDOR-linux-gnu-musl`

I lack words to describe this.  I suppose it could be useful if the goal were to
drive config.sub into such a self-inconsistent state that everybody abandons it.

Perhaps that is the plan.

  - a



Re: config.sub should normalize *-*-windows-*

2023-08-22 Thread Po Lu
Jacob Bachmeyer  writes:

> There are several GNU/Linux distributions that either use or can use
> musl libc already.  (See:
> https://en.wikipedia.org/w/index.php?title=Musl&oldid=1164590075>)
> Musl libc does not have the same features as GNU libc, so it is
> rightly a different ABI target, however, the system is still a GNU
> variant, so its configure tuple should still match *-gnu-* for the
> same reasons that the GNU project wants to call the overall system
> GNU/Linux.

These systems are not GNU/Linux systems, since they don't incorporate
the GNU libc; this particularly applies to Autoconf users, since they
will not have access to extensions furnished by the GNU libc.

>> To top it all off, considerations for such systems affect the entire GNU
>> project, and config cannot unilaterally ordain decisions regarding their
>> treatment.  config-patches is definitely the wrong mailing list.
>
> OK then, what is the right mailing list?

gnu-system-discuss, maybe?



Re: config.sub should normalize *-*-windows-*

2023-08-22 Thread Jacob Bachmeyer

Po Lu wrote:

Jacob Bachmeyer  writes:
  

No, we are not.  CPU-VENDOR-KERNEL-OS-LIBCABI, with at least one of
the latter three omitted, fits the bill.  In that case, the reference
to MinGW means that "OS" and/or "KERNEL" are omitted and MinGW is the
ABI.  The next logical extension is to allow all five to be present,
to describe systems flexible enough to accommodate multiple ABIs.



We're not trying to change the world here, so let's wait until a more
urgent need presents itself before issuing plans for drastic changes to
a format that has been firmly established for well over two decades,
okay?
  


I do not see this as planning a drastic change.  I see this issue as 
acknowledging a change that has already happened unnoticed.



The problem is that that example does exist, so we need to find a
systematic way to accommodate it before more such variant GNU systems
are produced and we have a real mess.



Which systems are distributed in this manner?  And what difference does
C library they elect to use for system utilites make towards the
compilation of user programs with Autoconf?
  


There are several GNU/Linux distributions that either use or can use 
musl libc already.  (See:  
https://en.wikipedia.org/w/index.php?title=Musl&oldid=1164590075>) 
Musl libc does not have the same features as GNU libc, so it is rightly 
a different ABI target, however, the system is still a GNU variant, so 
its configure tuple should still match *-gnu-* for the same reasons that 
the GNU project wants to call the overall system GNU/Linux.



To top it all off, considerations for such systems affect the entire GNU
project, and config cannot unilaterally ordain decisions regarding their
treatment.  config-patches is definitely the wrong mailing list.


OK then, what is the right mailing list?


-- Jacob




Re: config.sub should normalize *-*-windows-*

2023-08-21 Thread Po Lu
Jacob Bachmeyer  writes:

> No, we are not.  CPU-VENDOR-KERNEL-OS-LIBCABI, with at least one of
> the latter three omitted, fits the bill.  In that case, the reference
> to MinGW means that "OS" and/or "KERNEL" are omitted and MinGW is the
> ABI.  The next logical extension is to allow all five to be present,
> to describe systems flexible enough to accommodate multiple ABIs.

We're not trying to change the world here, so let's wait until a more
urgent need presents itself before issuing plans for drastic changes to
a format that has been firmly established for well over two decades,
okay?

> The problem is that that example does exist, so we need to find a
> systematic way to accommodate it before more such variant GNU systems
> are produced and we have a real mess.

Which systems are distributed in this manner?  And what difference does
C library they elect to use for system utilites make towards the
compilation of user programs with Autoconf?

To top it all off, considerations for such systems affect the entire GNU
project, and config cannot unilaterally ordain decisions regarding their
treatment.  config-patches is definitely the wrong mailing list.

Thanks.



Re: config.sub should normalize *-*-windows-*

2023-08-21 Thread Jacob Bachmeyer

Po Lu wrote:

Jacob Bachmeyer  writes:

  
[...]



but several existing tuples use a libc or ABI name in place of a
kernel and/or operating system.



In each of those cases, the ABI name _can_ be construed as a kernel
(since there is no kernel at all), or the libc name refers to a general
category of OS.  Neither of these situations are applicable to MinGW or
MSVC.
  
  

Arguably, MinGW *is* an ABI name.



Either way, that ship has already sailed.  So we're stuck with dubbing
MinGW an operating system.
  


No, we are not.  CPU-VENDOR-KERNEL-OS-LIBCABI, with at least one of the 
latter three omitted, fits the bill.  In that case, the reference to 
MinGW means that "OS" and/or "KERNEL" are omitted and MinGW is the ABI.  
The next logical extension is to allow all five to be present, to 
describe systems flexible enough to accommodate multiple ABIs.



Think about why the GNU project pushes to call the common system
"GNU/Linux" and you should see the reason for using
`*-*-linux-gnu-musl' to express a GNU/Linux system using musl libc.



If the GNU libc isn't being used, it's not a complete GNU system.  We
should defer establishing suitable configuration names for Frankenstein
systems until the moment they come into existence.


The problem is that that example does exist, so we need to find a 
systematic way to accommodate it before more such variant GNU systems 
are produced and we have a real mess.



-- Jacob




Re: config.sub should normalize *-*-windows-*

2023-08-21 Thread Po Lu
Jacob Bachmeyer  writes:

> Then its present use is *wrong* and a bug that should be fixed.

The subject of this thread, indeed.

> It is a little more complex than that:  the GNU system theoretically
> can run on any of multiple kernels.  While Linux is most commonly
> used, GNU HURD is still in development and I understand that there is
> a Debian variant using the GNU utilities on a FreeBSD kernel.

They're *-*-kfreebsd-gnu and *-*-gnu.

>>> but several existing tuples use a libc or ABI name in place of a
>>> kernel and/or operating system.
>>> 
>>
>> In each of those cases, the ABI name _can_ be construed as a kernel
>> (since there is no kernel at all), or the libc name refers to a general
>> category of OS.  Neither of these situations are applicable to MinGW or
>> MSVC.
>>   
>
> Arguably, MinGW *is* an ABI name.

Either way, that ship has already sailed.  So we're stuck with dubbing
MinGW an operating system.

> Think about why the GNU project pushes to call the common system
> "GNU/Linux" and you should see the reason for using
> `*-*-linux-gnu-musl' to express a GNU/Linux system using musl libc.

If the GNU libc isn't being used, it's not a complete GNU system.  We
should defer establishing suitable configuration names for Frankenstein
systems until the moment they come into existence.



Re: config.sub should normalize *-*-windows-*

2023-08-21 Thread Jacob Bachmeyer

John Ericson wrote:

On Mon, Aug 21, 2023, at 1:17 AM, Po Lu wrote:

Jacob Bachmeyer mailto:jcb62...@gmail.com>> writes:

> No:  MinGW is Windows native "Win32" API while a future `windows-gnu'
> would be the GNU system's POSIX API on an NT kernel.  These are *very*
> different configurations; `windows-gnu' would more closely resemble
> Cygwin.

This is not what the `x86_64-pc-windows-gnu' configuration presently
canonicalized by config.sub represents.


I have offered multiple times to change it to windows-mingnu or 
something else. Let's not be hung up on this, it is just making a 
straw man of the broader project of making configs that are more 
consistent.


If it describes MinGW, then it should be windows-mingw32 or 
windows-mingw64 as appropriate.  The CPU field /should/ be redundant to 
that, but x86_64 can run 32-bit code, so it would probably be a good 
idea, unless we want to canonicalize `x86_64-pc-*mingw32' to 
`i686-pc-windows-mingw'.  Should canonicalization change the CPU field 
when one CPU type has a compatibility mode for another CPU type and the 
ABI implies use of that mode?



[...]

> But they already have drifted:  config tuples were originally triplets
> and now often feature a 4-element CPU-VENDOR-KERNEL-OS form

Only as a result of a technical need to distinguish Linux-based GNU
systems from other GNU systems.  Absent that requirement, we would
simply call GNU/Linux systems *-*-gnu, Alpine *-*-alpine, and Android
*-*-android.


You just said it! We have the exact same "technical need" on Windows 
as with Linux of identifying different platforms that share the same 
syscall interface. For the same reason we don't want people to have to 
write *-*-gnu | *-*-alpine | *-*-android (an endlessly growing list of 
special cases) to use e.g. the clone system call, we don't want them 
to have to maintain a big and ever growing list of Windows variants 
for a conditional that is just about Windows in general.


The catch here is that any package recognizing both *-gnu and 
*-windows-* would need to ensure that the match for *-gnu has priority, 
since an actual *-windows-gnu environment would be the (POSIX) GNU 
system running on an NT kernel and would /not/ have the standard 
Windows-isms that *-windows-* otherwise implies.  CygWin may have 
similar issues; I believe that it is currently treated as a unique OS 
unrelated to Windows.


Or should we define a new `windowsix' KERNEL value for POSIX 
environments on NT kernels?



-- Jacob



Re: config.sub should normalize *-*-windows-*

2023-08-21 Thread Po Lu
Jacob Bachmeyer  writes:

> I said "with only 3-or-4-of-5 elements present"; that using all 5
> elements is currently invalid does not change that we effectively
> /have/ 5 elements, with a restriction that only 3 or 4 of them can
> actually be present in any one tuple.
>
>
> -- Jacob

And my point was that you're suffering from a bout of pareidolia...



Re: config.sub should normalize *-*-windows-*

2023-08-21 Thread Jacob Bachmeyer

Po Lu wrote:

Jacob Bachmeyer  writes:

  

No:  MinGW is Windows native "Win32" API while a future `windows-gnu'
would be the GNU system's POSIX API on an NT kernel.  These are *very*
different configurations; `windows-gnu' would more closely resemble
Cygwin.



This is not what the `x86_64-pc-windows-gnu' configuration presently
canonicalized by config.sub represents.
  


Then its present use is *wrong* and a bug that should be fixed.


I say it would be more appropriate to accept `x86_64-pc-mingw64' as a
short form for `x86_64-pc-windows-mingw64', since Wine could enable a
`x86_64-pc-linux-mingw64' platform to exist.  (Wine's goal is that
that platform should be indistinguishable from
`x86_64-pc-windows-mingw64', but it is certainly a distinct
configuration from the user's perspective.)



Wine is a compatibility layer that emulates the MS-Windows kernel.  It
is not config's role to report the intricacies of the operating system
implementation, only details that affect user programs running under
that operating system.
  


As I said, Wine's goal is that `x86_64-pc-linux-mingw64' be 
indistinguishable from `x86_64-pc-windows-mingw64' but that does not 
preclude using the KERNEL-OS form.



But they already have drifted:  config tuples were originally triplets
and now often feature a 4-element CPU-VENDOR-KERNEL-OS form



Only as a result of a technical need to distinguish Linux-based GNU
systems from other GNU systems.  Absent that requirement, we would
simply call GNU/Linux systems *-*-gnu, Alpine *-*-alpine, and Android
*-*-android.
  


It is a little more complex than that:  the GNU system theoretically can 
run on any of multiple kernels.  While Linux is most commonly used, GNU 
HURD is still in development and I understand that there is a Debian 
variant using the GNU utilities on a FreeBSD kernel.



but several existing tuples use a libc or ABI name in place of a
kernel and/or operating system.



In each of those cases, the ABI name _can_ be construed as a kernel
(since there is no kernel at all), or the libc name refers to a general
category of OS.  Neither of these situations are applicable to MinGW or
MSVC.
  


Arguably, MinGW *is* an ABI name.


I simply note this and suggest recognizing this fact that config
tuples are actually now currently 3-or-4-of-5 elements.  The GNU
system is definitely flexible enough for that 5-element form to be
appropriate: `CPU-VENDOR-linux-gnu' (GNU/Linux, implied to be using
glibc) and `CPU-VENDOR-linux-gnu-musl' (GNU/Linux using musl libc) are
plausibly distinguishable, for example, and could even both be useful
on the *same* machine, if, for example, some low-level system
utilities are linked against musl libc while most user programs use
glibc.



Such configurations do not exist, so we need not provide for them in
config.*.  And in any case, these ``low level utilities'' would be
configured for *-*-linux-musl, while user programs would be configured
for *-*-linux-gnu.  I see no reason config.* must take special measures
to recognize these Frankenstein systems, since the C library used to
build some system utilities has no bearing on the operation of other
user programs built for *-*-linux-gnu.


Think about why the GNU project pushes to call the common system 
"GNU/Linux" and you should see the reason for using `*-*-linux-gnu-musl' 
to express a GNU/Linux system using musl libc.



-- Jacob




Re: config.sub should normalize *-*-windows-*

2023-08-21 Thread Jacob Bachmeyer

Po Lu wrote:

Jacob Bachmeyer  writes

This is why I am arguing that we should acknowledge that the naming
conventions have, in practice, already changed to
CPU-VENDOR-KERNEL-OS-LIBCABI, with only 3-or-4-of-5 elements present
at the moment.



$ ./config.sub a-b-c-d-e
Invalid configuration 'a-b-c-d-e': more than four components


I said "with only 3-or-4-of-5 elements present"; that using all 5 
elements is currently invalid does not change that we effectively /have/ 
5 elements, with a restriction that only 3 or 4 of them can actually be 
present in any one tuple.



-- Jacob




Re: config.sub should normalize *-*-windows-*

2023-08-21 Thread Po Lu
"John Ericson"  writes:

> I have offered multiple times to change it to windows-mingnu or
> something else. Let's not be hung up on this, it is just making a straw
> man of the broader project of making configs that are more
> consistent.

The point is, it should not contain `windows' at all, and it should not
differ from an existing triplet designating the same configuration.  The
objective of config development is to uniquely identify a system
configuration, and nothing beyond that -- changing the set of values
reported for the same configuration in the name of ``consistency''
certainly runs contrary to such an objective.

> You have misunderstood Jacob's point, which is that the MinGW
> interface has multiple implementations, namely top Windows itself
> and via Wine. Indeed see
> https://manpages.ubuntu.com/manpages/kinetic/man1/winegcc-development.1.html,
> this already exists via winelib and GCC. (MinGW is basically "modify
> MS headers so they work with GCC", and thus the same
> modifications are needed whether we are going to run on windows
> or on some Unix with winelib.)

My understanding is that headers used by winegcc are provided by
Winelib, not MinGW.  It would be fine to distinguish this configuration
from others: x86_64-pc-linux-winelib perhaps?  With that said,

> We never want to distinguish implementations that present the exact
> same interface (indeed, that defeats the point of re-implementing the
> same interface, going down this road yields a cat-and-mouse of
> endless lies like browser user-agent strings), but we do want to
> distinguish different interfaces.

All of the above is tangential to the subject at hand, since neither of
the configurations being debated fall under the latter category.

> You just said it! We have the exact same "technical need" on
> Windows as with Linux of identifying different platforms that share
> the same syscall interface. For the same reason we don't want
> people to have to write *-*-gnu | *-*-alpine | *-*-android (an endlessly
> growing list of special cases) to use e.g. the clone system call, we
> don't want them to have to maintain a big and ever growing list of
> Windows variants for a conditional that is just about Windows in
> general.

We have the means to adequately distinguish between the different
Windows configurations already.  Programs that want to use MSVC write
*-*-winnt*, and those which need to detect MinGW write *-*-mingw*.

> As a Nixpkgs developer, my goal is to see all the free software in the
> world packaged in a single way which will run (or can be
> cross-compiled to everywhere). This is very ambitious, and the only
> way it will work is if it is easy for upstream software to be portable.
> And it way to make it easy to be portable supporting all the variations
> is if we clean up foot-guns like this where upstream software has to
> maintain every-growing disjunctions rather than future-proof
> wildcards.

Chances are that software written for MSVC or MinGW will not work OOTB
with future Win32 toolchains, should any come into existence, making any
such future proofing redundant.  It is definitely no justification for
introducing duplicate triplets, especially ones inconsistent with the
naming scheme delinated at the start of config.sub.

You are also portraying the situation from the perspective of a
packager.  They are not Autoconf's primary audience: users configuring
programs are.  Duplicate configuration names designating the same
systems will aggravate the conundrum experienced by most when
configuring software.

Meanwhile, if config.sub normalizes the *-*-windows-* triplets as I've
proposed, users can continue providing these invalid triplets, with no
changes to Autoconf scripts or build files.  Which is the purpose of
config.sub after all.

> Sure, this sort of tech debt OCDing I am doing freely admit seems
> over kill for just one package (emacs) and one windows platform
> (MinGW), but please believe me when I say when considering all the
> variations and all the package it *does* become something worth
> practically caring about.

Having ported software other than Emacs to MS-Windows (and witnesses
others porting even more), I cannot agree.




Re: config.sub should normalize *-*-windows-*

2023-08-21 Thread John Ericson
On Mon, Aug 21, 2023, at 1:17 AM, Po Lu wrote:
> Jacob Bachmeyer  writes:
> 
> > No:  MinGW is Windows native "Win32" API while a future `windows-gnu'
> > would be the GNU system's POSIX API on an NT kernel.  These are *very*
> > different configurations; `windows-gnu' would more closely resemble
> > Cygwin.
> 
> This is not what the `x86_64-pc-windows-gnu' configuration presently
> canonicalized by config.sub represents.

I have offered multiple times to change it to windows-mingnu or something else. 
Let's not be hung up on this, it is just making a straw man of the broader 
project of making configs that are more consistent.

> > I say it would be more appropriate to accept `x86_64-pc-mingw64' as a
> > short form for `x86_64-pc-windows-mingw64', since Wine could enable a
> > `x86_64-pc-linux-mingw64' platform to exist.  (Wine's goal is that
> > that platform should be indistinguishable from
> > `x86_64-pc-windows-mingw64', but it is certainly a distinct
> > configuration from the user's perspective.)
> 
> Wine is a compatibility layer that emulates the MS-Windows kernel.  It
> is not config's role to report the intricacies of the operating system
> implementation, only details that affect user programs running under
> that operating system.

You have misunderstood Jacob's point, which is that the MinGW *interface* has 
multiple implementations, namely top Windows itself and via Wine. Indeed see 
https://manpages.ubuntu.com/manpages/kinetic/man1/winegcc-development.1.html, 
this already exists via winelib and GCC. (MinGW is basically "modify MS headers 
so they work with GCC", and thus the same modifications are needed whether we 
are going to run on windows or on some Unix with winelib.)

We never want to distinguish implementations that present the exact same 
interface (indeed, that defeats the point of re-implementing the same 
interface, going down this road yields a cat-and-mouse of endless lies like 
browser user-agent strings), but we *do* want to distinguish different 
interfaces.

> > But they already have drifted:  config tuples were originally triplets
> > and now often feature a 4-element CPU-VENDOR-KERNEL-OS form
> 
> Only as a result of a technical need to distinguish Linux-based GNU
> systems from other GNU systems.  Absent that requirement, we would
> simply call GNU/Linux systems *-*-gnu, Alpine *-*-alpine, and Android
> *-*-android.

You just said it! We have the exact same "technical need" on Windows as with 
Linux of identifying different platforms that share the same syscall interface. 
For the same reason we don't want people to have to write *-*-gnu | *-*-alpine 
| *-*-android (an endlessly growing list of special cases) to use e.g. the 
clone system call, we don't want them to have to maintain a big and ever 
growing list of Windows variants for a conditional that is just about Windows 
in general.

As a Nixpkgs developer, my goal is to see all the free software in the world 
packaged in a single way which will run (or can be cross-compiled to 
everywhere). This is very ambitious, and the only way it will work is if it is 
easy for upstream software to be portable. And it way to make it easy to be 
portable supporting all the variations is if we clean up foot-guns like this 
where upstream software has to maintain every-growing disjunctions rather than 
future-proof wildcards.

Sure, this sort of tech debt OCDing I am doing freely admit seems over kill for 
just one package (emacs) and one windows platform (MinGW), but please believe 
me when I say when considering all the variations and all the package it *does* 
become something worth practically caring about.

John

Re: config.sub should normalize *-*-windows-*

2023-08-20 Thread Po Lu
Jacob Bachmeyer  writes:

> This is why I am arguing that we should acknowledge that the naming
> conventions have, in practice, already changed to
> CPU-VENDOR-KERNEL-OS-LIBCABI, with only 3-or-4-of-5 elements present
> at the moment.

$ ./config.sub a-b-c-d-e
Invalid configuration 'a-b-c-d-e': more than four components



Re: config.sub should normalize *-*-windows-*

2023-08-20 Thread Po Lu
John Ericson  writes:

> I agree the GNU project is not under any such obligation, and that's
> why I proposed windows-mingw as a compromise.

Once again, what's wrong with plain mingw?  Or winnt?

> It is more work for me to go make both GCC and LLVM support, but I
> rather do that than be stuck with plain mingw32.

Your preferences do not necessarily reflect those of the thousands of
Autoconf users, all of whom have lived with the status quo for decades.

> There is no existing convention for windows.

Really?  What's alpha-dec-winnt*, or i586-pc-mingw32?

> So far every time a new "brand name" 3rd position component has been
> chosen without any sort of pattern.

There doesn't have to be a pattern.

> Now that I've made (over the past few years) GNU config be more
> structured and more easily support longer configs, it is time to
> establish a convention. windows-* makes sense

Neither make sense.

And the overriding objective of all config.* development is to _NEVER_
change the set of canonical values, or even worse, introduce duplicate
ones.

> and has precedent.

>From LLVM?  That may be so, but the GNU project elected to use `mingw'
and `winnt' decades prior to LLVM's very existence.



Re: config.sub should normalize *-*-windows-*

2023-08-20 Thread Po Lu
Jacob Bachmeyer  writes:

> No:  MinGW is Windows native "Win32" API while a future `windows-gnu'
> would be the GNU system's POSIX API on an NT kernel.  These are *very*
> different configurations; `windows-gnu' would more closely resemble
> Cygwin.

This is not what the `x86_64-pc-windows-gnu' configuration presently
canonicalized by config.sub represents.

> I say it would be more appropriate to accept `x86_64-pc-mingw64' as a
> short form for `x86_64-pc-windows-mingw64', since Wine could enable a
> `x86_64-pc-linux-mingw64' platform to exist.  (Wine's goal is that
> that platform should be indistinguishable from
> `x86_64-pc-windows-mingw64', but it is certainly a distinct
> configuration from the user's perspective.)

Wine is a compatibility layer that emulates the MS-Windows kernel.  It
is not config's role to report the intricacies of the operating system
implementation, only details that affect user programs running under
that operating system.

> But they already have drifted:  config tuples were originally triplets
> and now often feature a 4-element CPU-VENDOR-KERNEL-OS form

Only as a result of a technical need to distinguish Linux-based GNU
systems from other GNU systems.  Absent that requirement, we would
simply call GNU/Linux systems *-*-gnu, Alpine *-*-alpine, and Android
*-*-android.

> but several existing tuples use a libc or ABI name in place of a
> kernel and/or operating system.

In each of those cases, the ABI name _can_ be construed as a kernel
(since there is no kernel at all), or the libc name refers to a general
category of OS.  Neither of these situations are applicable to MinGW or
MSVC.

> I simply note this and suggest recognizing this fact that config
> tuples are actually now currently 3-or-4-of-5 elements.  The GNU
> system is definitely flexible enough for that 5-element form to be
> appropriate: `CPU-VENDOR-linux-gnu' (GNU/Linux, implied to be using
> glibc) and `CPU-VENDOR-linux-gnu-musl' (GNU/Linux using musl libc) are
> plausibly distinguishable, for example, and could even both be useful
> on the *same* machine, if, for example, some low-level system
> utilities are linked against musl libc while most user programs use
> glibc.

Such configurations do not exist, so we need not provide for them in
config.*.  And in any case, these ``low level utilities'' would be
configured for *-*-linux-musl, while user programs would be configured
for *-*-linux-gnu.  I see no reason config.* must take special measures
to recognize these Frankenstein systems, since the C library used to
build some system utilities has no bearing on the operation of other
user programs built for *-*-linux-gnu.



Re: config.sub should normalize *-*-windows-*

2023-08-20 Thread Jacob Bachmeyer

Po Lu wrote:

Jacob Bachmeyer  writes:

  

At this time, yes.  However, the GNU utilities are designed to be
fairly portable and the NT kernel was designed to support multiple
ABIs, so a hypothetical port of GNU to run under MS-Windows is within
the realm of possibility.  (In fact, the underlying architecture of NT
should have all of the primitives needed to support HURD or a closely
related system.)  It is more likely that this would be implemented on
ReactOS (which aims for ABI compatibility with NT 5.1, is a stable
target, and is Free) first, but a hypothetical `x86_64-pc-windows-gnu'
(or perhaps `x86_64-pc-reactos-gnu') config tuple *is* a future
possibility.



This hypothesizing is not relevant here.  x86_64-pc-windows-* represents
MinGW, and should be normalized correspondingly.
  


No:  MinGW is Windows native "Win32" API while a future `windows-gnu' 
would be the GNU system's POSIX API on an NT kernel.  These are *very* 
different configurations; `windows-gnu' would more closely resemble Cygwin.



And what would we canonicalize `x86_64-pc-windows-gnu' to, other than
itself, currently?



x86_64-pc-mingw64, which I mentioned at the outset of this thread.
  


I say it would be more appropriate to accept `x86_64-pc-mingw64' as a 
short form for `x86_64-pc-windows-mingw64', since Wine could enable a 
`x86_64-pc-linux-mingw64' platform to exist.  (Wine's goal is that that 
platform should be indistinguishable from `x86_64-pc-windows-mingw64', 
but it is certainly a distinct configuration from the user's perspective.)



It appears that config tuples may be drifting towards a 5-element
CPU-VENDOR-KERNEL-OS-LIBCABI form, with each of the last three
elements potentially optional, which makes any real tuple ambiguous,
except that the valid strings for KERNEL, OS, and LIBCABI are from
distinct sets.



Configuration tuples don't ``drift'', and they certainly should not
change or duplicate other triplets.


But they already have drifted:  config tuples were originally triplets 
and now often feature a 4-element CPU-VENDOR-KERNEL-OS form, but several 
existing tuples use a libc or ABI name in place of a kernel and/or 
operating system.  I simply note this and suggest recognizing this fact 
that config tuples are actually now currently 3-or-4-of-5 elements.  The 
GNU system is definitely flexible enough for that 5-element form to be 
appropriate:  `CPU-VENDOR-linux-gnu' (GNU/Linux, implied to be using 
glibc) and `CPU-VENDOR-linux-gnu-musl' (GNU/Linux using musl libc) are 
plausibly distinguishable, for example, and could even both be useful on 
the *same* machine, if, for example, some low-level system utilities are 
linked against musl libc while most user programs use glibc.



-- Jacob




Re: config.sub should normalize *-*-windows-*

2023-08-20 Thread Jacob Bachmeyer

Po Lu wrote:

John Ericson  writes:

  

Thanks Jacob,

That's absolutely right that Win NT supports multiple personalities
and so all sorts of things are possible. (Indeed that is how WSL1
worked.)

MinGW stands for "Minimalist GNU for Windows" [1], and I suspect that
is why Saleem choose windows-gnu in that commit almost a decade ago. I
supposed we could say that "minimalist GNU" is not "GNU", and do
windows-mingnu or something, and then I could submit an LLVM patch to
try to support that. But I suppose I lean towards support configs that
at least one of GCC or Clang supports already, rather than making up
completely new stuff.



GNU config is part of the GNU project, developing the GNU operating
system, which opted for ``mingw'' many, many moons ago.  We are under no
obligation to adhere to LLVM standards, especially when they require us
to misrepresent the nature of a specific system configuration.
  


This is also correct:  `windows-gnu' does not currently exist and its 
use to describe MinGW in LLVM is /wrong/.  The GNU system 
implements/extends POSIX and MinGW attempts to port GNU utilities to run 
under native Windows, which does *not* implement POSIX, therefore MinGW 
is *not* `windows-gnu'.



Also, I would like to point out that the "scales to more variations"
argument is not at all hypothetical. If one looks at [2] one will see
that MSYS is a variation of Cygwin, and a mingw-style environments can
be made from the newer ucrt or older msvcrt. Today there are just too
many subtle variations to capture them all with sensible. It looks
like MSYS [3] reuses a triple for multiple configurations, and just
relies on users getting the PATH right, but that clearly isn't
ideal. Creating windows-* variants to handle them all in a consistent
and predictable manner is much better.



We can create new triplets for new environments once they do come into
existence.  But they should not duplicate existing ones, and they must
conform to the existing naming convention for configuration triplets.
  


This is why I am arguing that we should acknowledge that the naming 
conventions have, in practice, already changed to 
CPU-VENDOR-KERNEL-OS-LIBCABI, with only 3-or-4-of-5 elements present at 
the moment.



-- Jacob



Re: config.sub should normalize *-*-windows-*

2023-08-20 Thread John Ericson



On 8/21/23 00:39, Po Lu wrote:

John Ericson  writes:


Thanks Jacob,

That's absolutely right that Win NT supports multiple personalities
and so all sorts of things are possible. (Indeed that is how WSL1
worked.)

MinGW stands for "Minimalist GNU for Windows" [1], and I suspect that
is why Saleem choose windows-gnu in that commit almost a decade ago. I
supposed we could say that "minimalist GNU" is not "GNU", and do
windows-mingnu or something, and then I could submit an LLVM patch to
try to support that. But I suppose I lean towards support configs that
at least one of GCC or Clang supports already, rather than making up
completely new stuff.

GNU config is part of the GNU project, developing the GNU operating
system, which opted for ``mingw'' many, many moons ago.  We are under no
obligation to adhere to LLVM standards, especially when they require us
to misrepresent the nature of a specific system configuration.


I agree the GNU project is not under any such obligation, and that's why 
I proposed windows-mingw as a compromise. It is more work for me to go 
make both GCC and LLVM support, but I rather do that than be stuck with 
plain mingw32.



Also, I would like to point out that the "scales to more variations"
argument is not at all hypothetical. If one looks at [2] one will see
that MSYS is a variation of Cygwin, and a mingw-style environments can
be made from the newer ucrt or older msvcrt. Today there are just too
many subtle variations to capture them all with sensible. It looks
like MSYS [3] reuses a triple for multiple configurations, and just
relies on users getting the PATH right, but that clearly isn't
ideal. Creating windows-* variants to handle them all in a consistent
and predictable manner is much better.

We can create new triplets for new environments once they do come into
existence.  But they should not duplicate existing ones, and they must
conform to the existing naming convention for configuration triplets.


There is no existing convention for windows. So far every time a new 
"brand name" 3rd position component has been chosen without any sort of 
pattern. Now that I've made (over the past few years) GNU config be more 
structured and more easily support longer configs, it is time to 
establish a convention. windows-* makes sense, and has precedent.


John




Re: config.sub should normalize *-*-windows-*

2023-08-20 Thread Po Lu
John Ericson  writes:

> Thanks Jacob,
>
> That's absolutely right that Win NT supports multiple personalities
> and so all sorts of things are possible. (Indeed that is how WSL1
> worked.)
>
> MinGW stands for "Minimalist GNU for Windows" [1], and I suspect that
> is why Saleem choose windows-gnu in that commit almost a decade ago. I
> supposed we could say that "minimalist GNU" is not "GNU", and do
> windows-mingnu or something, and then I could submit an LLVM patch to
> try to support that. But I suppose I lean towards support configs that
> at least one of GCC or Clang supports already, rather than making up
> completely new stuff.

GNU config is part of the GNU project, developing the GNU operating
system, which opted for ``mingw'' many, many moons ago.  We are under no
obligation to adhere to LLVM standards, especially when they require us
to misrepresent the nature of a specific system configuration.

> Also, I would like to point out that the "scales to more variations"
> argument is not at all hypothetical. If one looks at [2] one will see
> that MSYS is a variation of Cygwin, and a mingw-style environments can
> be made from the newer ucrt or older msvcrt. Today there are just too
> many subtle variations to capture them all with sensible. It looks
> like MSYS [3] reuses a triple for multiple configurations, and just
> relies on users getting the PATH right, but that clearly isn't
> ideal. Creating windows-* variants to handle them all in a consistent
> and predictable manner is much better.

We can create new triplets for new environments once they do come into
existence.  But they should not duplicate existing ones, and they must
conform to the existing naming convention for configuration triplets.



Re: config.sub should normalize *-*-windows-*

2023-08-20 Thread John Ericson

Thanks Jacob,

That's absolutely right that Win NT supports multiple personalities and 
so all sorts of things are possible. (Indeed that is how WSL1 worked.)


MinGW stands for "Minimalist GNU for Windows" [1], and I suspect that is 
why Saleem choose windows-gnu in that commit almost a decade ago. I 
supposed we could say that "minimalist GNU" is not "GNU", and do 
windows-mingnu or something, and then I could submit an LLVM patch to 
try to support that. But I suppose I lean towards support configs that 
at least one of GCC or Clang supports already, rather than making up 
completely new stuff.


Also, I would like to point out that the "scales to more variations" 
argument is not at all hypothetical. If one looks at [2] one will see 
that MSYS is a variation of Cygwin, and a mingw-style environments can 
be made from the newer ucrt or older msvcrt. Today there are just too 
many subtle variations to capture them all with sensible. It looks like 
MSYS [3] reuses a triple for multiple configurations, and just relies on 
users getting the PATH right, but that clearly isn't ideal. Creating 
windows-* variants to handle them all in a consistent and predictable 
manner is much better.


John

P.S. I've also CC'd Martin Storjso who has worked on LLVM MinGW things 
recently


[1]: https://en.wikipedia.org/wiki/MinGW

[2]: https://www.msys2.org/docs/environments/

[3]: https://packages.msys2.org/package/mingw-w64-x86_64-gcc 
https://packages.msys2.org/package/mingw-w64-ucrt-x86_64-gcc


On 8/20/23 22:40, Po Lu wrote:

Jacob Bachmeyer  writes:


At this time, yes.  However, the GNU utilities are designed to be
fairly portable and the NT kernel was designed to support multiple
ABIs, so a hypothetical port of GNU to run under MS-Windows is within
the realm of possibility.  (In fact, the underlying architecture of NT
should have all of the primitives needed to support HURD or a closely
related system.)  It is more likely that this would be implemented on
ReactOS (which aims for ABI compatibility with NT 5.1, is a stable
target, and is Free) first, but a hypothetical `x86_64-pc-windows-gnu'
(or perhaps `x86_64-pc-reactos-gnu') config tuple *is* a future
possibility.

This hypothesizing is not relevant here.  x86_64-pc-windows-* represents
MinGW, and should be normalized correspondingly.


And what would we canonicalize `x86_64-pc-windows-gnu' to, other than
itself, currently?

x86_64-pc-mingw64, which I mentioned at the outset of this thread.


It appears that config tuples may be drifting towards a 5-element
CPU-VENDOR-KERNEL-OS-LIBCABI form, with each of the last three
elements potentially optional, which makes any real tuple ambiguous,
except that the valid strings for KERNEL, OS, and LIBCABI are from
distinct sets.

Configuration tuples don't ``drift'', and they certainly should not
change or duplicate other triplets.




Re: config.sub should normalize *-*-windows-*

2023-08-20 Thread Po Lu
Jacob Bachmeyer  writes:

> At this time, yes.  However, the GNU utilities are designed to be
> fairly portable and the NT kernel was designed to support multiple
> ABIs, so a hypothetical port of GNU to run under MS-Windows is within
> the realm of possibility.  (In fact, the underlying architecture of NT
> should have all of the primitives needed to support HURD or a closely
> related system.)  It is more likely that this would be implemented on
> ReactOS (which aims for ABI compatibility with NT 5.1, is a stable
> target, and is Free) first, but a hypothetical `x86_64-pc-windows-gnu'
> (or perhaps `x86_64-pc-reactos-gnu') config tuple *is* a future
> possibility.

This hypothesizing is not relevant here.  x86_64-pc-windows-* represents
MinGW, and should be normalized correspondingly.

> And what would we canonicalize `x86_64-pc-windows-gnu' to, other than
> itself, currently?

x86_64-pc-mingw64, which I mentioned at the outset of this thread.

> It appears that config tuples may be drifting towards a 5-element
> CPU-VENDOR-KERNEL-OS-LIBCABI form, with each of the last three
> elements potentially optional, which makes any real tuple ambiguous,
> except that the valid strings for KERNEL, OS, and LIBCABI are from
> distinct sets.

Configuration tuples don't ``drift'', and they certainly should not
change or duplicate other triplets.



Re: config.sub should normalize *-*-windows-*

2023-08-20 Thread Jacob Bachmeyer

Po Lu wrote:

"John Ericson"  writes:

  
[...]

Had those Windows-based platforms been introduced later, something
like the configs that Saleem added to LLVM would have been used from
the get go --- grouping the Windows-based platforms and grouping the
Linux-based platforms are both advantageous ways of categorizing
things, and advantageous for the same reasons.



We are trying to develop the GNU operating system, and it is in our
interest to convey the distinction between GNU systems employing the
Linux kernel, and other operating systems that are - by happenstance -
built on top of the same kernel.  OTOH, MinGW does not provide an
operating system founded upon the Windows kernel, so it is incorrect to
apply the:

  machine-vendor-kernel-OS

quadruplet scheme to it.  To rub salt into the wound, the GNU operating
system does NOT run under a MS-Windows kernel.  So ``windows-gnu'' is
not just conjecture, it is also a misnomer.
  


At this time, yes.  However, the GNU utilities are designed to be fairly 
portable and the NT kernel was designed to support multiple ABIs, so a 
hypothetical port of GNU to run under MS-Windows is within the realm of 
possibility.  (In fact, the underlying architecture of NT should have 
all of the primitives needed to support HURD or a closely related 
system.)  It is more likely that this would be implemented on ReactOS 
(which aims for ABI compatibility with NT 5.1, is a stable target, and 
is Free) first, but a hypothetical `x86_64-pc-windows-gnu' (or perhaps 
`x86_64-pc-reactos-gnu') config tuple *is* a future possibility.



As I said in the other email, I am not forcing anyone to do anything.



You are, as users will soon begin to provide invalid triplets such as
`x86_64-pc-windows-gnu' to their configuration files.  And instead of
canonicalizing them, the express purpose of config.sub, they are
reproduced verbatim, much to the detriment of configure scripts and to
the chagrin of package maintainers.
  


And what would we canonicalize `x86_64-pc-windows-gnu' to, other than 
itself, currently?


It appears that config tuples may be drifting towards a 5-element 
CPU-VENDOR-KERNEL-OS-LIBCABI form, with each of the last three elements 
potentially optional, which makes any real tuple ambiguous, except that 
the valid strings for KERNEL, OS, and LIBCABI are from distinct sets.



-- Jacob



Re: config.sub should normalize *-*-windows-*

2023-08-20 Thread Po Lu
"John Ericson"  writes:

> If Musl, GNU Libc, and Android are all different operating systems,
> why are MSVCRT, MinGW, and Cygwin not different operating systems?

Musl is not an operating system, but Musl-based systems are distinct
from GNU and Android systems, in that they share nothing in common
except for the Linux kernel.  These considerations are GNU project
policy, see:

  https://www.gnu.org/gnu/why-gnu-linux.en.html

In contrast, MinGW and MSVC are merely different compilers for the same
OS, using a different ABI, but the same system libraries and services.
A program will run on any MS-Windows system irrespective of whether it
was compiled with MSVC or MinGW.

And config.* already regards Cygwin as a separate operating system.

> The simplest reading of history that doesn't require any contortions
> is that MinGW and Cygwin predated configs with more than 3 components,
> but Android did not.

That's an inaccurate portrayal of history, at best.  See below.

> Had those Windows-based platforms been introduced later, something
> like the configs that Saleem added to LLVM would have been used from
> the get go --- grouping the Windows-based platforms and grouping the
> Linux-based platforms are both advantageous ways of categorizing
> things, and advantageous for the same reasons.

We are trying to develop the GNU operating system, and it is in our
interest to convey the distinction between GNU systems employing the
Linux kernel, and other operating systems that are - by happenstance -
built on top of the same kernel.  OTOH, MinGW does not provide an
operating system founded upon the Windows kernel, so it is incorrect to
apply the:

  machine-vendor-kernel-OS

quadruplet scheme to it.  To rub salt into the wound, the GNU operating
system does NOT run under a MS-Windows kernel.  So ``windows-gnu'' is
not just conjecture, it is also a misnomer.

> As I said in the other email, I am not forcing anyone to do anything.

You are, as users will soon begin to provide invalid triplets such as
`x86_64-pc-windows-gnu' to their configuration files.  And instead of
canonicalizing them, the express purpose of config.sub, they are
reproduced verbatim, much to the detriment of configure scripts and to
the chagrin of package maintainers.

> You can take the latest version and do nothing else. Anyone that uses
> *-windows-gnu will have their build fail, just as it fails
> today. There is no problem.

Users will start expecting configure to grok such configurations.  We
will start receiving bug reports, for the simple reason that the present
config.* cabal failed, in this case, to excercise the elementary degree
of circumspection and good judgement that should be applied when
maintaining a program underpinning thousands of important GNU (and
other) projects.



Re: config.sub should normalize *-*-windows-*

2023-08-20 Thread John Ericson
On Fri, Aug 18, 2023, at 8:24 PM, Po Lu wrote:
> GNU is an operating system.  Musl-based systems are not GNU, so -musl
> represents a ``musl-based operating system''.
> 
> > I do not think this is something to be frowned upon because "Operating
> > System.", after all, also lacks any rigorous objective definition.  
> 
> It does not, within the GNU project at least.  GNU is one operating
> system; Android is another, as are Musl-based systems.  And MS-Windows
> is a single operating system.

If Musl, GNU Libc, and Android are all different operating systems, why are 
MSVCRT, MinGW, and Cygwin not different operating systems? Listing off examples 
is *not* providing an objective definition.

The simplest reading of history that doesn't require any contortions is that 
MinGW and Cygwin predated configs with more than 3 components, but Android did 
not. Had those Windows-based platforms been introduced later, something like 
the configs that Saleem added to LLVM would have been used from the get go --- 
grouping the Windows-based platforms and grouping the Linux-based platforms are 
both advantageous ways of categorizing things, and advantageous for the same 
reasons.

> How is that worse than forcing every program wishing to support MS-Windows to
> introduce express support for 2 or 3 disparate and incorrect triplets?

As I said in the other email, I am not forcing anyone to do anything.

> Anyway, I plan to merge the latest config.* into Emacs soon.  So
> speaking as someone responsible, in part, for keeping the MS-Windows
> port of Emacs in working order, I would like to see the change I
> illustrated installed ASAP.

You can take the latest version and do nothing else. Anyone that uses 
*-windows-gnu will have their build fail, just as it fails today. There is no 
problem.

John

Re: config.sub should normalize *-*-windows-*

2023-08-18 Thread Po Lu
"John Ericson"  writes:

> In fairness I just recently submitted the patches added them, so
> absent a clear notion of GNU config releases I think a grace period
> where we can reconsider recently added changes is acceptable.

So let's please remove that change, and replace it with one that
canonicalizes *-*-windows-*.

>  Even more important than this is the principle that config.sub canonical 
> names are *never* changed, even if
>  they are wrong according to some external standard.
>
> That said, I don't think we should so normalize them. I took them from LLVM, 
> which has supported them for years and normalized in
> the *other* direction (i.e. to these), and Rust which follows LLVM's lead. I 
> knew we couldn't change the old ones to normalize to the new
> ones, so I thought a fair middle ground was that neither would normalize to 
> the other.
>
> For the record LLVM, Rust, and even sometimes GNU config don't treat
> *-*-foo-bar as *-*-$kernel-$os, but rather *-*-$kernel-$abi.  Where
> ABI is a sort of catch-all residual. This is why
> e.g. riscv-unknown-linux-musl is accepted --- no one would think the
> Musl libc is an operating system! Rather "gnu" is interpreted to be
> mean "glibc, libstdcxx++, etc.".

GNU is an operating system.  Musl-based systems are not GNU, so -musl
represents a ``musl-based operating system''.

> I do not think this is something to be frowned upon because "Operating
> System.", after all, also lacks any rigorous objective definition.  

It does not, within the GNU project at least.  GNU is one operating
system; Android is another, as are Musl-based systems.  And MS-Windows
is a single operating system.

> At the end of the day there is:
>
> 1 The syscall interface to communicate with "the outside world". (By
> "kernel" we really mean syscall interface, it is possible for two
> different implementations, like the actual Windows kernel, and Wine,
> to support the same syscall interface.)

There is already a set precedent for how such changes are treated by
config*: see mips*n32, arm*eabi, et cetera.

> 2 Other code linked into the same process. ABI covers the "most
> important" parts of this, especially when the userland ABI is more
> stable than the syscall ABI (Many BSDs, some Windows NT things, etc.)

ABI differences don't constitute a new operating system.

> And even ignoring all that, the *windows*-* convention makes clear
> that these are variations of extra stuff atop on Windows. In many
> instances, it doesn't matter which one of them is in use. Using the
> new triples makes it easier for that agnostic code to roll with the
> punches.

They are invalid triples.

> 
>
> My intention in adding these to GNU config was to then rework our
> Windows cross compilation in Nixpkgs to use them, which would mean
> likewise submitting patches to GCC and other things to accept them
> too. Normalizing them away would prevent me from doing all these other
> yak shaves, and trying to get the various flavors of Windows cross to
> work more consistently, because everything downstream from config.sub
> invocations would work the same way as before. IMO that would
> basically defeat the purpose of accepting them at all in GNU config
> --- better to reject that do a normalization that may not be desired.

If these configuration triplets are normalized, GNU projects (and others
using config*) will automagically work when they are provided.  How is
that worse than forcing every program wishing to support MS-Windows to
introduce express support for 2 or 3 disparate and incorrect triplets?

Anyway, I plan to merge the latest config.* into Emacs soon.  So
speaking as someone responsible, in part, for keeping the MS-Windows
port of Emacs in working order, I would like to see the change I
illustrated installed ASAP.



Re: config.sub should normalize *-*-windows-*

2023-08-18 Thread John Ericson
On 8/18/23 07:42, Zack Weinberg wrote:

> On Thu, Aug 17, 2023, at 8:34 PM, Po Lu wrote:
> ...
> 
>> Given that the MinGW ABI does not constitute the GNU operating system
>> executing on the MS-Windows kernel, and MSVC is not an operating system,
>> such blunders should be ignored, or at least normalized...
In fairness I just recently submitted the patches added them, so absent a clear 
notion of GNU config releases I think a grace period where we can reconsider 
recently added changes is acceptable.

> Even more important than this is the principle that config.sub canonical 
> names are *never* changed, even if they are wrong according to some external 
> standard.
That said, I don't think we should so normalize them. I took them from LLVM, 
which has supported them for years and normalized in the *other* direction 
(i.e. to these), and Rust which follows LLVM's lead. I knew we couldn't change 
the old ones to normalize to the new ones, so I thought a fair middle ground 
was that neither would normalize to the other.

For the record LLVM, Rust, and even sometimes GNU config don't treat 
*-*-foo-bar as *-*-$kernel-$os, but rather *-*-$kernel-$abi. Where ABI is a 
sort of catch-all residual. This is why e.g. riscv-unknown-linux-musl is 
accepted --- no one would think the Musl libc is an operating system! Rather 
"gnu" is interpreted to be mean "glibc, libstdcxx++, etc.".

I do not think this is something to be frowned upon because "Operating 
System.", after all, also lacks any rigorous objective definition. At the end 
of the day there is:

 1. The syscall interface to communicate with "the outside world". (By "kernel" 
we really mean syscall interface, it is possible for two different 
implementations, like the actual Windows kernel, and Wine, to support the same 
syscall interface.)
 2. Other code linked into the same process. ABI covers the "most important" 
parts of this, especially when the userland ABI is *more* stable than the 
syscall ABI (Many BSDs, some Windows NT things, etc.)
And even ignoring all that, the *windows*-* convention makes clear that these 
are variations of extra stuff atop on Windows. In many instances, it doesn't 
matter which one of them is in use. Using the new triples makes it easier for 
that agnostic code to roll with the punches.



My intention in adding these to GNU config was to then rework our Windows cross 
compilation in Nixpkgs to use them, which would mean likewise submitting 
patches to GCC and other things to accept them too. Normalizing them away would 
prevent me from doing all these other yak shaves, and trying to get the various 
flavors of Windows cross to work more consistently, because everything 
downstream from config.sub invocations would work the same way as before. IMO 
that would basically defeat the purpose of accepting them at all in GNU config 
--- better to reject that do a normalization that may not be desired.

Cheers,

John

P.S. CCing Saleem Abdulrasool who originally added them to LLVM in 
https://reviews.llvm.org/D2947, and who has continued to work on LLVM-land 
Windows support, e.g. for Swift. (I imagine Swift, like Rust, also uses the 
*windows*-* ones.) Perhaps he may have some additional insight to add.



Re: config.sub should normalize *-*-windows-*

2023-08-18 Thread Zack Weinberg
On Thu, Aug 17, 2023, at 8:34 PM, Po Lu wrote:
...
> Given that the MinGW ABI does not constitute the GNU operating system
> executing on the MS-Windows kernel, and MSVC is not an operating system,
> such blunders should be ignored, or at least normalized...

Even more important than this is the principle that config.sub canonical names 
are *never* changed, even if they are wrong according to some external standard.

zw



config.sub should normalize *-*-windows-*

2023-08-17 Thread Po Lu
x86_64-pc-windows-* is first and foremost a _misnomer_.  The format of a
configuration triplet (or quadruplet) is set in stone:
MACHINE-VENDOR-[KERNEL-]OS.

Given that the MinGW ABI does not constitute the GNU operating system
executing on the MS-Windows kernel, and MSVC is not an operating system,
such blunders should be ignored, or at least normalized to one of the
existing operating system values: x86_64-pc-mingw* for MinGW, and
x86_64-pc-winnt for MSVC.