Re: LC_CTYPE=UTF-8

2020-06-25 Thread shwaresyst
There are plans for this, having a POSIX.UTF-8 locale as an XSI base requirement. There may be POSIX.UTF-E and UTF-I locales too; same features, simply the different charmaps. As options there may even be, albeit this is unlikely as no platform I'm aware of fully supports ISO-6429 now, a

Re: LC_CTYPE=UTF-8

2020-06-25 Thread shwaresyst
The locale requirements specified in the C standard are what is applicable for implementations that limit their character encoding to the basic source and execution character sets. POSIX requires implementations to support, in at least one provided charmap, the superset of the basic sets

Re: LC_CTYPE=UTF-8

2020-06-25 Thread Martijn Dekker
Op 25-06-20 om 21:13 schreef Alan Coopersmith: The only thought I had along those lines was that I thought the "C" locale came from the C standard, and might be best left to the C committee to standardize, while this group controls the "POSIX" locale definition. Actually, as far as POSIX is

Re: LC_CTYPE=UTF-8

2020-06-25 Thread Ingo Schwarze
Hi Alan, Alan Coopersmith wrote on Thu, Jun 25, 2020 at 12:13:33PM -0700: > On 6/25/20 8:31 AM, Ingo Schwarze wrote: >> Whether to standardize only C.UTF-8 or both C.UTF-8 and POSIX.UTF-8 >> as synonyms looks a bit like asking for the best colour of a bikeshed. >> Given that the standard already

Re: LC_CTYPE=UTF-8

2020-06-25 Thread Alan Coopersmith
On 6/25/20 8:31 AM, Ingo Schwarze wrote: Whether to standardize only C.UTF-8 or both C.UTF-8 and POSIX.UTF-8 as synonyms looks a bit like asking for the best colour of a bikeshed. Given that the standard already contains the redundancy of requiring both "C" and "POSIX", maybe it is more

Re: LC_CTYPE=UTF-8

2020-06-25 Thread Martijn Dekker
Op 25-06-20 om 16:59 schreef Alan Coopersmith: On 6/25/20 6:33 AM, Hans Åberg wrote: Perhaps there should be a default UTF-8 locale: It seems that the current construct does not apply so well to it. If the goal is to standardize existing behavior the standard could define the C.UTF-8 locale

Re: LC_CTYPE=UTF-8

2020-06-25 Thread Ingo Schwarze
Hi Alan, Alan Coopersmith wrote on Thu, Jun 25, 2020 at 07:59:39AM -0700: > On 6/25/20 6:33 AM, Hans Aberg wrote: >> Perhaps there should be a default UTF-8 locale: It seems that the >> current construct does not apply so well to it. > If the goal is to standardize existing behavior the

Re: LC_CTYPE=UTF-8

2020-06-25 Thread Alan Coopersmith
On 6/25/20 6:33 AM, Hans Åberg wrote: Perhaps there should be a default UTF-8 locale: It seems that the current construct does not apply so well to it. If the goal is to standardize existing behavior the standard could define the C.UTF-8 locale (or perhaps a POSIX.UTF-8 locale) that a number

Re: LC_CTYPE=UTF-8

2020-06-25 Thread Hans Åberg
> On 25 Jun 2020, at 15:19, Ingo Schwarze wrote: > > Hans Aberg wrote on Thu, Jun 25, 2020 at 10:15:03AM +0200: > >> MacOS sets as default LC_CTYPE=UTF-8, not appearing in the 'locale >> -a' list. Then some software interprets this as though the locale >> is C/POSIX, disregards the UTF-8

Re: LC_CTYPE=UTF-8

2020-06-25 Thread Ingo Schwarze
Hi Hans, Hans Aberg wrote on Thu, Jun 25, 2020 at 10:15:03AM +0200: > MacOS sets as default LC_CTYPE=UTF-8, not appearing in the 'locale > -a' list. Then some software interprets this as though the locale > is C/POSIX, disregards the UTF-8 encoding, and converts all non-ASCII > (high bit set)

Availability of 202x Draft 1 review draft

2020-06-25 Thread Andrew Josey
hi all, I'm pleased to announce the availability of the first draft of the 202x revision of the standard. This draft is the first committee draft. The draft can be obtained from the login page of the Austin Group at: https://www.opengroup.org/austin/login.html The Mantis project for

LC_CTYPE=UTF-8

2020-06-25 Thread Hans Åberg
MacOS sets as default LC_CTYPE=UTF-8, not appearing in the 'locale -a' list. Then some software interprets this as though the locale is C/POSIX, disregards the UTF-8 encoding, and converts all non-ASCII (high bit set) char's into octal escape sequences. What is the correct interpretation here?