Re: UTF-8 locale & POSIX text model

2017-11-30 Thread keld
On Sun, Nov 26, 2017 at 04:16:36PM +, Stephane Chazelas wrote: > 2017-11-26 14:07:50 +0100, k...@keldix.com: > [...] > > > For instance, as currently specified, POSIX says that the > > > output of the "locale" utility be suitable for reinput to the > > > shell and requiring double-quote

Re: UTF-8 locale & POSIX text model

2017-11-27 Thread Hans Åberg
> On 27 Nov 2017, at 22:51, Chet Ramey wrote: > > On 11/27/17 1:12 PM, Hans Åberg wrote: > On MacOS 10.13, one can set locale environment variables. The Terminal default login shell reads .profile; xterm reads .bashrc. There are other ways to set them

Re: UTF-8 locale & POSIX text model

2017-11-27 Thread Chet Ramey
On 11/27/17 1:12 PM, Hans Åberg wrote: >>> On MacOS 10.13, one can set locale environment variables. The Terminal >>> default login shell reads .profile; xterm reads .bashrc. There are other >>> ways to set them system-wide, changing with the OS version. >> >> Terminal has been able to pass the

Re: UTF-8 locale & POSIX text model

2017-11-27 Thread Hans Åberg
> On 27 Nov 2017, at 22:04, Chet Ramey wrote: > > On 11/27/17 12:51 PM, Hans Åberg wrote: > >> On MacOS 10.13, one can set locale environment variables. The Terminal >> default login shell reads .profile; xterm reads .bashrc. There are other >> ways to set them

Re: UTF-8 locale & POSIX text model

2017-11-27 Thread Chet Ramey
On 11/27/17 12:51 PM, Hans Åberg wrote: > On MacOS 10.13, one can set locale environment variables. The Terminal > default login shell reads .profile; xterm reads .bashrc. There are other ways > to set them system-wide, changing with the OS version. Terminal has been able to pass the locale

Re: UTF-8 locale & POSIX text model

2017-11-27 Thread Hans Åberg
> On 27 Nov 2017, at 19:35, Chet Ramey wrote: > > On 11/27/17 1:19 AM, Hans Åberg wrote: > The deprecated HFS uses UTF-16, but MacOS has LC_CTYPE=UTF-8; thus with no additional qualifications like in LC_CTYPE=en_US.UTF-8. It would be interesting to know

Re: UTF-8 locale & POSIX text model

2017-11-27 Thread Hans Åberg
> On 27 Nov 2017, at 10:43, Stephane Chazelas > wrote: > > 2017-11-26 22:40:45 +0100, Hans Åberg: > [...] >> The deprecated HFS uses UTF-16, but MacOS has LC_CTYPE=UTF-8; >> thus with no additional qualifications like in >> LC_CTYPE=en_US.UTF-8. It would be

Re: UTF-8 locale & POSIX text model

2017-11-27 Thread Joerg Schilling
Joseph Myers wrote: > On Sat, 25 Nov 2017, k...@keldix.com wrote: > > > systems, and also implementations that can conform using UTF-16 and > > different 8-bit codesets. For instance 'A' is coded x0041 (two bytes) in > > UTF-16 > > and x41 (only one byte) in cp850, and

Re: UTF-8 locale & POSIX text model

2017-11-27 Thread Joseph Myers
On Sat, 25 Nov 2017, k...@keldix.com wrote: > systems, and also implementations that can conform using UTF-16 and > different 8-bit codesets. For instance 'A' is coded x0041 (two bytes) in > UTF-16 > and x41 (only one byte) in cp850, and UTF-8. ISO C includes (C99 TC2 onwards) the requirement

Re: UTF-8 locale & POSIX text model

2017-11-27 Thread Hans Åberg
> On 27 Nov 2017, at 03:16, Chet Ramey wrote: > > On 11/26/17 1:40 PM, Hans Åberg wrote: > >> The deprecated HFS uses UTF-16, but MacOS has LC_CTYPE=UTF-8; thus with no >> additional qualifications like in LC_CTYPE=en_US.UTF-8. It would be >> interesting to know if it

Re: UTF-8 locale & POSIX text model

2017-11-26 Thread Hans Åberg
> On 26 Nov 2017, at 13:43, k...@keldix.com wrote: > > Well, the pathname processing should be a function of the filesystem. Eg if > you have a windows > filesystem, or an apple filesystem mounted on a linux operating system, then > the file names > of the foreign system should be interpreted

Re: UTF-8 locale & POSIX text model

2017-11-26 Thread Hans Åberg
> On 26 Nov 2017, at 13:43, k...@keldix.com wrote: > > I don't have windos nor apple systems, but they run utf-16 natively, and > recent > Windows 10 system have a full linux (ubuntu) subsystem. I could also see > problems > with utf-16 and posix, but at least apple should have solved that

Re: UTF-8 locale & POSIX text model

2017-11-26 Thread Stephane Chazelas
2017-11-26 14:07:50 +0100, k...@keldix.com: [...] > > For instance, as currently specified, POSIX says that the > > output of the "locale" utility be suitable for reinput to the > > shell and requiring double-quote quoting in some cases. > > > > Using double-quote quoting is problematic because

Re: UTF-8 locale & POSIX text model

2017-11-26 Thread k...@keldix.com
On Sun, Nov 26, 2017 at 02:09:21AM +, Danny Niu wrote: > > > On 26 Nov 2017, at 3:53 AM, k...@keldix.com wrote: > > On Wed, Nov 22, 2017 at 05:43:51PM +, Stephane Chazelas wrote: > 2017-11-22 16:27:15 +0100, Martijn Dekker: > Op 22-11-17 om 16:02 schreef Geoff

Re: UTF-8 locale & POSIX text model

2017-11-26 Thread Stephane Chazelas
2017-11-25 20:53:20 +0100, k...@keldix.com: [...] > > It just says those characters are the one constituting the > > portable character set. It doesn't specify the encoding other > > than it mandates the encoding of those characters to be > > invariant in the charsets in the system's supported

Re: UTF-8 locale & POSIX text model

2017-11-25 Thread Danny Niu
On 26 Nov 2017, at 3:53 AM, k...@keldix.com wrote: On Wed, Nov 22, 2017 at 05:43:51PM +, Stephane Chazelas wrote: 2017-11-22 16:27:15 +0100, Martijn Dekker: Op 22-11-17 om 16:02 schreef Geoff Clare: Danny Niu > wrote,

Re: UTF-8 locale & POSIX text model

2017-11-25 Thread keld
On Wed, Nov 22, 2017 at 05:43:51PM +, Stephane Chazelas wrote: > 2017-11-22 16:27:15 +0100, Martijn Dekker: > > Op 22-11-17 om 16:02 schreef Geoff Clare: > > > Danny Niu wrote, on 22 Nov 2017: > > >> > > >> Q1: What is the rationale for not making POSIX an application of

Re: Re: UTF-8 locale & POSIX text model

2017-11-24 Thread Shware Systems
It's my understanding that column was added to provide the stock names for those code points in creating charmap files in the format supported by localedef. This is an informative reference for convenience as the standard  also lists those names elsewhere.   As to Q2, the general direction I see

Re: UTF-8 locale & POSIX text model

2017-11-22 Thread Stephane Chazelas
2017-11-22 16:27:15 +0100, Martijn Dekker: > Op 22-11-17 om 16:02 schreef Geoff Clare: > > Danny Niu wrote, on 22 Nov 2017: > >> > >> Q1: What is the rationale for not making POSIX an application of ASCII? > > > > So that systems which use other encodings (specifically

Re: UTF-8 locale & POSIX text model

2017-11-22 Thread Geoff Clare
Danny Niu wrote, on 22 Nov 2017: > > Q1: What is the rationale for not making POSIX an application of ASCII? So that systems which use other encodings (specifically EBCDIC) can be POSIX-conforming. IBM z/OS is certified UNIX 95 and uses EBCDIC. -- Geoff Clare

Re: UTF-8 locale & POSIX text model

2017-11-22 Thread Martijn Dekker
Op 22-11-17 om 13:58 schreef Danny Niu: > Q1: What is the rationale for not making POSIX an application of ASCII? Actually, it mostly is. POSIX mandates that all supported locales include the "portable character set", which is ASCII minus some control characters.