Glenn Fowler wrote: > On Thu, 17 Jan 2008 01:50:38 +0100 Roland Mainz wrote: > > Glenn Fowler wrote: > > > On Tue, 15 Jan 2008 16:57:26 -0800 (PST) Don Cragun wrote: > > > > ... Note that in the C > > > > Standard, "character" is a single-byte character. > > > so absent a standard multibyte interface ast/ksh will stick with > > > the single byte characters provided by localeconv(): > > > struct lconv *decimal_point > > > struct lconv *thousands_sep > > > But what happens when these data point to multibyte characters (see > > http://mail.opensolaris.org/pipermail/ksh93-integration-discuss/2008-January/005846.html > > for an idea to split the arabic locales into one version which uses > > ASCII characters and a 2nd version which uses the correct arabic > > (multibyte) characters) ? AFAIK (Don may correct me) it's the author(s) > > of the locale data which are responsible to define this correctly and > > the "consumer side" (e.g. libast/ksh93) should just use the strings (and > > not just the first byte) from struct lconv char* elements... > > if I understand Don correctly the C standard states that > struct lconv *decimal_point > struct lconv *thousands_sep > each point to a character, and a character is a one byte quantity > > so it doesn't matter how many bytes are pointed to, only the first > counts for both decimal_point and thousands_sep
The standard says the behaviour is "unspecified", not "forbidden" (or "komodo dragons will eat you if you do this"): -- snip -- "In contexts where standards limit the decimal_point to a single byte, the result of specifying a multi-byte operand shall be unspecified." and "In contexts where standards limit the thousands_sep to a single byte -- snip -- More interestingly is AFAIK the practical side: Are all data in |struct lconv| which use |char *| terminated by a '\0' ? If this is "true" for all platforms it shouldn't be a problem to extend the behaviour from the existing singlebyte to multibyte characters - AFAIK it's only an extension of the existing standard and it's up to the authors of the locale data to use multibyte characters... or not ? ---- Bye, Roland -- __ . . __ (o.\ \/ /.o) roland.mainz at nrubsig.org \__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer /O /==\ O\ TEL +49 641 7950090 (;O/ \/ \O;)