Ienup Sung wrote: > Roland Mainz wrote at 01/14/08 15:52: > > BTW: Is it a bug that $ LC_ALL=ar_SA.UTF-8 locale -k decimal_point # > > returns a comma (',' ) ? > > Using comma (',' or 0x2c) as the decimal_point value of ar_SA.UTF-8 isn't > correct but rather it should be either Arabic decimal separator (U+066B in > Unicode) or, in my opinion, period ('.' or 0x2e) as in "out of no proper > single byte character", for the use with ASCII digits, or both. > > I though can guess why there is comma in the current locale definition: > > Majority of Arabic speaking countries at Middle East, the proper decimal > point with Arabic-Indic digits would be ARABIC DECIMAL SEPARATOR U+066B and > the proper thousand separator would be ARABIC THOUSANDS SEPARATOR U+066C. > > The U+066B looks similar to ASCII comma but the two are completely different > characters. The same also goes to the U+066C and ASCII apostrophe. > > I *think* the locale owner,
Who is the locale owner ? > experiencing problems mentioned in the noted > CR 6558816 and other bugs What are the CR #-numbers of the other bugs ? > after the recent change of the decimal_point and > thousands_sep values from ASCII period and ASCII comma to U+066B and U+066C, Slightly offtopic: It seems that CDE's dtterm default font in the en_US.UTF-8 locale doesn't have a glyph available for U+066C ... ;-( > somehow figured that using ASCII comma for decimal_point and ASCII > apostrophe for thousands_sep might be better (since possibly they look > similar) and resorted into that compromise. > > To me, probably the best compromise might have been having ASCII period for > decimal_point and an empty string for thousands_sep. What about extending this compromise (refining my proposal from http://mail.opensolaris.org/pipermail/ksh93-integration-discuss/2008-January/005846.html a bit) a bit (AFAIK the idea of multibyte charcters for { decimal_point, thousands_se, etc } sounds intesting but I can also see the issues that non-multibyte aware applications won't like this)): 1) Create "ar_SA.UTF-8 at ascii_numeric" (uses ASCII characters for |decimal_point| and |thousands_sep|) 2) Create "ar_SA.UTF-8 at arabic_numeric" (uses multibyte characters (to represent these characters as arabic (multibyte) characters) for |decimal_point| and |thousands_sep|) 3) Make "ar_SA.UTF-8" an alias to "ar_SA.UTF-8 at ascii_numeric" for now ---- Bye, Roland -- __ . . __ (o.\ \/ /.o) roland.mainz at nrubsig.org \__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer /O /==\ O\ TEL +49 641 7950090 (;O/ \/ \O;)