ast uses localeconv(3)
but
it assumes the struct lconv char* elements point to one byte
we will recode to properly treat the return as 0-terminated strings

it will be up to the native localeconv() implementation
to do the right thing

I just checked on sol11 and for LC_ALL=ar_SA.UTF-8
struct lconv decimal_point = ","

-- Glenn Fowler -- AT&T Research, Florham Park NJ --

On Tue, 15 Jan 2008 00:52:19 +0100 Roland Mainz wrote:
> I'm currently trying to figure out whether
> http://bugs.opensolaris.org/view_bug.do?bug_id=6558816 ("printf variants
> behaving incorrectly for multibyte decimal point") applies to the ksh93
> "printf" builtin command, too.

> The (public) description for CR #6558816 says:
> -- snip --
> In snv_62 for ar_EG.UTF-8/ar_SA.UTF-8 locales the decimal point is
> defined as 0x066b, the arabic decimal point. This is a multibyte
> charatcer with UTF-8 representation 0xd9 0xab.

> Compiling and running the following program in ar_EG.UTF-8/ar_SA.UTF-8
> locales

> #include <stdlib.h>
> #include <locale.h>

> void main() {
>       float g=10.111;

>       setlocale(LC_ALL,"");
>       printf("%f\n", g);
> }

> #./a.out | od -x
> gives

> 0000000 3130 d931 3131 3130 300a

> which is not correct, the decimal point is chopped off at the first byte

> The following should be the right output

> 0000000 3130 d9ab 3131 3131 3030 0a
>  
> according to man -s 3C printf

>      All forms of the printf() functions allow for the  insertion
>      of  a  language-dependent  radix  character  in  the  output
>      string. The radix character  is  defined  by  the  program's
>      locale  (category  LC_NUMERIC). In the POSIX locale, or in a
>      locale where the radix character is not defined,  the  radix
>      character defaults to a period (.).
> -- snip --

> ast-ksh.2008-01-06 returns the following output:
> -- snip --
> $ ksh93 -c 'float i=10.111 ; export LC_ALL=ar_EG.UTF-8 ; printf "%f" i'
> | od -t x1
> 0000000 31 30 2c 31 31 31 30 30 30
> 0000011
> -- snip --
> ... e.g. it uses a comma (',') and not Unicode 0x066b ...

> Kenjiro: Which API should be used to obtain the multibyte character
> value for the arabic decimal point (note that ksh93 uses
> |libast::printf()| and not Solaris's |libc::printf()|, e.g. any fix for
> Solaris libc needs to be ported to |libast::printf()|, too) ?

> BTW: Is it a bug that $ LC_ALL=ar_SA.UTF-8 locale -k decimal_point #
> returns a comma (',' ) ?


Reply via email to