[ksh93-integration-discuss] ksh93 "printf" builtin vs. CR #6558816 ("printf variants behaving incorrectly for multibyte decimal point") ...

Roland Mainz Tue, 15 Jan 2008 00:52:19 +0100

Hi!

----


I'm currently trying to figure out whether
http://bugs.opensolaris.org/view_bug.do?bug_id=6558816 ("printf variants
behaving incorrectly for multibyte decimal point") applies to the ksh93
"printf" builtin command, too.

The (public) description for CR #6558816 says:
-- snip --
In snv_62 for ar_EG.UTF-8/ar_SA.UTF-8 locales the decimal point is
defined as 0x066b, the arabic decimal point. This is a multibyte
charatcer with UTF-8 representation 0xd9 0xab.

Compiling and running the following program in ar_EG.UTF-8/ar_SA.UTF-8
locales

#include <stdlib.h>
#include <locale.h>

void main() {
        float g=10.111;

        setlocale(LC_ALL,"");
        printf("%f\n", g);
}


#./a.out | od -x
gives

0000000 3130 d931 3131 3130 300a

which is not correct, the decimal point is chopped off at the first byte

The following should be the right output

0000000 3130 d9ab 3131 3131 3030 0a
 
according to man -s 3C printf

     All forms of the printf() functions allow for the  insertion
     of  a  language-dependent  radix  character  in  the  output
     string. The radix character  is  defined  by  the  program's
     locale  (category  LC_NUMERIC). In the POSIX locale, or in a
     locale where the radix character is not defined,  the  radix
     character defaults to a period (.).
-- snip --

ast-ksh.2008-01-06 returns the following output:
-- snip --
$ ksh93 -c 'float i=10.111 ; export LC_ALL=ar_EG.UTF-8 ; printf "%f" i'
| od -t x1
0000000 31 30 2c 31 31 31 30 30 30
0000011
-- snip --
... e.g. it uses a comma (',') and not Unicode 0x066b ...

Kenjiro: Which API should be used to obtain the multibyte character
value for the arabic decimal point (note that ksh93 uses
|libast::printf()| and not Solaris's |libc::printf()|, e.g. any fix for
Solaris libc needs to be ported to |libast::printf()|, too) ?

BTW: Is it a bug that $ LC_ALL=ar_SA.UTF-8 locale -k decimal_point #
returns a comma (',' ) ?

----

Bye,
Roland

-- 
  __ .  . __
 (o.\ \/ /.o) roland.mainz at nrubsig.org
  \__\/\/__/  MPEG specialist, C&&JAVA&&Sun&&Unix programmer
  /O /==\ O\  TEL +49 641 7950090
 (;O/ \/ \O;)

[ksh93-integration-discuss] ksh93 "printf" builtin vs. CR #6558816 ("printf variants behaving incorrectly for multibyte decimal point") ...

Reply via email to