Re: wchar_t <--> Unicode Conversion

H. Peter Anvin Sat, 02 Jun 2001 14:07:07 -0700
Followup to:  <[EMAIL PROTECTED]>
By author:    "Michael B. Allen" <[EMAIL PROTECTED]>
In newsgroup: linux.utf8
>
> > > Why doesn't wchar_t play nice with Unicode?
> > 
> > It does, if your C implementation defines the macro name
> > __STDC_ISO_10646__ (see the C standard for additional information).
> 
> No. It doesn't regardless of how the C99 macro is set. AFAICT you cannot
> convert from wchar_t to an arbitrary encoding without going through a
> 7 or 8 bit locale dependant encoding such as UTF-8 or IS0-8859-1. For
> example, if I have a lot of UCS-2 code and want to use wchar_t functions
> like wprintf I must first convert the string to UTF-8 using iconv and
> then again to wchar_t * with mbstowcs.
> 

Sorry, that's wrong.  If __STDC_ISO_10646__ is defined BY THE
IMPLEMENTATION (not by you!), then wchar_t is UCS.  ISO 9899:1999,
�6.10.8.2.

If the implementation doesn't define __STDC_ISO_10646__ then you're on
your own.  The sane way to code this is, of course:

uint16_t *p;    /* UCS-2 input */
wchar_t *q;     /* wchar_t output */

#ifdef __STDC_ISO_10646__
        /* Convert UCS-2 to wchar_t (typically UCS-4) */
        while ( *p ) 
                *p++ = *q++;
#else
        /* Do nasty things with iconv() and mbstowcs */
#endif


        -hpa


-- 
<[EMAIL PROTECTED]> at work, <[EMAIL PROTECTED]> in private!
"Unix gives you enough rope to shoot yourself in the foot."
http://www.zytor.com/~hpa/puzzle.txt
-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/
Re: wchar_t <--> Unicode Conversion

Reply via email to