Re: FAQ !?

Markus Scherer Wed, 13 Dec 2000 09:46:46 -0800
[EMAIL PROTECTED] wrote:
> I guess this should be a FAQ (but is'nt). I need code to convert unicode
> data between
> various encoding schemes (UTF16LE to UTF32BE etc...). Are there standard
> routines
> I can use ? If so, where can I find them ?

The CD for the Unicode book should have some of this - in any case, these 
transformations are fairly simple.

Unicode libraries have it, see http://www.unicode.org/unicode/onlinedat/products.html
For example, see ICU at http://oss.software.ibm.com/icu/ - see documentation and 
source code for converters and UTF macros in icu/source/common/unicode/utf.h

> As an aside. I have run into trouble porting a database application which
> stores UTF16LE
> data onto HPUX and SUN machines. I can see that wchar_t there is defined as
> unsigned long.
> So most probably all wcs*() functions would expect UTF32 encoded data. Am I
> correct in my
> assumption ?  What do I do to be certain ?

wchar_t is a very fuzzy type. It may be 8, 16, or 32 bits depending on the platform, 
and there is no general guarantee that it stores Unicode. Most older systems use it 
for scalar character code points custom-built for the char* encoding.

>  What online information can I
> look through for
> more information on such a problem ?

About wchar_t and Unicode, see "What size wchar_t do I need for Unicode?" at 
http://www-4.ibm.com/software/developer/library/uniwchar.html

To be sure, you can use typedefs that are always what you want. ICU and other 
libraries define types for string units and scalar code points that work on all 
platforms, and they provide functions to work with such Unicode strings and characters.

Good luck,
markus
Re: FAQ !?

Reply via email to