Bits is bits; storing UTF-16 into a 4-byte wchar_t string can work. But the unwary might make the mistake of passing a wchar_t string containing UTF-16 with surrogate pairs to a runtime library function that expects UTF-32. Seems to me that all bets are off at that point.
The central point, I think, is that you have to understand the different types and what gets stored in them to use them safely with various libraries. You do, and can make robust decisions based on your understanding, but many people don't understand that functions that process wchar_t strings expect different sequences of bytes on different platforms. -----Original Message----- From: Boris Kolpackov [mailto:[EMAIL PROTECTED] Sent: Wednesday, September 19, 2007 3:12 PM To: [email protected] Subject: Re: WCHAR to XmlCh Jesse Pelton <[EMAIL PROTECTED]> writes: > That assumes wchar_t holds UTF-16 (as XMLCh does). It might not. See > http://www.losingfight.com/blog/2006/07/28/wchar_t-unsafe-at-any-size/ > for a wchar_t story that would be amusing if it were fiction. What most people fail to realize is that wchar_t holds whatever you put into it. If you want portable UTF-16 in wchar_t then put UTF-16 into it, even on platforms where wchar_t is 4-bytes long and can hold UTF-32. Alternatively, it is possible to use UTF-16 on platforms where wchar_t is 2-bytes long and UTF-32 on the rest. The only parts that will need to know about this arrangement are those that are responsible with converting to/from wchar_t strings (e.g., XMLCh to/from wchar_t). If the application does not need to do anything special with (e.g., search for) characters that are outside the BMP (Basic Multilingual Plane), then it can use wchar_t that contains either UTF-16 or UTF-32 without actually caring which one it is. And I am pretty sure this is 99.9% of applications. We use this approach in our XML data binding tool when the user requests the underlying character type to be wchar_t. Boris -- Boris Kolpackov Code Synthesis Tools CC http://www.codesynthesis.com Open-Source, Cross-Platform C++ XML Data Binding
