In a message dated 2001-11-08 21:09:35 Pacific Standard Time, 
[EMAIL PROTECTED] writes:

>  Anybody willing to check this for me?
>
>  CString sUTF32ToUTF8( LONG lUTF32 )

I haven't run it through a compiler, but most of it looks fine.  However, the 
algorithm would be a lot more transparent if the constants were hex instead 
of decimal (e.g. 0x1000 instead of 4096).

Also, I would have written it to use bit shifts instead of divisions and 
modulos (IUTF32 >> 12 instead of lUTF32 / 4096).

And I don't think you're supposed to exclude the surrogate code space (0xD800 
through 0xDFFF) from normal processing.  (This is the "D29 conundrum" -- all 
UTFs must support encoding of non-characters, including unpaired surrogates, 
even though UTF-16 cannot do this.)  The code you provided encodes unpaired 
surrogates in four bytes -- by pushing them down to the final "else" -- which 
is wrong in any event and almost certainly not what the programmer intended.

-Doug Ewell
 Fullerton, California

Reply via email to