In a message dated 2001-11-08 21:09:35 Pacific Standard Time, [EMAIL PROTECTED] writes:
> Anybody willing to check this for me? > > CString sUTF32ToUTF8( LONG lUTF32 ) I haven't run it through a compiler, but most of it looks fine. However, the algorithm would be a lot more transparent if the constants were hex instead of decimal (e.g. 0x1000 instead of 4096). Also, I would have written it to use bit shifts instead of divisions and modulos (IUTF32 >> 12 instead of lUTF32 / 4096). And I don't think you're supposed to exclude the surrogate code space (0xD800 through 0xDFFF) from normal processing. (This is the "D29 conundrum" -- all UTFs must support encoding of non-characters, including unpaired surrogates, even though UTF-16 cannot do this.) The code you provided encodes unpaired surrogates in four bytes -- by pushing them down to the final "else" -- which is wrong in any event and almost certainly not what the programmer intended. -Doug Ewell Fullerton, California

