Theodore H. Smith <delete at softhome dot net> wrote: > My code I took from Uniconv.c fails on a roundtrip, converting 1114048 > from UTF32 to UTF8, then back again. > > I did modify the code however to make it faster. So can anyone here > who uses Uniconv.c tell me if a roundtrip on 1114048 works fine?
I don't know what Uniconv.c is either. I know there is a Basis Technology product called Uniconv, but I'm pretty sure it doesn't come with source code. Anyway, decimal 1114048 is hex 10FFC0. (Theodore, please try to use hexadecimal to refer to Unicode code points. Decimal is not conventionally used for that purpose and will probably confuse people.) The UTF-8 bytes corresponding to U+10FFC0 are F4 8F BF 80. So you should check your UTF-8 encoding code first to ensure it yields the correct bytes. If the encoding stage is OK, then the problem must lie in the decoding stage. -Doug Ewell Fullerton, California

