Re: any unicode conversion tools?

Stefan Persson Fri, 07 May 2004 11:50:42 -0700

Clark Cox wrote:

 Note
also that
UTF-8 encoded sequences can be up to 5 bytes long...
How is that possible. I was under the impression that a UTF-8 sequence could never be more than 4 bytes (i.e. U+10FFFF becomes F4 8F BF BF).

Unicode & ISO/IEC 10646 define UTF-8 differently; Unicode stops at 4 bytes, while ISO/IEC 10646 allows more bytes; however, all combinations with more bytes than 4 result in illegal sequences or illegal code points.

Stefan

Re: any unicode conversion tools?

Reply via email to