RE: UTF-8 and UTF-16

Marco . Cimarosti Fri, 06 Oct 2000 01:26:47 -0700

I muttered this incomprehensible paragraph:
> - UTF-16 has 16-bit units ("words") and uses 1 or 2 units per 
> character. Characters 000000 to 00FFFF use the corresponding 
> word; higher values use a pair of "surrogates", the first one 
> ("high") being in . It too exists in the same 3 variants as 
> bove: little-endian, high-endian, and BOM-marked.

(The passage above demonstrates that even the FAQ of FAQ's my be puzzling,
if you cut away random chunks from it.;-) Sorry, I'm a little bit under
pressure; this is what I meant:

- UTF-16 has 16-bit units ("words") and uses 1 or 2 units per character.
Characters 000000 to 00FFFF use the corresponding word; higher values use a
pair of "surrogates", the first one ("high") being in range D800 to DBFF,
the second one ("low") in range DC00 to DFFF. It too exists in the same 3
variants as above: little-endian, big-endian, and BOM-marked.

_ Marco

RE: UTF-8 and UTF-16

Reply via email to