The protocol uses UTF-8 but not UTF-16.
That's a valid point, but there are no semantic differences between the two: both encodings can express the full Unicode character set. UTF-8 is just more compact (at least for Roman text), backward-compatible with ASCII, and does not contain any embedded null bytes.
�Jens
