On Tue, 13 Oct 2015 15:23:36 +0000 David Starner <[email protected]> wrote:
> A UTF-16 string could delete one surrogate, or add a fractional > character. A Unicode string (not a "UTF-16 string"), which could be > stored internally in, say, a Python-like format which is Latin-1, > UCS-2, or UTF-32, conversions made as needed and differences hidden > from the user, can't. Confusingly, the Unicode definitions are the other way round. A UTF-16 string is a string of UTF-16 codepoints in which all surrogate characters are paired surrogates. Any string of UTF-15 code units may is a Unicode 16-bit string. Richard.

