Well, first, it is 17 planes (or have we switched to using hexadecimal
numbers on the Unicode list already?
Second, of course this is in connection with UTF-16. I wasn't involved
when UTF-16 was created, but it must have become clear that 2^16 (^
denotes exponentiation (to the power of))
That's a valid computation if the extension was limited to use only
2-surrogate encodings for supplementary planes.
If we could use 3-surrogate encodings, you'd need
3*2ˆn surrogates
to encode
2^(3*n)
new codepoints.
With n=10 (like today), this requires a total of 3072 surrogates, and you
Note that the **curent bet** that the existing 17 planes will be sufficient
is valid only if there's no international desire to encode something else
than just what is in the current focus of Unicode.
Say (for example) that the WIPO absolutely wants to encode corporate logos.
Or ISO or the IETF
On Tuesday 27 November 2012, Philippe Verdy verd...@wanadoo.fr wrote:
This is not complicate to parse it in the foreward direction, but for the
backward direction, it means that when you see the final low surrogate, you
still need to rollback to the previous position: it can only be a
There isn't an actual problem here which needs a solution, satisfactory, or
otherwise. The persistence of the 17 planes may not be enough meme on this
list is an interesting phenomenon in itself, but has no practical impact on any
of the actual ongoing work on maintenance of the encoding
5 matches
Mail list logo