On Wednesday, 6 December 2017 at 15:12:22 UTC, Steven
Schveighoffer wrote:
On 12/6/17 4:34 AM, Ola Fosheim Grøstad wrote:
On Wednesday, 6 December 2017 at 09:24:33 UTC, Jonathan M
Davis wrote:
UTF-32 on the other hand is guaranteed to have a code unit be
a full code point.
I don't think the standard says that? Isn't this only because
the current set is small enough to fit? So this may change as
Unicode grows?
The current unicode encoding has 2 million different code
points.
2,097,152 possible codepoints. As of [Unicode 10] only 136,690
codepoints have been assigned.
I'd say we'll all be dead and so will our great great
great grandchildren by the time unicode amasses more than 2
billion codepoints :)
So there's enough time even before the current range is even
filled.
Also, UTF8 has been standardized to only have up to 4 code
units per code point. The encoding scheme allows more, but the
standard restricts it.
[Unicode 10]: http://www.unicode.org/versions/Unicode10.0.0/