Every once in a while, someone comes up with the idea that there are
not enough Unicode code points (either generally, emoji, or in the
PUA) and proposes a scheme where complicated combinations of code
points can be assigned identities as characters, emoji, or other
concepts, which always amounts to reinventing Unicode on top of
itself, except worse. There have been several such proposals in this
list.

On Sun, Mar 22, 2026 at 11:17 PM Martin J. Dürst via Unicode
<[email protected]> wrote:
>
> In addition to what Michel and others said: If Unicode would ever have
> to plan for running out of code points, I'm sure that they would do it
> by switching to a larger code space

This is the solution, plain and simple. Unicode is essentially ready
to support 2^31 code points. The limit of 17 planes is entirely
artificial and meant to cater to the design limits of UTF-16.

> Most probably, the definition of UTF-8 would be changed to allow more
> than 3 bytes, and the definition of UTF-32 would be changed to allow the
> actual use of the newly allowed code points. The definition of UTF-16,
> if still needed, would be a bit more tricky.

UTF-32 and UTF-8 are trivial to extend to 2^31 code points, of course.
The extended versions might need new names like XTF-8 and XTF-32 to
formally distinguish them from the current schemes.

For UTF-16, there has always been (to me) an obvious method as well.
In addition to the current HS/LS (high surrogate/low surrogate) pairs,
allow triples: HS/HS/LS and HS/LS/LS. Each triple starts with a HS and
ends on a LS. A stream of XTF-16 triples is self-synchronizing, though
an interrupted stream might look like it ends or begins with a valid
UTF-16 pair resulting in a single-character error.

This has the advantages of not needing any new surrogate code points
and there being exactly 31 free bits in a triple which means the same
code space can be accessed.

That is just a creative exercise. As has been said many times over,
the current code space is under no risk of running out any time soon.

Cheers,
Sławomir Osipiuk

Reply via email to