>11 Digit Boy asked:
>> Why does Unicode only have space for 1114112 glyphs?
>
>BMP = 256 � 256 = 65536
>HI_SURROGS = 1024
>LO_SURROGS = 1024
>
>UNICODE = BMP + HI_SURROGS � LO_SURROGS = 1114112

There are other ways to calculate:

17 * 65536 = 1,114,112
0x10FFFF + 1 = 1,114,112 (decimal)

But we really should do a little extra arithmatic to arrive at a more
useful number:

    65,536
 *      17
-----------
 1,114,112
-    2,048  (non-characters for surrogate code units)
-       34  (non-characters nFFFE and nFFFF for 0 <= n <= 16)
-       32  (non-characters FDD0 - FDEF)
-----------
 1,111,998

That's the currently number of usable codepoints in the Unicode codespace.



- Peter


---------------------------------------------------------------------------
Peter Constable

Non-Roman Script Initiative, SIL International
7500 W. Camp Wisdom Rd., Dallas, TX 75236, USA
Tel: +1 972 708 7485
E-mail: <[EMAIL PROTECTED]>



Reply via email to