On Tue, 2 May 2017 05:08:27 +0200 Philippe Verdy via Unicode <[email protected]> wrote:
> Consider also that the BMP is almost full, the remaining few holes > are kept for isolated characters that may be added to existing > scripts, or permanently reserved to avoid clashes with legacy > softwares using simple code remappings between distinct blocks, or to > perform simple case conversions (e.g. in Greek) for internal purposes > (these positions are not interoperable and may clash with future > versions of the UCS and I18n tools/libraries like ICU) > > You should abstain using any currently unassigned positions in the > existing Unicode blocks: use PUA if you have nothing else; there are > plenty of space available, in the BMP (most common usage in fonts > that need to map additional glyphs) or in the two last planes. It isn't codepoints that is the constraint; one must consider the number of glyphs without dedicated one-character codes. For example, U+1000 MYANMAR LETTER KA needs glyphs for: 1000 1000 FE00 1039 1000 (and probably at two different widths) 1039 1000 FE00 (do.) There are a few CJK ideographs with similar needs: 537F 537F FE00 (= CJK COMPATIBILITY IDEOGRAPH-2F831) 537F FE01 (= CJK COMPATIBILITY IDEOGRAPH-2F832) 537F FE02 (= CJK COMPATIBILITY IDEOGRAPH-2F833) There's also the Japanese ideographic variation sequence <U+5375 U+E0100>, which should probably have its own glyph even if it's the same as one of the above. The Arabic script (and other cursively connected scripts) has similar expansions, even if one goes for a typewritten style. Devanagari explodes when one considers just the conjuncts prescribed for Hindi. I think it's also necessary to avoid splitting likely grapheme clusters between fonts. Which of the three fonts will support U+1F3F4 U+E0067 U+E0062 U+E0065 U+E006E U+E0067 U+E007F (English flag) and which U+261D U+1F3FF (index pointing up: dark skin tone)? Now, the BMP has headroom provided by the surrogate characters and the PUA, which will not have mappings, but I'm not sure that it's enough. That's why I asked the question. Richard.

