On Mon, 12 Dec 2016 09:30:31 -0800 Ken Whistler <[email protected]> wrote:
> On 12/12/2016 6:59 AM, Karl Williamson wrote: > > These are currently GCB Other, but when assigned, don't we know that > > they will be Extended? So this could be done now. > Any proposal like this then also has hidden costs on the committees, > because it sets up implied requirements for what can be encoded where > and what properties it has to have. Every time such defaults are set > up, it makes the documentation of what is already "pre-assigned" more > complicated and fragile. Already, a large proportion of the > participants in the maintenance committees have very murky > understandings about what can and cannot be put where in the future, > and why. And that is a recipe for mistakes in encoding. How does this differ from U+0816 SAMARITAN MARK IN changing from bidi_class=R to bidi_class=NSM upon assignment? The idea is to reduce the damage done by the use of obsolete versions of the Unicode database. > Finally, like it or not, there currently is no actually contract > guaranteeing that the remaining open ranges in blocks "reserved" for > combining marks will all end up gc=Mn or gc=Me, anyway. The relevant > ranges are 1ABF..1AFF, 1DF6..1DFA, and 20F1..20FF. There is nothing to > prevent the committees from deciding that one (or more) spacing > combining marks might be appropriate to encode there, or possibly even > spacing non-combining marks of some strange sort, like the spacing > Arabic letter diacritics that ended up at FBB2..FBC1. Trying to keep > those ranges free of characters that would not be Grapheme_Extend=Yes > would require some guy on the committee to be aware of the arcane > dependencies for segmentation properties, and then to police such > decisions in perpetuity -- or at least until the blocks in question > filled up with non-problematical characters. What is the down side of a code point changing from Graphme_Extend=Yes to Grapheme_Extend=No when it is assigned? Richard.

