I've not proposed to move these characters elsewhere (or ro reencode them), why do you think that?.
I just challenge your statement that a block cannot be discontinuous, something that is unique in all Unicode properties and completely absent from ISO 10646 which does not define any real properties beside a name in a specific code point and some informative glyph, plus historic reference links documenting its intended usage. (Where is it written in the Unicode-only stability rules that is continuous when allocations of codepoints in these blocs has always been discontinuous?...), much more important than this legacy one which has absolutely no use in regexps as you stated. Even the set of non-characters is also discontinuous, as well as blocks for the Arabic script.; or blocks for presentation forms, or blocks for compatibility characters. Every property in Unicode is fragmented over multiple ranges (whose length is also extremely frequently discontinuous within each block or even in the same encoding column In other words IsInArabicPresentation(x) would still remain true for all assgned characters in that block, it will just be false for non-characters considered outside of it but non-characters don't have nay useful property except being non-character (the block where they are allocated does not matter at all). The alternative is to not restrict these characters as being non-characters and allowing them to be present in files without enforcing any error, i.e. treat it like PUA, also with a feow possible default properties (this makes them a bit interoperable still with limited private agreements, possibly implicit with the transport interface or enveloppe format). 2014-06-01 4:15 GMT+02:00 Asmus Freytag <[email protected]>: > More importantly, while a regex that uses an expression that is > equivalent to "IsInArabiPresentation(x)" may or may not be well-defined, > there is no reason to break it by splitting the block. > > As blocks cannot be discontiguous (unlike other properties), some Arabic > Presentation forms would have to be put into a new block (Arabic > Presentation Forms C). This is what would break such expressions - it has, > in fact, nothing to do with the status of the noncharacters. > > There's no reason to contemplate breaking changes of any kind at this > point. > > A./ >
_______________________________________________ Unicode mailing list [email protected] http://unicode.org/mailman/listinfo/unicode

