From: Kent Karlsson <kent.karlsson14_at_telia.com> Den 2011-11-05 04:23, skrev "António Martins-Tuválkin" <tuvalkin_at_gmail.com>:
> > I'm going through N4106 ( http://std.dkuug.dk/jtc1/sc2/wg2/docs/n4106.pdf ), > ... > > I see the following characters being put forward for proposing to be > encoded: > > 1ABB COMBINING PARENTHESES ABOVE > 1ABC COMBINING DOUBLE PARENTHESES ABOVE > 1ABD COMBINING PARENTHESES BELOW > 1ABE COMBINING PARENTHESES OVERLAY > > Well, COMBINING DOUBLE PARENTHESES ABOVE seems to be the same as <COMBINING > PARENTHESES > ABOVE, COMBINING PARENTHESES ABOVE>. And COMBINING PARENTHESES OVERLAY seems > to be just > a tiny parenthesis before and a tiny parenthesis after; no need for a > combining mark, especially one with > a splitting behaviour. > > Otherwise, I think COMBINING ((DOUBLE)) PARENTHESES ABOVE/BELOW are an > entirely new brand of > characters in Unicode (if accepted as proposed). They are supposed to split > (ok, we have split > vowels in some Indic scripts, more on that below), but these split around > *another combining mark*. > So despite being given (as proposed) vanilla above/below mark properties, > they do not "stack" the > way such characters normally do, but is supposed to invoke an entirely new > behaviour. I agree, except that if we give them any but a ccc=220/230, then canonical reordering will separate them from the modifier letters that they are attached to. I think this is one of those cases where a definition needs to expand in order to accommodate architecture. We do already have some non-stacking behaviour defined for these characters in order to accommodate polytonic Greek, so we do have some experience with disparate appearances of consecutive marks. > That supposedly stacking combining marks *sometimes* (more a font dependence > than a character > dependence) don't stack but instead are laid out linearly is not new. But to > *require* non-stacking > behaviour for certain characters is new. Then think of it as the "non-spacing" version of stacking behaviour. > So we have a combination of: > > 1. Splitting. (Normally only used for some Indic scripts). > > 2. Indeed splitting with no other characters to use for the decomposition, > thus requiring the use of > PUA characters, to stay compliant, for representing the result of the > split at the character level. > (This is entirely new, as far as I can tell.) I cannot imagine in any way how this requires PUA characters. > 3. The split is entirely *within* the sequence of combining characters > (except for COMBINING > PARENTHESES OVERLAY, which behaves as split vowels normally do, but still > with issue 2), not > around the combining sequence including the base. (This is entirely new.) > > 4. Requiring (if at all supported) to use linear layout of combining > characters instead of stacking. > (This is entirely new.) If I were designing a font, I would simply make the in/out mark attachment point near the top/middle of the parentheses, so that it drops down around the "base" mark, and then attaches any subsequent marks as if the parentheses weren't there. I think you're making this too complicated. > This makes these proposed characters entirely unique in their display > behaviour, IMO. I do, however, agree totally with this assessment, I just believe it is more manageable than you paint it. [snip] > /Kent K I do, myself, have a couple of concerns in regards to several proposed characters in N4106 as well. Namely, I believe that U+1DF2, U+1DF3, and U+1DF4 should require significant justification as to why they should not be encoded as U+0363 + U+0308, U+0366 + U+0308, and U+0367 + U+0308. I have similar concerns about U+A799, U+AB30, U+AB33, U+AB38, U+AB3E, U+AB3F, etc. Van A