Re: N4106

vanisaac Mon, 07 Nov 2011 00:46:02 -0800

From: Kent Karlsson <kent.karlsson14_at_telia.com>
Den 2011-11-05 04:23, skrev "António Martins-Tuválkin" <tuvalkin_at_gmail.com>:


> > I'm going through N4106 ( http://std.dkuug.dk/jtc1/sc2/wg2/docs/n4106.pdf ),
> ...
> 
> I see the following characters being put forward for proposing to be
> encoded:
> 
> 1ABB COMBINING PARENTHESES ABOVE
> 1ABC COMBINING DOUBLE PARENTHESES ABOVE
> 1ABD COMBINING PARENTHESES BELOW
> 1ABE COMBINING PARENTHESES OVERLAY
> 
> Well, COMBINING DOUBLE PARENTHESES ABOVE seems to be the same as <COMBINING
> PARENTHESES
> ABOVE, COMBINING PARENTHESES ABOVE>. And COMBINING PARENTHESES OVERLAY seems
> to be just
> a tiny parenthesis before and a tiny parenthesis after; no need for a
> combining mark, especially one with
> a splitting behaviour.
> 
> Otherwise, I think COMBINING ((DOUBLE)) PARENTHESES ABOVE/BELOW are an
> entirely new brand of
> characters in Unicode (if accepted as proposed). They are supposed to split
> (ok, we have split
> vowels in some Indic scripts, more on that below), but these split around
> *another combining mark*.
> So despite being given (as proposed) vanilla above/below mark properties,
> they do not "stack" the
> way such characters normally do, but is supposed to invoke an entirely new
> behaviour.

I agree, except that if we give them any but a ccc=220/230, then canonical 
reordering will separate them from the modifier letters that they are attached 
to. I think this is one of those cases where a definition needs to expand in 
order to accommodate architecture. We do already have some non-stacking 
behaviour defined for these characters in order to accommodate polytonic Greek, 
so we do have some experience with disparate appearances of consecutive marks.

> That supposedly stacking combining marks *sometimes* (more a font dependence
> than a character
> dependence) don't stack but instead are laid out linearly is not new. But to
> *require* non-stacking
> behaviour for certain characters is new.

Then think of it as the "non-spacing" version of stacking behaviour.

> So we have a combination of:
> 
> 1. Splitting. (Normally only used for some Indic scripts).
> 
> 2. Indeed splitting with no other characters to use for the decomposition,
> thus requiring the use of
>    PUA characters, to stay compliant, for representing the result of the
> split at the character level.
>    (This is entirely new, as far as I can tell.)

I cannot imagine in any way how this requires PUA characters.

> 3. The split is entirely *within* the sequence of combining characters
> (except for COMBINING
>    PARENTHESES OVERLAY, which behaves as split vowels normally do, but still
> with issue 2), not
>    around the combining sequence including the base. (This is entirely new.)
> 
> 4. Requiring (if at all supported) to use linear layout of combining
> characters instead of stacking.
>    (This is entirely new.)

If I were designing a font, I would simply make the in/out mark attachment 
point near the top/middle of the parentheses, so that it drops down around the 
"base" mark, and then attaches any subsequent marks as if the parentheses 
weren't there. I think you're making this too complicated.

> This makes these proposed characters entirely unique in their display
> behaviour, IMO.

I do, however, agree totally with this assessment, I just believe it is more 
manageable than you paint it.

[snip]
>     /Kent K 

I do, myself, have a couple of concerns in regards to several proposed 
characters in N4106 as well. Namely, I believe that U+1DF2, U+1DF3, and U+1DF4 
should require significant justification as to why they should not be encoded 
as U+0363 + U+0308, U+0366 + U+0308, and U+0367 + U+0308. I have similar 
concerns about U+A799, U+AB30, U+AB33, U+AB38, U+AB3E, U+AB3F, etc.

Van A

Re: N4106

Reply via email to