From: "Asmus Freytag" <[EMAIL PROTECTED]>
On the other hand, all aspects to *coloring* of characters
do not belong in the plain text stream - but that was not
the question.

I think suggested solutions that define markup that apply to
combining characters but place that markup outside of the
combining sequence would be a better answer than protocols
trying to put markup inside the combining character sequence.

My personal take is that the UTC might make a recommendation
to that effect, but it's not part of the standard proper.
It's not clear that the issue has practical urgency - if
I should be mistaken on that, I'd like to find out how and why.

Placing markup out of the combining sequence seems attractive, apparently, but exposes to other difficulties about how to refer to parts of combining sequences (I did not say "parts of characters", because I agree that combining characters are not part of characters, but effectively true abstract characters per the Unicode definition), when combining sequences are themselves subject to transformations like normalization.


A solution would be to specify in the markup which normalization to apply to the combining sequence before refering to its component characters, with some syntax like:
<font style="color:red nfd(2,1);">e&combining-acute;</font>
which would resist to normalization of the document such as NFC in:
<font style="color:red nfd(2,1);">&e-with-acute;</font>
Here some syntax in the markup style indicates an explicit NFD normalization to apply to the plain-text fragment encoded in the text element, before specifying a range of characters to which the style applies (Here it says that color:red applies to only 1 character starting at the second one in the surrounded text fragment, after it has been forced to NFD normalization.


May be this seems tricky, but other simplified solutions may be implemented in a style language, such as providing more basic restrictions using new markup attributes:
<font style="combining-color:red">&e-with-acute;</font>
where the new "combining-color" attribute implies such prenormalization and automatic selection of character ranges to which to apply coloring. May be there are better solutions, that will not imply augmenting the style language schema with lots of new attribute names, such as in:
<font style="color:combining(red)">&e-with-acute;</font>
Here also, Unicode itself is not affected. But markup languages and renderers are seriously modified to take new markup property names or values into account.





Reply via email to