William,

Rather than having the user insert the VS14 after every character, the editor might allow the user to select a span of text for italicization.  Then it would be up to the editor/app to insert the VS14s where appropriate.

For Andrew’s example of “fête”, the user would either type the string:
“f” + “ê” + “t” + “e”
or the string:
“f” + “e” + <U+0300 COMBINING CIRCUMFLEX ACCENT> + “t” + “e”.

If the latter, the application would insert VS14 characters after the “f”, “e”, “t”, and “e”.  The application would not insert a VS14 after the combining circumflex — because the specification does not allow VS characters after combining marks, they may only be used on base characters.

In the first ‘spelling’, since the specifications forbid VS characters after any character which is not a base character (in other words, not after any character which has a decomposition, such as “ê”) — the application would first need to convert the string to the second ‘spelling’, and proceed as above.  This is known as converting to NFD.

So in order for VS14 to be a viable approach, any application would ① need to convert any selected span to NFD, and ② only insert VS14 after each base character.  And those are two operations which are quite possible, although they do add slightly to the programmer’s burden.  I don’t think it’s a “deal-killer”.

Of course, the user might insert VS14s without application assistance.  In which case hopefully the user knows the rules.  The worst case scenario is where the user might insert a VS14 after a non-base character, in which case it should simply be ignored by any application.  It should never “break” the display or the processing; it simply makes the text for that document non-conformant.  (Of course putting a VS14 after “ê” should not result in an italicized “ê”.)

Cheers,

James

Reply via email to