On Wednesday, August 06, 2003 11:48 PM, Peter Kirk <[EMAIL PROTECTED]> wrote:

> OK, what kind of markup should I use, in any well-known markup
> language, to ensure that an isolated diacritic is centred in the
> space between the words before and after it?

In plain text, I think that this encoding:
    ...endOfWord1, SPACE, SPACE, diacritic, SPACE,
    startOfWord2...
is what you need, as it creates the following combining sequences:
    <...endOfWord1>, <SPACE>, <SPACE, diacritic>, <SPACE>,
    <startOfWord2...>

If you don't want any space around the diacritic which must be displayed
isolated but in the middle of a word, the following would work:
    ...endOfWord1, SPACE, diacritic, startOfWord2...
Here the SPACE is not a break opportunity, but just the base character
for the diacritic inserted. What is missing in the standard is defining the
property of such SPACE+diacritic sequence: normally it inherits the
properties of the base character, and properties of diacritics are ignored.

But when using a SPACE or NBSP base character new properties may
be needed. If there's still a break opportunity on the base SPACE of a
combining sequence, it is not clear where the break occurs: before the
SPACE (i.e. before the combining sequence), or after the diacritic (i.e.
after the combining sequence)?

I think that the second option applies here, i.e. the base SPACE would
create a break opportunity at end of the whole combining sequence
made with a SPACE and the following combining characters (including
CGJ if needed to fix canonical ordering).

Another similar case would be the use of a isolated nukta (which
normally modifies a following base character): the sequence
<nukta, SPACE> is a single combining sequence with a break
opportunity. So a sequence like <nukta, SPACE, acute accent>
would be unbreakable but would include a break opportunity at its
end, unless it is followed by a NBSP.
And the sequence <nukta, NBSP, acute accent> would also be
unbreakable either in the middle or on both ends.

-- 
Philippe.
Spams non tol�r�s: tout message non sollicit� sera
rapport� � vos fournisseurs de services Internet.


Reply via email to