At 06:30 AM 4/24/2004, Peter Constable replied to Peter Kirk:
problems do arise if there is more than one combining character
> between the base character and the VS and they are not in canonical
> order. But this is a marginal case which can be avoided by ensuring
that
> canonical order is always used.

If data is always encoded in canonical order, then having a VS within
the combining mark sequence wouldn't create any normalization problems,
that's true. But you well know that people do not want their Hebrew data
in canonical order. Even if they did, it couldn't be guaranteed.

More simply put, if all data was always normalized, we wouldn't need normalization ;-).


Having character sequences that can't be normalized is not a 'marginal case'.

Furthermore, one of the defining characteristics of a VS character is that there must be a sufficiently large number of circumstances where it's OK to ignore its presence altogether. If there isn't, and if there's a strong semantic distinction between the character and its variation, then it's really not a good case for a VS - one should propose a new character.

A./





Reply via email to