On Tue, 07 Jan 2003 06:16:43 -0800 (PST), "Robert R. Chilton" wrote:

> I understand your interest in preserving the semantic or lexical
> distinction between an instance of a contracted series of single vowels
> and a true usage of the double vowel.  However, the procedure of
> normalization is designed to collapse all the variant encodings for a
> particular presentation form into a single, "normalized" encoding.
> ...
> Canonical combining classes are defined for combining characters (such
> as macron and dot-under, or the vowel signs of Tibetan) in order to
> support normalization of identical presentation forms to a single
> encoding.  So in the cases you cite, of "graphically identical but
> semantically different" instances, consistency in searching, sorting,
> etc. requires that all "graphically identical" presentation forms be
> normalized to a single normalized encoding.
> 

O.K. Your explanation of normalisation makes sense, and I'll change the encoding
of double and triple E and O vowel signs accordingly on my web pages. The only
query I still have is why a triple E vowel sign should be normalised to <U+0F7B,
U+0F7A> rather than <U+0F7A, U+0F7B> ? What determines that the former sequence
is better than the latter sequence ?

Andrew

Reply via email to