On Tue, 07 Jan 2003 06:16:43 -0800 (PST), "Robert R. Chilton" wrote:
> I understand your interest in preserving the semantic or lexical > distinction between an instance of a contracted series of single vowels > and a true usage of the double vowel. However, the procedure of > normalization is designed to collapse all the variant encodings for a > particular presentation form into a single, "normalized" encoding. > ... > Canonical combining classes are defined for combining characters (such > as macron and dot-under, or the vowel signs of Tibetan) in order to > support normalization of identical presentation forms to a single > encoding. So in the cases you cite, of "graphically identical but > semantically different" instances, consistency in searching, sorting, > etc. requires that all "graphically identical" presentation forms be > normalized to a single normalized encoding. > O.K. Your explanation of normalisation makes sense, and I'll change the encoding of double and triple E and O vowel signs accordingly on my web pages. The only query I still have is why a triple E vowel sign should be normalised to <U+0F7B, U+0F7A> rather than <U+0F7A, U+0F7B> ? What determines that the former sequence is better than the latter sequence ? Andrew

