On Thu, 02 Jan 2003 04:43:23 -0800 (PST), "Chris Fynn" wrote: > > ----- Original Message ----- > From: "Robert R. Chilton" <[EMAIL PROTECTED]> > To: <[EMAIL PROTECTED]> > Sent: Sunday, December 29, 2002 9:34 AM > Subject: [tibex] Re: PRC asking for 956 precomposed characters > > > > I had heard some rumors about this proposal over the past year and I was > > interested to finally see n2558. Sadly, this proposal is flawed on many > > counts. It seems that this proposal is motivated solely by > > typographical considerations without concern for broader character data > > processing needs. Although this character set might be fine for > > computer-based typesetting of the modern Tibetan materials now being > > printed in the Peoples' Republic of China, it is somewhat lacking as a > > basis for interchange and processing of Tibetan-script data. > > > Most notably this proposal represents the repertoire of a particular > > sub-language (modern Tibetan as used in the PRC) rather than a script. > > There are many examples of Tibetan-script words in classical Tibetan > > works, as well as in Dzongkha and other Tibetan-script languages of > > South Asia, that cannot be represented by this character set. >
... Whilst I agree in general with Robert's point-by-point refutation of document n2558, I still think that the Chinese proposal is being unfairly misrepresented when he states that it only "represents the repertoire of a particular sub-language (modern Tibetan as used in the PRC)". Although it is true that the PRC proposal is biased towards PRC Tibetan orthography (e.g. includes glyphs for representing the Non-Tibetan sounds FA, FI, FU, FE and FO as used in the PRC, but not the glyphs that are created by adding the TSA -PHRU mark [U+0F39] to the consonants PHA and BA, and which are used outside the PRC for representing the sounds FA etc. and VA etc.), it seems to me that the glyph repertoire covers not only Modern Tibetan, but also includes glyphs that are normally only found in early Tibetan texts (note for example the large number of the Reversed I glyphs), as well as the vast majority of commonly encountered Sanskrit-usage stacks. Admittedly the proposal does not cover all conceivable consonant-vowel stacks, but I still maintain that it has fairly comprehensive coverage of the glyphs that are likely to be encountered in the vast majority of Tibetan texts, both secular and religious, ancient and modern. Nevertheless, whether the Chinese proposal fails to include certain transliteration letters or obscure Sanskrit-usage stacks or special letters used for writing Dzongkha (although as far as I know Dzongkha is just a dialect of Tibetan - or a separate language for political reasons - and written Dzongkha is much the same as written Tibetan ... no doubt someone will correct me on this) is largely irrelevant. The proposal could easily be expanded to include the non-PRC usage letters, or a separate "Extended Brdarten" block could be proposed. The key point is that the existing Tibetan encoding model works just fine for all varieties of Tibetan, and there is simply no need for precomposed Tibetan characters. I've posted my analysis of document n2558, together with a table mapping the proposed glyphs to existing Unicode sequences, at http://uk.geocities.com/babelstone1357/Tibetan/brdarten.html These are my main observations : 1. The proposal includes a single, apparently arbitrary, example of a consonant plus triple E vowel (Glyph 107) that is found only in Tibetan shorthand abbreviations, but many other consonant plus multiple vowel sign shorthand abbreviations that are frequently encountered in prayer flags and elsewhere are not covered by this proposal. (See http://uk.geocities.com/babelstone1357/Tibetan/shorthand.html for some illustrated examples of shorthand abbreviations.) 2. The proposal includes two examples of letters (KA and KHA) with a superfixed TIBETAN SIGN LCE TSA CAN [U+0F88] (Glyphs 029 and 100). This sign is most commonly used in Kalachakra literature, and there are presumably other instances of its usage combined with different letters that are not covered by this proposal. I'm not entirely sure how these glyphs should be encoded using the existing Unicode character encoding model - I assume that the sign LCE TSA CAN [U+0F88] should be encoded immediately following the base consonant with which it is associated (i.e. <U+0F40, U+0F88> for Glyph 029 and <U+0F41, U+0F88> for Glyph 100). Please correct me if I'm wrong. 3. The proposal includes two examples of letters (PA and PHA) with a superfixed TIBETAN MARK PALUTA [U+0F85] (Glyphs 435 and Glyph 486). Presumably there are other instances of its usage combined with different letters that are not covered by this proposal. Again I'm not entirely sure how these glyphs should be encoded using the existing Unicode character encoding model - I assume that the paluta [U+0F85] should be encoded immediately following the base consonant with which it is associated (i.e. <U+0F54, U+0F85> for Glyph 435 and <U+0F55, U+0F85> for Glyph 486). Please correct me if I'm wrong. 4. Glyph 687 [Tibetan BrdaRten Character ZHA], Glyph 698 [Tibetan BrdaRten Character ZA] and Glyph 713 [Tibetan BrdaRten Character AHA] in the proposal are respectively the letters ZHA [U+0F5E], ZA [U+0F5F] and -A [U+0F60] with a dot slightly right of centre over the top of the letter. I do not recognise this dot-like mark, and the names given in Document N2558 do not explain what it signifies. Can anyone enlighten me ? Andrew

