Re: PRC asking for 956 precomposed Tibetan characters

Andrew C. West Thu, 02 Jan 2003 05:45:37 -0800

On Thu, 02 Jan 2003 04:43:23 -0800 (PST), "Chris Fynn" wrote:

> 
> ----- Original Message ----- 
> From: "Robert R. Chilton" <[EMAIL PROTECTED]>
> To: <[EMAIL PROTECTED]>
> Sent: Sunday, December 29, 2002 9:34 AM
> Subject: [tibex] Re: PRC asking for 956 precomposed characters
> 
> 
> > I had heard some rumors about this proposal over the past year and I was
> > interested to finally see n2558.  Sadly, this proposal is flawed on many
> > counts.  It seems that this proposal is motivated solely by
> > typographical considerations without concern for broader character data
> > processing needs.  Although this character set might be fine for
> > computer-based typesetting of the modern Tibetan materials now being
> > printed in the Peoples' Republic of China, it is somewhat lacking as a
> > basis for interchange and processing of Tibetan-script data.
>  
> > Most notably this proposal represents the repertoire of a particular
> > sub-language (modern Tibetan as used in the PRC) rather than a script. 
> > There are many examples of Tibetan-script words in classical Tibetan
> > works, as well as in Dzongkha and other Tibetan-script languages of
> > South Asia, that cannot be represented by this character set.
>


...

Whilst I agree in general with Robert's point-by-point refutation of document
n2558, I still think that the Chinese proposal is being unfairly misrepresented
when he states that it only "represents the repertoire of a particular
sub-language (modern Tibetan as used in the PRC)". Although it is true that the
PRC proposal is biased towards PRC Tibetan orthography (e.g. includes glyphs for
representing the Non-Tibetan sounds FA, FI, FU, FE and FO as used in the PRC,
but not the glyphs that are created by adding the TSA -PHRU mark [U+0F39] to the
consonants PHA and BA, and which are used outside the PRC for representing the
sounds FA etc. and VA etc.), it seems to me that the glyph repertoire covers not
only Modern Tibetan, but also includes glyphs that are normally only found in
early Tibetan texts (note for example the large number of the Reversed I
glyphs), as well as the vast majority of commonly encountered Sanskrit-usage
stacks. Admittedly the proposal does not cover all conceivable consonant-vowel
stacks, but I still maintain that it has fairly comprehensive coverage of the
glyphs that are likely to be encountered in the vast majority of Tibetan texts,
both secular and religious, ancient and modern.

Nevertheless, whether the Chinese proposal fails to include certain
transliteration letters or obscure Sanskrit-usage stacks or special letters used
for writing Dzongkha (although as far as I know Dzongkha is just a dialect of
Tibetan - or a separate language for political reasons - and written Dzongkha is
much the same as written Tibetan ... no doubt someone will correct me on this)
is largely irrelevant. The proposal could easily be expanded to include the
non-PRC usage letters, or a separate "Extended Brdarten" block could be
proposed. The key point is that  the existing Tibetan encoding model works just
fine for all varieties of Tibetan, and there is simply no need for precomposed
Tibetan characters.

I've posted my analysis of document n2558, together with a table mapping the
proposed glyphs to existing Unicode sequences, at
http://uk.geocities.com/babelstone1357/Tibetan/brdarten.html

These are my main observations :

1. The proposal includes a single, apparently arbitrary, example of a consonant
plus triple E vowel (Glyph 107) that is found only in Tibetan shorthand
abbreviations, but many other consonant plus multiple vowel sign shorthand
abbreviations that are frequently encountered in prayer flags and elsewhere are
not covered by this proposal. (See
http://uk.geocities.com/babelstone1357/Tibetan/shorthand.html for some
illustrated examples of shorthand abbreviations.)

2. The proposal includes two examples of letters (KA and KHA) with a superfixed
TIBETAN SIGN LCE TSA CAN [U+0F88] (Glyphs 029 and 100). This sign is most
commonly used in Kalachakra literature, and there are presumably other instances
of its usage combined with different letters that are not covered by this
proposal. I'm not entirely sure how these glyphs should be encoded using the
existing Unicode character encoding model - I assume that the sign LCE TSA CAN
[U+0F88] should be encoded immediately following the base consonant with which
it is associated (i.e. <U+0F40, U+0F88> for Glyph 029 and <U+0F41, U+0F88> for
Glyph 100). Please correct me if I'm wrong.

3. The proposal includes two examples of letters (PA and PHA) with a superfixed
TIBETAN MARK PALUTA [U+0F85] (Glyphs 435 and Glyph 486). Presumably there are
other instances of its usage combined with different letters that are not
covered by this proposal. Again I'm not entirely sure how these glyphs should be
encoded using the existing Unicode character encoding model - I assume that the
paluta [U+0F85] should be encoded immediately following the base consonant with
which it is associated (i.e. <U+0F54, U+0F85> for Glyph 435 and <U+0F55, U+0F85>
for Glyph 486). Please correct me if I'm wrong.

4. Glyph 687 [Tibetan BrdaRten Character ZHA], Glyph 698 [Tibetan BrdaRten
Character ZA] and Glyph 713 [Tibetan BrdaRten Character AHA] in the proposal are
respectively the letters ZHA [U+0F5E], ZA [U+0F5F] and -A [U+0F60] with a dot
slightly right of centre over the top of the letter. I do not recognise this
dot-like mark, and the names given in Document N2558 do not explain what it
signifies. Can anyone enlighten me ?

Andrew

Re: PRC asking for 956 precomposed Tibetan characters

Reply via email to