Antoine Leca
Sun, 12 Jan 2003 12:17:47 -0800
Edmund GRIMLEY EVANS wrote:
(1) Are these words[*] encoded correctly? DEVANAGARI LETTER HA DEVANAGARI VOWEL SIGN AI DEVANAGARI SIGN ANUSVARA DEVANAGARI LETTER MA DEVANAGARI VOWEL SIGN E DEVANAGARI SIGN ANUSVARA
From a Devanagari's point of view (more precisely, Unicode's), yes. From the point of view of a language (which one?), I do not know.
(2) Do these three characters represent the same consonant in Hindi,
No. The ones with "VOCALIC" in the name, are... vowels, not consonant!
and is it non-retroflex?
> U+0930 "LETTER RA" Non retroflex. The retroflex versions are at U+0931, and also at U+095C, as I understand thing.
U+090B "LETTER VOCALIC R" U+0943 "VOWEL SIGN VOCALIC R"
(there are the same sound) I understand they may carry retroflex aspects. But I also understand there are multiple ways to pronounce them, particularly since the original (Vedic/Sanskrit) way seems to have been lost.
My book transcribes U+090B as 'r' with a dot under it, which is also used to transcribe the retroflex R of U+095C and U+095D, so I'm a bit confused.
U+090B as r_underdot is the old way, which were choosed (in 1894!) initially. Now (ISO standard 15919, "translitteration of Indic scripts") one prefers r_underring, that is, r with a ring below. Similarly for U+090C, vocalic l.
(3) Why is U+095C called DDDHA in the Unicode character name? As far as I know it's not aspirated, so where does the H come from?
No idea. By the way, the Unicode name are just informative, you should not follow them too blindly. This character is not "genuine" in Devanagari, it is used for translitterating Gurmukhi and Oriya, and in both language the character is named RRA (but RRA in Devanagari is another thing, U+0931, which comes from Dravidian languages). I think this character would better be named DDDA. But it is too late to change this, so go on with DDDHA.
Is it in fact the non-aspirated version of U+095D (called "RHA")?
Yes it is.
DEVANAGARI LETTER DA
DEVANAGARI VOWEL SIGN VOCALIC R
DEVANAGARI LETTER DDHA
DEVANAGARI SIGN NUKTA
(4) Why is this word[*] written with DDHA + NUKTA instead of with
U+095D ("RHA", which is used elsewhere in the same document)?
No idea. Unicode says both are equivalent (i.e. it is the same to write U+00E0, or U+0061 U+0300). Hope it helps, Antoine -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/