Namaste! Lainaus Edmund GRIMLEY EVANS <[EMAIL PROTECTED]>:
> (1) Are these words[*] encoded correctly? > > DEVANAGARI LETTER HA > DEVANAGARI VOWEL SIGN AI > DEVANAGARI SIGN ANUSVARA Yes; this is the plural present tense form of the verb "to be". > DEVANAGARI LETTER MA > DEVANAGARI VOWEL SIGN E > DEVANAGARI SIGN ANUSVARA Yes; this is the postposition meaning "in". > My book says these words contain candrabindu, but it is written as > anusvara because of the vowel superscript. So it's not clear to me how > the sign should be encoded. >From a linguistic point of view, in modern-day Hindi "candrabindu" basically means the nasalization of the vowel it is attached to, while "anusvaara" is a mark of a nasal consonant, homorganous with the sound following it. In these two examples, a nasalized vowel is meant, but an "anusvaara" is written out, since it looks nicer because of the other above-combining diacritic. This is, however, just a modern convention of the use of these two marks, and in many Sanskrit texts these rules are not followed. Therefore, concerning these diacritics, you should encode what you see, not what you think their semantics should be. > (2) Do these three characters represent the same consonant in Hindi, > and is it non-retroflex? > > U+090B "LETTER VOCALIC R" In Sanskrit this used to be a sign for the regular, non-retroflexed "r" sound (an alveolar flap or a trill), when there was no real vowel pronounced in the same syllable. Cf. the common English pronunciation of "bottle", where there are two syllables -- "bOt" and "l" (or "bO" and "tl") -- but just one real vowel; the "l" here functions as a vowel (it is forming a syllable), while it really is a consonant. In modern-day Hindi this is usually pronounced "ri" (consonant + vowel). > U+0930 "LETTER RA" This is the regular non-syllabic consonant "r" with the same original pronunciation as with the previous one. Nowadays, this is however always pronounced just "r", never "ri". > U+0943 "VOWEL SIGN VOCALIC R" This is just the dependent form of "LETTER VOCALIC R", which is used when the syllabic "r" has another consonant preceding it. This is also pronounced "ri" in modern-day Hindi. An example word for these would be DEVANAGARI LETTER SA DEVANAGARI SIGN ANUSVARA DEVANAGARI LETTER SA DEVANAGARI VIRAMA DEVANAGARI LETTER KA DEVANAGARI VOWEL SIGN VOCALIC R DEVANAGARI LETTER TA which used to be pronounced disyllabically "sans-krt", but in modern-day Hindi "sans-krit". > (3) Why is U+095C called DDDHA in the Unicode character name? As far > as I know it's not aspirated, so where does the H come from? Is it in > fact the non-aspirated version of U+095D (called "RHA")? Many Unicode names for characters have no relation to their real pronunciation in one particular language (although in this case I don't know of a single language, where this would be aspirated). In Hindi U+095C is the non-aspirated retroflex flap corresponding to its aspirated version U+095D, as you quite correctly suppose. Best regards, Miikka-Markus Alhonen -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
