On Sun, 14 Aug 2011 19:59:30 +0530 Shriramana Sharma <[email protected]> wrote:
> On 08/14/2011 06:02 PM, Richard Wordingham wrote: > > On Fri, 24 Jun 2011 18:24:01 +0530 > > Shriramana Sharma<[email protected]> wrote: > > > >> The point is that the sequence: > >> > >> <la, virama, candrabindu, la> > >> > >> is strictly speaking *the* sequence recommended *across* Indic > >> scripts for representation of Sanskrit clusters involving a nasal > >> and non-nasal "semivowel". > > > > Could you please quote me chapter and verse for this from the TUS or > > other relevant ruling. > <snip> > However, people working with Indic rendering in a major operating > system support the concept (see > http://www.unicode.org/mail-arch/unicode-ml/y2011-m06/0153.html). Thanks, that's useful as a reference - it helps me find it later. > To make it official I'll submit a document for this matter to be > included in the published Standard. To me, the issue is on the relative ordering of candrabindu and virama. > For Indian Indic scripts we have attestations for both scripts using > C1-conjoining forms and C2-conjoining forms. The issues is on the relative ordering of candrabindu and virama. For a C1-conjoining form (i.e. C2 relatively unmodified), <la virama candrabindu la> is easier to handle. For a C2-conjoining form, <la candrabindu virama la> is easier to work with. Vowels and the like already occur within Tai Tham and Khmer consonant clusters with C2-conjoining forms. Normally the virama equivalent (CCC 9) occurs immediately before C2, but that can already be displaced in normalised text, e.g. the Northern Thai loan word ᩈᩮᩥᩁ᩠᩺ᨷ (from English 'serve') normalised to ᩈᩮᩥᩁ᩠᩺ᨷ <U+1A48 TAI THAM LETTER HIGH SA, U+1A6E TAI THAM VOWEL SIGN E, U+1A65 TAI THAM VOWEL SIGN I, U+1A41 TAI THAM LETTER RA, U+1A60 TAI THAM SIGN SAKOT, U+1A7A TAI THAM SIGN RA HAAM, U+1A37 TAI THAM LETTER BA>. The rendering on p155 of Bunkhit Watcharasat's 'Northern Thai Teach-Yourself Book' (Siamese: ภาษาเมืองล้านนา ฉบับเรียนด้วยตนเอง) makes it clear that the ra haam (vowel killer, here acting as a consonant killer) acts on the letter ra. I've seen a claim that vowels within Tibetan consonant stacks can be handled sensibly within the confines of Unicode - I didn't investigate it. I think an official ruling should cover all Indic scripts, ideally even those encoded in writing order, such as Thai. (I'm presuming the Thai script's subscript consonants will be supported one day, and Lao already has one unambiguously subscript consonant.) Richard.

