Thai, Khmer, Lao and Tai Viet are already exceptions to the Unicode character encoding model. This should remain bounded to the native scripts of this region of Indochina. For the rest, all Indic scripts using the logical encoding order (including those of Burma, and the Philippines) should have the same coherent behavior.
So the case of the Khmer Coeng is not a good case, as Khmer does not behave and is not included as a regular Indic script, depite of its historic origin (anyway there's a split of representation as well between Semitic scripts and Greek/Coptic, even if there's a common historic origin). The split between Indochinan scripts and other scripts with Brahmic origin is probably much more recent (and justified by compatibility with legacy encodings), but it is justifiable to consider those Indochinan scripts in a class separated from "Indic" scripts, within the same large "Brahmic" family. The so called "Unicode character model" already includes distinct classes between alphabetic scripts, abjads, abugidas (Indic), syllabaries, and sinographic scripts, within the phonographic family, plus logographic scripts. This just adds another class for Indochinan abugidas (using the visual encoding order), which should probably be better formalized officially. Philippe. 2011/8/14 Richard Wordingham <richard.wording...@ntlworld.com>: > On Fri, 24 Jun 2011 18:24:01 +0530 > Shriramana Sharma <samj...@gmail.com> wrote: > >> The point is that the sequence: >> >> <la, virama, candrabindu, la> >> >> is strictly speaking *the* sequence recommended *across* Indic >> scripts for representation of Sanskrit clusters involving a nasal and >> non-nasal "semivowel". > > Could you please quote me chapter and verse for this from the TUS or > other relevant ruling. It contradicts TUS 6.0 Section 11.4 Ordering of > Syllable Components (p367), which treats U+17D2 KHMER SIGN COENG and > its following consonant (or independent vowel) as inseparable. > > It also creates the further oddity that when using a 'consonant sign' > (Tibetan, possibly Myanmar, and Tai Tham) one would have the sequence > <la, candrabindu, subjoined la>. (Alas, I don't have any relevant > Sanskrit examples in those scripts.) > > The problem may be what is meant by an 'Indic script'? Do you include > Tibetan and Further Indian Indic scripts (e.g. Myanmar, Tai Tham and > Khmer), or do you just mean Indian Indic scripts? > > Richard. > >