On Tue, 14 May 2019 00:58:07 +0000 Andrew Glass via Unicode <email@example.com> wrote:
> Here is the essence of the initial changes needed to support CV+C. > Open to feedback. > > > * Create new SAKOT class > SAKOT (Sk) based on UISC = Invisible_Stacker > * Reduced HALANT class > Now only HALANT (H) based on UISC = Virama > * Updated Standard cluster mode > > [< R | CS >] < B | GB > [VS] (CMAbv)* (CMBlw)* (< < H | Sk > B | SUB > > [VS] (CMAbv)* (CMBlw)*)* [MPre] [MAbv] [MBlw] [MPst] (VPre)* > > (VAbv)* (VBlw)* (VPst)* (VMPre)* (VMAbv)* (VMBlw)* (VMPst)* (Sk B)* > > (FAbv)* (FBlw)* (FPst)* [FM] This comes a lot closer to supporting Tai Tham monosyllabic clusters. Although this shouldn't affect Tai Tham, some of those medials need to be made repeatable; I belief this has already been done in HarfBuzz. I trust you'll be reclassifying U+1A55 TAI THAM CONSONANT SIGN MEDIAL RA and U+1A56 TAI THAM CONSONANT SIGN MEDIAL LA into the category SUB so that we can write about bananas forever (ᨠᩖ᩠ᩅ᩠᩶ᨿᨲᩕ᩠ᩃᩬᨯ): <HIGH KA, MEDIAL LA, SAKOT, WA, TONE-2, SAKOT, LOW YA> /kluai/ 'banana' <HIGH TA, MEDIAL RA, SAKOT, LA, SIGN OA BELOW, DA> /tʰalɔːt/ 'for ever' The issues here are that WA in a medial rôle is indistinguishable from a coda ('sakot') consonant and that MEDIAL RA can act as a consonant aspirator. Unfortunately, we didn't define a consonant HIGH RATTHA with a canonical decomposition to <U+1A2D RATA, U+1A5B SIGN HIGH RATHA OR LOW PA>. The problem is that 'HIGH RATTHA', widely seen as an alternative form of HIGH RATHA, can act as a subscript coda consonant. There are also a couple of words in the Northern Thai Dictionary of Palm-Leaf Manuscripts where MEDIAL LA acts as a coda consonant. Together, these call for (Sk B)* to be replaced by (<Sk B | SUB>). This next question does not, I believe, affect HarfBuzz. Will NFC code render as well as unnormalised code? In the first example above, <TONE-2, SAKOT, LOW YA> normalises to <SAKOT, TONE-2, LOW YA>, which does not match any portion of the regular expression. Richard.