On Thu, 17 May 2018 23:38:27 -0800 James Kass via Unicode <[email protected]> wrote:
> I wrote, > > > Changing the entry order to: > > ᨽᩮᩣᨾᨶᩣᩮ > > <LOW PHA, SIGN E, SIGN AA, MA, NA, SIGN AA, SIGN E> > > ... forms the NAA ligature and the vowel re-ordering matches the > > Lamphun graphic you sent. But that kludge probably breaks the > > preferred encoding model/order. > > On the other hand, do the script users normally input the NAA ligature > sequence first and then add any additional signs or marks? If the > users consider NAA to be a distinct "letter", then that might explain > why a font developed by a user accomodates the ligation for the string > "NA" + "AA" only when nothing else appears between them. If, for > example, there's a popular input method or keyboard driver which puts > "NAA" on its own key, then the users will be producing data which is > "NA" plus "AA" plus anything else. There was a keyboard map in the zip file that you may have got the font from, http://www.kengtung.org/font-download/Tai-Tham-Unicode-for-PC.zip . It has three key symbols per key - plan, shift and capslock. All the combinations correspond to a single character. There's also a zip file for a non-Unicode font, http://www.kengtung.org/font-download/Tai-Tham-Non-Unicode-for-PC.zip and that has a corresponding keyboard. Now, while I haven't looked at the font, it looks like a direct key to glyph mapping, and as I would have expected from the pre-Unicode Wat Inn hack encoding, the English key stroke for 'o' (the key stroke for THAI CHARACTER NO NU) yields NA and the key stroke for 'O' yields the NAA ligature. I may be wrong about the relationship - the top vowel + tone ligatures seem to be missing from the keyboard. So, the evidence is ambiguous. The dictionaries I have seen do not treat NAA as an indivisible character - NAA plus subscript is treated differently depending on whether the subscript phonetically precedes or follows the subscript consonant. However, the rule that homorganic subscript precedes and others follow the vowel works pretty well. Now, the chanting of Pali declensions, if related to writing, should bring home via the participles in -nt- that there is a close relationship between <NA, subscript HIGH TA> and <NAA, subscript HIGH TA>. It would be interesting to see how often ligation fails in participles. However, I think there is a different explanation for the sequence. There are suggestions around that aksharas should be encoded with left matras in second place. This makes it easier for fonts. I think we're seeing an encoding based on ease of font design. Now, one doesn't need this. If feature ss02 is enabled, the fonts of my Da Lekh family will convert a transliteration of Tai Tham letters, numbers and marks to ASCII back to the original Tai Tham text. All I need is a feature activation, which ASCII is normally has the privilege of receiving. I believe I could do it all by ccmp, but this feature is a fall back for when the renderer does not support Tai Tham. At present, Tai Tham seems to be in grave danger of breaking up into a number of font encodings - one chooses the rendering system, and that determines the allowed sequences, even for fairly simple words. The Xishuangbanna News appears to be using a visual order encoding. I suspect this works because syllables are separated by spaces, so they don't have to worry about Indic rearrangement being applied despite the lack of lookups for OTL script "lana". Richard.

