On Tue, 24 Feb 2015 09:26:41 -0800 Roozbeh Pournader <[email protected]> wrote:
> On Tue, Feb 24, 2015 at 5:03 AM, Richard Wordingham < > [email protected]> wrote: > > > Are we still left with IndicSyllabicCategory.txt as the only > > functional definition of the properties? > Not necessarily. USE seems to use a combination of Indic syllabic, > Indic positional, and general categories, with some codepoints as > exceptions. HarfBuzz has been using some very similar techniques too, > with tables automatically derived from the Unicode data files and > then some exceptions in code. That's what I'd call a *formal* definition. The definition of well-formed clusters by USE provides what I would regard as a *functional* definition. One can then classify a character by where it occurs. Of course, USE need not have captured all combinations, and indeed I say it has not. > > 1. Is <consonant><dependent_vowel>_<dependent_vowel> an allowed > > context for a 'Consonant_Medial' if it is allowed for an invisible > > stacker plus consonant? <snip> > > 3. Are they allowed contexts for 'Consonant_Subjoined' if they are > > allowed for an invisible stacker plus consonant? > They could be, as soon as we have evidence that there is need for > allowing them (if we don't allow them at the moment). Generally, give > us the character sequence that should work and doesn't, and why your > sequence is correct according to Unicode encoding of a script, and > HarfBuzz will get the patterns fixed to allow the character sequence. That's circular! The USE makes very little distinction between a consonant_medial and consonant_subjoined. One distinction is that a consonant_medial cannot be followed by <invisible_stacker, consonant>. The MFL Revisison 1 p801 (I need this reference for the UTC) has eight words starting with the cluster <HIGH HA, MEDIAL LA, SAKOT, WA> /lw/, so U+1A56 TAI THAM CONSONANT SIGN MEDIAL LA should be 'Consonant_subjoined'. I've also seen it after a vowel in another dictionary. There is a word in which U+1A55 TAI THAM CONSONANT SIGN MEDIAL RA phonetically follows a written vowel, so that eliminates the medial consonants as a Tai Tham category! Richard. _______________________________________________ HarfBuzz mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/harfbuzz
