. Peter Jacobi wrote, > So, which codepoint sequence will imply the disjoint form and > which will imply the ligated form? If 'Indic unification' still > holds, the conjunct form always is the default and the disjoint > form needs ZWNJ. > > IMHO this doesn't fit well actual Tamil use and raises a lot of > practical problems. > > Either there must be an accepted list of these ligatures (but > lists of archaic usage tend to grow), or one is bound to put a > preemptive ZWNJ after every SHA VIRAMA in modern use, to prevent > conjunct consonant forming. > > If this archaic ligature problems extends to other grantha > consonants, even more preemptive ZWNJs are necessary for > contempary Tamil.
The Unicode string U+0BB2, U+0BC8 will display differently, depending on which font is used. (லை) Code2000 will display an old-fashioned ligature glyph, Latha will show a more modern alternative, and TabAvarangal2 ( http://www.geocities.com/avarangal ) will render the string in a proposed Tamil script-reform style. Yet, the underlying encoded character string is constant. It may be possible and desirable to treat these archaic ligature forms similarly. Fonts designed for modern Tamil simply won't include these archaic ligature glyphs, so it shouldn't be necessary to insert ZWNJs all over the place in existing files. Anyone seeking to reproduce a Tamil classic would need to specify an appropriate font which includes the archaic ligatures. Users whose systems lacked the appropriate font would still be able to read the document, however. IMHO, it's important to preserve options for users to explicitly control ligation in plain text. With these archaic Tamil ligatures, an author *may* elect to insert ZWNJs and other appropriate formatting characters to preserve such distinctions where desired. I'm still concerned about the SHRII ligature encoding, though. Of course, it makes sense to treat the ligature as a conjunct of SHA + RA + II, but since SA + RA + II seems to have been the "official" way to encode the ligature -- the proposed change will break existing implementations. It might be best to add the new SHA character without changing the existing SHRII encoding (SA + RA + II). Best regards, James Kass .

