Thomas Milo wrote:

What do you think of my example of the pakistani tanween with small
meem, indicating tanween + iqlaab, which from the grapheme point of
view is in addition to and offset from the tanween?


(http://kprayertime.sourceforge.net/calligraphy/tanween-dammataan-iqlaab.png
)

Doesn't this indicate that iqlaab should be encoded as such, and not
incorporated into the tanween?


Well, in my view this is an example of how not to identify graphemes. The
Egyptian and Saudi editions express iqlaab with a ligature of vowel and
small meem, your example shows a tanween ligature with small meem, but the
underlying grapheme is identical: tanween+iqlaab.

I would strongly urge you not to construe these as "ligatures". "Ligature" is a term of art in modern computational typography. I don't believe a calligrapher writing a Quran would say a vowel followed by a small meem is a single unit, let alone a ligature. In fact, the language itself indicates this: the operation of iqlaab has nothing to do with the vowel; ditto for the operation of tanween and ikhfaa.


The first thing to agree on is to encode iqlaab as a separate grapheme. What
rests then is how to encode tanween. Unicode adopted the tanween ligatures
as separate codes. My opinion is that the ligatures fathatan, dhammatan and
kasratan are not graphemes, but ligatures consisting of exactly what their
Arabic names indicate: two fathas, two dhammas and two kasras.

My understanding is that Unicode does not construe the -atan codepoints as ligatures but as single things. They were adopted because that's the way all the legacy encodings did things

Also I'd be careful about using "grapheme"; it may be the best and most accurate terminology, but that doesn't mean the Unicode crowd accepts it; in fact I predict that if you say "Unicode encodes graphemes" on the Unicode you'll get a lot of howling. "Abstract character" is the unicode way. My own preference is "semantic unit" or the like. Don't look for a lot of logical precision and consistency and simplicity in the language of unicode. :(

-g
_______________________________________________
General mailing list
[email protected]
http://lists.arabeyes.org/mailman/listinfo/general

رد على