Waleed, Thank you for your questions. I've omitted things to be clarified first.
> - Is there any complain from Arab/Persian users on the usability of the > current implementation? If you mean their daily usage, casual or business, I guess no. # But I don't use Arabic in most of my life, so please someone inform. > - Is there a shortcoming from the current specifications? So, no in the above sense. At the same time yes in another. Let me explain. There was a discussion in this [email protected] a month ago, concerning how to encode Qur'aan, or rather a Mushaf, under the subject: "Questions about yeh, hamzah on yeh, alef maksura and dotless ba". [1] More than 60 messages were posted, so I don't recommend you to read them. I describe relevant points here. # In fact, I simplemindedly assumed you all have read them in my # previous post. I was wrong. The logic below relies on the current Unicode, in order to reveal its confusion. Summary will be given later. (a) In today's Mushaf (Qur'aan in book form, not recited one), yeh is dotted in initial and medial form, and dotless in final form. So, isn't is proper to encode yeh with U+06CC "arabic letter farsi yeh"? It is defined to wear and drop dots, just in the same manner as Qur'aanic yeh. So is today's secular Egyptian yeh. It is called "farsi", but isn't it really both Persian and Arabic? (b) You can't blame if an Egyptian encode their language with Farsi Yeh because it seems to implement "one yeh" correctly, but then there's a discrepancy with other Arabs who spell the final yeh both dotted and undotted, according to the context. Is the current standard really acceptable, which has indistinguishable two dotted yeh in ini/med form, U+064A "yeh" and "farsi yeh", and two dotless yeh in iso/fin form, U+0649 "alif maksura" and "farsi yeh"? I would say it may cause troubles.[3] (c) In Mushaf, hamza under yeh appears. The Yeh is dotless. With which yeh should it be encoded? There're three candidates: U+0649 "Alef maksura", U+064A "yeh", and U+06CC "Farsi yeh". (d) There's U+0626 "yeh with hamza above", and it is specified to be equivalent to U+064A "yeh" U+0654 "hamza above". How should these two superscript and subscript hamza's after yeh should be related? (e) In Mushaf, there're occurrences of superscript small alef over dotless yeh. Again, which yeh should be used? Current 4.1.0 Unicode says nothing about it, nor gives slightest hints. To summarize, Qur'aan/modern Egyptian/Persian use a sole yeh throughout. They consider it as just "yeh". Yeh comes both dotted and dotless. There's a simple rule when it is dotted and when is not. On the other hand, other modern Arabic speakers use both dotless and dotted yeh at the end of words. Such dotless final yeh is sometimes called "alef maksura", although it is not correct.[2] You cannot decide automatically which yeh is dotted and which is not only from the context it appears. Instead you have to know the grammar and each word. Grammar and lexicon have nothing to do with Unicode. My view is that distinction of the natural but intuitive concept of "letter yeh" and grapheme-wise observation lacks in the current specification. U+064A claims to be natural yeh, but it is incomplete. U+0649 is a grapheme "dotless yeh", but again, bestowed half of its right, under a peculiar name "alef maksura". In the thread referred to formerly [1], historical texts are also considered. There's more subtlety with them, and my proposal does not cover them. Regards. Oibane ------------------------------ [1]http://lists.arabeyes.org/archives/general/2005/December/msg00006.html [2]"Alef maksura" is the name of a type of nouns, which often ends in dotless yeh, with /aa/ sound (elongated /a/). It is wrong to call the final dotless yeh because: (a) Some alef maksura nouns just end in plain alif letter. (b) There're other words, say verbs, which end with dotless yeh and /aa/ sound. On the other hand, it is a grammarian's terminology, and you are not assumed to know it. Not few people casually call it "alif maksura" today. I was one, until the discussion [1]. As a pragmatism, it may be too severe to consider it illegitimate. Anyway, the name "alef maksura" for U+0649 is a source of confusion and has to be amended. If its name changes to "dotless yeh" and annotated as "... used as alef maksura in Arabic ...", it may be acceptable as a compromise. Well, my knowledge is thin, better wait other's words. [3] See for an easy example: http://www.google.com/search?q=cache:r0MGbwnb908J:www.isu.net.sa/archive/ainc-alc/2001-July/000166.html It is cited in my first post, too.
_______________________________________________ General mailing list [email protected] http://lists.arabeyes.org/mailman/listinfo/general

