Meor Ridzuan Meor Yahaya wrote: >> On side note, unicode also have Farsi yeh. At first, I though it was >> strictly for Persian language. But in their document, it does mention >> Arabic, the language. The characteristic of Farsi yeh is, in >> initial/medial forms, it exist with dots, otherwise, no dots. More >> like what it appear in Madinah Mushaf. However, I think it should be >> kept for Persian Language only.
Meor, Farsi yeh is another flaw in the Unicode naming system. "Farsi yeh" is simply tradional Arabic yaa' as used in all mushafs, whether to denote ii, y or ae. On final forms dots were ornamental, used indiscrimenately in any of these instances. In the case of retroflex yaa' (registered in Unicode as U+ 06D2 Yeh Barree, a grapheme for Urdu) dots are always used in Arabic. As a result, calii and calae both have dots. In my recently rewritten Arabic tutorial for Unicode Conferences, I gave an overview of this problem category: www.decotype.com/publications/unicode-tutorial.pdf (page 7) Generally speaking, the Unicode standard provides sound graphemic code points for office Arabic. However, for Classic and Qur'anic Arabic it only provides visual patches (such as supercript hamza). Where there are multiple solutions (e.g., yeh-hamza as a contextual allograph of a single code point or built from distinct code points), a protocol is lacking. As for solutions, the only robust way out of the YEH-conundrum is the encoding of separate dots - throughout. All other solutions will remain what they are: a confused mess. Such a radical approach would also solve the YEH-HAMZA ambivalence: U+0649 NODOT-YEH with DOTS BELOW or HAMZA (ABOVE -BELOW according to context). A less ambitious, and possibly more practical solution would be to use regular U+064A YEH throughout, side-by-side with U+0626 YEH-HAMZA. The first one should drop dots in final position according to a QUR'ANIC LOCALE, the latter one should then intelligently shift the hamza below according to the same locale. By the same token, U+0621 hamza should be treated as a transparent grapheme when used in Qur'anic or Classical Arabic context. Font technology will have to deal with the resulting positioning issues of loose hamza between letters are exactly identical to those of SUPERSCRIPT ALEF when preceded by FATHA. Without such font solutions, Qur'anic encoding will again remain what it is today: a confused - and incompatible - mess. This approach, a combination of locales and font technology, will result in clean, interchangeable and universally searchable code for Qur'anic Arabic - with the drawback that its quality will depend on the available local resources. But that is a general limitation of today's rendering technology for which solutions are under way. t
_______________________________________________ General mailing list [email protected] http://lists.arabeyes.org/mailman/listinfo/general

