Mohammed Yousif wrote: > On Saturday 31 December 2005 17:22, Thomas Milo wrote: >> Mohammed Yousif wrote: >> >>> - If they are to identify the "e" sound Yeh not like the rest of >>> dots, then I really don't know how this should be handled because >>> these two dots in this context can not be considered part of the >>> final shape of the letter Yeh since they add more information to >>> the already known letter Yeh and that's not the scope of dots in >>> Arabic. Maybe in that case U+064A should be used as a Yeh >>> specialization, but then I wouldn't like that solution.
I am not sure what you are describing here. If you mean the explicit /y/ or /ii/ function as expressed by two dots, then I understand. For that sort of problems IMHO the best theoretical solution would be to consider the dot patterns (not just for yeh, but _all_ five of them) as separate graphemes entitled to their own code point in Unicode. If that principle were extended to all Arabic-derived characters, then that would also simplify font design for Unicode dramatically. A couple of years ago I described this in a well-established scienfific journal: www.decotype.com/publications/Manuscripta_Orientalia.pdf >> The word /stay'asuw/ in Q12:80 is rather a spanner in the works: its >> existence implies that there can be no rule that the sequence Yeh, >> Hamza can be trusted to be Yeh+Hamza_above/below. >> > > The well established Qur'an sciences can be employed to know if the > hamza is above/below or standalone. Namely, the Rasm science, it > disambigu clearly this type of situations and identifies the various > variations that can exist with other types of Masahef "Maghribi > Mushaf...etc". If it's not a simple straightforward rule, it cannot be expected to be built into a font. So our earlier idea of assuming that the string (any) YEH followed by hamza could be substituted by a single - ligature! - YEH+HAMZA (above or below according to Qur'anic rules and locale) turns out to be false. > Along with that, there is the fact that Qur'an has been taught over > the years from generation to generation by the mouth. Which means > that Mushaf is not the only way to know the nature of a specific > element or letter. Again, this fact is only relevant for digitizing the Qur'an if this knowledhe can be formulated as a set of objective rules and built into a font. >> Meor's present encoding of this transparent Hamza as U+0640 TATWEEL. >> U+0654 SUPERSCRIPT HAMZA is IMHO untenable from both a linguistic >> and calligraphic point of view: tatweel is not a grapheme (i.e., not >> part of any orthography), and it is totally font and calligraphy >> dependant. It happens to be used in such positions by the recent >> Egyptian and Saudi editions, but a robust encoding and rendering >> solution should not depend on ad hoc such innovations. I prefer to >> encode this hamz as U+0621 and then make sure the hamza is >> positioned between the surrounding characters rather that on top of >> the first one - without having to resort to a tatweel. >> > > Can't agree more. Actually, the original ArabeyesQr implements this > by using only U+0640 for this type of standalone Hamza but that made > the font quite complex because no direct support from Unicode exists > for this type of situation and maybe Meor thought it's easier to just > use Tatweel. I admire Meor's efficiency in creating a first workable Qur'an using Unicode and OpenType components. But there are still a couple of open ends that are not his fault, but that are the consequence of font technological limitations. Best wishes for the New Year, t
_______________________________________________ General mailing list [email protected] http://lists.arabeyes.org/mailman/listinfo/general

