Dear Abdulhaq, 1- Characters vs glyphs
> Nevertheless, the different styles of dammataan and the ØÙÙØØ tanweens (with > the meem) for instance _do_ carry real meaning. They seem to me to be _real > characters_ (and therefore should be in the Unicode standard) and not merely > combinations of glyphs. In my analysis the different styles of tanween should not be considered separate characters, but contextually conditioned modulations of the basic tanween. I have two reasons for this. The first is that the visual variations correspond with phonetic, i.e. linguistically meaningless, variation, as opposed to phonemic, i.e., linguistically meaningful variation. After all, the affected words do not change in meaning, they are just anticipating phonetically on the next consonant. The second reason is that I am in favour of looking at the larger picture of Islam as a great civilization and community, within which traditions coexist that are minimally different. For instance, Mamluk, Persian, Indian and Ottoman Qur'ans only use regular tanween, even where the contemporary Arabic Qur'an uses the specialized tanween variation presently under discussion. We would do well to define qur'anic use of Unicode in such a way that the essential singularity of the text is maintained. The way to do this is to use mark-up or specialized characters like SMALL MEEM to follow regular tanween to indicate a phonetic modulation (please note that my preceding email decribes a solution for tamweem etc. that does not yet implement this idea. Seen in this light, my present solution has a trait in common with your approach - it is pragmatic by using what is available in the way of code points and feasible is the way of font technology. After all: it works.) > If a renderer wants (for whatever reason) to > represent a dammataan that indicates ØØÙØØ, why should it _have_ to use two > adjacent dammas? No, the dammataan that carries one meaning should have one > code, and the other dammataan another. Then the _font_ will provide the > appropriate (single) glyph. The use of repeated damma/fatha/kasra is just an example of how it could be done. For modern font technology internal glyph substition is a trivial matter. Internally two Unicodes can be made a single glyph (=ligature) or one Unicode can make many glyphs (for instance multiple pen strokes to build one letter). As I indicated above, I believe the cleanest way to handle tanween variation is to use one and the same solution for basic tanween, followed by a modulation character, one for tamween (already available and not yet unambiguously defined in Unicode) and another, new code for rendering them sequentially. > As another example, I've read your pdfs on the internet and I think I agree > with every single word, amongst which you say that ligatures do not belong > in Unicode. I agree entirely. However, you yourself mention that even basic > letters like baa' and taa' are ligatures in a certain way in that they are > composed of the basal stroke and then the nuqaaT are added later. Does this > mean that you are proposing that the baa' character should be removed and > that we should have one code for the single-tooth stroke, one for the single > dot, and that from now on to render a baa' the text must contain the tooth > character followed by a nuqTa? This is in fact exactly how I analyse Arabic script and why I consider the existing legacy code industrial trash. However, in our present discussion we are looking for ways to make the best of the existing Arabic block in Unicode. 2 - Technological considerations > What you are proposing means the representation of one 'semantic load' as two > other quite different characters (e.g. two dammas) that then need to > replaced with a combined character to be rendered/placed by the renderer. Having agreed on the plain text coding format and conventions, the renderer can apply any substitution it deems necessary. This can be internal character substitution and internal glyph substition. This all happens inside a black box and is of no relevance to out present discussion: I have been arguing on the assumption that the topic is the plain text coding format and conventions. > I'm fairly sure from your previous emails that you intend implementations to > adopt the second approach. However, as Muhammad implied, this means adding > significant code to numerous code bodies over which we have no control and > no knowledge either. On linux alone (never mind the Mac, Solaris, PalmOS, > BSD, etc.) there are a number of font renderers in current use. Each of > these is under the control of groups that will not be inclined to make these > changes themselves and may not even let others change things (Apple and also > the X consortium with their xfs springs to mind). Hence we all (everyone > except arabic windows users) would be forced into waiting for the > diachronic page you mentioned, and who knows when that will arrive? Unicode must be supplemented by adequate font technology. If these entities do not have it, they cannot handle Unicode. > > The only possibility to accomplish a robust solution for encoding the > > Qur'an - or any Classical Arabic for that matter - in the Unicode format > > would be designing a code set from scratch and apply for it's inclusion in > > the second plain as Historic or Diachronic Arabic. This is exactly what I > > am working on, including the conversion schemes to upgrade the Arabic > > industrlal rubbish in Unicode or interchange with it. > This will be a good step forward, but these characters are not just historic > but in widespread current use. Thinking about it, they are in one of the > most common books in the world! Diachronic would be the better term: not just going back to the past, but als looking towards the future. > Is the Unicode standard ultimately some sort of Platonic ideal that cannot be > violated by everyday pragmatism, or is it a tool to help people communicate > and share precise information albeit in a non-perfect way? No. It suffers from the same type of bureaucratic and practical limitations and obstructions as the platforms you enumated above. 3 - Phased approach > I concur with Nadim about correcting of the situation in two phases, the > first which allows all operating systems (not just Windows) to output these > extra characters without changes to core services such as font renderers. OK > it may in your view be a bit of a kludge, but by your own understanding the > whole thing is already a kludge. The second phase can get it _right_ in the > ways you are proposing. Even then though I believe that the tajweed markers > are characters with genuine meaning and not just glyph combinations. I agree with the phased approach, but we need to agree on the phases. If your operating systems cannot handle fonts the way it is assumed by the Unicode Standard, than I would agree with you that you need to opt for kludges. Just count me out. 4 - Industry View > > As a general rremark I would like to point out that, due to the by > > definition conservative character (sic!) of Industry Standards, there is > > no hope in the world of getting a structurally clean solution for Qur'anic > > Arabic - or even for Arabic at large. The reason is, that none of the > > Arabic encoding patterns or font designs were researched by and for > > scholars and calligraphers, but by employees of engineering companies with > > the short term commercial objective of arabizing as cheaply and as fast as > > possible whatever product they had that was originally made on the > > assumption that Latin characters rule the world. It was from the junk yard > > of trashed legacy code patterns that Unicode picked its Arabic code. > I take on board totally what you say about this being an industry body that > has purely commercial interests. These characters/glyphs combinations, > whatever you want to call them, relate only to qur'aan and islamic books > (of which there are a huge number) which are almost entirely produced in the > arab world which does not have a loud voice in these commercial global > organisations. The real discussions are not based on loud voices but on competent ones. Unfortunately the really competent people in the field of Qur'an and Islamic books have either been too modest to participate in these discussions, or their hearts are elsewhere. > Nevertheless it is in their interests to facilitate the production of these > books, particularly bearing in mind the way the IT world is changing in the > arab world. Copyright law is now being taken much more seriously and the > time will come when people have to actually pay for products such as Windows > and PageMaker. While perhaps they cannot get away without the DTP software, > they would like to use Open Source (read: Free) operating systems. If the OS > cannot support the output of these characters then it will hobble take up of > paid-for software products. If these characters are not added then there is > a strong chance that these OSes will never support the tajweed marks for the > technical reasons I mentioned above and will not be able to be used for > preparing a large body of books. This is of prime importance and should be given the highest possible priority By far the majority of the competent people in the industry - most likely including you yourself - want to be remunerated for their work and skills, or the want be concerned. I do not yet understand how Open Sources - without some kind of funding - cannot but push Arabic technology into the farthest possible margin of the computer globalization drive. > Although the chances of these changes as you say may be slim, we have a far > greater hope with your support. Please consider it. I sympathize with the idea, in fact I am actively looking for ways to get this off the ground myself. But without a generous party funding it, I would just should myself in the foot. Regards, t _______________________________________________ General mailing list [EMAIL PROTECTED] http://lists.arabeyes.org/mailman/listinfo/general

