Abdulhaq Lynch wrote:
The thing is that the contemporary Qur'an printings are almost completely
render-able today with Unicode using a character-based (not glyph based)
encoding scheme, only a few mode codepoints need to be addded that's it.
The XML elements and other such high level semantics we are talking about
address what is beyond the rendering, i.e. text analysis. So the rendering
problem is almost solved, IMHO.
Hi Mete
one problem is that the rendering problem has been 'almost solved' for a long
time now.
Hi,
Also, keep in mind that rendering is not the only purpose of text
encoding. Machine manipulation of the text is equally important. We
want to search for stuff, sort things, etc. based on the "natural"
semantics of written Arabic. The most fundamental problem with Unicode
is precisely that it is optimized for certain classes of language. It's
a surface encoding, which works great for languages like English, which
have a surface orthography. But for a language like Arabic, with a more
complex relation between orthography and lexical structure, such an
encoding design falls far short of what could be done. The restriction
of Unicode to visual abstract semantics represents a subtle (and no
doubt unintentional) bias. That's why I recommend disregarding Unicode
and designing from the ground up to satisfy the needs of the
Arabic-speaking community.
-gregg
_______________________________________________
General mailing list
[email protected]
http://lists.arabeyes.org/mailman/listinfo/general