Re: Proposal for the Basis of a Codepoint Extension to Unicode forthe Encoding of the Quranic Manuscripts

Gregg Reynolds Wed, 22 Jun 2005 00:03:58 -0700

Abdulhaq Lynch wrote:

The thing is that the contemporary Qur'an printings are almost completely
render-able today with Unicode using a character-based (not glyph based)
encoding scheme, only a few mode codepoints need to be addded that's it.
The XML elements and other such high level semantics we are talking about
address what is beyond the rendering, i.e. text analysis. So the rendering
problem is almost solved, IMHO.


 Hi Mete

one problem is that the rendering problem has been 'almost solved' for a longtime now.

Hi,

Also, keep in mind that rendering is not the only purpose of textencoding. Machine manipulation of the text is equally important. Wewant to search for stuff, sort things, etc. based on the "natural"semantics of written Arabic. The most fundamental problem with Unicodeis precisely that it is optimized for certain classes of language. It'sa surface encoding, which works great for languages like English, whichhave a surface orthography. But for a language like Arabic, with a morecomplex relation between orthography and lexical structure, such anencoding design falls far short of what could be done. The restrictionof Unicode to visual abstract semantics represents a subtle (and nodoubt unintentional) bias. That's why I recommend disregarding Unicodeand designing from the ground up to satisfy the needs of theArabic-speaking community.


-gregg

_______________________________________________
General mailing list
[email protected]
http://lists.arabeyes.org/mailman/listinfo/general

Re: Proposal for the Basis of a Codepoint Extension to Unicode forthe Encoding of the Quranic Manuscripts

رد على