---------- Original Message ----------------------------------
From: Abdulhaq Lynch <[EMAIL PROTECTED]>
>I don't see why we should battle with an encoding that was invented when there 
>was no clear seperation between semantic characters and glyphs, and by 
>someone who didn't even under those circumstances think the whole thing 
>through. I also suspect that they did not understand the science of tajweed 
>but simply had a look at a couple of mashafs and made certain incorrect 
>assumptions about the glyphs they saw.
>
>Adding some new codepoints has the great benefit of totally seperating tajweed 
>marks (which are nothing to do with grammar by the way) from the actual text, 
>making searching trivial. It allows the rendering application to apply 
>whatever local rules apply for that rule ( a meem here, a circle there, two 
>staggered or horizontal fathas etc).

When I referred to the the tajweed rules referred as aspects of grammar I meant 
so that they are aspects of grammar as much as i`raab are aspects of grammar. 
Basically with the Unicode Arabic block it is almost possible to encode the 
Quran cleanly from a graphemic perspective. But the kind of higher level 
encoding scheme you are referring to gets too much involved in the specifics of 
the Arabic language rather than the Arabic script, it goes into language 
encoding rather than script encoding. Since Unicode is intended for encoding 
scripts not languages this kind of high level encoding would be outside the 
scope of Unicode anyways. But nonetheless private use area could be used for 
such a project. Please see:
http://www.unicode.org/standard/supported.html
"The Unicode Character Standard primarily encodes scripts rather than 
languages. That is, where more than one language shares a set of symbols that 
have a historically related derivation, the union of the set of symbols of each 
such language is unified into a single collection identified as a single 
script."
Also http://www.unicode.org/faq/basic_q.html#17

>I agree about the XML too, in fact that was my first thought, but the other 
>great benefit of the new codepoints is that the text stream can be passed 
>directly to an OpenType renderer without processing XML.

I really think XML is the way to go for language encoding. Script encoded text 
would further be encoded with higher level language-specific XML tags in order 
to give the text language-specific semantic emphasis. It seems like this is 
where the text industry is going towards. XML over Unicode. Unicode for script 
level encoding. XML for higher level semantic. So I would be biased towards the 
XML option.

Kind regards,
Mete

--
Mete Kural
Touchtone Corporation
714-755-2810
--
_______________________________________________
General mailing list
[email protected]
http://lists.arabeyes.org/mailman/listinfo/general

رد على