At 22:45 +0000 2004-11-07, Peter Kirk wrote:
You have indeed stated an intention to encode "significant nodes".
Yes. Based on the scholarly taxonomy of writing systems.
But the official documentation, the Unicode Standard, does not say anything like this.
Alarm! Alarm! I detect a desire on your part to consider informative, explanatory text as normative.
Rather, it states that Unicode encodes "Characters, Not Glyphs", and that "Characters are the abstract representations of the smallest components of written language that have SEMANTIC value" (TUS section 2.2 p.15, my emphasis on "SEMANTIC").
Yes. ARABIC LETTER SHEEN is a different letter, and a different character from SYRIAC LETTER SHIN. DEVANAGARI LETTER KA is a different letter, and a different character, from ORIYA LETTER KA. PHOENICIAN LETTER NUN is a different letter, and a different character, from HEBREW LETTER NUN.
And, Michael, I think you have agreed with me, and so with many scholars of Semitic languages, that the distinction between corresponding Phoenician and Hebrew letters (like that between corresponding Devanagari and Gujarati letters) is not a semantic one.
LETTERS differ by semantics. SCRIPTS differ by other criteria WHETHER OR NOT TEXT AFFIRMING THIS HAS BEEN WRITTEN INTO THE UNICODE STANDARD YET.
The conclusion we reach from reading the Standard is that these distinctions are glyph distinctions and so should not be encoded.
You're wrong. You ignore the historical node-based distinctions which differentiate the Indic scripts one from the other, and which apply equally to Phoenician and Hebrew. And no, Fraktur and S�tterlin are not the same sort of thing.
If it is indeed the position of the UTC that corresponding characters in "significant node" scripts should be encoded despite the lack of semantic distinctiveness,
This is YOUR requisite.
I would like to suggest an amendment to the standard to make this principle clear. This would of course have to be agreed with WG2. Until such an amendment has been put in place, there will continue to be opposition to encoding of any new scripts which do not show clear semantic distinctiveness and so appear to be in breach of the principles of the Standard.
You're mistaken in your application of the concept of "semantic distinctiveness" with regard to script identity.
--
Michael Everson * * Everson Typography * * http://www.evertype.com

