Arcane Jill posted:

(A) A proposed character will be rejected if its glyph is identical in
appearance to that of an extant glyph, regardless of its semantic
meaning,

Obviously not.


Unicode encodes characters not glyphs. That particular glyphs of one character are normally indistinguishable from particular glyphs of another character (though perhaps in a different style) does not mean that the characters themselves would be usefully unified.

Examples from the recent past are the deunification of Coptic from Greek and the introduction of numerous Latin alphabet letter forms in various styles as mathematical characters.

(B) A proposed character will be rejected if its semantic meaning is
identical to that of an extant character, regardless of the appearance
of its glyph,

Obviously not.


For example, that a proposed character has the approximate semantic value of IPA _b_ doesn't mean that it should be taken as just a variant glyph of IPA _b_ and coded as U+0062. By that rule a large number of uncoded scripts could be easily coded by assigning the glyphs to encoded glyphs of approximately the same meaning and using a font change to render the script.

But changing to a different script by a font change (as opposed to a different style of the same script) is not Unicode philosophy except in the case of cipher character sets.

(C) A proposed character will be rejected if either (A) or (B) are true

A redundant suggestion.


However if both (A) *and* (B) were true there would be less likelihood that a new encoded character would be of value, especially if users are already *happily* using a character already coded in Unicode.

However if the normal glyphs of a proposed new character were mostly identical to normal glyphs of an already encoded character and the proposed new character also had meanings associated with it which mostly corresponded to the meanings associated to the same already encoded character then it is quite likely that there would be seen to be no need to encode the proposed new character.

But even that would not be a rule.

If, for example, in a particular script that has yet to be encoded it chanced that the character used for the normal sound indicated by IPA _b_ actually looked like Latter letter _b_, it would still likely be encoded as part of that script.

The separate encoding of Coptic characters is one precedent not forced by compatibility with previous character encodings.

By another precedent, in the case of punctuation characters and diacritical marks similarity of form with already encoded characters bears more weight than it does with non-punctuation characters and non-diacritical characters.

(D) None of the above

True.


Though of course these are points that would be considered in coming to a decision.

There is a debated area here, which comes to the fore on occasion, for example in regards to old Semitic scripts and whether particular Semitic scripts should be lumped together or distinguished by separate encodings.

When the question of unifying or distinguishing between characters is considered, it seems to me that the most important question is how confusing or useful it would be to unify or distinguish between those particular characters from the point of view of current users or expected users.

Unicode should do what is most useful.

Honest debate does arise, because what is useful in one sphere or from one point of view may cause problems in another sphere or from another point of view. Sometimes there is no definite correct answer.

Jim Allan



Reply via email to