On 2015/02/19 20:47, Julian Bradfield wrote:
On 2015-02-19, Eli Zaretskii <[email protected]> wrote:
Does anyone know why does the UCD define compatibility decompositions
for Arabic initial, medial, and final forms, but doesn't do the same
for Hebrew final letters, like U+05DD HEBREW LETTER FINAL MEM? Or for
that matter, for U+03C2 GREEK SMALL LETTER FINAL SIGMA?
As far as I understand it:
In Arabic, the variant of a letter is determined entirely by its
position, so there is no compelling need to represent the forms separately
(as characters rather than glyphs) save for the existence of legacy
standards (and if there is, you can use the ZWJ/ZWNJ hacks). Thus the
forms would not have been encoded but for the legacy standards.
Whereas in Hebrew, non-final forms appear finally in certain contexts
in normal text; and in Greek, while Greek text may have a determinate
choice between σ and ς, there are many contexts where the two symbols
are distinguished (not least maths).
Digging a bit deeper, the phenomenon of a letter changing shape
depending on position is pervasive in Arabic, and involves complicated
interdependencies across multiple characters in good-quality typography.
But in Hebrew, this phenomenon is minor, and marginal in Greek, and
typographic interactions are also very limited.
That led to (after some initial tries with alternatives) different
encoding models. In Arabic, shaping is the job of the rendering engine,
whereas in Hebrew and Greek, it's part of the encoding.
As for determinate choice between σ and ς, John Cowan once gave an
example of a Greek word (composed of two original words) with a final
sigma in the middle.
Regards, Martin.
_______________________________________________
Unicode mailing list
[email protected]
http://unicode.org/mailman/listinfo/unicode