Andries Brouwer wrote:
Arabic needs tagging of glyphs as being `initial', `medial', `final',
and `isolated', as specified in the Unicode book. Since this is
identical for all fonts the OpenType designers have decided to make
this information not being part of the font itself.
I just had to struggle with this a little.
The ARABIC LETTER HEH (U+0647) is a letter with 4 glyph forms.
In Kurdish (written in the Sorani, essential arabic, alphabet)
one has two letters (let me call them Kurdish H and Kurdish E)
and these 4 glyph forms become the two forms of Kurdish H
and the two forms of Kurdish E.
Now these four glyphs are tagged with `initial', `medial', `final',
and `isolated', and that is correct if the glyphs are used to write
arabic, but incorrect when precisely the same glyphs are used
to write Kurdish.
I wonder what the correct way is to write Kurdish in Unicode
(without using language tagging).
Are new Unicode code points needed? Do these exist already?
A similar situation exists in Uyghur. The way it is solved currently is
to use the characters in the Arabic Presentation Form blocks of Unicode.
I will have to check if it is possible to use Unicode to write about
Uyghur in Kurdish, or about Kurdish in Uyghur without language tags.
But this is not relevant to the console font topic. All I can say to
Rich is that his best course of action will be to go off and implement
his vision. There is no better way to understand why the font situation
is the way the way it is today.
I have believed for many years that software in general is getting
hideously complicated and clumsy. I don't complain about it much any
more because I know the effort required to find simpler solutions. And
simpler solutions are often ignored in favor of something that works.
--
---------------------------------------------------------------------------
Mark Leisher
Computing Research Lab Nowadays, the common wisdom is to
New Mexico State University celebrate diversity - as long as you
Box 30001, MSC 3CRL don't point out that people are
Las Cruces, NM 88003 different. -- Colin Quinn
--
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/