Andries Brouwer wrote:
Arabic needs tagging of glyphs as being `initial', `medial', `final',
and `isolated', as specified in the Unicode book.  Since this is
identical for all fonts the OpenType designers have decided to make
this information not being part of the font itself.

I just had to struggle with this a little.

The ARABIC LETTER HEH (U+0647) is a letter with 4 glyph forms.
In Kurdish (written in the Sorani, essential arabic, alphabet)
one has two letters (let me call them Kurdish H and Kurdish E)
and these 4 glyph forms become the two forms of Kurdish H
and the two forms of Kurdish E.
Now these four glyphs are tagged with `initial', `medial', `final',
and `isolated', and that is correct if the glyphs are used to write
arabic, but incorrect when precisely the same glyphs are used
to write Kurdish.

I wonder what the correct way is to write Kurdish in Unicode
(without using language tagging).
Are new Unicode code points needed? Do these exist already?

A similar situation exists in Uyghur. The way it is solved currently is to use the characters in the Arabic Presentation Form blocks of Unicode. I will have to check if it is possible to use Unicode to write about Uyghur in Kurdish, or about Kurdish in Uyghur without language tags.

But this is not relevant to the console font topic. All I can say to Rich is that his best course of action will be to go off and implement his vision. There is no better way to understand why the font situation is the way the way it is today.

I have believed for many years that software in general is getting hideously complicated and clumsy. I don't complain about it much any more because I know the effort required to find simpler solutions. And simpler solutions are often ignored in favor of something that works.
--
---------------------------------------------------------------------------
Mark Leisher
Computing Research Lab             Nowadays, the common wisdom is to
New Mexico State University        celebrate diversity - as long as you
Box 30001, MSC 3CRL                don't point out that people are
Las Cruces, NM  88003              different.    -- Colin Quinn

--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Reply via email to