On Thursday, 4. October 2001 17:48, Alexander Voropay wrote:
> From: Edmund GRIMLEY EVANS <[EMAIL PROTECTED]> wrote:
> >The first problem is to provide a good transliteration
> >table: glibc and libiconv don't transliterate Cyrillics, I
> > think, so can anyone recommend such a table?
>
> You could translate Unicode to KOI8-R table and clear 8-th
> bit. The KOI8-* charsets was invented to work at non 8-bit
> cleant equipment in mid-80. Practically all Russians can read
> such transliteration.
Edmund was talking about a "good transliteration" ;o)
I don't know anything about translitering legacy characters,
such as Yat', so I can only talk about the Russian alphabet.
АБВГДЕЁЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ
абвгдеёжзийклмнопрстуфхцчшщъыьэюя
I can't find the relevant ISO standard, but IIRC, the
transliterated alphabet should look like:
ABVGDEËŽZIJKLMNOPRSTUFHCČŠŜ"Y'ĖÛÂ
abvgdeëžzijklmnoprstufhcčšŝ"y'ėûâ
It is not possible to encode this in ISO-8559-1, because Ž, Č,
Š, Ŝ and Ė, as well as their lowercase representations, are not
available. ISO-8559-2 looks much better, it only lacks Ŝ and Ė.
Maybe they can be replaced by Ş(ş) and Ę(ę), or with some other
similar characters. Maybe Č(č) should be represented as Ç(ç),
which is also present in ISO-8559-1.
When the ISO-8559-2 encoded transliteration is interpreted as
ISO-8559-1, some characters will be misrepresented and look like
this:
ABVGDEË®ZIJKLMNOPRSTUFHCÇ©ª"Y'ÊÛÂ
abvgdeë¾zijklmnoprstufhc繺"y'êûâ
However, this still looks better than decaputated KOI8-R:
abwgde3vzijklmnoprstufhc~{}yx|`q
ABWGDE#VZIJKLMNOPRSTUFHC^[]_YX\@Q
BTW: Why the hell are upper and lower case interchanged in
KOI8-R???
OK, HTH
Thomas
}:o{#
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/