Hi, this is my next thoughts about different cyrillic charsets in dosemu. CP1125 ------ Andy Shevchenko <[EMAIL PROTECTED]> kindly reported that there is nice DOS encoding for Ukrainian usage called CP1125. It contains all Ukrainian symbols and is approved of by Ukraine government. Great job for supporting it is done in ASPLinux's dosemu RPM package.
I didn't found better visual description of CP1125 so used this page for reference: http://www.ic-chernobyl.kiev.ua/~porokh/cyr/index.htm It seems to be quite correct. CP1125 differs from CP866 in most upper characters with codes 0xF2-0xF9: 0x0490, /* 0xF2 - CYRILLIC CAPITAL LETTER GHE WITH UPTURN */ 0x0491, /* 0xF3 - CYRILLIC SMALL LETTER GHE WITH UPTURN */ 0x0404, /* 0xF4 - CYRILLIC CAPITAL LETTER UKRAINIAN IE */ 0x0454, /* 0xF5 - CYRILLIC SMALL LETTER UKRAINIAN IE */ 0x0406, /* 0xF6 - CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I */ 0x0456, /* 0xF7 - CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I */ 0x0407, /* 0xF8 - CYRILLIC CAPITAL LETTER YI */ 0x0457, /* 0xF9 - CYRILLIC SMALL LETTER YI */ So I made cp1125.c by changing unicode values for these characters in cp866.c. KOI8-U ------ KOI8-U is described in RFC2319: http://rfc.net/rfc2319.html According to it, Perl Unicode::Map8 module gives wrong value for character 0xB4 - 0x0403 when it must be 0x0404 - CYRILLIC CAPITAL LETTER UKRAINIAN IE. CP1251, CP866 ------------- cp866.c and cp1251.c are also generated by Unicode::Map8 and I hope they are correct =). You can find listings at: http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/PC/CP866.TXT http://www.unicode.org/Public/MAPPINGS/VENDORS/MICSFT/WINDOWS/CP1251.TXT Again I vote for returning back characters 0xF2-0xF7, 0xFC and 0xFD in cp866 because we have to comply some common rules (Unicode in this case). KOI8-RU ------- KOI8-RU is described in RFC draft: http://cad.ntu-kpi.kiev.ua/multiling/koi8-ru/rfc-draft-koi8-ru.txt Table is derived from koi8-r.c by replacing changed codes. Unicode::CharName Perl module was used for Unicode names. Character 0xB4 points to 0x0403 while must point to 0x0404. External/internal ----------------- Encodings above combine in following charset pairs: $_external_char_set $internal_char_set Russian: koi8-r cp866 cp1251 cp866 cp866 cp866 Ukrainian: cp1251 cp1125 cp1125 cp1125 koi8-u cp1125 koi8-ru cp1125 Files ----- cp866.tar.bz2 - changes in cp866 table and fonts, cyr_ua.tar.bz2 - other tables and cp1125 Xfonts derived from cp866 Xfonts. -- Grigory Batalov.
cp866.tar.bz2
Description: BZip2 compressed data
cyr_ua.tar.bz2
Description: BZip2 compressed data
