Hi! I'm asked about Ukrainian support in dosemu. If I understand correctly (I'm not sure =)), we need specific tables in extra_charsets.
Main idea is to get specific characters that is not in koi8-r already as they don't used by Russians. According to Roman Czyborra's great document (http://czyborra.com/charsets/cyrillic.html) and to Unicode layout (http://www.unicode.org/charts/PDF/U0400.pdf) we can get next characters in same cp866 internal_charset: 8bit - Unicode -------------- 0xF2 - 0x0404 CYRILLIC CAPITAL LETTER UKRAINIAN IE 0xF3 - 0x0454 CYRILLIC SMALL LETTER UKRAINIAN IE 0xF4 - 0x0407 CYRILLIC CAPITAL LETTER YI 0xF5 - 0x0457 CYRILLIC SMALL LETTER YI just with returning them back (they are substituted with other helpful characters, but I think this is not correct). We miss 0x0491 CYRILLIC SMALL LETTER GHE WITH UPTURN 0x0490 CYRILLIC CAPITAL LETTER GHE WITH UPTURN as there isn't place for this symbols in cp866 and 0xF6 - 0x040E CYRILLIC CAPITAL LETTER SHORT U 0xF7 - 0x045E CYRILLIC SMALL LETTER SHORT U as there are not present in koi8-u charset (but they are in cp1251 so this is a subject to discuss). Symbols 0x0406 CYRILLIC CAPITAL LETTER BYELORUSSIAN-UKRAINIAN I 0x0456 CYRILLIC SMALL LETTER BYELORUSSIAN-UKRAINIAN I in koi8-u can be mapped to 0x0049 LATIN CAPITAL LETTER I 0x0069 LATIN SMALL LETTER I as they are very similar and aren't present in cp866. I made needed tables for cp1251 and koi8-u with attached Perl script. It generates exact copy of koi8-r.c so I think it's correct =) (checked with 'diff -Bubi'). Koi8-u table is corrected by hand a little. Tables are in attached patch. Also there is change for two cyrillic vga fonts that make them able to display symbols 0xF2, 0xF3, 0xF4, 0xF5. As I mentioned above it would be nice to implement symbols 0xF6, 0xF7 there, but I don't have request for them so not sure if they are really needed. -- Grigory Batalov.
dosemu-1.1.4.13-extra_charsets.diff.bz2
Description: BZip2 compressed data
#!/usr/bin/perl -w
use strict;
require Unicode::Map8;
my ($locale, $_locale);
$_locale = $locale = "koi8-r";
$_locale =~ s/-/_/g;
my $map = Unicode::Map8->new($locale);
my ($i, $j);
print "\x23include \"translate.h\"
static const t_unicode ${_locale}_c1_chars[] = {\n";
for ($i=0;$i<4;$i++)
{ for ($j=0;$j<8;$j++)
{ printf("0x%04X, ", $map->to_char16(0x80 + $i*8 + $j)); }
printf("/* 0x%02X-0x%02X */\n", 0x80 + $i*8, 0x80 + $i*8 + 7);
}
print "};
struct char_set ${_locale}_c1 = {
1,
CHARS(${_locale}_c1_chars),
0, \"\", 0, 32,
};
static const t_unicode ${_locale}_g1_chars[] = {\n";
for ($i=0;$i<12;$i++)
{ for ($j=0;$j<8;$j++)
{ printf("0x%04X, ", $map->to_char16(0xA0 + $i*8 + $j)); }
printf("/* 0x%02X-0x%02X */\n", 0xA0 + $i*8, 0xA0 + $i*8 + 7);
}
print "};
struct char_set ${_locale}_g1 = {
1,
CHARS(${_locale}_g1_chars),
0, \"\", 1, 96,
};
struct char_set $_locale = {
.c0 = &ascii_c0,
.g0 = &ascii_g0,
.c1 = &${_locale}_c1,
.g1 = &${_locale}_g1,
.names = { \"$locale\", 0 },
};
struct char_set ${_locale}_safe = {
.c0 = &ascii_c0,
.g0 = &ascii_g0,
.c1 = &ascii_c1,
.g1 = &${_locale}_g1,
.names = { \"${locale}-safe\", 0 },
};
CONSTRUCTOR(static void init(void))
{
register_charset(&$_locale);
register_charset(&${_locale}_safe);
}\n";
