"Arnd Hanses" <[EMAIL PROTECTED]> wrote:
> -
> + name.subst('=DF', 'z'); /* AHanses: no more trouble with 'umlauts'? */=
>
> + name.subst('=E4', 'a');
> + name.subst('=F6', 'o');
> + name.subst('=FC', 'u');
> + name.subst('=E6', 'e');
> + name.subst('=F1', 'n');
> + name.subst('=C4', 'A'); /* AHanses: For OS/2 'lowercase()' won't help=
> */
> + name.subst('=D6', 'O');
> + name.subst('=DC', 'U');
> + name.subst('=C6', 'A');
> + name.subst('=D1', 'U');
> +#warning AHanses: List is incomplete. Please find a more generic soluti=
> on.
> +
As I said, don't do this here. It is much more efficient to bitand 0x7f
to whole the string:
LString& LString::discardSign()
{
for (int i=0; i<length() ; i++)
p->s[i] |= 0x7f;
return *this;
}
ISO-8859-x and EUC (Extended Unix Code) works just fine with this.
And also as I said, this must be called before the special characters
handling is performed here *BUT* after 0xaf, 0xba and 0xdc are
converted to something different.
The only remaining problem is that Microsoft/IBM codepages, Shift-JIS
and Big5 utilizes the region 0x80-0x9f which will be converted to
non-printable control characters. E.g.
# Format: Three tab-separated columns (Sorry tabs are expanded)
# Column #1 is the cp850_DOSLatin1 code (in hex)
# Column #2 is the Unicode (in hex as 0xXXXX)
# Column #3 is the Unicode name (follows a comment sign, '#')
0x80 0x00c7 #LATIN CAPITAL LETTER C WITH CEDILLA
0x81 0x00fc #LATIN SMALL LETTER U WITH DIAERESIS
0x82 0x00e9 #LATIN SMALL LETTER E WITH ACUTE
0x83 0x00e2 #LATIN SMALL LETTER A WITH CIRCUMFLEX
0x84 0x00e4 #LATIN SMALL LETTER A WITH DIAERESIS
0x85 0x00e0 #LATIN SMALL LETTER A WITH GRAVE
0x86 0x00e5 #LATIN SMALL LETTER A WITH RING ABOVE
0x87 0x00e7 #LATIN SMALL LETTER C WITH CEDILLA
0x88 0x00ea #LATIN SMALL LETTER E WITH CIRCUMFLEX
0x89 0x00eb #LATIN SMALL LETTER E WITH DIAERESIS
0x8a 0x00e8 #LATIN SMALL LETTER E WITH GRAVE
0x8b 0x00ef #LATIN SMALL LETTER I WITH DIAERESIS
0x8c 0x00ee #LATIN SMALL LETTER I WITH CIRCUMFLEX
0x8d 0x00ec #LATIN SMALL LETTER I WITH GRAVE
0x8e 0x00c4 #LATIN CAPITAL LETTER A WITH DIAERESIS
0x8f 0x00c5 #LATIN CAPITAL LETTER A WITH RING ABOVE
0x90 0x00c9 #LATIN CAPITAL LETTER E WITH ACUTE
0x91 0x00e6 #LATIN SMALL LIGATURE AE
0x92 0x00c6 #LATIN CAPITAL LIGATURE AE
0x93 0x00f4 #LATIN SMALL LETTER O WITH CIRCUMFLEX
0x94 0x00f6 #LATIN SMALL LETTER O WITH DIAERESIS
0x95 0x00f2 #LATIN SMALL LETTER O WITH GRAVE
0x96 0x00fb #LATIN SMALL LETTER U WITH CIRCUMFLEX
0x97 0x00f9 #LATIN SMALL LETTER U WITH GRAVE
0x98 0x00ff #LATIN SMALL LETTER Y WITH DIAERESIS
0x99 0x00d6 #LATIN CAPITAL LETTER O WITH DIAERESIS
0x9a 0x00dc #LATIN CAPITAL LETTER U WITH DIAERESIS
0x9b 0x00f8 #LATIN SMALL LETTER O WITH STROKE
0x9c 0x00a3 #POUND SIGN
0x9d 0x00d8 #LATIN CAPITAL LETTER O WITH STROKE
0x9e 0x00d7 #MULTIPLICATION SIGN
0x9f 0x0192 #LATIN SMALL LETTER F WITH HOOK
Regards,
SMiyata