> OK, so let's go through the font: when I decode it with ftdump, I > get the following entires for name and family: > > font family (ID 1) [Microsoft] (language=0x0804): > "\U+009E\U+004F" > full name (ID 4) [Microsoft] (language=0x0804): > "\U+009E\U+004F" > > When I read the related data via the freetype-functions, I get back > > string=žÑOSýýýýØ8hW
Uh, oh, please tell us the byte values (in '0xXX' notation)! Everything else won't survice e-mail encoding/decoding without distortions. > string_len=4 > > ...means the žÑOS-part of the string is valid. But this in no case > decodes to 0x9E 0x00 0x4F 0x00! Welcome to Mojibake hell. The following possibilities come to my mind; there are certainly even more possibilities to screw up. (1) Wrong byte order. (2) Wrong encoding, for example interpreting GB2312 characters as UTF-8. (3) Ditto, but mixing up with UCS4 – or vice versa. It can also be combination of (1) to (3). My advice: Forget it. Either suppress invalid data, or simply follow the 'garbage in, garbage out' principle. Werner