OK, so let's go through the font: when I decode it with ftdump, I get the following entires for name and family:
font family (ID 1) [Microsoft] (language=0x0804): "\U+009E\U+004F" full name (ID 4) [Microsoft] (language=0x0804): "\U+009E\U+004F" When I read the related data via the freetype-functions, I get back string=žÑOSýýýýØ8hW string_len=4 ...means the žÑOS-part of the string is valid. But this in no case decodes to 0x9E 0x00 0x4F 0x00! > Gesendet: Donnerstag, 02. September 2021 um 15:21 Uhr > Von: "Werner LEMBERG" <w...@gnu.org> > An: virtual_wor...@gmx.de > Cc: freetype@nongnu.org > Betreff: Re: Aw: Re: Native TTF name sometimes contains crap > > > I _only_ make use of data where the encode-ID is set to > > > TT_MS_ID_UNICODE_CS. From this I would assume, all data in related > > > "string" member come with the same encoding and therefore have to be > > > used/decoded in the same way. Is this correct? In theory, this is correct. > > However, ... > But what I notice, is that this is true for all Russian > > fonts I have > and for about 50% of the Chinese fonts. But when decoding > > the > "string" data of the remaining 50% Chinese fonts (which also have > > > the encode-ID TT_MS_ID_UNICODE_CS), I get the mentioned crap. So > this > > seems like there is any other property one has to check when > decoding the > > names? ... some old fonts (and I guess you have only problems with old > > fonts) don't follow those rules. In other words, they are buggy. This might > > be due to bad font generators, or intentionally faking entries to be > > compatible with old, buggy software, etc., etc. I suggest that you analyze > > the 'name' tables of the problematic fonts with the `ttx` disassembler from > > the 'fonttools' bundle. If ttx produces correct results while you get > > incorrect results with FreeType, please give more details. Werner