Aw: Re: Re: Re: Native TTF name sometimes contains crap

virtual_worlds Sat, 04 Sep 2021 10:41:40 -0700

Yes, for sure, hex values are more accurate.

So ftdump returns "\U+009E\U+004F" which is the correct name, so ftdump is 
doing something I do not know about.


When I call the get-name-function as shown, the returned value is 0x7e 0xd1 
0x4f 0x53 So when it is a Mojibake-problem - is ftdump workarounding this? If 
yes: how?


> Gesendet: Freitag, 03. September 2021 um 15:16 Uhr
> Von: "Werner LEMBERG" <[email protected]>
> An: [email protected]
> Cc: [email protected]
> Betreff: Re: Aw: Re: Re: Native TTF name sometimes contains crap
>
> > OK, so let's go through the font: when I decode it with ftdump, I > get the 
> > following entires for name and family: > > font family (ID 1) [Microsoft] 
> > (language=0x0804): > "\U+009E\U+004F" > full name (ID 4) [Microsoft] 
> > (language=0x0804): > "\U+009E\U+004F" > > When I read the related data via 
> > the freetype-functions, I get back > > string=žÑOSýýýýØ8hW Uh, oh, please 
> > tell us the byte values (in '0xXX' notation)! Everything else won't survice 
> > e-mail encoding/decoding without distortions. > string_len=4 > > ...means 
> > the žÑOS-part of the string is valid. But this in no case > decodes to 0x9E 
> > 0x00 0x4F 0x00! Welcome to Mojibake hell. The following possibilities come 
> > to my mind; there are certainly even more possibilities to screw up. (1) 
> > Wrong byte order. (2) Wrong encoding, for example interpreting GB2312 
> > characters as UTF-8. (3) Ditto, but mixing up with UCS4 – or vice versa. It 
> > can also be combination of (1) to (3). My advice: Forget it. Either 
> > suppress invalid data, or simply follow the 'garbage in, garbage out' 
> > principle. Werner

Aw: Re: Re: Re: Native TTF name sometimes contains crap

Reply via email to