Around 1 o'clock on Jun 30, Pablo Saratxaga wrote:
> What are those glyphs? (I'm quite surprised, I would have expected the > opposite: fonts generally have more glyphs than the standard encodings of > the sio-8859 family for example) My definition of language tag is coloured by the OS/2 table codePageRange bits from which is was originally defined in fontconfig. Those bits are defined to map to specific Windows code pages; the Latin-1 case doesn't map to ISO 8859-1, but rather to code page 1252 for which many fonts are missing a few random entries. Similarly for the other tags, the existing fonts that I have don't generally seem to cover the complete windows code page from which the codePageRange bit was derived. > No, the tolerance for missing glyphs in CJK tests should be the same or > even smaller. The difference is that it isn't needed to test all the glyphs > for CJK coverages; testing only a set of 256 choose glyphs would be enough > (if they are correctly choosen, testing that 256 glyphs are present in a > font is enough to assure, with 99.99% of confidence, that it covers a given > CJK language). I'm not confident enough of this approach; I fear that any set of 256 glyphs that must appear in a simplified Chinese font may well appear in many traditional Chinese (or even Japanese) fonts. Certainly we could experimentally determine a reasonable subset, and it's completely trivial to change the matching table used in the code. > Of course, complete checking can also be done, but I wonder if it is > actually useful (I mean, is there a font suitable for simplified chinese > out there that doesn't encode all the characters of gb2312? It seems that this must be the case -- I set the '500' number so high because all of the fonts which I have that advertise support for simplified Chinese are missing over 200 glyphs from GB2312. I got similar results for Japanese fonts, Korean Wansung fonts and traditional Chinese fonts. I would need a significantly larger set of fonts than I currently have access to if I wanted to generate smaller test char sets. Now that the tests stand in isolation, perhaps those skilled with particular languages can develop more specific tests. > But to handle such case, I think it would be better to choose a given > definition of "big5" (or several of them) and stick to it, rather than > allowing a so tremendously big hole as 500 possible missing chars. Missing 500 from a repertoire of nearly 20000 doesn't seem to render most of these fonts unusable. Keith Packard XFree86 Core Team HP Cambridge Research Lab _______________________________________________ Fonts mailing list [EMAIL PROTECTED] http://XFree86.Org/mailman/listinfo/fonts
