Many TrueType fonts include an OS/2 table which holds codePageRange bits.
These bits indicate the old OS/2 code pages supported by the font, and
hence indirectly indicate which languages the font is intended to support.
These tables, however, are quite primitive, indicating support for only a
very few languages as they hold only 64 bits total.
My question is whether I should take these TrueType fonts and test them
against my new coverage tables, at least for languages which aren't
covered by the codePageRange bits.
I now have coverage information for 76 of the 139 ISO 639-1 language
names; I used the Unicode code charts to mark coverage for the Indic
languages and a few other scripts:
Bengali (BN)
Tibetan (BO)
Gujarati (GU)
Khmer (KM)
Kannada (KN)
Lao (LO)
Malayalam (ML)
Mongolian (MN)
Oriya (OR)
Sinhala (Sinhalese) (SI)
Tamil (TA)
Telugu (GE)
Tagalog (TL)
Given that these languages have unique alphabets, this method seems
relatively sound. I'm still missing several Indic languages and
all of the non-arabic African languages.
I did remove the @ and ` marks from the latin scripts; that should leave
all of them including only the alphabet.
I've also committed this whole mess to XFree86 CVS; the coverage
files can be found in xc/lib/fontconfig/fc-lang/*.orth
Keith Packard XFree86 Core Team HP Cambridge Research Lab
_______________________________________________
Fonts mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/fonts