As Chris said, the CFF font doesn't contain the encoding information, so decompressing the font won't help you.
CID CFF fonts are indexed using "CIDs" (character IDs), which usually come from a particular "character collection". (This is all Adobe terminology.) In your case, the character collection, as specified in the CIDSystemInfo dictionary, is Adobe-Japan1. You can get the relevant specs here: https://www.adobe.com/devnet/font.html For PDF, you'll also need to look at the font's encoding, which may be "Identity", meaning that the character codes are the CIDs; or it may be a CMap, which maps (possibly variable length) character codes to CIDs. The CMap can either be the name of a predefined CMap (those CMap files are available at the adobe link above), or a CMap stream. Either way, you'll need to parse the CMap -- the CMap format spec is also on that web page. To summarize: * get the character code from the PDF content stream * map the character code to CID, using a CMap * use the CID to get a glyph from the font If you want to map to Unicode (rather than drawing a character from the font), replace that last step with: map the CID to Unicode, using the character collection data. - Derek On 2017 Mar 22, tony smith wrote: > Thanks for the quick reply. > > I've attached an example pdf file, which displays a single Chinese > character. > The font details and the string in the pdf data are both readable. > > The pdf string contains the single character <07F4>. > Also the relevant font has the following entry > /CIDSystemInfo << /Registry (Adobe)/Ordering (Japan1)/Supplement 4 >> > > Fontforge has a file which maps these CID values to the equivalent Unicode > values. > So in this case the Chinese character is \u9AD8 > But Freetype has no way to workout what the Unicode mapping is for a CFF. > > So if my understanding is correct I need to uncompress the CFF file and > workout the relevant mapping from the CMap information. > > Thanks for you time, Tony Smith. > > > On 22/03/17 14:08, tony smith wrote: >> Hello >> >> I'm using Freetype to display fonts embedded in pdf data. >> >> My problem seems to be with an embedded CCF file, where the CID >> mappings is only defined in the pdf data. >> I load a Compact File Format (CFF) stream into memory. >> But when I try to select a character map, using FT_Select_CharMap, I >> get error 0x26, Invalid_CharMap_Handle. >> I've attached the CFF, font.cff, and a simple c++ program, cff.c, >> which highlights the problem. >> >> From the pdf file I know that the font is a CID one. >> So you can work out the mapping from the PDF strings to Unicode strings. >> But this information doesn't seem to be available in the CFF stream. >> So my question is how do I handle this situation? >> I can't see any way in Freetype to add the missing character encoding. >> Do I have to decompress the CFF stream, which should be a type1 font >> and then explicitly add the missing character encoding information? > > CFF != compressed type 1. Although the imaging models, and many concepts > are shared, they are not the same. > > CFF CIDFonts don't contain any encoding information, and there is no set > glyph ordering, and there is no built-in character ID to glyph index > mapping (like, for example, a TTF/OTF cmap table). > > To actually use a CIDFont you need to use the CMap (NOTE: not "cmap", but > "CMap") specified by the PDF - that can either be custom, embedded in the > PDF, or a named reference to a standard CMap (available from Adobe, but I > can't remember right now where). > > Parsing the CMap table will give you a mapping from character code to CID, > which you can then use to pull the glyph from the font. (In Postscript > terms you "compose" the CIDFont with the CMap to create a usable instance > of the CIDFont). > > CMaps are defined in a highly restricted variant of Postscript, so should > be fairly easy to parse. > > Chris > > _______________________________________________ > Freetype mailing list > [email protected] > https://lists.nongnu.org/mailman/listinfo/freetype > > -- > This e-mail message has been scanned and cleared by Google Message Security > and the UNICOM Global security systems. This message is for the named > person's use only. If you receive this message in error, please delete it > and notify the sender. _______________________________________________ Freetype mailing list [email protected] https://lists.nongnu.org/mailman/listinfo/freetype
