As Chris said, the CFF font doesn't contain the encoding information, so
decompressing the font won't help you.

CID CFF fonts are indexed using "CIDs" (character IDs), which usually
come from a particular "character collection".  (This is all Adobe
terminology.)

In your case, the character collection, as specified in the
CIDSystemInfo dictionary, is Adobe-Japan1.  You can get the relevant
specs here:

https://www.adobe.com/devnet/font.html

For PDF, you'll also need to look at the font's encoding, which may be
"Identity", meaning that the character codes are the CIDs; or it may be
a CMap, which maps (possibly variable length) character codes to CIDs.

The CMap can either be the name of a predefined CMap (those CMap files
are available at the adobe link above), or a CMap stream.  Either way,
you'll need to parse the CMap -- the CMap format spec is also on that
web page.

To summarize:
* get the character code from the PDF content stream
* map the character code to CID, using a CMap
* use the CID to get a glyph from the font

If you want to map to Unicode (rather than drawing a character from the
font), replace that last step with: map the CID to Unicode, using the
character collection data.

- Derek


On 2017 Mar 22, tony smith wrote:
> Thanks for the quick reply.
> 
> I've attached an example pdf file, which displays a single Chinese
> character.
> The font details and the string in the pdf data are both readable.
> 
> The pdf string contains the single character <07F4>.
> Also the relevant font has the following entry
>   /CIDSystemInfo << /Registry (Adobe)/Ordering (Japan1)/Supplement 4 >>
> 
> Fontforge has a file which maps these CID values to the equivalent Unicode
> values.
> So in this case the Chinese character is \u9AD8
> But Freetype has no way to workout what the Unicode mapping is for a CFF.
> 
> So if my understanding is correct I need to uncompress the CFF file and
> workout the relevant mapping from the CMap information.
> 
> Thanks for you time, Tony Smith.
> 
> 
> On 22/03/17 14:08, tony smith wrote:
>> Hello
>>
>> I'm using Freetype to display fonts embedded in pdf data.
>>
>> My problem seems to be with an embedded CCF file, where the CID
>> mappings is only defined in the pdf data.
>> I load a Compact File Format (CFF) stream into memory.
>> But when I try to select a character map, using FT_Select_CharMap, I
>> get error 0x26, Invalid_CharMap_Handle.
>> I've attached the CFF, font.cff, and a simple c++ program, cff.c,
>> which highlights the problem.
>>
>> From the pdf file I know that the font is a CID one.
>> So you can work out the mapping from the PDF strings to Unicode strings.
>> But this information doesn't seem to be available in the CFF stream.
>> So my question is how do I handle this situation?
>> I can't see any way in Freetype to add the missing character encoding.
>> Do I have to decompress the CFF stream, which should be a type1 font
>> and then explicitly add the missing character encoding information?
> 
> CFF != compressed type 1. Although the imaging models, and many concepts
> are shared, they are not the same.
> 
> CFF CIDFonts don't contain any encoding information, and there is no set
> glyph ordering, and there is no built-in character ID to glyph index
> mapping (like, for example, a TTF/OTF cmap table).
> 
> To actually use a CIDFont you need to use the CMap (NOTE: not "cmap", but
> "CMap") specified by the PDF - that can either be custom, embedded in the
> PDF, or a named reference to a standard CMap (available from Adobe, but I
> can't remember right now where).
> 
> Parsing the CMap table will give you a mapping from character code to CID,
> which you can then use to pull the glyph from the font. (In Postscript
> terms you "compose" the CIDFont with the CMap to create a usable instance
> of the CIDFont).
> 
> CMaps are defined in a highly restricted variant of Postscript, so should
> be fairly easy to parse.
> 
> Chris
> 
> _______________________________________________
> Freetype mailing list
> [email protected]
> https://lists.nongnu.org/mailman/listinfo/freetype
> 
> --
> This e-mail message has been scanned and cleared by Google Message Security
> and the UNICOM Global security systems. This message is for the named
> person's use only. If you receive this message in error, please delete it
> and notify the sender.


_______________________________________________
Freetype mailing list
[email protected]
https://lists.nongnu.org/mailman/listinfo/freetype

Reply via email to