Hi,
2012/3/8 Andreas Lehmkuehler <[email protected]> > Hi, > > Am 07.03.2012 09:15, schrieb Leleu Eric: > > Hi all, >> >> >> I'm currently working on the preflight issue PDFBOX-1236 [1] >> >> The error seems to come from the management of the "toUnicode" CMap in a >> Type0 font. >> >> The "toUnicode" CMap overrides the "Encoding" CMap of the font. Due to >> this >> behaviour, >> the preflight validator receives the unicode value for each character code >> present in a Text operator instead of the CID value present in the >> Encoding >> CMap. >> > Can you give me a pointer where in the preflight code that exactly happens. > > You can find the Text validation in the "org.apache.padaf.preflight.contentstream.ConstentStreamWrapper" class. The method is validText(byte[] string). We ask the character to the font.encode method to know how many bytes are used to describe the CID. When we have the CID, the checkCID on the "org.apache.padaf.preflight.font.CFFType2FontContainer" is called and an exception occurred when we search the GlyphId with this CID. If I comment the initialization of the toUnicode map, I found the right glyphs. The first one is the 'W' glyph58 linked to the CID 1. (If I extract the font and I read it with fontforge, the glyph 58 is the 'W' too) > So I have two questions : >> - Is the "Encoding overriding" the right thing to do ? >> - Why the "toUnicode" Cmap is used to display text? According to my >> understanding of the PDF References v1.7, the toUnicode CMap is used to >> extract Text from a PDF File and to create a text file with unicode >> characters. To display the text on a PDFReader, the font content and the >> Encoding Cmap seem enough. >> > PDFBox uses Graphics2d#drawString and newly java.awt.Font#**createGlyphVector > to render the text. The text as to be provided as unicode string when > calling those methods. > IMO we have to change that in the longrun. It would be better to create > the glyphs using the font directly instead of converting it to an AWT-font. > I don't need to render the Text in the preflight component, I only check that the glyph is present and I check the consistency of the width. Bypass the AWT-Font will be great but it is a huge work. > What is your point of view about these two points? >> > Probably we can find a workaround for your issue, but I need some more > details on how the preflight code works (see above). > > > BR, >> Eric >> >> [1] >> https://issues.apache.org/**jira/browse/PDFBOX-1236<https://issues.apache.org/jira/browse/PDFBOX-1236> >> > > BR > Andreas Lehmkühler > BR Eric
