[ https://issues.apache.org/jira/browse/PDFBOX-1137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andreas Lehmkühler resolved PDFBOX-1137. ---------------------------------------- Resolution: Fixed Fix Version/s: 1.7.0 Assignee: Andreas Lehmkühler That's a good point! I added the patch as proposed in revision 1183015 Thanks! > PDSimpleFont.determineEncoding will never parse embedded CMAPs > -------------------------------------------------------------- > > Key: PDFBOX-1137 > URL: https://issues.apache.org/jira/browse/PDFBOX-1137 > Project: PDFBox > Issue Type: Bug > Components: Text extraction > Affects Versions: 1.6.0 > Reporter: Antoni Mylka > Assignee: Andreas Lehmkühler > Fix For: 1.7.0 > > Attachments: pdfbox-1137.patch > > > The enconding of a PDSimpleFont is determined in determineEncoding. It > contains a series of ifs. Most notably at the end there is a: > if (encoding instanceof COSDictionary) { ... } > else if (encoding instanceof COSStream) { ... } > This is wrong because COSStream is a subclass of COSDictionary, so the > program will never get into the COSStream-specific block, which is > responsible for the parsing of embedded CMAPs. The solution would be to > reverse the order of those ifs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira