[
https://issues.apache.org/jira/browse/PDFBOX-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14977261#comment-14977261
]
Tilman Hausherr commented on PDFBOX-3066:
-----------------------------------------
There's this piece of code in PDType1CFont:
{code}
// extract from Type1 font/substitute
if (genericFont instanceof EncodedFont)
{
//FIXME dead instanceof
return Type1Encoding.fromFontBox(((EncodedFont)
genericFont).getEncoding());
}
else
{
// default (only happens with TTFs)
return StandardEncoding.INSTANCE;
}
{code}
My IDE shows a warning "Dead instanceof. FontBoxFont cannot be assigned to
EncodedFont". However genericFont is of type CFFType1Font which implements both
EncodedFont and FontBoxFont, so that part isn't dead code.
The embedded CFFType1Font has an encoding table that tells that 48 is
"parenright". For some reason Adobe ignores it and we don't.
> Text getting garbled in this file, was Ok in 1.8
> ------------------------------------------------
>
> Key: PDFBOX-3066
> URL: https://issues.apache.org/jira/browse/PDFBOX-3066
> Project: PDFBox
> Issue Type: Bug
> Components: Text extraction
> Affects Versions: 2.0.0
> Reporter: Joel Hirsh
> Attachments: PDFBOX-3066-reduced.pdf, garbled.pdf
>
>
> Attached file, PrintTextLocations shows text garbled, like *,%-))’))
> Acrobat copy/paste shows accurate text, and was also fine in 1.8.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]