[
https://issues.apache.org/jira/browse/PDFBOX-897?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12934110#action_12934110
]
Martijn Brinkers commented on PDFBOX-897:
-----------------------------------------
What happens is this:
encoding is null until encoding is initialized when /Encoding ... array is
found. However if the /Encoding...array line is not matched, which happens for
example with this line "/Encoding 256 array 0 1 255 {1 index exch /.notdef put}
for" because the line does not end with array, encoding remains null. When the
line that starts with "dup" is found, a NullPointerException will be thrown
because encoding is null. The patch makes sure that the encoding is detected
even when "array" is somewhere in the middle
> NullPointerException PDFFont#getEncodingFromFont with a PDF book because
> Type1Encoding is null
> ----------------------------------------------------------------------------------------------
>
> Key: PDFBOX-897
> URL: https://issues.apache.org/jira/browse/PDFBOX-897
> Project: PDFBox
> Issue Type: Bug
> Components: Parsing
> Affects Versions: 1.3.1
> Reporter: Martijn Brinkers
> Attachments: PDFBOX-897.patch
>
>
> A NullPointerException was thrown while extracting text from a PDF ebook. The
> exception was thrown in
> PDFFont#getEncodingFromFont line:
> [snip]
> encoding.addCharacterEncoding(index, name.replace("/", ""));
> [snip]
> encoding was null. The line that was scanned was "/Encoding 256 array 0 1 255
> {1 index exch /.notdef put} for". The array check however only checks for
> line.endsWith("array"). The NPE was fixed when using line.contains("array")
> instead.
> I have added a patch. The PDF is a PDF book with copyright so it cannot be
> attached as an example. The meta data of the document was:
> Acrobat Distiller 7.0 (Windows)
> PScript5.dll Version 5.2.2
> PDF-1.6
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.