Alex created PDFBOX-1535:
----------------------------
Summary: Extract text from PDF cause Nullpointer Exception in
PDFStreamEngine.processEncodedText Method
Key: PDFBOX-1535
URL: https://issues.apache.org/jira/browse/PDFBOX-1535
Project: PDFBox
Issue Type: Bug
Components: Text extraction
Affects Versions: 1.7.1
Environment: jdk 1.7_17
Reporter: Alex
Priority: Critical
Attachments: 1.pdf
The xpdfbin-win-3.03 -> pdftotext.exe works fine with this pdf File.
Tried pdfbox Version 1.2.1 too, but same error.
[org.apache.pdfbox.util.PDFStreamEngine] java.lang.NullPointerException
java.lang.NullPointerException
at
org.apache.pdfbox.util.PDFStreamEngine.processEncodedText(PDFStreamEngine.java:357)
at
org.apache.pdfbox.util.operator.ShowTextGlyph.process(ShowTextGlyph.java:62)
at
org.apache.pdfbox.util.PDFStreamEngine.processOperator(PDFStreamEngine.java:556)
at
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:270)
at
org.apache.pdfbox.util.PDFStreamEngine.processSubStream(PDFStreamEngine.java:237)
at
org.apache.pdfbox.util.PDFStreamEngine.processStream(PDFStreamEngine.java:217)
at
org.apache.pdfbox.util.PDFTextStripper.processPage(PDFTextStripper.java:448)
at
org.apache.pdfbox.util.PDFTextStripper.processPages(PDFTextStripper.java:372)
at
org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:328)
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira