[
https://issues.apache.org/jira/browse/PDFBOX-3536?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jay Askren updated PDFBOX-3536:
-------------------------------
Attachment: UnknownDirObject.pdf
We are getting a similar error with the attached pdf:
java.io.IOException: Unknown dir object c=')' cInt=41 peek=')' peekInt=41 at
offset 6313406
at
org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:954)
at
org.apache.pdfbox.pdfparser.BaseParser.parseCOSArray(BaseParser.java:654)
at
org.apache.pdfbox.pdfparser.PDFStreamParser.parseNextToken(PDFStreamParser.java:175)
at
org.apache.pdfbox.contentstream.PDFStreamEngine.processStreamOperators(PDFStreamEngine.java:502)
at
org.apache.pdfbox.contentstream.PDFStreamEngine.processStream(PDFStreamEngine.java:469)
at
org.apache.pdfbox.contentstream.PDFStreamEngine.processPage(PDFStreamEngine.java:150)
at
org.apache.pdfbox.text.LegacyPDFStreamEngine.processPage(LegacyPDFStreamEngine.java:139)
at
org.apache.pdfbox.text.PDFTextStripper.processPage(PDFTextStripper.java:391)
at
org.apache.pdfbox.text.PDFTextStripper.processPages(PDFTextStripper.java:319)
at
org.apache.pdfbox.text.PDFTextStripper.writeText(PDFTextStripper.java:266)
.
.
.
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748) Running outputter XRIF 2.0
Outputter
> IOException "Invalid dictionary, found: 'r' but expected: '/' at offset 1148"
> on a valid PDF
> --------------------------------------------------------------------------------------------
>
> Key: PDFBOX-3536
> URL: https://issues.apache.org/jira/browse/PDFBOX-3536
> Project: PDFBox
> Issue Type: Bug
> Components: Parsing
> Affects Versions: 2.0.3
> Environment: Windows 7 x64, JVM 1.8.0_101
> Reporter: Seva Alekseyev
> Attachments: resulprovao.pdf, UnknownDirObject.pdf
>
>
> On the attached file, which loads fine with Adobe Reader, the
> PDDocument.load() methpod throws the following error:
> java.io.IOException: Unknown dir object c='>' cInt=62 peek='>' peekInt=62 at
> offset 1196
> at
> org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:982)
> at
> org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionaryValue(BaseParser.java:153)
> at
> org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionaryNameValuePair(BaseParser.java:277)
> at
> org.apache.pdfbox.pdfparser.BaseParser.parseCOSDictionary(BaseParser.java:210)
> at
> org.apache.pdfbox.pdfparser.BaseParser.parseDirObject(BaseParser.java:885)
> at
> org.apache.pdfbox.pdfparser.COSParser.parseFileObject(COSParser.java:757)
> at
> org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:726)
> at
> org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:657)
> at
> org.apache.pdfbox.pdfparser.COSParser.parseTrailerValuesDynamically(COSParser.java:2092)
> at
> org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:203)
> at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:252)
> at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:957)
> at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:913)
> at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:861)
> at Temp.PDFTemp.App.main(App.java:19)
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]