[ 
https://issues.apache.org/jira/browse/PDFBOX-4052?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Lehmkühler resolved PDFBOX-4052.
----------------------------------------
       Resolution: Fixed
    Fix Version/s: 2.0.9

The parser doesn't stumble upon the garbage at the beginning of the pdf but has 
a problem to find the object when performing the brute force search. The object 
identifiers aren't separated by spaces (1 0 obj) but by newlines. I've adjusted 
the brute force search and now everything looks fine.
PDFBOX-4049 still doesn't work, there is something else wrong besides the 
garbage and the beginning/end of the file.

> Number '------------06836305' is getting too long, stop reading at offset 36
> ----------------------------------------------------------------------------
>
>                 Key: PDFBOX-4052
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4052
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 2.0.8
>         Environment: Windows 10
>            Reporter: savan patel
>            Assignee: Andreas Lehmkühler
>            Priority: Major
>             Fix For: 2.0.9
>
>         Attachments: b14f8bd0-e9d9-4d0c-97b8-2bad2c20e250.pdf
>
>
> Bug in parsing the pdf.
> {code}
> java.io.IOException: Number '------------06836305' is getting too long, stop 
> reading at offset 36
>     
> org.apache.pdfbox.pdfparser.BaseParser.readStringNumber(BaseParser.java:1388)
>     org.apache.pdfbox.pdfparser.BaseParser.readLong(BaseParser.java:1349)
>     
> org.apache.pdfbox.pdfparser.BaseParser.readObjectNumber(BaseParser.java:1286)
>     org.apache.pdfbox.pdfparser.COSParser.parseFileObject(COSParser.java:822)
>     
> org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:804)
>     
> org.apache.pdfbox.pdfparser.COSParser.parseObjectDynamically(COSParser.java:743)
>     
> org.apache.pdfbox.pdfparser.COSParser.parseTrailerValuesDynamically(COSParser.java:2676)
>     org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:193)
>     org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:240)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to