[ 
https://issues.apache.org/jira/browse/PDFBOX-3179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15076557#comment-15076557
 ] 

Tilman Hausherr commented on PDFBOX-3179:
-----------------------------------------

That PDF file is broken. Open it with NOTEPAD++ and you'll see lines like this:
{code}
0000000016 00000 n
0000000944 00000 n
0000001087 00000 n
0000001423 00000 n
0000001743 00000 n
0000001882 00000 n
0000001995 00000 n
0000002106 00000 n
0000002175 00000 n
0000002255 00000 n
0000003075 00000 n
0000003351 00000 n
0000003520 00000 n
0000003545 00000 n
0000000777 00000 n
0000000616 00000 n
{code}
Now use CTRL-G and chose go to position (not go to line). Then enter one of the 
values in the left column, e.g. 16. That should position right before "15 0 
obj" but it doesn't. Either the software that created the PDF file is buggy 
(which I doubt, it is Adobe), or you (or your client) transferred in ASCII mode 
over ftp or another communication method.

PDFBox can handle a lot of corrupt files, but obviously not this one.

> PDDocument.load() Error: Expected a long type at offset 2, instead got 
> 'DF-1.4'
> -------------------------------------------------------------------------------
>
>                 Key: PDFBOX-3179
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3179
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 2.0.0
>         Environment: Mac OS X 10.10.5
>            Reporter: Jedrzej Majko
>
> Simple PDDocument.load failed to heal attached PDF (pdfbox 2.0.0 RC2):
> Exception in thread "main" java.io.IOException: Error: Expected a long type 
> at offset 2, instead got 'DF-1.4'
>       at org.apache.pdfbox.pdfparser.BaseParser.readLong(BaseParser.java:1340)
>       at 
> org.apache.pdfbox.pdfparser.BaseParser.readObjectNumber(BaseParser.java:1268)
>       at 
> org.apache.pdfbox.pdfparser.COSParser.parseXrefObjStream(COSParser.java:321)
>       at org.apache.pdfbox.pdfparser.COSParser.parseXref(COSParser.java:287)
>       at 
> org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:189)
>       at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:246)
>       at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:855)
>       at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:811)
>       at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:765)
> Reproduced using 2.0.0 RC2 from maven and with code from trunk svn.
> File in question:
> http://coobers.com/bucket/ikona_free.pdf



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to