[
https://issues.apache.org/jira/browse/PDFBOX-3179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15076557#comment-15076557
]
Tilman Hausherr commented on PDFBOX-3179:
-----------------------------------------
That PDF file is broken. Open it with NOTEPAD++ and you'll see lines like this:
{code}
0000000016 00000 n
0000000944 00000 n
0000001087 00000 n
0000001423 00000 n
0000001743 00000 n
0000001882 00000 n
0000001995 00000 n
0000002106 00000 n
0000002175 00000 n
0000002255 00000 n
0000003075 00000 n
0000003351 00000 n
0000003520 00000 n
0000003545 00000 n
0000000777 00000 n
0000000616 00000 n
{code}
Now use CTRL-G and chose go to position (not go to line). Then enter one of the
values in the left column, e.g. 16. That should position right before "15 0
obj" but it doesn't. Either the software that created the PDF file is buggy
(which I doubt, it is Adobe), or you (or your client) transferred in ASCII mode
over ftp or another communication method.
PDFBox can handle a lot of corrupt files, but obviously not this one.
> PDDocument.load() Error: Expected a long type at offset 2, instead got
> 'DF-1.4'
> -------------------------------------------------------------------------------
>
> Key: PDFBOX-3179
> URL: https://issues.apache.org/jira/browse/PDFBOX-3179
> Project: PDFBox
> Issue Type: Bug
> Components: Parsing
> Affects Versions: 2.0.0
> Environment: Mac OS X 10.10.5
> Reporter: Jedrzej Majko
>
> Simple PDDocument.load failed to heal attached PDF (pdfbox 2.0.0 RC2):
> Exception in thread "main" java.io.IOException: Error: Expected a long type
> at offset 2, instead got 'DF-1.4'
> at org.apache.pdfbox.pdfparser.BaseParser.readLong(BaseParser.java:1340)
> at
> org.apache.pdfbox.pdfparser.BaseParser.readObjectNumber(BaseParser.java:1268)
> at
> org.apache.pdfbox.pdfparser.COSParser.parseXrefObjStream(COSParser.java:321)
> at org.apache.pdfbox.pdfparser.COSParser.parseXref(COSParser.java:287)
> at
> org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:189)
> at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:246)
> at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:855)
> at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:811)
> at org.apache.pdfbox.pdmodel.PDDocument.load(PDDocument.java:765)
> Reproduced using 2.0.0 RC2 from maven and with code from trunk svn.
> File in question:
> http://coobers.com/bucket/ikona_free.pdf
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]