[
https://issues.apache.org/jira/browse/PDFBOX-2733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14385378#comment-14385378
]
Tilman Hausherr edited comment on PDFBOX-2733 at 3/28/15 7:19 PM:
------------------------------------------------------------------
The file is malformed. Here's an excerpt of the validation from PDF-Tools (this
is for PDF/A-1b, I've deleted the parts only relevant to PDF/A):
{quote}
Validating file "PDFBOX-2733.pdf" for conformance level pdfa-1b
The 'xref' keyword was not found or the xref table is malformed.
The file trailer dictionary is missing or invalid.
The comment, classifying the file as containing 8-bit binary data, is missing.
The file trailer dictionary must have an id key.
The file format (header, trailer, objects, xref, streams) is corrupted.
{quote}
One of the causes is this:
{code}
<< /Prev 0 /Root 5 0 R /Size 6 >>
{code}
Definition of Prev:
{quote}
The byte offset from the beginning of the file to the beginning of the previous
cross-reference section.
{quote}
So it makes no sense that it is 0. Adobe Reader offers to save the file when
closing. It does this when the file is broken.
I'll test a small fix. But if you can, you should return the scanner to the
seller :-)
was (Author: tilman):
The file is broken. Here's an excerpt of the validation from PDF-Tools (this is
for PDF/A-1b, I've deleted the parts only relevant to PDF/A):
{quote}
Validating file "PDFBOX-2733.pdf" for conformance level pdfa-1b
The 'xref' keyword was not found or the xref table is malformed.
The file trailer dictionary is missing or invalid.
The comment, classifying the file as containing 8-bit binary data, is missing.
The file trailer dictionary must have an id key.
The file format (header, trailer, objects, xref, streams) is corrupted.
{quote}
One of the causes is this:
{code}
<< /Prev 0 /Root 5 0 R /Size 6 >>
{code}
Definition of Prev:
{quote}
The byte offset from the beginning of the file to the beginning of the previous
cross-reference section.
{quote}
So it makes no sense that it is 0. Adobe Reader offers to save the file when
closing. It does this when the file is broken.
I'll test a small fix. But if you can, you should return the scanner to the
seller :-)
> Nullpointer exception in PDFXrefStreamParser.parse
> --------------------------------------------------
>
> Key: PDFBOX-2733
> URL: https://issues.apache.org/jira/browse/PDFBOX-2733
> Project: PDFBox
> Issue Type: Bug
> Components: Parsing
> Affects Versions: 1.8.9, 1.8.10, 2.0.0
> Environment: windows 7
> Reporter: jerome girardini
> Attachments: scan-canon-windows8.pdf
>
>
> with some pdf, an nullpointer is sent during the parsing
> +{quote}
> Here is the trace :
> Caused by: java.lang.NullPointerException
> at
> org.apache.pdfbox.pdfparser.PDFXrefStreamParser.parse(PDFXrefStreamParser.java:91)
> at
> org.apache.pdfbox.pdfparser.COSParser.parseXrefStream(COSParser.java:1836)
> at
> org.apache.pdfbox.pdfparser.COSParser.parseXrefObjStream(COSParser.java:320)
> at org.apache.pdfbox.pdfparser.COSParser.parseXref(COSParser.java:280)
> at
> org.apache.pdfbox.pdfparser.PDFParser.initialParse(PDFParser.java:314)
> at org.apache.pdfbox.pdfparser.PDFParser.parse(PDFParser.java:373)
> at ch.ge.afc.ael.commun.piecejointe.UtiPdf.loadDocument(UtiPdf.java:439)
> {quote}+
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]