PDFParser fails if object/xref starts at same line as endobj of a stream object
-------------------------------------------------------------------------------
Key: PDFBOX-818
URL: https://issues.apache.org/jira/browse/PDFBOX-818
Project: PDFBox
Issue Type: Bug
Components: Parsing
Affects Versions: 1.3.0
Reporter: Timo Boehme
If an object or xref starts at same line after the 'endobj' token and the
closed object contains a stream, parsing of next object fails.
Example:
endstream
endobj xref
0 26
In PDFParser if an object contains a stream the 'endobj' token is read via
readLine(). Thus the line break is consumed as well. Now the 'endobj' with
following command is handled but only 'xref' is pushed back and not the line
break which results in 'xref0' when trying to read next pbject. Thus in this
case a simple solution is to push back a space byte before the 'xref'.
I will add a patch for it.
Part of the problem can be seen in PDF from
http://onlinelibrary.wiley.com/doi/10.1111/j.1399-6576.2009.02134.x/pdf at last
'endobj'. However the last object does not contain a stream and I was not able
to produce such a PDF (the PDFs I have containing described problematic
construct are unfortunately confidential).
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.