PDFParser fails if object/xref starts at same line as endobj of a stream object
-------------------------------------------------------------------------------

                 Key: PDFBOX-818
                 URL: https://issues.apache.org/jira/browse/PDFBOX-818
             Project: PDFBox
          Issue Type: Bug
          Components: Parsing
    Affects Versions: 1.3.0
            Reporter: Timo Boehme


If an object or xref starts at same line after the 'endobj' token and the 
closed object contains a stream, parsing of next object fails.
Example:
endstream
endobj xref
0 26
In PDFParser if an object contains a stream the 'endobj' token is read via 
readLine(). Thus the line break is consumed as well. Now the 'endobj' with 
following command is handled but only 'xref' is pushed back and not the line 
break which results in 'xref0' when trying to read next pbject. Thus in this 
case a simple solution is to push back a space byte before the 'xref'.
I will add a patch for it.
Part of the problem can be seen in PDF from 
http://onlinelibrary.wiley.com/doi/10.1111/j.1399-6576.2009.02134.x/pdf at last 
'endobj'. However the last object does not contain a stream and I was not able 
to produce such a PDF (the PDFs I have containing described problematic 
construct are unfortunately confidential).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to