[jira] Commented: (PDFBOX-818) PDFParser fails if object/xref starts at same line as endobj of a stream object

Martijn Brinkers (JIRA) Sun, 21 Nov 2010 13:25:40 -0800

    [ 
https://issues.apache.org/jira/browse/PDFBOX-818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12934353#action_12934353
 ]


Martijn Brinkers commented on PDFBOX-818:
-----------------------------------------

I have tried to replicate the problem with the PDF from the link but I can 
extract text from the PDF without any problems.

> PDFParser fails if object/xref starts at same line as endobj of a stream 
> object
> -------------------------------------------------------------------------------
>
>                 Key: PDFBOX-818
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-818
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 1.3.1
>            Reporter: Timo Boehme
>         Attachments: pdfbox_issue818.patch
>
>
> If an object or xref starts at same line after the 'endobj' token and the 
> closed object contains a stream, parsing of next object fails.
> Example:
> endstream
> endobj xref
> 0 26
> In PDFParser if an object contains a stream the 'endobj' token is read via 
> readLine(). Thus the line break is consumed as well. Now the 'endobj' with 
> following command is handled but only 'xref' is pushed back and not the line 
> break which results in 'xref0' when trying to read next pbject. Thus in this 
> case a simple solution is to push back a space byte before the 'xref'.
> I will add a patch for it.
> Part of the problem can be seen in PDF from 
> http://onlinelibrary.wiley.com/doi/10.1111/j.1399-6576.2009.02134.x/pdf at 
> last 'endobj'. However the last object does not contain a stream and I was 
> not able to produce such a PDF (the PDFs I have containing described 
> problematic construct are unfortunately confidential).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (PDFBOX-818) PDFParser fails if object/xref starts at same line as endobj of a stream object

Reply via email to