[jira] [Commented] (PDFBOX-1037) PDF with multiple %%EOF only parses one page

Thomas Chojecki (JIRA) Mon, 27 Jun 2011 23:43:08 -0700

    [ 
https://issues.apache.org/jira/browse/PDFBOX-1037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13056342#comment-13056342
 ]


Thomas Chojecki commented on PDFBOX-1037:
-----------------------------------------

i you checked out the sources, you can try open the PDFParser.java and change 
the if statement at line 880.

change
if(document.getXrefTable().containsValue(offset))

to 
if(true)

and try again to read the document.
if this helps, the xref table is broken and the offsets doesn't match the real 
object position, so the pdfbox skip such objects.

> PDF with multiple %%EOF only parses one page
> --------------------------------------------
>
>                 Key: PDFBOX-1037
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1037
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 1.5.0
>         Environment: Windows XP - Java SE 1.6
>            Reporter: Abraham Farris
>         Attachments: blankpageproblemmod.pdf, blankpageproblemmod.png
>
>
> Any type of page counts (getDocumentCatalog().getPages().getCount()) only 
> return int 1.  Doing a simple .load and .save will strip out all pages after 
> the first.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PDFBOX-1037) PDF with multiple %%EOF only parses one page

Reply via email to