[ 
https://issues.apache.org/jira/browse/PDFBOX-1037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Thomas Chojecki updated PDFBOX-1037:
------------------------------------

    Attachment: blankpageproblemmod.png

On this image you can see a part of the document inside a editor. on windows 
notepad++ is a good one to open pdfs.

the first object 1 0 obj is the document catalog and the root object of the 
document. this object has a entry /Pages this give the parser information about 
the page structure.

This Pages entry refer to the 5 0 obj. there you can see the page count and you 
can also see that the count is 1. this one page is specific in the 4 0 obj,


you can modify pdfs without changing the old content with a incremental update. 
so the new writer need to write a new document catalog and overwrite the old 
one. this is never done in you pdf.

so the function count of the pdfbox read this count number you can see and 
return it to you without searching for the real amount of pages

> PDF with multiple %%EOF only parses one page
> --------------------------------------------
>
>                 Key: PDFBOX-1037
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1037
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 1.5.0
>         Environment: Windows XP - Java SE 1.6
>            Reporter: Abraham Farris
>         Attachments: blankpageproblemmod.pdf, blankpageproblemmod.png
>
>
> Any type of page counts (getDocumentCatalog().getPages().getCount()) only 
> return int 1.  Doing a simple .load and .save will strip out all pages after 
> the first.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to