[
https://issues.apache.org/jira/browse/PDFBOX-1037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Thomas Chojecki updated PDFBOX-1037:
------------------------------------
Attachment: blankpageproblemmod.png
On this image you can see a part of the document inside a editor. on windows
notepad++ is a good one to open pdfs.
the first object 1 0 obj is the document catalog and the root object of the
document. this object has a entry /Pages this give the parser information about
the page structure.
This Pages entry refer to the 5 0 obj. there you can see the page count and you
can also see that the count is 1. this one page is specific in the 4 0 obj,
you can modify pdfs without changing the old content with a incremental update.
so the new writer need to write a new document catalog and overwrite the old
one. this is never done in you pdf.
so the function count of the pdfbox read this count number you can see and
return it to you without searching for the real amount of pages
> PDF with multiple %%EOF only parses one page
> --------------------------------------------
>
> Key: PDFBOX-1037
> URL: https://issues.apache.org/jira/browse/PDFBOX-1037
> Project: PDFBox
> Issue Type: Bug
> Components: Parsing
> Affects Versions: 1.5.0
> Environment: Windows XP - Java SE 1.6
> Reporter: Abraham Farris
> Attachments: blankpageproblemmod.pdf, blankpageproblemmod.png
>
>
> Any type of page counts (getDocumentCatalog().getPages().getCount()) only
> return int 1. Doing a simple .load and .save will strip out all pages after
> the first.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira