[
https://issues.apache.org/jira/browse/PDFBOX-1467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13539547#comment-13539547
]
Andreas Lehmkühler commented on PDFBOX-1467:
--------------------------------------------
I can't reproduce the issue. Everything works fine for me.
Can you be more specific? Are you using your own code, or some of the PDFBox
tools like ExtractText?
> PDocumentCatalog.getAllPages returns empty list for certain pdfs, affects
> many other methods as well
> ----------------------------------------------------------------------------------------------------
>
> Key: PDFBOX-1467
> URL: https://issues.apache.org/jira/browse/PDFBOX-1467
> Project: PDFBox
> Issue Type: Bug
> Affects Versions: 1.7.1
> Reporter: Peter Lehto
> Priority: Critical
> Attachments: cd17ac7f-675c-4cc8-859b-5bd9d509cb1a.pdf
>
>
> Originally found from PageExtractor and after some debugging, it seems that
> PDocumentCatalot.getAllPages returns an empty list for certain pdfs. Also
> calling PDDocument.getNumberOfPages returns 0 as it uses the catalog for
> getting the actual information. This goes all the way down to
> COSDictionary.getDictionaryObject, which returns null for COSName.PAGES.
> Eventually everything that has something do with page numbers fails. For
> example saving document to stream etc.
> This problems occurs with certain pdf documents. I suspect they have some
> kind of different structure or header information or possibly even corrupted
> header. With other pdf files this problem does not exist. The non working pdf
> files are still accessible through other software like Adobe Reader and they
> work.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira