[jira] [Commented] (PDFBOX-1467) PDocumentCatalog.getAllPages returns empty list for certain pdfs, affects many other methods as well

JIRA Wed, 26 Dec 2012 06:36:17 -0800

    [ 
https://issues.apache.org/jira/browse/PDFBOX-1467?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13539547#comment-13539547
 ]


Andreas Lehmkühler commented on PDFBOX-1467:
--------------------------------------------

I can't reproduce the issue. Everything works fine for me.

Can you be more specific? Are you using your own code, or some of the PDFBox 
tools like ExtractText?
                
> PDocumentCatalog.getAllPages returns empty list for certain pdfs, affects 
> many other methods as well
> ----------------------------------------------------------------------------------------------------
>
>                 Key: PDFBOX-1467
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1467
>             Project: PDFBox
>          Issue Type: Bug
>    Affects Versions: 1.7.1
>            Reporter: Peter Lehto
>            Priority: Critical
>         Attachments: cd17ac7f-675c-4cc8-859b-5bd9d509cb1a.pdf
>
>
> Originally found from PageExtractor and after some debugging, it seems that 
> PDocumentCatalot.getAllPages returns an empty list for certain pdfs. Also 
> calling PDDocument.getNumberOfPages returns 0 as it uses the catalog for 
> getting the actual information. This goes all the way down to 
> COSDictionary.getDictionaryObject, which returns null for COSName.PAGES.
> Eventually everything that has something do with page numbers fails. For 
> example saving document to stream etc.
> This problems occurs with certain pdf documents. I suspect they have some 
> kind of different structure or header information or possibly even corrupted 
> header. With other pdf files this problem does not exist. The non working pdf 
> files are still accessible through other software like Adobe Reader and they 
> work.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (PDFBOX-1467) PDocumentCatalog.getAllPages returns empty list for certain pdfs, affects many other methods as well

Reply via email to