[ 
https://issues.apache.org/jira/browse/PDFBOX-2715?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andreas Lehmkühler closed PDFBOX-2715.
--------------------------------------
       Resolution: Fixed
    Fix Version/s: 1.8.8
                   2.0.0
         Assignee: Andreas Lehmkühler

Both pdfs are malformed. I had a deeper look at 
IT-11557_pdf_broken_pages_F150317DYCELZZ.pdf. It contains two doubled objects 
16 0 and 17 0 and two objects are missing 36 0 and 53 0. If one compares the 
offsets of the missing objects within the xref table and the objects of the 
doubled objects it's obvious that these are the doubled ones. As both missing 
objects aren't needed it's safe to skip them. So does the non sequential parser 
as Maruan already pointed out.

Closed as fixed


> Pages in a PDF being dropped with just an error-log message
> -----------------------------------------------------------
>
>                 Key: PDFBOX-2715
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2715
>             Project: PDFBox
>          Issue Type: Bug
>          Components: PDModel
>    Affects Versions: 1.8.8
>         Environment: Linux, Java 7. 
>            Reporter: Cecilie Fritzvold
>            Assignee: Andreas Lehmkühler
>             Fix For: 2.0.0, 1.8.8
>
>         Attachments: IT-11557_pdf_broken_pages.pdf, 
> IT-11557_pdf_broken_pages_F150317DYCELZZ.pdf
>
>
> Trying to excatly pages from PDF documents like this
> {code}
> PDDocument doc = PDDocument.load(new ByteArrayInputStream(pdf));
> List allPages = doc.getDocumentCatalog().getAllPages();
> {code}
> But not all pages get read, and the only indication something is wrong is 
> this error-logging:
> {noformat}
> ERROR org.apache.pdfbox.pdmodel.PDPageNode.getAllKids()#202: No Kids found in 
> getAllKids(). Probably a malformed pdf.
> {noformat}
> I'm getting one of these error-lines for each page that isn't read. I'm 
> attaching two different files with this problem. One gives me 4 out of 6 
> pages, and the other gives me none of the 4 pages. Both documents read fine 
> in Acrobat Reader and in Okular where all the pages get shown.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to