[
https://issues.apache.org/jira/browse/PDFBOX-4049?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Andreas Lehmkühler closed PDFBOX-4049.
--------------------------------------
Resolution: Duplicate
The garbage at the beginning of the PDF is the root cause for the issue. PDFBox
realizes that all offsets are bad and triggers a brute force search. As the pdf
uses compressed and encrypted streams the brute force mechanism isn't able to
repair the pdf, see PDFBOX-4097.
You should ensure that such garbage is removed before parsing the file to avoid
the repair mechanism.
Closed as duplicate od PDFBOX-4097
> IllegalArgumentException: root cannot be null
> ---------------------------------------------
>
> Key: PDFBOX-4049
> URL: https://issues.apache.org/jira/browse/PDFBOX-4049
> Project: PDFBox
> Issue Type: Bug
> Components: Parsing
> Affects Versions: 2.0.8
> Environment: Windows 10
> Reporter: savan patel
> Assignee: Andreas Lehmkühler
> Priority: Major
> Labels: regression
> Attachments: 372d5dd7-d4b8-41b2-9f50-80c1353aee59.pdf
>
>
> I got a pdf,,, in which pdfbox gives errors while parsing it.
> {code}
> Exception in thread "main" java.lang.IllegalArgumentException: root cannot be
> null
> at org.apache.pdfbox.pdmodel.PDPageTree.<init>(PDPageTree.java:75)
> at
> org.apache.pdfbox.pdmodel.PDDocumentCatalog.getPages(PDDocumentCatalog.java:129)
> at
> org.apache.pdfbox.pdmodel.PDDocument.getNumberOfPages(PDDocument.java:1411)
> {code}
> This did not happen with 2.0.7.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]