[
https://issues.apache.org/jira/browse/PDFBOX-1202?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13180344#comment-13180344
]
Ilija Pavlic commented on PDFBOX-1202:
--------------------------------------
I am also able to extract the text from the entire document without removing
security, by just invoking decrypt in a minimal working example.
The difference between the MWE and the buggy code is that I have been appending
to the PDPageContentStream, so perhaps that was the reason for the error. I
don't have the code by hand now, I'll look into it later in the day.
> org.apache.pdfbox.filter.FlateFilter decode SEVERE: Stop reading corrupt
> stream
> -------------------------------------------------------------------------------
>
> Key: PDFBOX-1202
> URL: https://issues.apache.org/jira/browse/PDFBOX-1202
> Project: PDFBox
> Issue Type: Bug
> Components: Text extraction
> Affects Versions: 1.6.0
> Reporter: Ilija Pavlic
> Priority: Minor
> Attachments: IATAUnitedStates.pdf
>
>
> Error "org.apache.pdfbox.filter.FlateFilter decode SEVERE: Stop reading
> corrupt stream" thrown when extracting text.
> The document was loaded with the following snippet:
> document = PDDocument.load("C:/Users/ilija.pavlic/Downloads/TestInput.pdf");
> if (document.isEncrypted()) {
> try {
> document.decrypt("");
> } catch (InvalidPasswordException e) {
> System.err.println("Error: Document is encrypted with a password.");
> System.exit(1);
> }
> }
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira