[
https://issues.apache.org/jira/browse/PDFBOX-872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12934022#action_12934022
]
Martijn Brinkers commented on PDFBOX-872:
-----------------------------------------
The exception is thrown because the PDF was encrypted with AES. PDFBox does not
yet support AES out of the box. I have added a patch to add AES encryption to
the SecurityHandler. AES is enabled in the StandardSecurityHandler when the
PDCryptFilterDictionary says that the PDF was encrypted with AES.
With the patch, the attached PDF can be read.
> ERROR org.apache.pdfbox.filter.FlateFilter - Stop reading corrupt stream
> -------------------------------------------------------------------------
>
> Key: PDFBOX-872
> URL: https://issues.apache.org/jira/browse/PDFBOX-872
> Project: PDFBox
> Issue Type: Bug
> Components: Parsing
> Affects Versions: 1.3.1
> Environment: Windows XP [Версия 5.1.2600]
> java version "1.6.0_22"
> Java(TM) SE Runtime Environment (build 1.6.0_22-b04)
> Java HotSpot(TM) Client VM (build 17.1-b03, mixed mode, sharing)
> Reporter: Vladimir
> Priority: Critical
> Attachments: PDFBOX-872.patch
>
>
> This report:
> http://www2.goldmansachs.com/our-firm/press/press-releases/current/pdfs/2010-q2-earnings.pdf
> With this code:
> public static String getTransformed(InputStream inputStream) {
> PDDocument pdDocument = null;
> String document = null;
> try {
> PDFParser parser = new PDFParser(inputStream);
> parser.parse();
> pdDocument = parser.getPDDocument();
> PDFText2HTML pdf2html = new PDFText2HTML("UTF-8");
> document = pdf2html.getText(pdDocument);
> } catch (IOException e) {
> e.printStackTrace();
> } finally {
> if (pdDocument != null) {
> try {
> pdDocument.getDocument().close();
> } catch (IOException e) {
> e.printStackTrace();
> }
> }
> }
> return document;
> }
> returns:
> 17:01:15,609 [main] ERROR org.apache.pdfbox.filter.FlateFilter - Stop
> reading corrupt stream
> null
> java.io.IOException: Error: Expected an integer type, actual=''
> at org.apache.pdfbox.pdfparser.BaseParser.readInt(BaseParser.java:1310)
> at
> org.apache.pdfbox.pdfparser.PDFObjectStreamParser.parse(PDFObjectStreamParser.java:81)
> at
> org.apache.pdfbox.cos.COSDocument.dereferenceObjectStreams(COSDocument.java:449)
> at
> org.apache.pdfbox.pdmodel.PDDocument.openProtection(PDDocument.java:1112)
> at org.apache.pdfbox.pdmodel.PDDocument.decrypt(PDDocument.java:591)
> at
> org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:246)
> at
> org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:184)
> in Foxit PDF this file was opened normally
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.