[ 
https://issues.apache.org/jira/browse/PDFBOX-872?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12934022#action_12934022
 ] 

Martijn Brinkers commented on PDFBOX-872:
-----------------------------------------

The exception is thrown because the PDF was encrypted with AES. PDFBox does not 
yet support AES out of the box. I have added a patch to add AES encryption to 
the SecurityHandler. AES is enabled in the StandardSecurityHandler when the 
PDCryptFilterDictionary says that the PDF was encrypted with AES.

With the patch, the attached PDF can be read.

> ERROR org.apache.pdfbox.filter.FlateFilter  - Stop reading corrupt stream
> -------------------------------------------------------------------------
>
>                 Key: PDFBOX-872
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-872
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 1.3.1
>         Environment: Windows XP [Версия 5.1.2600]
> java version "1.6.0_22"
> Java(TM) SE Runtime Environment (build 1.6.0_22-b04)
> Java HotSpot(TM) Client VM (build 17.1-b03, mixed mode, sharing)
>            Reporter: Vladimir
>            Priority: Critical
>         Attachments: PDFBOX-872.patch
>
>
> This report: 
> http://www2.goldmansachs.com/our-firm/press/press-releases/current/pdfs/2010-q2-earnings.pdf
> With this code:
> public static String getTransformed(InputStream inputStream) {
>         PDDocument pdDocument = null;
>         String document = null;
>         try {
>             PDFParser parser = new PDFParser(inputStream);
>             parser.parse();
>             pdDocument = parser.getPDDocument();
>             PDFText2HTML pdf2html = new PDFText2HTML("UTF-8");
>             document = pdf2html.getText(pdDocument);
>         } catch (IOException e) {
>             e.printStackTrace();      
>         } finally {
>             if (pdDocument != null) {
>                 try {
>                     pdDocument.getDocument().close();
>                 } catch (IOException e) {
>                     e.printStackTrace();
>                       }
>             }
>         }
>         return document;
>     }
> returns:
> 17:01:15,609 [main] ERROR org.apache.pdfbox.filter.FlateFilter  - Stop 
> reading corrupt stream
> null
> java.io.IOException: Error: Expected an integer type, actual=''
>       at org.apache.pdfbox.pdfparser.BaseParser.readInt(BaseParser.java:1310)
>       at 
> org.apache.pdfbox.pdfparser.PDFObjectStreamParser.parse(PDFObjectStreamParser.java:81)
>       at 
> org.apache.pdfbox.cos.COSDocument.dereferenceObjectStreams(COSDocument.java:449)
>       at 
> org.apache.pdfbox.pdmodel.PDDocument.openProtection(PDDocument.java:1112)
>       at org.apache.pdfbox.pdmodel.PDDocument.decrypt(PDDocument.java:591)
>       at 
> org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:246)
>       at 
> org.apache.pdfbox.util.PDFTextStripper.getText(PDFTextStripper.java:184)
> in Foxit PDF this file was opened normally

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to