Hello,
I made a few tests for the support of PDFBox 1.0.0 parsing of PDF files with
encryption.
I get the following results (see below).
I'd like to know if this is a functional/technical restriction (for exceptional cases) or
an unknown behaviour. I have seen nothing on this purpose on the JIRA. I haven't seen
either a description of what is supported or not.
In all the tests, I used the same original file, the difference is only on the encryption
level.
PDF version Protection Cyphering level Result
1.5 none - Text parsed
1.6 password RC 40 bits Text parsed
1.6 password RC 128 bits Text parsed
1.6 password RC 128 bits Text parsed
1.6 password AES 128 bits IOException
(stack trace
below)
1.7 Adobe ext. level 3 password AES 256 bits
ArrayIndexOutOfBoundsException
(stack trace
below)
Here are the stack traces :
java.io.IOException: Error: Expected an integer type, actual=''
at org.apache.pdfbox.pdfparser.BaseParser.readInt(BaseParser.java:1275)
at
org.apache.pdfbox.pdfparser.PDFObjectStreamParser.parse(PDFObjectStreamParser.java:81)
at
org.apache.pdfbox.cos.COSDocument.dereferenceObjectStreams(COSDocument.java:458)
at org.apache.pdfbox.pdmodel.PDDocument.openProtection(PDDocument.java:1100)
at org.apache.pdfbox.pdmodel.PDDocument.decrypt(PDDocument.java:579)
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:235)
java.lang.ArrayIndexOutOfBoundsException
at java.lang.System.arraycopy(Native Method)
at
org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.computeEncryptedKey(StandardSecurityHandler.java:538)
at
org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.computeUserPassword(StandardSecurityHandler.java:574)
at
org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.isUserPassword(StandardSecurityHandler.java:757)
at
org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.decryptDocument(StandardSecurityHandler.java:186)
at org.apache.pdfbox.pdmodel.PDDocument.openProtection(PDDocument.java:1099)
at org.apache.pdfbox.pdmodel.PDDocument.decrypt(PDDocument.java:579)
at org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:235)
Regards,
--
Benoît Dissert - Netiquette Sign. cf: http://tools.ietf.org/html/rfc1855
R&D Jalios - http://support.jalios.com/ - <[email protected]>