[ 
https://issues.apache.org/jira/browse/PDFBOX-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14219474#comment-14219474
 ] 

Ekaterina commented on PDFBOX-2510:
-----------------------------------

I am using the non-sequential parser in tika with pdfbox-1.8.8-SNAPSHOT and now 
it gives me:

org.apache.tika.exception.TikaException: Unable to extract PDF content
        at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:146)
        at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:159)
        at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244)
        at 
org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:244)
        at 
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:121)
        at com.majio.core.gate.utils.DocUtils.readInputStream(DocUtils.java:59)
        at com.majio.core.gate.utils.DocUtils.readDoc(DocUtils.java:30)
        at com.majio.core.gate.utils.DocUtils.main(DocUtils.java:83)
Caused by: org.apache.pdfbox.exceptions.WrappedIOException
        at 
org.apache.pdfbox.pdmodel.encryption.SecurityHandler.encryptData(SecurityHandler.java:377)
        at 
org.apache.pdfbox.pdmodel.encryption.SecurityHandler.decryptStream(SecurityHandler.java:475)
        at 
org.apache.pdfbox.pdmodel.encryption.SecurityHandler.decrypt(SecurityHandler.java:439)
        at 
org.apache.pdfbox.pdmodel.encryption.SecurityHandler.decryptObject(SecurityHandler.java:409)
        at 
org.apache.pdfbox.pdmodel.encryption.SecurityHandler.proceedDecryption(SecurityHandler.java:221)
        at 
org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.decryptDocument(StandardSecurityHandler.java:158)
        at 
org.apache.pdfbox.pdmodel.PDDocument.openProtection(PDDocument.java:1601)
        at org.apache.pdfbox.pdmodel.PDDocument.decrypt(PDDocument.java:947)
        at 
org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:357)
        at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:130)
        ... 7 more
Caused by: javax.crypto.IllegalBlockSizeException: Input length must be 
multiple of 16 when decrypting with padded cipher
        at com.sun.crypto.provider.CipherCore.doFinal(CipherCore.java:913)
        at com.sun.crypto.provider.CipherCore.doFinal(CipherCore.java:824)
        at com.sun.crypto.provider.AESCipher.engineDoFinal(AESCipher.java:436)
        at javax.crypto.Cipher.doFinal(Cipher.java:2179)
        at 
org.apache.pdfbox.pdmodel.encryption.SecurityHandler.encryptData(SecurityHandler.java:355)
        ... 16 more


> Getting "Error: The supplied password does not match either the owner or user 
> password in the document." while trying to parse pdf without password in 
> -------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: PDFBOX-2510
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2510
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 1.8.8
>            Reporter: Ekaterina
>         Attachments: DV.pdf
>
>
> I have a pdf that was correctly parsed for some time and suddenly I've got 
> "javax.crypto.BadPaddingException: Given final block not properly padded" 
> when I tried to parse it with pdfbox-1.8.7. Then I tried 
> pdfbox-1.8.8-SNAPSHOT and I've got "Error: The supplied password does not 
> match either the owner or user password in the document.". Here is the code 
> I'm using:
> ContentHandler handler = new BodyContentHandler(400000);
>               Metadata metadata = new Metadata();
>               Parser parser = new AutoDetectParser();
>               try (TikaInputStream stream = TikaInputStream.get(input)) {
>                       parser.parse(stream, handler, metadata, new 
> ParseContext());
>               } catch (IOException | SAXException | TikaException e) {
>                       LOG.error("Could not parse the input document", e);
>               }
>               return handler.toString();
> (I am using it with tika-parsers-1.6)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to