hi,

when i am trying to parse the pdf which is password protected using this code:

InputStream input = new FileInputStream(new File(resourceLocation));
        ContentHandler textHandler = new BodyContentHandler();
        Metadata metadata = new Metadata();
        PDFParser parser = new PDFParser();
        parser.parse(input, textHandler, metadata);
        input.close();
        out.println("Title: " + metadata.get("title"));
        out.println("Author: " + metadata.get("Author"));
        out.println("content: " + textHandler.toString());

i am getting this exception:

Could not parse document:class
org.apache.tika.exception.TikaException:Unable to extract PDF content
org.apache.tika.exception.TikaException: Unable to extract PDF content
        at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:76)
        at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:96)
        at org.apache.tika.parser.AbstractParser.parse(AbstractParser.java:53)
        at 
com.lucidimagination.article.tika.TikaParsePdf.parse(TikaParsePdf.java:43)
        at 
com.lucidimagination.article.tika.TikaParsePdf.main(TikaParsePdf.java:28)
Caused by: org.apache.pdfbox.exceptions.WrappedIOException: Error
decrypting document, details:
        at 
org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:314)
        at org.apache.tika.parser.pdf.PDF2XHTML.process(PDF2XHTML.java:61)
        ... 4 more
Caused by: org.apache.pdfbox.exceptions.CryptographyException: Error:
The supplied password does not match either the owner or user password
in the document.
        at 
org.apache.pdfbox.pdmodel.encryption.StandardSecurityHandler.decryptDocument(StandardSecurityHandler.java:239)
        at 
org.apache.pdfbox.pdmodel.PDDocument.openProtection(PDDocument.java:1325)
        at org.apache.pdfbox.pdmodel.PDDocument.decrypt(PDDocument.java:796)
        at 
org.apache.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:310)
        ... 5 more

so can anybody help me how to parse the password protected pdf.

thanks and regards
chethan

Reply via email to