[
https://issues.apache.org/jira/browse/PDFBOX-2510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14219067#comment-14219067
]
Tilman Hausherr commented on PDFBOX-2510:
-----------------------------------------
Forgot to mention: use the non-sequential parser. TIKA has an option for this
that is turned off by default.
> Getting "Error: The supplied password does not match either the owner or user
> password in the document." while trying to parse pdf without password in
> -------------------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: PDFBOX-2510
> URL: https://issues.apache.org/jira/browse/PDFBOX-2510
> Project: PDFBox
> Issue Type: Bug
> Components: Parsing
> Affects Versions: 1.8.8
> Reporter: Ekaterina
> Attachments: DV.pdf
>
>
> I have a pdf that was correctly parsed for some time and suddenly I've got
> "javax.crypto.BadPaddingException: Given final block not properly padded"
> when I tried to parse it with pdfbox-1.8.7. Then I tried
> pdfbox-1.8.8-SNAPSHOT and I've got "Error: The supplied password does not
> match either the owner or user password in the document.". Here is the code
> I'm using:
> ContentHandler handler = new BodyContentHandler(400000);
> Metadata metadata = new Metadata();
> Parser parser = new AutoDetectParser();
> try (TikaInputStream stream = TikaInputStream.get(input)) {
> parser.parse(stream, handler, metadata, new
> ParseContext());
> } catch (IOException | SAXException | TikaException e) {
> LOG.error("Could not parse the input document", e);
> }
> return handler.toString();
> (I am using it with tika-parsers-1.6)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)