Hi,

Are you somehow using a lower PDFBox version with a newer Tika Version? canPrintFaithful() was introduced in 2022 because the old method canPrintDegraded(), which is now deprecated, was misleading

Tilman

On 08.02.2024 13:59, Morkus wrote:
Hello,

I recently tried to upgrade our Java Tika maven libraries from 2.2.0 to 2.9.1, but when using the same Java code that worked before, extracting text from a PDF no longer works.

Now I get these errors:

"java.lang.NoSuchMethodError: 'boolean org.apache.pdfbox.pdmodel.encryption.AccessPermission.canPrintFaithful()'

at org.apache.tika.parser.pdf.PDFParser.extractMetadata(PDFParser.java:591)
at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:201)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:298)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:298)
at org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:203)
.
.
.
The error stack seems to indicate a line of code that worked perfectly with 2.2.0.

Note the last line below "parser.parse(...)". This is one of the lines noted in the error stack.
Parser parser =new AutoDetectParser();
BodyContentHandler handler =new BodyContentHandler(-1);// handle larger files. 
Metadata metadata =new Metadata();
InputStream inputStream =new ByteArrayInputStream(decodedBodyData);
ParseContext context =new ParseContext();*parser.parse(inputStream, handler, 
metadata, context);*
Could someone explain what I need to change so this code works again with 2.9.1?

Greatly appreciate any suggestions.

Thanks very much,

  * m



Sent from ProtonMail <https://protonmail.com>, Swiss-based encrypted email.

Sent with Proton Mail <https://proton.me/> secure email.

Reply via email to