Hi,
Are you somehow using a lower PDFBox version with a newer Tika Version?
canPrintFaithful() was introduced in 2022 because the old method
canPrintDegraded(), which is now deprecated, was misleading
Tilman
On 08.02.2024 13:59, Morkus wrote:
Hello,
I recently tried to upgrade our Java Tika maven libraries from 2.2.0
to 2.9.1, but when using the same Java code that worked before,
extracting text from a PDF no longer works.
Now I get these errors:
"java.lang.NoSuchMethodError: 'boolean
org.apache.pdfbox.pdmodel.encryption.AccessPermission.canPrintFaithful()'
at
org.apache.tika.parser.pdf.PDFParser.extractMetadata(PDFParser.java:591)
at org.apache.tika.parser.pdf.PDFParser.parse(PDFParser.java:201)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:298)
at org.apache.tika.parser.CompositeParser.parse(CompositeParser.java:298)
at
org.apache.tika.parser.AutoDetectParser.parse(AutoDetectParser.java:203)
.
.
.
The error stack seems to indicate a line of code that worked perfectly
with 2.2.0.
Note the last line below "parser.parse(...)". This is one of the lines
noted in the error stack.
Parser parser =new AutoDetectParser();
BodyContentHandler handler =new BodyContentHandler(-1);// handle larger files.
Metadata metadata =new Metadata();
InputStream inputStream =new ByteArrayInputStream(decodedBodyData);
ParseContext context =new ParseContext();*parser.parse(inputStream, handler,
metadata, context);*
Could someone explain what I need to change so this code works again
with 2.9.1?
Greatly appreciate any suggestions.
Thanks very much,
* m
Sent from ProtonMail <https://protonmail.com>, Swiss-based encrypted
email.
Sent with Proton Mail <https://proton.me/> secure email.