Am 16.07.2022 um 18:43 schrieb PGNet Dev:

i don't get any more useful info on failure,

    --> https://pastebin.com/raw/DsrLxbeg

You didn't get the exception I mentioned; then set the breakpoint at parse() to get the fileLen. The current error messages suggests that bytes have been changed or have been lost.

IIRC tika saves the PDF in a file in the temp directory before parsing, maybe look there at that time and compare the length and content with your own.

Tilman

Reply via email to