I'm also here 😂
You can also put a breakpoint in PDFBox, then go to
org.apache.pdfbox.pdfparser.PDFParser.parse()
and when it does breakpoint-stop there (it definitively passes that
point!), then look into your /tmp directory for the file that is
mentioned in the tika debug output and copy it somewhere else.
Tilman
Am 20.07.2022 um 00:45 schrieb PGNet Dev:
hi,
i'm debugging a problem with email attachment scanning by tika-server.
dovecot imap server receives email+attachment, then hands off the
attachment (modified, or unmodified, dunno yet) via its 'fts-tika'
plugin.
with
dovecot 2.3.19.1
tika 2.4.2-snapshot
openjdk version "18.0.1" 2022-04-19
this used to work with earlier versions (haven't bisected the problem
yet).
with that^ version mix, it's failing.
it appears to be failing @ ~ PDFParser.
i've been trying to debug in this thread,
https://lists.apache.org/thread/pztsq8tb8xqz3s4kmjpnt9p3zt07y05k
but have hit a current (temporary?) impasse.
at both Tika & Dovecot mailing lists, it's suggested to capture the
/tmp/file @ failure.
to do so, i've -- per instruction -- set a jdb bkpt @
org.apache.tika.parser.pdf.PDFParser
, but on exec, the errant file's not persisted
one suggestion as to why not is,
"If the debugger didn't stop, then the breakpoint was at the wrong
place. Or it's not possible to debug."
seems *really* odd that it can't be debugged ... thought best to ask
_here_ first.
Q:
IS it possible to debug? ()
what's the RIGHT breakpoint to set to make sure to halt, & catch
the tmp file?
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: users-h...@pdfbox.apache.org