On 7/19/22 12:24 PM, Tilman Hausherr wrote:
The checkstyle violation is about the coding style. You can delete that part in the tika-parent/pom.xml if you want, or add <skip>true</skip> below "<configuration>" in that plugin. Same for the ossindex-maven-plugin and the forbiddenapis plugin.
If the debugger didn't stop, then the breakpoint was at the wrong place. Or it's not possible to debug.
I'll give the pom mod a try in a bit. As to which breakpoint, I certainly don't know the tika/java internals well enough to say what is/isn't correct, yet.
Re "is there anything informative in that now-more-verbose DEBUG output? " well yes, the MD5 output. This proves that the file is different. (ok, the different length showed that too)
I've asked over at Dovecot ML what, specifically, dovecot 'sends' to the tika backend via their fts-tika plugin: the original/complete/unmodified attachment, suggesting that the file size / MD5 hash should be the same as what tika's trapping or, some modification to the file is made (trimmed, or add'l headers, etc etc), and that the size/hash are not _expected_ to be the same we'll see what i hear
