Am 07.03.2021 um 06:04 schrieb Tilman Hausherr:
Report is here:

http://home.snafu.de/tilman/tmp/reports_pdfbox_2.0.22_vs_2.0.23.tar.xz

There's not much changed. No new exceptions. Re content, the changes that seem important are all related to "soft hyphen".

https://issues.apache.org/jira/browse/PDFBOX-5115

I am currently fixing this, and then I'll run the tests again. The text extraction differences will likely stay. It's possible that a change in tika-eval is needed too.

Tilman


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@pdfbox.apache.org
For additional commands, e-mail: dev-h...@pdfbox.apache.org

Reply via email to