Thanks!

I created one issue (PDFBOX-5207) but I don't consider this a blocker.

The other files where column T has text have troubles related to matrix multiplication. I suspect that some parser changes produce larger numbers than before.

The file
bug_trackers/poppler/poppler-84988-0.zip-3.pdf
has a different problem but I suspect it is related:
/MediaBox [0 170141183460469231731687303715884105728 612 792]

in 2.0.23 rendering worked (it seems the number was skipped and then the rectangle ignored), but in 2.0.24 it doesn't.

Tilman

Am 03.06.2021 um 14:24 schrieb Tim Allison:
Reports are here:
https://corpora.tika.apache.org/base/reports/reports-pdfbox-2.0.24-SNAPSHOT.tgz

No new exceptions. Content looks better by a tiny amount.  There are a
few files with some apparent regressions, but overall, the diffs are
negligible.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to