Am 06.06.2022 um 20:44 schrieb Tim Allison:
All, Martin Thoma of pypdf2 has set up some comparison tests on text extraction: https://github.com/py-pdf/benchmarks
https://github.com/py-pdf/benchmarks/blob/main/read/results/tika/2201.00069.txtMaybe it's because of the vertical text, he didn't use the detectAngles option.
Tilman
