This was faster than I expected: Tilman contributed changes to PDFBox [0] and Tabula [1], thus making us compatible with the newest version of PDFBox.
As soon as 2.0.21 is released, we'll release a new version of Tabula. Thanks! [0] https://svn.apache.org/viewvc/pdfbox/branches/2.0/pdfbox/src/main/java/org/apache/pdfbox/text/LegacyPDFStreamEngine.java?r1=1879751&r2=1879750&pathrev=1879751 [1] https://github.com/tabulapdf/tabula-java/pull/325#issuecomment-615896790 On Thu, Jul 9, 2020 at 12:24 AM Tilman Hausherr <thaush...@t-online.de> wrote: > Yeah I remember that one, I even tried to find the problem and then did > something else. Or maybe the IDE crashed so the window was no longer > open and I forgot. > > I did not even go far enough to find out whether the old text extraction > was the "good" one or the new one. > > Coincicentally, there is an issue > https://issues.apache.org/jira/browse/PDFBOX-4909 > that may make it easier to get back to the old height calculation. > > Tilman (works for free here) > > Am 09.07.2020 um 04:10 schrieb Manuel Aristarán: > > Hi! > > > > I'm one of the maintainers of Tabula [0]. > > > > Due to some changes in PDFBox, we've been running on 2.0.15 for some time > > now, and we would love to keep Tabula updated with the newest version of > > our favorite library :) > > > > Last year, Tilman Hausherr graciously submitted a PR [1] that updated > > PDFBox to 2.0.19, but unfortunately broke a few tests, as it seems that > > there were changes in the font measurement heuristics. Text measurement > is > > a critical need of Tabula, so we had to choose to stick with the latest > > compatible version. > > > > We want to offer a $200 USD bounty to fix the issue. We run entirely on > > donations, and have funds available for this [2]. The goal is to update > > Tabula to use PDFBox 2.0.20, and the requirement is that the test suite > > passes in its entirety. > > > > If you're interested, please get in touch with me at man...@jazzido.com > > > > Thanks! > > > > > > [0] https://tabula.technology > > [1] https://github.com/tabulapdf/tabula-java/pull/325 > > [2] https://opencollective.com/tabulapdf > > > > -- > > Manuel Aristarán > > http://jazzido.com > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org > For additional commands, e-mail: users-h...@pdfbox.apache.org > >