This was faster than I expected: Tilman contributed changes to PDFBox [0]
and Tabula [1], thus making us compatible with the newest version of PDFBox.

As soon as 2.0.21 is released, we'll release a new version of Tabula.

Thanks!

[0]
https://svn.apache.org/viewvc/pdfbox/branches/2.0/pdfbox/src/main/java/org/apache/pdfbox/text/LegacyPDFStreamEngine.java?r1=1879751&r2=1879750&pathrev=1879751
[1] https://github.com/tabulapdf/tabula-java/pull/325#issuecomment-615896790

On Thu, Jul 9, 2020 at 12:24 AM Tilman Hausherr <thaush...@t-online.de>
wrote:

> Yeah I remember that one, I even tried to find the problem and then did
> something else. Or maybe the IDE crashed so the window was no longer
> open and I forgot.
>
> I did not even go far enough to find out whether the old text extraction
> was the "good" one or the new one.
>
> Coincicentally, there is an issue
> https://issues.apache.org/jira/browse/PDFBOX-4909
> that may make it easier to get back to the old height calculation.
>
> Tilman (works for free here)
>
> Am 09.07.2020 um 04:10 schrieb Manuel Aristarán:
> > Hi!
> >
> > I'm one of the maintainers of Tabula [0].
> >
> > Due to some changes in PDFBox, we've been running on 2.0.15 for some time
> > now, and we would love to keep Tabula updated with the newest version of
> > our favorite library :)
> >
> > Last year, Tilman Hausherr graciously submitted a PR [1] that updated
> > PDFBox to 2.0.19, but unfortunately broke a few tests, as it seems that
> > there were changes in the font measurement heuristics. Text measurement
> is
> > a critical need of Tabula, so we had to choose to stick with the latest
> > compatible version.
> >
> > We want to offer a $200 USD bounty to fix the issue. We run entirely on
> > donations, and have funds available for this [2]. The goal is to update
> > Tabula to use PDFBox 2.0.20, and the requirement is that the test suite
> > passes in its entirety.
> >
> > If you're interested, please get in touch with me at man...@jazzido.com
> >
> > Thanks!
> >
> >
> > [0] https://tabula.technology
> > [1] https://github.com/tabulapdf/tabula-java/pull/325
> > [2] https://opencollective.com/tabulapdf
> >
> > --
> > Manuel Aristarán
> > http://jazzido.com
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscr...@pdfbox.apache.org
> For additional commands, e-mail: users-h...@pdfbox.apache.org
>
>

Reply via email to