I wouldn't. :D
On Thu, Jun 16, 2022 at 12:16 PM Tilman Hausherr
wrote:
> Am 15.06.2022 um 12:19 schrieb Tim Allison:
> > Reports are here:
> > https://corpora.tika.apache.org/base/reports/pdfbox-3-20220614.tgz
>
> govdocs1/372/372582.pdf
> commoncrawl3/KH/KHDACXIPFMWP632LZ3S4TRRSZPDGHGM5
>
Am 15.06.2022 um 12:19 schrieb Tim Allison:
Reports are here:
https://corpora.tika.apache.org/base/reports/pdfbox-3-20220614.tgz
govdocs1/372/372582.pdf
commoncrawl3/KH/KHDACXIPFMWP632LZ3S4TRRSZPDGHGM5
commoncrawl3/VN/VNCWMY6Y4C3XYWA65CQPPSNZSY6OQEEA
have lost text. But the first one is a
Am 15.06.22 um 13:07 schrieb Tim Allison:
In "parse_time_millis_details.xlsx", there are some that took much longer
in 3.x during the multithreaded run but do not show much of a difference
singlethreaded...likely accidents of resources available at parse time.
Overall, the sum of processing
Am 15.06.22 um 12:19 schrieb Tim Allison:
Reports are here:
https://corpora.tika.apache.org/base/reports/pdfbox-3-20220614.tgz
@Tim thanks again
Looks like there aren't any new exceptions in 3.0.0 at all, ergo we are good to
target a new release :-)
Andreas
On Mon, Jun 13, 2022 at 4:54
In "parse_time_millis_details.xlsx", there are some that took much longer
in 3.x during the multithreaded run but do not show much of a difference
singlethreaded...likely accidents of resources available at parse time.
Overall, the sum of processing times across all files is very similar.
I had a chance to look at new_catastrophic_exceptions_in_b, and the three
files in there take roughly the same amount of time and resources. I think
they failed on trunk only because of the whims of multithreading and
available resources at the time.
This file is admittedly quite large, but it
Reports are here:
https://corpora.tika.apache.org/base/reports/pdfbox-3-20220614.tgz
On Mon, Jun 13, 2022 at 4:54 PM Tim Allison wrote:
> Just seeing this now. Y. I'll kick off the tests tomorrow morning (ET).
>
> On Sat, Jun 11, 2022 at 8:09 AM Andreas Lehmkuehler
> wrote:
>
>> I've fixed
Just seeing this now. Y. I'll kick off the tests tomorrow morning (ET).
On Sat, Jun 11, 2022 at 8:09 AM Andreas Lehmkuehler
wrote:
> I've fixed PDFBOX-5452 and found/fixed another one, see PDFBOX-5456
>
> @Tim is there any chance to rerun the regression tests?
>
> Thanks in advance
> Andreas
I've fixed PDFBOX-5452 and found/fixed another one, see PDFBOX-5456
@Tim is there any chance to rerun the regression tests?
Thanks in advance
Andreas
Am 07.06.22 um 08:06 schrieb Andreas Lehmkuehler:
I've found another regression, see PDFBOX-5452
Andreas
Am 29.05.22 um 18:37 schrieb Andreas
I've found another regression, see PDFBOX-5452
Andreas
Am 29.05.22 um 18:37 schrieb Andreas Lehmkuehler:
Thanks Tim,
looks like there are some regressions, see PDFBOX-5444 and PDFBOX-5447.
Maybe there are more to come
Andreas
Am 26.05.22 um 15:04 schrieb Tim Allison:
Apologies for
Good to find them now! Let me know when I should rerun and thank you!
Best,
Tim
On Sun, May 29, 2022 at 12:37 PM Andreas Lehmkuehler
wrote:
> Thanks Tim,
>
> looks like there are some regressions, see PDFBOX-5444 and PDFBOX-5447.
>
> Maybe there are more to come
>
> Andreas
>
>
>
Thanks Tim,
looks like there are some regressions, see PDFBOX-5444 and PDFBOX-5447.
Maybe there are more to come
Andreas
Am 26.05.22 um 15:04 schrieb Tim Allison:
Apologies for my delay. I ran trunk/3.x on May 12 against 2.0.26. The
reports are here:
Apologies for my delay. I ran trunk/3.x on May 12 against 2.0.26. The
reports are here:
https://corpora.tika.apache.org/base/reports/reports_pdfbox_3x_20220512.tgz
Happy to rerun with a more recent version of trunk.
Cheers,
Tim
On Sun, May 8, 2022 at 1:21 PM Andreas Lehmkuehler wrote:
Am 06.05.22 um 14:30 schrieb Tim Allison:
All,
Let me know when makes sense to run the text extraction regression
Yes, it'd be useful to have some update results.
How about comparing 2.0.26 vs 3.0.0-alpha3 and maybe 3.0.0-alpha2 vs.
3.0.0-alpha3?
tests for 3.x. I regret I haven't been
All,
Let me know when makes sense to run the text extraction regression
tests for 3.x. I regret I haven't been following our mailing list as
closely as I should be.
Cheers,
Tim
15 matches
Mail list logo