Hi All,
I reran 2.0.23 with our added handling for flash files against the
3.0.0-SNAPSHOT that I ran yesterday. The diffs look almost the same
as the reports I created yesterday, so I think those are accurate:
https://corpora.tika.apache.org/base/reports/pdfbox-2.0.23-richmedia.tgz
There are a handful of files that "lose" attachments going into
3.0.0-SNAPSHOT because I haven't added the richmedia handling in our
3.0.0 branch.
Best,
Tim
On Thu, Apr 15, 2021 at 7:15 PM Tim Allison <[email protected]> wrote:
>
> Diffs look suspiciously small...I may have to rerun the analyses.
>
> On Thu, Apr 15, 2021 at 7:08 PM Tim Allison <[email protected]> wrote:
> >
> > Latest here:
> > https://corpora.tika.apache.org/base/reports/pdfbox-3.0.0-20210415_reports.tgz
> >
> > I haven't had a chance to look yet. Will dig in tomorrow.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]