Hi All,
 I reran 2.0.23 with our added handling for flash files against the
3.0.0-SNAPSHOT that I ran yesterday.  The diffs look almost the same
as the reports I created yesterday, so I think those are accurate:
https://corpora.tika.apache.org/base/reports/pdfbox-2.0.23-richmedia.tgz

There are a handful of files that "lose" attachments going into
3.0.0-SNAPSHOT because I haven't added the richmedia handling in our
3.0.0 branch.

     Best,

           Tim

On Thu, Apr 15, 2021 at 7:15 PM Tim Allison <[email protected]> wrote:
>
> Diffs look suspiciously small...I may have to rerun the analyses.
>
> On Thu, Apr 15, 2021 at 7:08 PM Tim Allison <[email protected]> wrote:
> >
> > Latest here: 
> > https://corpora.tika.apache.org/base/reports/pdfbox-3.0.0-20210415_reports.tgz
> >
> > I haven't had a chance to look yet.  Will dig in tomorrow.

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to