I fixed the hwp5 multithreading problem.

I looked into tar files, and the handful I reviewed had a "skip the rest of
the final block with x bytes", but there weren't actually x bytes.  This
didn't harm extraction because this happened on the last block.  Folks will
get more exceptions, but will get the same content.  I think this is ok on
balance given the improved safety we're getting with skip->skipFully in
TikaInputStream.

We do have more exceptions in mp4, but I think that is mostly on truncated
files.

In short, I _think_ we're ready to go for 1.24.1.  Please take a look at
the reports and let me know what you think.

Best,

         Tim

On Tue, Apr 14, 2020 at 10:36 AM Tim Allison <[email protected]> wrote:

> All,
>   We've made some important bug fixes since 1.24.  I recently ran the
> regression tests locally.  The reports are here:
>
>
> https://github.com/tballison/share/blob/master/tika_comparisons/tika_1_24_1_reports.tgz
>
>   We're getting more exceptions with .tar on "read the rest of the
> block".  I'll look into this; my initial impression is that these files are
> not truncated.
>
>   We're also getting more exceptions on mp4 with 0-length records, which,
> I think, is a side effect of truncation.
>
>   Let me know what else you see.
>
>        Cheers,
>
>                   Tim
>

Reply via email to