In the new_exceptions_in_B_details.xlsx file, commoncrawl3/43/43R5U3BXJUDJXDZ25OAE33ZU47362WLV is listed as "bmp", but it is a zip file. And I get no exception when trying to extract all attachments with the -z option

Same for commoncrawl3/M4/M4J5KAPEC5F62UXFNCPRQATQWH3FSWPG

Tilman

On 19.07.2023 19:19, Tim Allison wrote:
Results are here:
https://corpora.tika.apache.org/base/reports/tika-2.8.1-pre-rc1.tgz

This is on a new set of ~1.3 million files, including fewer truncated PDFs.

I've only had a chance to look quickly.  No showstoppers leapt out to me.
There are some expected differences, and a couple of surprises.  I'm going
to dig a bit tomorrow and then start the release process unless anyone
finds anything concerning or has a blocker.

Thank you, all!

Best,

      Tim

On Thu, Jul 13, 2023 at 7:00 PM Tim Allison <[email protected]> wrote:

All,
    I think we’re at a good place for a minor version release? Should I
start the regression tests tomorrow for potential release next week or week
after?
    Any blockers or things we should try to get in?

    Thank you!

     Best,

        Tim

On Thu, Jul 13, 2023 at 5:20 PM Nicolò Mendola (Jira) <[email protected]>
wrote:

     [
https://issues.apache.org/jira/browse/TIKA-4064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17742957#comment-17742957
]

Nicolò Mendola edited comment on TIKA-4064 at 7/13/23 9:19 PM:
---------------------------------------------------------------

Just out of interest, is there an eta for release 2.8.1 to be published?

Best regards


was (Author: JIRAUSER296595):
Just out of interest, is there an eta for release 2.8.1 to be published?



Best regards

Update to 2.8.1
---------------

                 Key: TIKA-4064
                 URL: https://issues.apache.org/jira/browse/TIKA-4064
             Project: Tika
          Issue Type: Task
          Components: build
    Affects Versions: 2.8.0
            Reporter: Tilman Hausherr
            Priority: Minor
             Fix For: 2.8.1


The latest maven versions plugin finds much more outdated stuff than
the previous one, so I'll do a few updates.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


Reply via email to