Thank you. Checking out your repo now.
On Thu, May 11, 2023 at 10:19 AM Julien Nioche <[email protected]> wrote: > > Thanks Tim, > > I am testing 2.8.0 with StormCrawler > > Apart from a lot of warning about missing classes like > Caused by: java.lang.ClassNotFoundException: > org.apache.commons.compress.compressors.gzip.GzipCompressorInputStream > I am also getting a failed test when trying to extract text from an embedded > document. > > I can't see anything related in the release notes apart maybe from > > * Improve extraction of embedded file names in .docx (TIKA-3968). > > I've created a branch for it in SC -> > https://github.com/DigitalPebble/storm-crawler/tree/tika2.8 > in case anyone has the time and inclination to try to reproduce the issue. > > I'll see if I can find the source of the problem > > Julien > > > On Tue, 9 May 2023 at 17:40, Tim Allison <[email protected]> wrote: >> >> A candidate for the Tika 2.8.0 release is available at: >> https://dist.apache.org/repos/dist/dev/tika/2.8.0 >> >> The release candidate is a zip archive of the sources in: >> https://github.com/apache/tika/tree/2.8.0-rc1/ >> >> The SHA-512 checksum of the archive is >> 6b514a45b87013c566e57af2b6a526bce0b3bf02a1dabefe998068aa49672ec4a7ec2ecfa538a84aca719607f339a44341caeaab1ca313fc1c161154ec095bbb. >> >> In addition, a staged maven repository is available here: >> https://repository.apache.org/content/repositories/orgapachetika-1093/org/apache/tika >> >> Please vote on releasing this package as Apache Tika 2.8.0. >> The vote is open for the next 72 hours and passes if a majority of at >> least three +1 Tika PMC votes are cast. >> >> [ ] +1 Release this package as Apache Tika 2.8.0 >> [ ] -1 Do not release this package because... >> >> Here's my +1. >> >> Best, >> >> Tim > > > > -- > > Open Source Solutions for Text Engineering > > http://www.digitalpebble.com > http://digitalpebble.blogspot.com/ > #digitalpebble
