What does your mvn dependency:tree tell? :-)

The only thing that needs to be cleaned is the locally installed SC.



Am 11. September 2025 16:48:53 MESZ schrieb Markos Volikas 
<[email protected]>:
>Yes..
>
>I'm building from source using: 
>https://dist.apache.org/repos/dist/dev/stormcrawler/stormcrawler-3.5.0-RC2/ 
>(tar.gz)
>
>I completely removed 
>/home/markos/.m2/repository/org/apache/commons/commons-compress and then ran 
>mvn clean install and it seems that multiple versions are getting in.
>
>Before this I had also removed my .m2/ completely to make sure all 
>dependencies are downloaded and they did. I have attached the build log.
>
>markos@nombat:~/.m2/repository/org/apache/commons/commons-compress$ ll
>total 28
>drwxrwxr-x  7 markos markos 4096 Sep 11 17:42 ./
>drwxrwxr-x 12 markos markos 4096 Sep 11 17:42 ../
>drwxrwxr-x  2 markos markos 4096 Sep 11 17:42 1.20/
>drwxrwxr-x  2 markos markos 4096 Sep 11 17:42 1.26.1/
>drwxrwxr-x  2 markos markos 4096 Sep 11 17:42 1.26.2/
>drwxrwxr-x  2 markos markos 4096 Sep 11 17:42 1.27.1/
>drwxrwxr-x  2 markos markos 4096 Sep 11 17:42 1.28.0/
>
>Markos
>
>On 9/11/25 16:55, Richard Zowalla wrote:
>> Cleaned your local Maven repo before building the uber jar?
>> 
>> Can you check your compress version?
>> 
>> Gruß
>> Richard
>> 
>> Am 11. September 2025 15:38:38 MESZ schrieb Markos Volikas 
>> <[email protected]>:
>>> Hi all,
>>> 
>>> I'm afraid I'm still getting:
>>> 
>>> 16:25:13.829 [Thread-46-parse-executor[6, 6]] INFO  o.a.s.b.JSoupParserBolt 
>>> - Parsing : starting https://apache.org/
>>> 16:25:13.848 [Thread-46-parse-executor[6, 6]] ERROR o.a.s.b.JSoupParserBolt 
>>> - Exception while guessing mimetype on https://apache.org/: 
>>> org.apache.commons.compress.archivers.ArchiveException: No Archiver found 
>>> for the stream signature
>>> 
>>> I'm running in local mode with Storm 2.8.2 running on Ubuntu 24.04 (openjdk 
>>> 17.0.16 2025-07-15). The database is Solr running in Docker although this 
>>> should be irrelevant. Maybe I'm doing something wrong? I have attached the 
>>> config I'm using in case you have any ideas. Sorry for the delay, but I 
>>> just found time to look into this again :-(
>>> 
>>> Markos
>>> 
>>> On 9/8/25 20:46, Richard Zowalla wrote:
>>>> Hi folks,
>>>> 
>>>> I have posted a 2nd release candidate for the Apache StormCrawler 3.5.0 
>>>> release and it is ready for testing. The regression with Tika / Compress 
>>>> was fixed.
>>>> 
>>>> Apache StormCrawler 3.5.0 decouples Selenium from the core module, 
>>>> improving modularity and reducing unnecessary dependencies.
>>>> The release also introduces an advanced metadata filtering systemt hat 
>>>> supports complex logical operations like key=>val OR (key2=>val2 AND 
>>>> key3=>val3).
>>>> Additionally, multiple dependencies were upgraded, core tests improved, 
>>>> and deprecated code cleaned up, enhancing overall stability and 
>>>> maintainability.
>>>> 
>>>> Thank you to everyone who contributed to this release, including all of 
>>>> our users and the people who submitted bug reports,
>>>> contributed code or documentation enhancements.
>>>> 
>>>> The release was made using the Apache StormCrawler release process, 
>>>> documented here:
>>>> https://github.com/apache/stormcrawler/blob/main/RELEASING.md
>>>> 
>>>> Source:
>>>> 
>>>> https://dist.apache.org/repos/dist/dev/stormcrawler/stormcrawler-3.5.0-RC 
>>>> <https://dist.apache.org/repos/dist/dev/stormcrawler/stormcrawler-3.5.0-RC1>2
>>>> 
>>>> Tag:
>>>> 
>>>> https://github.com/apache/stormcrawler/releases/tag/stormcrawler-3.5.0
>>>> 
>>>> Commit Hash:
>>>> 
>>>> 1947ad4c56ff5c5c90e093900a163e0ac3144bb6
>>>> 
>>>> Maven Repo:
>>>> 
>>>> https://repository.apache.org/content/repositories/orgapachestormcrawler-1011
>>>> 
>>>> <repositories>
>>>> <repository>
>>>> <id>stormcrawler-3.5.0-rc2</id>
>>>> <name>Testing StormCrawler 3.5.0 release candidate 2</name>
>>>> <url>
>>>> https://repository.apache.org/content/repositories/orgapachestormcrawler-1011
>>>> </url>
>>>> </repository>
>>>> </repositories>
>>>> 
>>>> Release notes:
>>>> 
>>>> https://github.com/apache/stormcrawler/releases/tag/stormcrawler-3.5.0
>>>> 
>>>> Reminder: The up-2-date KEYS file for signature verification can be
>>>> found here: https://downloads.apache.org/stormcrawler/KEYS
>>>> 
>>>> Please vote on releasing these packages as Apache StormCrawler 3.5.0
>>>> The vote is open for at least the next 72 hours.
>>>> 
>>>> Only votes from the StormCrawler PMC are binding, but everyone is welcome 
>>>> to check the release candidate and vote.
>>>> The vote passes if at least three binding +1 votes are cast.
>>>> 
>>>> Please VOTE
>>>> 
>>>> [+1] go ship it
>>>> [+0] meh, don't care
>>>> [-1] stop, there is a ${showstopper}
>>>> 
>>>> Thanks!
>>>> Richard

Reply via email to