maciejpuzianowski opened a new pull request, #849:
URL: https://github.com/apache/nutch/pull/849
When running Apache Nutch 1.20 on a distributed Hadoop cluster with the
language-identifier plugin enabled, a class loader conflict occurs during the
parse process. This results in the following error:
2025-02-24 08:58:59,152 INFO mapreduce.Job: Task Id :
attempt_1740061418437_0135_m_000001_0, Status : FAILED
Error: loader constraint violation: when resolving method
'org.slf4j.ILoggerFactory org.slf4j.impl.StaticLoggerBinder.getLoggerFactory()'
the class loader org.apache.nutch.plugin.PluginClassLoader @4c5228e7 of the
current class, org/slf4j/LoggerFactory, and the class loader 'app' for the
method's defining class, org/slf4j/impl/StaticLoggerBinder, have different
Class objects for the type org/slf4j/ILoggerFactory used in the signature
(org.slf4j.LoggerFactory is in unnamed module of loader
org.apache.nutch.plugin.PluginClassLoader @4c5228e7, parent loader 'app';
org.slf4j.impl.StaticLoggerBinder is in unnamed module of loader 'app')
I have managed to resolve this issue by modifying following files:
ivy.xml ->
```
<dependency org="org.apache.tika" name="tika-langdetect-optimaize"
rev="2.9.0" conf="*->default">
<!-- exclusions of dependencies provided in Nutch core (ivy/ivy.xml)
-->
<exclude org="org.apache.tika" name="tika-core" />
<exclude org="com.google.guava" name="guava" />
<exclude org="org.slf4j" name="slf4j-api" />
<!-- exclusions of dependencies provided in Nutch core (ivy/ivy.xml)
-->
</dependency>
```
and plugin.xml ->
```
<library name="annotations-12.0.jar"/>
<library name="checker-qual-3.33.0.jar"/>
<library name="error_prone_annotations-2.18.0.jar"/>
<library name="failureaccess-1.0.1.jar"/>
<library name="j2objc-annotations-2.8.jar"/>
<library name="jsonic-1.2.11.jar"/>
<library name="jsr305-3.0.2.jar"/>
<library name="language-detector-0.6.jar"/>
<library
name="listenablefuture-9999.0-empty-to-avoid-conflict-with-guava.jar"/>
<library name="tika-langdetect-optimaize-2.9.0.jar"/>
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]