sebastian-nagel commented on PR #776: URL: https://github.com/apache/nutch/pull/776#issuecomment-1725795918
> Can we exclude commons-io from hadoop and then add it as a dependency in the main ivy.xml? When running in distributed or pseudo-distributed mode, commons-io 2.8.0 is first in the classpath, independent from which commons-io version is contained in the Nutch job jar. Using Hadoop methods to write data or for communication may rely on that specific commons-io version and changing the Hadoop classpath is challenging, since even more may break. Btw., I've just rediscovered that using Tika in (pseudo)distributed mode is broken since the upgrade to Tika 2.3.0, see [NUTCH-2937](https://issues.apache.org/jira/browse/NUTCH-2937). Although, it didn't seem to have affected the MIME detector. > localhost: ssh: connect to host localhost port 22: Connection refused > Any ideas what I'm doing wrong? Need to [set up passphraseless ssh](https://hadoop.apache.org/docs/stable/hadoop-project-dist/hadoop-common/SingleCluster.html#Setup_passphraseless_ssh). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@nutch.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org