Sebastian Nagel created NUTCH-3006:
--------------------------------------
Summary: Downgrade Tika dependency to 2.2.1 (core and parse-tika)
Key: NUTCH-3006
URL: https://issues.apache.org/jira/browse/NUTCH-3006
Project: Nutch
Issue Type: Bug
Affects Versions: 1.20
Reporter: Sebastian Nagel
Fix For: 1.20
Tika 2.3.0 and upwards depend on a commons-io 2.11.0 (or even higher) which is
not available when Nutch is used on Hadoop. Only Hadoop 3.4.0 is expected to
ship with commons-io 2.11.0 (HADOOP-18301), all currently released versions
provide commons-io 2.8.0. Because Hadoop-required dependencies are enforced in
(pseudo)distributed mode, using Tika may cause issues, see NUTCH-2937 and
NUTCH-2959.
[~lewismc] suggested in the discussion of [Githup PR
#776|https://github.com/apache/nutch/pull/776] to downgrade to Tika 2.2.1 to
resolve these issues for now and until Hadoop 3.4.0 becomes available.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)