[ 
https://issues.apache.org/jira/browse/NUTCH-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17770059#comment-17770059
 ] 

Tim Allison commented on NUTCH-3006:
------------------------------------

An alternative approach would be for Tika to revert 
CloseShieldInputStream.wrap(), which I think was the only conflict?!  Should I 
check with the Tika community about that?

The notion of downgrading Tika to a December 2021 release unsettles me, and I 
have no idea how far out Hadoop 3.4.0 is.

WDYT?

> Downgrade Tika dependency to 2.2.1 (core and parse-tika)
> --------------------------------------------------------
>
>                 Key: NUTCH-3006
>                 URL: https://issues.apache.org/jira/browse/NUTCH-3006
>             Project: Nutch
>          Issue Type: Bug
>    Affects Versions: 1.20
>            Reporter: Sebastian Nagel
>            Priority: Major
>             Fix For: 1.20
>
>
> Tika 2.3.0 and upwards depend on a commons-io 2.11.0 (or even higher) which 
> is not available when Nutch is used on Hadoop. Only Hadoop 3.4.0 is expected 
> to ship with commons-io 2.11.0 (HADOOP-18301), all currently released 
> versions provide commons-io 2.8.0. Because Hadoop-required dependencies are 
> enforced in (pseudo)distributed mode, using Tika may cause issues, see 
> NUTCH-2937 and NUTCH-2959.
> [~lewismc] suggested in the discussion of [Githup PR 
> #776|https://github.com/apache/nutch/pull/776] to downgrade to Tika 2.2.1 to 
> resolve these issues for now and until Hadoop 3.4.0 becomes available.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to