[ 
https://issues.apache.org/jira/browse/NUTCH-2727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16900962#comment-16900962
 ] 

Sebastian Nagel commented on NUTCH-2727:
----------------------------------------

Hi [~markus17], yes, I would also, if we can guarantee a certain level of 
backward-compatibility. An upgrade to Hadoop 3.x may force some API changes or 
dependency upgrades which then makes it impossible to run the Nutch job file on 
a Hadoop 2.x cluster. The Hadoop version is often not easy to change because 
the cluster is shared with legacy applications and/or the cluster deployment is 
fixed and bound to a Hadoop distribution (Cloudera, Hortonworks, MapR, EMR, 
Azure, etc.). I want to avoid that users have to downgrade to get Nutch run on 
their cluster. I can confirm that the opposite (running Nutch built with 2.7.4 
on Hadoop 3.x) works out-of-the-box. Do you build Nutch with Hadoop 3.2.0 or 
just similarly run the Nutch job file on a Hadoop 3.2.0 cluster?

> Upgrade Hadoop dependencies to 2.9.2
> ------------------------------------
>
>                 Key: NUTCH-2727
>                 URL: https://issues.apache.org/jira/browse/NUTCH-2727
>             Project: Nutch
>          Issue Type: Improvement
>    Affects Versions: 1.15
>            Reporter: Sebastian Nagel
>            Priority: Major
>             Fix For: 1.16
>
>
> The latest upgrade of the Hadoop dependency dates back to Dec 2017 
> (NUTCH-2354). We might upgrade to the latest version of Hadoop 2.x (2.9.2).
> Note: Nutch 1.15 (or master) built with Hadoop 2.7.4 runs seamlessly on 
> Hadoop 3.x. This should be also the case for 2.9.4 (to be tested), so we 
> still might wait for the final upgrade to Hadoop 3.x to ensure 
> backward-compatibility.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

Reply via email to