To add to Sebastian, it runs on Hadoop 3.3.x very good as well. Actually, i
never had any Hadoop version that could not run Nutch out of the box and
without issues.

Op ma 13 jun. 2022 om 11:54 schreef Sebastian Nagel
<wastl.na...@googlemail.com.invalid>:

> Hi Michael,
>
> Nutch (1.18, and trunk/master) should work together with more recent Hadoop
> versions.
>
> At Common Crawl we use a modified Nutch version based on the recent trunk
> running on Hadoop 3.2.2 (soon 3.2.3) and Java 11, even on a mixed Hadoop
> cluster
> with x64 and arm64 AWS EC2 instances.
>
> But I'm sure there are more possible combinations.
>
> One important note: in trunk/master there is a yet unsolved regression
> caused by
> the newly introduced plugin-based URL stream handlers, see NUTCH-2936 and
> NUTCH-2949. Unless these are resolved, you need to undo these commits in
> order
> to run Nutch (built from trunk/master) in distributed mode.
>
> Best,
> Sebastian
>
> On 6/13/22 01:37, Michael Coffey wrote:
> > Do current 1.x versions of Nutch (1.18, and trunk/master) work with
> versions of Hadoop greater than 3.1.3? I ask because Hadoop 3.1.3 is from
> October 2019, and there are many newer versions available. For example,
> 3.1.4 came out in 2020, and there are 3.2.x and 3.3.x versions that came
> out this year.
> >
> > I don’t care about newer features in Hadoop, I just have general
> concerns about stability and security. I am working on reviving an old
> project and would like to put together the best possible infrastructure for
> the future.
> >
> >
>

Reply via email to