Hello Mike, > would it pay off for me to put a hadoop cluster on top of the 3 servers.
Yes, for as many reasons as Hadoop exists for. It can be tedious to set up for the first time, and there are many components. But at least you have three servers, which is kind of required by Zookeeper, that you will also need. Ideally you would have some additional VMs to run the controlling Hadoop programs and perhaps the Hadoop client nodes on. The workers can run on bare metal. > 1.) a server would not be integrated directly into the crawl process as a master. What do you mean? Can you elaborate? > 2.) can I run multiple crawl jobs on one server? Yes! Just have separate instances of Nutch home dirs on your Hadoop client nodes, each having their own configuration. Regards, Markus Op za 14 jan. 2023 om 18:42 schreef Mike <mz579...@gmail.com>: > Hi! > > I am now crawling the internet in local mode in parallel with up to 10 > instances on 3 computers. would it pay off for me to put a hadoop cluster > on top of the 3 servers. > > 1.) a server would not be integrated directly into the crawl process as a > master. > 2.) can I run multiple crawl jobs on one server? > > Thanks >