Hi all. We are using nutch for open web crawling one URL at a time until depth 5 and we were able to configure a cluster with one master and one hadoop node. Still, in our case, it seems that distributed mode is a lot slower than local mode. Any reason why not to run nutch for crawling in local mode in production?
Thanks in advance. Rodrigo