Yes, using a hadoop cluster and the different fetcher settings for number of fetcher threads you run as many crawlers as you want in parallel.

Nutch's crawler does obey robots.txt and is polite in that all pages from a given domain are fetched on a single machine.

Dennis

On 11/12/2010 05:34 AM, mohammad amin golshani wrote:
hi all
does nutch have the ability run on multiple machine?
and last question does it support robots.txt

best regards

Reply via email to