Yes, using hadoop you can run multiple nutch jobs in parallel as long as the output directories don't conflict. For example you can run multiple updatedb jobs on different crawldbs at the same time but not on the same crawldb going to the same output directory.

Dennis

On 11/13/2010 10:20 AM, Jitendra wrote:
Hi,

Wanted to add one more question on this. Can nutch run multiple jobs in
parallel on same machine.
I have changed nutch to use different crawldb and url directory for
different jobs.

Thanks

On Fri, Nov 12, 2010 at 5:22 PM, Birger Lie [via Lucene]<
[email protected]<ml-node%[email protected]>
wrote:
can be distributed on several machines (as many as you like)
does support robots txt


- Birger

On Nov 12, 2010, at 12:34 PM, mohammad amin golshani wrote:

does nutch have the ability run on multiple machine?


------------------------------
  View message @
http://lucene.472066.n3.nabble.com/can-nutch-s-crawler-run-parallel-tp1888331p1888409.html
To start a new topic under Nutch - User, email
[email protected]<ml-node%[email protected]>
To unsubscribe from Nutch - User, click 
here<http://lucene.472066.n3.nabble.com/template/TplServlet.jtp?tpl=unsubscribe_by_code&node=603147&code=amVldC5sb3Zlc0BnbWFpbC5jb218NjAzMTQ3fC0xMDg2ODAyNDgy>.




Reply via email to