Hi All,

I was wondering if someone could point me in the right direction for carrying 
out a distributed crawl.  Basically I was to split a crawl over a few machines. 
Is there a way of just 'fetching' the pages using multiple machines and then 
merging the results onto a single machine? Can I then run the Nutch indexing 
process over that single machine?

Thanks
Karen

Reply via email to