You can emit multiple fetchlists using the -numFetchers option, copy
each segment to a different machine to fetch, copy the segments back,
and run updatedb on all the segments.

Andy

On 7/6/05, Karen Church <[EMAIL PROTECTED]> wrote:
> Hi All,
> 
> I was wondering if someone could point me in the right direction for carrying 
> out a distributed crawl.  Basically I was to split a crawl over a few 
> machines. Is there a way of just 'fetching' the pages using multiple machines 
> and then merging the results onto a single machine? Can I then run the Nutch 
> indexing process over that single machine?
> 
> Thanks
> Karen
> 
>


-------------------------------------------------------
SF.Net email is sponsored by: Discover Easy Linux Migration Strategies
from IBM. Find simple to follow Roadmaps, straightforward articles,
informative Webcasts and more! Get everything you need to get up to
speed, fast. http://ads.osdn.com/?ad_idt77&alloc_id492&op=click
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to