You can emit multiple fetchlists using the -numFetchers option, copy each segment to a different machine to fetch, copy the segments back, and run updatedb on all the segments.
Andy On 7/6/05, Karen Church <[EMAIL PROTECTED]> wrote: > Hi All, > > I was wondering if someone could point me in the right direction for carrying > out a distributed crawl. Basically I was to split a crawl over a few > machines. Is there a way of just 'fetching' the pages using multiple machines > and then merging the results onto a single machine? Can I then run the Nutch > indexing process over that single machine? > > Thanks > Karen > > ------------------------------------------------------- SF.Net email is sponsored by: Discover Easy Linux Migration Strategies from IBM. Find simple to follow Roadmaps, straightforward articles, informative Webcasts and more! Get everything you need to get up to speed, fast. http://ads.osdn.com/?ad_idt77&alloc_id492&op=click _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
