On Fri, Feb 27, 2009 at 5:14 PM, Andrzej Bialecki <[email protected]> wrote:
> Michael Chan wrote: > >> Hi, >> >> I'm trying to generate multiple segments so that I can run several >> fetching >> tasks on a *single* machine. This is just to reduce the effort needed to >> refetch after a crash. Is the -numFetchers option still available in 0.9? >> When I use -numFetchers 4, it seems to be ignored and the generator >> generates one partition. Has it been deprecated? If so, is there an >> alternative? >> > > The numFetchers option is poorly named - it still works with the current > code but not in the same way as with Nutch 0.7: now it determines the number > of fetching tasks, and this happens ONLY when you run in distributed mode > (on a Hadoop cluster). In local mode it has no effect. > > Currently there is no support for generating multiple segments in one go. > However, if you set generator.update.crawldb to true, you can generate > multiple segments in multiple runs of Generator, and then fetch / update > these segments in arbitrary order. Is it recommended to run several fetchers using these segments on a single machine at once? Thanks. Michael > > -- > Best regards, > Andrzej Bialecki <>< > ___. ___ ___ ___ _ _ __________________________________ > [__ || __|__/|__||\/| Information Retrieval, Semantic Web > ___|||__|| \| || | Embedded Unix, System Integration > http://www.sigram.com Contact: info at sigram dot com > >
