so before version 1.0 nutch run on a single fetch thread only or has
multithread but single process? 

Andrzej Bialecki wrote:
> 
> Michael Chan wrote:
>> On Fri, Feb 27, 2009 at 5:14 PM, Andrzej Bialecki <[email protected]> wrote:
>> 
>>> Michael Chan wrote:
>>>
>>>> Hi,
>>>>
>>>> I'm trying to generate multiple segments so that I can run several
>>>> fetching
>>>> tasks on a *single* machine. This is just to reduce the effort needed
>>>> to
>>>> refetch after a crash. Is the -numFetchers option still available in
>>>> 0.9?
>>>> When I use -numFetchers 4, it seems to be ignored and the generator
>>>> generates one partition. Has it been deprecated? If so, is there an
>>>> alternative?
>>>>
>>> The numFetchers option is poorly named - it still works with the current
>>> code but not in the same way as with Nutch 0.7: now it determines the
>>> number
>>> of fetching tasks, and this happens ONLY when you run in distributed
>>> mode
>>> (on a Hadoop cluster). In local mode it has no effect.
>>>
>>> Currently there is no support for generating multiple segments in one
>>> go.
>>> However, if you set generator.update.crawldb to true, you can generate
>>> multiple segments in multiple runs of Generator, and then fetch / update
>>> these segments in arbitrary order.
>> 
>> 
>> Is it recommended to run several fetchers using these segments on a
>> single
>> machine at once? Thanks.
> 
> It's not recommended - if you run everything on a single machine it's 
> better to increase the number of threads. If your machine can take the 
> load you could try to run multiple fetchers at once, but it consumes 
> more resources than 1 fetcher using more threads. Usually the load is 
> too high (in terms of CPU, disk IO and network traffic) on a single 
> machine, that's why it's better to set up a cluster.
> 
> 
> -- 
> Best regards,
> Andrzej Bialecki     <><
>   ___. ___ ___ ___ _ _   __________________________________
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/The-numFetchers-option-tp22246349p22406295.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to