Thanks Doğacan,

I set numFetchers but only see the fetch being done from one host at one time, 
not all at the same time.
This is what I ran:

-bash-3.00$ bin/nutch generate crawl/crawldb crawl/segments -numFetchers 3
Generator: Selecting best-scoring urls due for fetch.
Generator: starting
Generator: segment: crawl/segments/20080918173443
Generator: filtering: true
Generator: Partitioning selected urls by host, for politeness.
Generator: done.
-bash-3.00$ bin/nutch fetch crawl/segments/20080918173443
Fetcher: starting
Fetcher: segment: crawl/segments/20080918173443




> Date: Thu, 18 Sep 2008 18:34:26 +0300
> From: [EMAIL PROTECTED]
> To: [email protected]
> Subject: Re: running fetches in hadoop
> 
> Hi,
> 
> On Thu, Sep 18, 2008 at 5:23 PM, Edward Quick <[EMAIL PROTECTED]> wrote:
> >
> > I have 3 hosts in a hadoop cluster and noticed that the fetch only runs 
> > from one host at a time.
> > Is that right or should the fetch run from all 3 hosts at the same time?
> >
> 
> Try running generate like this:
> 
> bin/nutch generate <other options> -numFetchers 3
> 
> > Thanks,
> >
> > Ed.
> >
> > _________________________________________________________________
> > Discover Bird's Eye View now with Multimap from Live Search
> > http://clk.atdmt.com/UKM/go/111354026/direct/01/
> 
> 
> 
> -- 
> Doğacan Güney

_________________________________________________________________
Discover Bird's Eye View now with Multimap from Live Search
http://clk.atdmt.com/UKM/go/111354026/direct/01/

Reply via email to