2008/9/18 Edward Quick <[EMAIL PROTECTED]>: > > Thanks Doğacan, > > I set numFetchers but only see the fetch being done from one host at one > time, not all at the same time. > This is what I ran: > > -bash-3.00$ bin/nutch generate crawl/crawldb crawl/segments -numFetchers 3 > Generator: Selecting best-scoring urls due for fetch. > Generator: starting > Generator: segment: crawl/segments/20080918173443 > Generator: filtering: true > Generator: Partitioning selected urls by host, for politeness. > Generator: done. > -bash-3.00$ bin/nutch fetch crawl/segments/20080918173443 > Fetcher: starting > Fetcher: segment: crawl/segments/20080918173443 >
Hmm, how many parts are under crawl/segments/20080918173443/crawl_generate? > > > >> Date: Thu, 18 Sep 2008 18:34:26 +0300 >> From: [EMAIL PROTECTED] >> To: [email protected] >> Subject: Re: running fetches in hadoop >> >> Hi, >> >> On Thu, Sep 18, 2008 at 5:23 PM, Edward Quick <[EMAIL PROTECTED]> wrote: >> > >> > I have 3 hosts in a hadoop cluster and noticed that the fetch only runs >> > from one host at a time. >> > Is that right or should the fetch run from all 3 hosts at the same time? >> > >> >> Try running generate like this: >> >> bin/nutch generate <other options> -numFetchers 3 >> >> > Thanks, >> > >> > Ed. >> > >> > _________________________________________________________________ >> > Discover Bird's Eye View now with Multimap from Live Search >> > http://clk.atdmt.com/UKM/go/111354026/direct/01/ >> >> >> >> -- >> Doğacan Güney > > _________________________________________________________________ > Discover Bird's Eye View now with Multimap from Live Search > http://clk.atdmt.com/UKM/go/111354026/direct/01/ -- Doğacan Güney
