2008/9/18 Edward Quick <[EMAIL PROTECTED]>:
>
> Thanks Doğacan,
>
> I set numFetchers but only see the fetch being done from one host at one 
> time, not all at the same time.
> This is what I ran:
>
> -bash-3.00$ bin/nutch generate crawl/crawldb crawl/segments -numFetchers 3
> Generator: Selecting best-scoring urls due for fetch.
> Generator: starting
> Generator: segment: crawl/segments/20080918173443
> Generator: filtering: true
> Generator: Partitioning selected urls by host, for politeness.
> Generator: done.
> -bash-3.00$ bin/nutch fetch crawl/segments/20080918173443
> Fetcher: starting
> Fetcher: segment: crawl/segments/20080918173443
>

Hmm, how many parts are under crawl/segments/20080918173443/crawl_generate?

>
>
>
>> Date: Thu, 18 Sep 2008 18:34:26 +0300
>> From: [EMAIL PROTECTED]
>> To: [email protected]
>> Subject: Re: running fetches in hadoop
>>
>> Hi,
>>
>> On Thu, Sep 18, 2008 at 5:23 PM, Edward Quick <[EMAIL PROTECTED]> wrote:
>> >
>> > I have 3 hosts in a hadoop cluster and noticed that the fetch only runs 
>> > from one host at a time.
>> > Is that right or should the fetch run from all 3 hosts at the same time?
>> >
>>
>> Try running generate like this:
>>
>> bin/nutch generate <other options> -numFetchers 3
>>
>> > Thanks,
>> >
>> > Ed.
>> >
>> > _________________________________________________________________
>> > Discover Bird's Eye View now with Multimap from Live Search
>> > http://clk.atdmt.com/UKM/go/111354026/direct/01/
>>
>>
>>
>> --
>> Doğacan Güney
>
> _________________________________________________________________
> Discover Bird's Eye View now with Multimap from Live Search
> http://clk.atdmt.com/UKM/go/111354026/direct/01/



-- 
Doğacan Güney

Reply via email to