Thanks, I'll give that a shot!
Jesse

int GetRandomNumber()
{
   return 4; // Chosen by fair roll of dice
                // Guaranteed to be random
} // xkcd.com



On Thu, Oct 29, 2009 at 5:53 AM, Andrzej Bialecki <a...@getopt.org> wrote:

> Jesse Hires wrote:
>
>> I have a two datanode and one namenode setup. One of my datanodes is
>> slower
>> than the other, causing the fetch to run significantly longer on it. Is
>> there a way to balance this out?
>>
>
> Most likely the number of URLs/host is unbalanced, meaning that the
> tasktracker that takes the longest is assigned a lot of URLs from a single
> host.
>
> A workaround for this is to limit the max number of URLs per host (in
> nutch-site.xml) to a more reasonable number, e.g. 100 or 1000, whatever
> works best for you.
>
> --
> Best regards,
> Andrzej Bialecki     <><
>  ___. ___ ___ ___ _ _   __________________________________
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com
>
>

Reply via email to