Matt Zytaruk wrote:
Hi all.
I'm trying to do a full crawl (all the pages in the site) of about 100
sites. Unfortunately I'm getting as many errors as I am successful
fetches, mostly all max.delays.exceeded. Is there any way to improve
this so that I don't get this error as much? I tried changing the
max.delays property in the nutch conf to a higher value, and I've also
tried using fewer threads (went down from 100 to 50) but with no
improvement really. This is using the nutch-0.8-dev version. Any help
would be immensely appreciated.
I've seen something similar with 0.7.1. Unfortunately, it seems to be
my ISP causing the trouble, I think in their DNS resolution. I tried
the same crawl on another machine with a different ISP, and it went
through much more smoothly. Like night and day.
--MDC
- Re: max delays error Michael D. Curtin
-