Hi all.

I'm trying to do a full crawl (all the pages in the site) of about 100 sites. Unfortunately I'm getting as many errors as I am successful fetches, mostly all max.delays.exceeded. Is there any way to improve this so that I don't get this error as much? I tried changing the max.delays property in the nutch conf to a higher value, and I've also tried using fewer threads (went down from 100 to 50) but with no improvement really. This is using the nutch-0.8-dev version. Any help would be immensely appreciated.

-Matt Z

Reply via email to