By any chance are you crawling many pages stored on a single server or small number of servers? If so, take a look at:
http://www.mail-archive.com/nutch-developers%40lists.sourceforge.net/msg04414.html http://www.mail-archive.com/nutch-developers%40lists.sourceforge.net/msg04427.html On 7/27/05, Christophe Noel <[EMAIL PROTECTED]> wrote: > Hello, > > When I'm fetching , I really have too many Http Timeout with default > nutch parameters. > > Does anyone have tips to improve that point ? > > Thanks very much. > > Christophe Noël. > www.cetic.be > > ===== > > org.apache.nutch.protocol.RetryLater: Exceeded http.max.delays: retry later. > at > org.apache.nutch.protocol.httpclient.Http.blockAddr(Http.java:133) > at > org.apache.nutch.protocol.httpclient.Http.getProtocolOutput(Http.java:201) > at > org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:135) > org.apache.nutch.protocol.RetryLater: Exceeded http.max.delays: retry later. > at > org.apache.nutch.protocol.httpclient.Http.blockAddr(Http.java:133) > at > org.apache.nutch.protocol.httpclient.Http.getProtocolOutput(Http.java:201) > at > org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:135) > ------------------------------------------------------- SF.Net email is sponsored by: Discover Easy Linux Migration Strategies from IBM. Find simple to follow Roadmaps, straightforward articles, informative Webcasts and more! Get everything you need to get up to speed, fast. http://ads.osdn.com/?ad_idt77&alloc_id492&op=click _______________________________________________ Nutch-developers mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-developers
