Hi Ric, The timing of the requests seems nice: > 5 seconds between each GET, it is too much for your machines?. Does it respect Robots.txt ? My take on that is a bad parsing of links, most likely caused by malformed ones. Could you identify where the links would came from?
Cheers, 2010/12/13 Doğacan Güney <[email protected]> > Hi, > > Thank you for the email. Can you provide some more information? For > example, > how many requests does the bot make per second, does it respect robots.txt, > etc? > > On Mon, Dec 13, 2010 at 11:28, Chrislip, Ric <[email protected]> > wrote: > > For several days now a Nutch robot from IP 174.36.195.29 has been hitting > > our run-time Web servers. I noticed because our event logs are showing > many > > ASP.NET warnings about "illegal characters in path". > > > > Your Web page at http://nutch.apache.org/bot.htm says that you would > "like > > to hear about any bad behavior." > > > > I have attached today's log entries from that IP address on one of our > > servers. > > > > Ric Chrislip > > Senior Programmer/Analyst, E-mail Administrator > > Clark Hall 111 > > Hartwick College > > Oneonta, New York, USA > > 607-431-4189 > > > > > > -- > Doğacan Güney >

