Hi - do not use multiple threads per host and do not fetch records faster than once every 2+ seconds, but 5+ seconds is better. Also, do not select over 500+ records for a host for each generation cycle. These guidelines keep you safe almost all the time. Faster is possible though.
M. -----Original message----- From: BlackIce<[email protected]> Sent: Wednesday 18th November 2015 20:51 To: [email protected] Subject: Complaint from a crawled website! Hi Group, I just received a complaint from my ISP stating that my "server" was attacking someones firewall. My guess is that I had nutch crawling too agressivly. And my question is: What are "Best Practices" in order to avoid such problems?

