Hi - do not use multiple threads per host and do not fetch records faster than 
once every 2+ seconds, but 5+ seconds is better. Also, do not select over 500+ 
records for a host for each generation cycle. These guidelines keep you safe 
almost all the time. Faster is possible though.

M.

-----Original message-----
From: BlackIce<[email protected]>
Sent: Wednesday 18th November 2015 20:51
To: [email protected]
Subject: Complaint from a crawled website!

Hi Group,

I just received a complaint from my ISP stating that my "server" was attacking 
someones firewall. My guess is that I had nutch crawling too agressivly. And my 
question is: What are "Best Practices" in order to avoid such problems?


Reply via email to