Levy, Alan wrote:
I have a server that gets about 1M hits per day. Over the past week,
this has exploded and the server is using about 80% of the cpu. We
figure that someone is using a webcrawler since when we analyze the
tomcat logs, there are thousands of hits from one ip address (every day
it's a different ip address).



Is there an open source or commercial product that will stop this?


Hi Alan,
there have already been many very good answers to your problem (mostly
related to robots.txt).

But I would like to also point out that Linux has a very sophisticated
tool for controlling IP traffic and I'm almost 99.9% sure that it is
already on your system right now!
iproute and tc

On RHEL its in package iproute and on SLES its in iproute2.
Sophisticated also means complex, so I am not suggesting its as easy to
setup as robots.txt, but it is a more general and comprehensive solution
to your and perhaps other very similar problems.

Both rpm packages supply doc files and examples (do rpm -qil packagename).
Also there is a Wiki at http://www.linux-foundation.org/en/Net:Iproute2

Mark

----------------------------------------------------------------------
For LINUX-390 subscribe / signoff / archive access instructions,
send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit
http://www.marist.edu/htbin/wlvindex?LINUX-390

Reply via email to