I am using IBM HTTP server and it works. -----Original Message----- From: Linux on 390 Port [mailto:[EMAIL PROTECTED] On Behalf Of Levy, Alan Sent: Wednesday, January 16, 2008 9:05 AM To: [email protected] Subject: Re: web crawling problem
Isn't a robots.txt only for apache (which I do not use) ? -----Original Message----- From: Linux on 390 Port [mailto:[EMAIL PROTECTED] On Behalf Of Robert Flynn Sent: Wednesday, January 16, 2008 8:56 AM To: [email protected] Subject: Re: web crawling problem You need to put a robots.txt file in the server path with the following: # # keep all robots out of entire site User-agent: * Disallow: / -----Original Message----- From: Linux on 390 Port [mailto:[EMAIL PROTECTED] On Behalf Of Levy, Alan Sent: Wednesday, January 16, 2008 8:24 AM To: [email protected] Subject: web crawling problem I have a server that gets about 1M hits per day. Over the past week, this has exploded and the server is using about 80% of the cpu. We figure that someone is using a webcrawler since when we analyze the tomcat logs, there are thousands of hits from one ip address (every day it's a different ip address). Is there an open source or commercial product that will stop this? Tia ---------------------------------------------------------------------- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 ---------------------------------------------------------------------- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 ---------------------------------------------------------------------- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390 ---------------------------------------------------------------------- For LINUX-390 subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: INFO LINUX-390 or visit http://www.marist.edu/htbin/wlvindex?LINUX-390
