this is the official way to do it. http://www.robotstxt.org/
Only the main search engines honour this though, the less popular search spiders will ignore it and do as they like. for those you can do some web content filtering or use a web application firewall to control activity on your site. On Mon, May 7, 2012 at 11:42 PM, Richard Steele <[email protected]> wrote: > > Today we had Google + 3 or 4 other spiders hammering our multi-instance > server at the same time. Is there a way to control these bots to prevent > them from submitting request after request? How do most high traffic > servers handle this? Thanks! > > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~| Order the Adobe Coldfusion Anthology now! http://www.amazon.com/Adobe-Coldfusion-Anthology/dp/1430272155/?tag=houseoffusion Archive: http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:351031 Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm Unsubscribe: http://www.houseoffusion.com/groups/cf-talk/unsubscribe.cfm

