> Today we had Google + 3 or 4 other spiders hammering our multi-instance 
> server at the same time. Is there a way to
> control these bots to prevent them from submitting request after request? How 
> do most high traffic servers handle this?
> Thanks!

The other answers you've already received are on-point, so I won't
reiterate those. But in addition, it can be important to ensure that
you don't create a brand new session for each page request, as many
crawlers disregard cookies.

Also, high-traffic servers typically handle this sort of thing via
caching, which can be done many different ways, with different levels
of aggressiveness. For example, generating static HTML as Matt
mentioned is a pretty aggressive (and potentially very effective)
caching mechanism. High volume sites often use third-party CDNs to
take care of some of this as well.

Dave Watts, CTO, Fig Leaf Software
http://www.figleaf.com/
http://training.figleaf.com/

Fig Leaf Software is a Veteran-Owned Small Business (VOSB) on
GSA Schedule, and provides the highest caliber vendor-authorized
instruction at our training centers, online, or onsite.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
Order the Adobe Coldfusion Anthology now!
http://www.amazon.com/Adobe-Coldfusion-Anthology/dp/1430272155/?tag=houseoffusion
Archive: 
http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:351043
Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/groups/cf-talk/unsubscribe.cfm

Reply via email to