Yes you will have to use user-agent and partial matches.  Of course  
there are thousands of possible bots and spiders out there so you will  
have a fun time building the list and then updating it.  Since your  
doing your own, I'm guessing you have access to the HTTP log files?   
if so you can run any number of free to inexpensive http log file  
analysis programs on those logs.  Personally I use Weblog Expert Pro.   
It's cheap, fast and customizable.

Otherwise, here is a website with all the spiders know to date.  
http://www.robotstxt.org/db.html


Wil Genovese

One man with courage makes a majority.
-Andrew Jackson

A fine is a tax for doing wrong. A tax is a fine for doing well.

On Jan 15, 2009, at 5:39 PM, Jim McAtee wrote:

> We keep our own page view stats in a database and want to avoid  
> counting
> page views by visiting spyders.  What's a good method for recognizing
> spyders without throwing away valid visitor page views?
>
> Something using cgi.user_agent, no doubt, but how can we keep a fairly
> comprehensive list up to date, and do we try to do exact string  
> matches,
> partial matches, or what?
>
>
> 

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to 
date
Get the Free Trial
http://ad.doubleclick.net/clk;207172674;29440083;f

Archive: 
http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:318037
Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm
Unsubscribe: http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=89.70.4

Reply via email to