On 12/15/99 at 12:32PM Boris Goldowsky wrote:

>I've been asked to report page-view statistics for our web site
>which eliminate page views from search-engine spiders and other robots.
>
>I've tried to do some of this by coming up with a list of User-Agent
>strings that look like spiders, but it seems like a hit-or-miss sort
>of approach.

There's nothing to distinguish any HTTP request (from a person or a 
robot), except the IP address it's coming from, and the UserAgent 
string. And, almost by definition, you can't have a definitive list of 
these. (For example, most search engine software will allow the agent 
string to be customized).

But all well behaved spiders request /ROBOTS.TXT before anything else 
on your server, so if you do a report on just that file, you should get 
a quick run down of the UserAgents or IP addresses to exclude. 
(Obviously, a person can request that file manually, but it doesn't 
happen very frequently).

>Bng

Aengus

Unknown data type

Reply via email to