>>We keep our own page view stats in a database and want to avoid counting
>>page views by visiting spyders. What's a good method for recognizing
>>spyders without throwing away valid visitor page views?
First, you should take in account that there are "good bots", like
Google, Yahoo, etc.
The best way to detect them is by their IP addresses, most of them
publish teyr IP range on their site.
Then there are bad bots, accounting for about half of the whole bot traffic.
Those are much more difficult to detect, since they disguise themselves
into normal human browser.
There are techniques to detect - and eventually ban them - but it takes
quite an amount of coding.
- keep track of the average time between HTTP requests. Below a few
seconds, it's a bot.
- bots generally don't read images nor script or CSS files.
- bad bots generally do not read nor comply to robots.txt commands, so
you can use this to set a robot trap.
put a link in some hidden div to a robot trap that is forbidden in
robots.txt. Bad bots will jump on it.
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
Adobe® ColdFusion® 8 software 8 is the most important and dramatic release to
date
Get the Free Trial
http://ad.doubleclick.net/clk;207172674;29440083;f
Archive:
http://www.houseoffusion.com/groups/cf-talk/message.cfm/messageid:318042
Subscription: http://www.houseoffusion.com/groups/cf-talk/subscribe.cfm
Unsubscribe:
http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4