>>I can tell you with absolute certainty that Google obeys robots.txt.

I'm pretty sure they do.
But we all know that sometimes, an HTTP request is lost somewhere in the 
cyber space.
If for any reason the robot does not receive the file, it will probably 
act as if there is none.
Only once will suffice to send the robot in the trap and classify it as 
"big bad Bot".
This is what I don't like in this method.

I've just implemented an other one which consists of
1. detect any robot, good or bad, by providing a link on a non human 
clickable 1pix image to some
template that registers any new robot.
This will also send me a mail with a link so I can go to the admin tools 
to decide if it is a bad robot or not.

Now on my sites, no robot, even good ones will ever see any image. Who 
needs to have images downlowded by robots?
Further more, bad robots are simply banned from the site.

I will just keep a dozen or of good robots, like Google, MSN Yahoo etc.
Who needs to be scanned by oddities like "disco/Nutch-0.9 (experimental 
crawler; [EMAIL PROTECTED])"
or "otbqrmupgnsbprxxwiqr6cw6xiwxqkfqc66cuu" anyway?

Any email crawler or image fetcher will fall in the trap the first time, 
and it will be their last time.
Once I'll have the dozen of good bots I want, I'll even set the "bad 
Bot" flag by default, gnarh, gnarh, gnarh!

The robots.txt is not a bad idea, except that only well behaving robots 
respect them, and they are not
exactly the ones you really need to control.

-- 
_______________________________________
REUSE CODE! Use custom tags;
See http://www.contentbox.com/claude/customtags/tagstore.cfm
(Please send any spam to this address: [EMAIL PROTECTED])
Thanks.


~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~|
Get the answers you are looking for on the ColdFusion Labs
Forum direct from active programmers and developers.
http://www.adobe.com/cfusion/webforums/forum/categories.cfm?forumid-72&catid=648

Archive: 
http://www.houseoffusion.com/groups/CF-Talk/message.cfm/messageid:292191
Subscription: http://www.houseoffusion.com/groups/CF-Talk/subscribe.cfm
Unsubscribe: 
http://www.houseoffusion.com/cf_lists/unsubscribe.cfm?user=11502.10531.4

Reply via email to