Re: URL scanning by bots

André Warnier Wed, 01 May 2013 05:23:03 -0700

Dirk-Willem van Gulik wrote:

On 1 mei 2013, at 13:31, Graham Leggett <minf...@sharp.fm> wrote:

The evidence was just explained - a bot that does not get an answer quick 
enough gives up and looks elsewhere.
The key words are "looks elsewhere".



For what it is worth - I've been experimenting with this (up till about 6 
months ago) on a machine of mine. Having the 200, 403, 404, 500 etc determined 
by an entirely unscientific 'modulo' of the IP address. Both on the main URL as 
well as on a few PHP/plesk hole URLs. And have ignored/behaved normal for any 
source IP that has (ever) fetched robot.txt from the same IP masked by the 
first 20 bits.

That showed that bot's indeed slowdown/do-not-come back so soon if you give 
them a 403 or similar - but I saw no differences as to which non 200 you give 
them (not tried slow reply or no reply). Do note though that I was focusing on 
naughty non-robot.txt fetching bots.

For what it's worth also, thank you.

This kind of response really helps, even if/when it would contradict the proposal that Iam trying to push. It helps because it provides some *evidence* which I am havingdifficulties collecting by myself, and which would allow to *really* judge the proposal onits merits, not just on unsubstantiated opinions.

At another level, I would add this : if implementing my proposal turns out to have noeffect, or a very small effect on the Internet at large, but effectively helps the serverwhere it is active to avoid some of these scans, then I believe that considering the easeand very low cost of implementing this proposal, it would still be worth the trouble.

Re: URL scanning by bots

Reply via email to