Dirk-Willem van Gulik wrote:
On 1 mei 2013, at 13:31, Graham Leggett <minf...@sharp.fm> wrote:
The evidence was just explained - a bot that does not get an answer quick
enough gives up and looks elsewhere.
The key words are "looks elsewhere".
For what it is worth - I've been experimenting with this (up till about 6
months ago) on a machine of mine. Having the 200, 403, 404, 500 etc determined
by an entirely unscientific 'modulo' of the IP address. Both on the main URL as
well as on a few PHP/plesk hole URLs. And have ignored/behaved normal for any
source IP that has (ever) fetched robot.txt from the same IP masked by the
first 20 bits.
That showed that bot's indeed slowdown/do-not-come back so soon if you give
them a 403 or similar - but I saw no differences as to which non 200 you give
them (not tried slow reply or no reply). Do note though that I was focusing on
naughty non-robot.txt fetching bots.
For what it's worth also, thank you.
This kind of response really helps, even if/when it would contradict the proposal that I
am trying to push. It helps because it provides some *evidence* which I am having
difficulties collecting by myself, and which would allow to *really* judge the proposal on
its merits, not just on unsubstantiated opinions.
At another level, I would add this : if implementing my proposal turns out to have no
effect, or a very small effect on the Internet at large, but effectively helps the server
where it is active to avoid some of these scans, then I believe that considering the ease
and very low cost of implementing this proposal, it would still be worth the trouble.