On 05/01/2013 03:22 PM, André Warnier wrote:
Dirk-Willem van Gulik wrote:
On 1 mei 2013, at 13:31, Graham Leggett <[email protected]> wrote:
The evidence was just explained - a bot that does not get an answer quick
enough gives up and looks elsewhere.
The key words are "looks elsewhere".
For what it is worth - I've been experimenting with this (up till about 6
months ago) on a machine of mine. Having the
200, 403, 404, 500 etc determined by an entirely unscientific 'modulo' of the
IP address. Both on the main URL as well
as on a few PHP/plesk hole URLs. And have ignored/behaved normal for any source
IP that has (ever) fetched robot.txt
from the same IP masked by the first 20 bits.
That showed that bot's indeed slowdown/do-not-come back so soon if you give
them a 403 or similar - but I saw no
differences as to which non 200 you give them (not tried slow reply or no
reply). Do note though that I was focusing
on naughty non-robot.txt fetching bots.
For what it's worth also, thank you.
This kind of response really helps, even if/when it would contradict the
proposal that I am trying to push. It helps
because it provides some *evidence* which I am having difficulties collecting
by myself, and which would allow to
*really* judge the proposal on its merits, not just on unsubstantiated opinions.
At another level, I would add this : if implementing my proposal turns out to
have no effect, or a very small effect on
the Internet at large, but effectively helps the server where it is active to
avoid some of these scans, then I believe
that considering the ease and very low cost of implementing this proposal, it
would still be worth the trouble.
If the majority of web servers start slowing down the bots, this will simply make the bot authors make them stick to the
IPs for more time. Once something becomes the standard they can very easily adopt to the new standard.