> I have for the first time encountered the problem that some braindead
> web robot (ExtractorPro) attempted to download all of the site and
> appended some random URL segment at the end of an embedded perl page. I
> use suffix .phtml for these pages, and the url looked like
> <http://mysite//page.phtml/randomotherurl>. The innocent embperl page
> delivered some contents with relative urls and the robot continued to
> fetch the same page with various URL suffixes, causing a loop and doing
> the equivalent of an Apache bench remotely.
>
> What is the best way to stop these kinds of mishaps? And what the heck
> is this ExtractorPro thing?
>

Maybe Apache::SpeedLimit is helpfull. It limits the number of pages one
client can fetch in per time. There a other Apache modules to block robots,
look at the Apache module list.

Gerald

-------------------------------------------------------------
Gerald Richter    ecos electronic communication services gmbh
Internetconnect * Webserver/-design/-datenbanken * Consulting

Post:       Tulpenstrasse 5         D-55276 Dienheim b. Mainz
E-Mail:     [EMAIL PROTECTED]         Voice:    +49 6133 925151
WWW:        http://www.ecos.de      Fax:      +49 6133 925152
-------------------------------------------------------------

Reply via email to