Randal,

This is a nice package, but I have some questions:

>       my $host = $r->get_remote_host;
>       return DECLINED if $host =~ /\.(holdit|stonehenge)\.com$/;

You have host name lookups turned on? That's not very performance friendly.
And you've just published how to get around your throttler: all I need to
do is change my reverse DNS to some name in one of your domains.
You can't trust external DNS information.

Also, isn't all this file i/o for every request also bad for performance?

Anyway it's a nice method that I think I'll adapt to a problem I have, which is
to detect and stop dictionnary attacks on a password protected site. In that
case the throttling needs only be applied when authentication has failed and
thus won't affect performance for normal requests.

Thanks for the code,
--
Eric


> Nope, I could find nothing to distinguish "evil spider" mode from
> normal browsing mode, other than the rapidity of the download
> requests.
> 
> So, I wrote my own throttling routines, unsatisfied with the others
> that I found...
> 
>     package Stonehenge::Throttle;
>     use strict;
> 
>     ## usage: PerlAccessHandler Stonehenge::Throttle;
> 
>     my $HISTORYDIR = "/home/merlyn/lib/Apache/Throttle";
> 
>     my $WINDOW = 90;          # seconds of interest
>     my $SLOWBYTES = $WINDOW * 2000;   # bytes before we sleep
>     my $SLEEP = 1;                    # sleep time
>     my $DECLINEBYTES = $WINDOW * 3000; # bytes before we 408 error
> 
>     use vars qw($VERSION);
>     $VERSION = (qw$Revision: 1.4 $ )[-1];
> 
>     use Apache::Constants qw(OK DECLINED);
>     use Apache::File;
>     use Apache::Log;
> 
>     use Stonehenge::Reload;
> 
>     sub handler {
>       goto &handler if Stonehenge::Reload->reload_me;
> 
>       my $r = shift;
>       return DECLINED unless $r->is_initial_req;
>       my $log = $r->server->log;
> 
>       my $host = $r->get_remote_host;
>       return DECLINED if $host =~ /\.(holdit|stonehenge)\.com$/;
>
> 
>       my $historyfile = "$HISTORYDIR/$host"; # closure var
> 
>       $r->register_cleanup
>       (sub {
>          my $fh = Apache::File->new;
>          open $fh, ">>$historyfile" or return DECLINED;
> 
>          my $time = time;
>          my $bytes = $r->bytes_sent;
>          syswrite $fh, pack "LL", $time, $bytes;
>          close $fh;
> 
>          return OK;
>        });
> 
>       {
>       my $startwindow = time - $WINDOW;
>       my $totalbytes = 0;
>       my $fh = Apache::File->new;
>       open $fh, $historyfile or return DECLINED;
>       while ((read $fh, my $buf, 8) > 0) {
>         my ($time, $bytes) = unpack "LL", $buf;
>         next if $time < $startwindow;
>         $totalbytes += $bytes;
>       }
>       if ($totalbytes > $DECLINEBYTES) {
>         $log->notice("$host got $totalbytes in $WINDOW secs, sending 503");
>         $r->header_out("Retry-After", $WINDOW);
>         return 503;           # Service Unavailable
>       } elsif ($totalbytes > $SLOWBYTES) {
>         $log->notice("$host got $totalbytes in $WINDOW secs, sleeping for $SLEEP");
>         sleep $SLEEP;
>         return DECLINED;
>       } else {
>         ## $log->notice("$host got $totalbytes in $WINDOW secs"); # DEBUG
>         return DECLINED;
>       }
>       }
>       return DECLINED;
>     }
>     1;
> 
> This has to be aided by a cron script run every 20 minutes or so
> that looks like this:
> 
>     #!/usr/bin/perl -w
>     use strict;
> 
>     # $Id: throttle-cleaner,v 1.1 1999/10/28 19:44:09 merlyn Exp $
> 
>     my $DIR = "/home/merlyn/lib/Apache/Throttle";
>     my $SECS = 360;                   # more than Stonehenge::Throttle $WINDOW
> 
>     chdir $DIR or die "Cannot chdir $DIR: $!";
>     opendir DOT, "." or die "Cannot opendir .: $!";
>     my $when = time - $SECS;
>     while (my $name = readdir DOT) {
>       next unless -f $name;
>       next if (stat($name))[8] > $when;
>       ## warn "unlinking $name\n";
>       unlink $name;
>     }
> 
> So now I have a bytes-served-in-window throttler on my website that
> prevents anyone from sucking down more than 3k/sec sustained over 90
> seconds from any specific IP.
> 
> It triggered five times overnight.  But my ISP neighbors are now
> happy.
> 
> I should clean up Stonehenge::Throttle and submit it.  Notice, no file
> locking!  That was an interesting fallout of the design.
> 
> -- 
> Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
> <[EMAIL PROTECTED]> <URL:http://www.stonehenge.com/merlyn/>
> Perl/Unix/security consulting, Technical writing, Comedy, etc. etc.
> See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training!
> 

Reply via email to