Judging from today's logs, it seems I'm still getting loads of what look
to rather pointless connections from the relevant IP address.  Using
ngrep, I see Debian Apt-Cacher-NG/0.5.1 as the user agent in the
requests, although I've not bothered to eliminate the posibility that
you have other clients involved (so you should eliminate that
possibility before assuming that it's definitely apt-cacher-ng doing this)

Choosing one of the busier seconds of the day so far, we get:

grep $IP /var/log/nginx/ftphost.access.log | sed -ne 
'\#04/Oct/2010:15:52:03#s/.*GET \([^ ]*\) .*$/\1/p' | sort | uniq -c | sort -nr
     23 /debian-volatile/dists/lenny/volatile/main/binary-i386/Packages.bz2
     22 /debian/dists/lenny/contrib/source/Sources.bz2
     22 /debian-volatile/dists/lenny/volatile/contrib/binary-i386/Packages.bz2
     19 /debian/dists/lenny/non-free/source/Sources.bz2
     18 /debian-volatile/dists/lenny/volatile/non-free/binary-i386/Packages.bz2
     17 /debian/dists/lenny/main/source/Sources.bz2
     15 /debian/dists/lenny/contrib/binary-i386/Packages.bz2
     12 /debian-volatile/dists/lenny/volatile/Release.gpg
     11 /debian/dists/lenny/non-free/binary-i386/Packages.bz2
     11 /debian/dists/lenny/Release
...

So, that's up to 23 repeated requests for the same file in one second.
That strikes me as though it is something that a cache should not be
doing.

It should not simply pass on a DoS, even if that is the reason for the
behaviour we're seeing here.

It strikes me that if the program has determined that the file is
unchanged, it could remember that for a few seconds at least.  Also, if
it's in the process of asking about a file, it could avoid asking again,
I'd think.

Here is a section the list of the number of times a second that that IP
address asked for /debian/dists/lenny/contrib/source/Sources.bz2, with
dates:

      2 04/Oct/2010:17:48:44 +0100
      3 04/Oct/2010:17:48:45 +0100
      1 04/Oct/2010:17:48:53 +0100
      5 04/Oct/2010:17:49:03 +0100
      5 04/Oct/2010:17:56:37 +0100
      2 04/Oct/2010:17:56:41 +0100
      1 04/Oct/2010:17:56:42 +0100
      1 04/Oct/2010:17:56:43 +0100
      1 04/Oct/2010:17:56:44 +0100
      1 04/Oct/2010:17:56:45 +0100
      1 04/Oct/2010:17:56:46 +0100
      1 04/Oct/2010:17:56:48 +0100
      5 04/Oct/2010:17:56:49 +0100
      1 04/Oct/2010:17:56:50 +0100
      5 04/Oct/2010:17:56:51 +0100
      7 04/Oct/2010:17:56:52 +0100
      7 04/Oct/2010:17:56:53 +0100
      6 04/Oct/2010:17:56:54 +0100
      3 04/Oct/2010:17:56:55 +0100
      5 04/Oct/2010:17:56:57 +0100
      4 04/Oct/2010:17:56:58 +0100
      1 04/Oct/2010:17:56:59 +0100
      7 04/Oct/2010:17:57:00 +0100
      4 04/Oct/2010:17:57:01 +0100
     15 04/Oct/2010:17:57:02 +0100
      5 04/Oct/2010:17:57:03 +0100
      4 04/Oct/2010:17:57:04 +0100
      8 04/Oct/2010:17:57:05 +0100
     11 04/Oct/2010:17:57:07 +0100
      9 04/Oct/2010:17:57:08 +0100
      2 04/Oct/2010:17:57:09 +0100
      5 04/Oct/2010:17:57:11 +0100
      5 04/Oct/2010:17:57:35 +0100

So that's asking me for the same URL up to 15 times a second, every
second for about 30 seconds -- this is not clever.

Cheers, Phil.

P.S.  Keith, looking back through the logs, it's clear that this has
been going on for at least a month, and probably longer.  It's only
since the server was upgraded recently that it's quick enough for the
traffic spike to be as sharp as it is, which piqued my curiosity enough
to rummage through the logs, so I wasn't trying to imply that something
had recently given rise to this situation, it's just that it's only
recently been as obvious in the graphs.
-- 
|)|  Philip Hands [+44 (0)20 8530 9560]    http://www.hands.com/
|-|  HANDS.COM Ltd.                    http://www.uk.debian.org/
|(|  10 Onslow Gardens, South Woodford, London  E18 1NE  ENGLAND

Attachment: pgpvHWirOSqtv.pgp
Description: PGP signature

Reply via email to