Judging from today's logs, it seems I'm still getting loads of what look to rather pointless connections from the relevant IP address. Using ngrep, I see Debian Apt-Cacher-NG/0.5.1 as the user agent in the requests, although I've not bothered to eliminate the posibility that you have other clients involved (so you should eliminate that possibility before assuming that it's definitely apt-cacher-ng doing this)
Choosing one of the busier seconds of the day so far, we get:
grep $IP /var/log/nginx/ftphost.access.log | sed -ne
'\#04/Oct/2010:15:52:03#s/.*GET \([^ ]*\) .*$/\1/p' | sort | uniq -c | sort -nr
23 /debian-volatile/dists/lenny/volatile/main/binary-i386/Packages.bz2
22 /debian/dists/lenny/contrib/source/Sources.bz2
22 /debian-volatile/dists/lenny/volatile/contrib/binary-i386/Packages.bz2
19 /debian/dists/lenny/non-free/source/Sources.bz2
18 /debian-volatile/dists/lenny/volatile/non-free/binary-i386/Packages.bz2
17 /debian/dists/lenny/main/source/Sources.bz2
15 /debian/dists/lenny/contrib/binary-i386/Packages.bz2
12 /debian-volatile/dists/lenny/volatile/Release.gpg
11 /debian/dists/lenny/non-free/binary-i386/Packages.bz2
11 /debian/dists/lenny/Release
...
So, that's up to 23 repeated requests for the same file in one second.
That strikes me as though it is something that a cache should not be
doing.
It should not simply pass on a DoS, even if that is the reason for the
behaviour we're seeing here.
It strikes me that if the program has determined that the file is
unchanged, it could remember that for a few seconds at least. Also, if
it's in the process of asking about a file, it could avoid asking again,
I'd think.
Here is a section the list of the number of times a second that that IP
address asked for /debian/dists/lenny/contrib/source/Sources.bz2, with
dates:
2 04/Oct/2010:17:48:44 +0100
3 04/Oct/2010:17:48:45 +0100
1 04/Oct/2010:17:48:53 +0100
5 04/Oct/2010:17:49:03 +0100
5 04/Oct/2010:17:56:37 +0100
2 04/Oct/2010:17:56:41 +0100
1 04/Oct/2010:17:56:42 +0100
1 04/Oct/2010:17:56:43 +0100
1 04/Oct/2010:17:56:44 +0100
1 04/Oct/2010:17:56:45 +0100
1 04/Oct/2010:17:56:46 +0100
1 04/Oct/2010:17:56:48 +0100
5 04/Oct/2010:17:56:49 +0100
1 04/Oct/2010:17:56:50 +0100
5 04/Oct/2010:17:56:51 +0100
7 04/Oct/2010:17:56:52 +0100
7 04/Oct/2010:17:56:53 +0100
6 04/Oct/2010:17:56:54 +0100
3 04/Oct/2010:17:56:55 +0100
5 04/Oct/2010:17:56:57 +0100
4 04/Oct/2010:17:56:58 +0100
1 04/Oct/2010:17:56:59 +0100
7 04/Oct/2010:17:57:00 +0100
4 04/Oct/2010:17:57:01 +0100
15 04/Oct/2010:17:57:02 +0100
5 04/Oct/2010:17:57:03 +0100
4 04/Oct/2010:17:57:04 +0100
8 04/Oct/2010:17:57:05 +0100
11 04/Oct/2010:17:57:07 +0100
9 04/Oct/2010:17:57:08 +0100
2 04/Oct/2010:17:57:09 +0100
5 04/Oct/2010:17:57:11 +0100
5 04/Oct/2010:17:57:35 +0100
So that's asking me for the same URL up to 15 times a second, every
second for about 30 seconds -- this is not clever.
Cheers, Phil.
P.S. Keith, looking back through the logs, it's clear that this has
been going on for at least a month, and probably longer. It's only
since the server was upgraded recently that it's quick enough for the
traffic spike to be as sharp as it is, which piqued my curiosity enough
to rummage through the logs, so I wasn't trying to imply that something
had recently given rise to this situation, it's just that it's only
recently been as obvious in the graphs.
--
|)| Philip Hands [+44 (0)20 8530 9560] http://www.hands.com/
|-| HANDS.COM Ltd. http://www.uk.debian.org/
|(| 10 Onslow Gardens, South Woodford, London E18 1NE ENGLAND
pgpvHWirOSqtv.pgp
Description: PGP signature

