On 12.09.2012 10:54, Saurabh Sheth wrote:
Squid (versions: 3.1 and 2.6) has a object in its cache and responds
to individual requests to this object just fine (TCP_HIT:NONE). From
the access.log ->
10.192.x.x - - [11/Sep/2012:15:41:55 -0700] "GET
http://originserver/data/object HTTP/1.1" 200 41136 TCP_HIT:NONE
10.192.x.x - - [11/Sep/2012:15:41:55 -0700] "GET
http://originserver/data/object HTTP/1.1" 200 24752 TCP_HIT:NONE
10.192.x.x - - [11/Sep/2012:15:41:55 -0700] "GET
http://originserver/data/object HTTP/1.1" 200 28848 TCP_HIT:NONE
10.192.x.x - - [11/Sep/2012:15:41:55 -0700] "GET
http://originserver/data/object HTTP/1.1" 200 41136 TCP_HIT:NONE
10.192.x.x - - [11/Sep/2012:15:41:55 -0700] "GET
http://originserver/data/object HTTP/1.1" 200 24752 TCP_HIT:NONE
10.192.x.x - - [11/Sep/2012:15:41:55 -0700] "GET
http://originserver/data/object HTTP/1.1" 200 45232 TCP_HIT:NONE
10.192.x.x - - [11/Sep/2012:15:41:55 -0700] "GET
http://originserver/data/object HTTP/1.1" 200 28848 TCP_HIT:NONE
10.192.x.x - - [11/Sep/2012:15:41:55 -0700] "GET
http://originserver/data/object HTTP/1.1" 200 49328 TCP_HIT:NONE
10.192.x.x - - [11/Sep/2012:15:41:55 -0700] "GET
http://originserver/data/object HTTP/1.1" 200 49328 TCP_HIT:NONE
10.192.x.x - - [11/Sep/2012:15:41:55 -0700] "GET
http://originserver/data/object HTTP/1.1" 200 32944 TCP_HIT:NONE
10.192.x.x - - [11/Sep/2012:15:41:55 -0700] "GET
http://originserver/data/object HTTP/1.1" 200 37040 TCP_HIT:NONE
10.192.x.x - - [11/Sep/2012:15:41:55 -0700] "GET
http://originserver/data/object HTTP/1.1" 200 37040 TCP_HIT:NONE
However, when I make a huge number of concurrent requests for the
same object, squid fails to load the object from the disk fast enough
and gives a TCP_SWAPFAIL_MISS ->
10.192.x.x - - [11/Sep/2012:15:42:23 -0700] "GET
http://originserver/data/object HTTP/1.1" 200 53424 TCP_HIT:NONE
10.192.x.x - - [11/Sep/2012:15:42:23 -0700] "GET
http://originserver/data/object HTTP/1.1" 200 37031
TCP_SWAPFAIL_MISS:DIRECT
10.192.x.x - - [11/Sep/2012:15:42:23 -0700] "GET
http://originserver/data/object HTTP/1.1" 200 28839 TCP_MISS:NONE
10.192.x.x - - [11/Sep/2012:15:42:23 -0700] "GET
http://originserver/data/object HTTP/1.1" 200 32935 TCP_MISS:NONE
All subsequent requests hit the origin server directly causing huge
load on the origin server (TCP_MISS:NONE) ->
10.192.x.x - - [11/Sep/2012:15:42:23 -0700] "GET
http://originserver/data/object HTTP/1.1" 200 32935 TCP_MISS:NONE
10.192.x.x - - [11/Sep/2012:15:42:23 -0700] "GET
http://originserver/data/object HTTP/1.1" 200 28839 TCP_MISS:NONE
10.192.x.x - - [11/Sep/2012:15:42:23 -0700] "GET
http://originserver/data/object HTTP/1.1" 200 37031 TCP_MISS:NONE
10.192.x.x - - [11/Sep/2012:15:42:23 -0700] "GET
http://originserver/data/object HTTP/1.1" 200 37031 TCP_MISS:NONE
10.192.x.x - - [11/Sep/2012:15:42:23 -0700] "GET
http://originserver/data/object HTTP/1.1" 200 32935 TCP_MISS:NONE
This is undesirable in the production setup, since such huge number
of requests hitting the origin server directly have the result of a
DOS attack on the origin server. This has brought down our origin
server more than once now.
Well, when you think about it this is a DOS on Squid as well. The
backend server is only facing the overflow which squid can't erase fast
enough. So any attacker trying this has to pass *two* DOS thresholds,
first the squid one then the backend on top. There is always another
idiot infected PC, so DOS resolution is not about *solving* the traffic
problem, but raising the bar and reducing the impact/damage when it
happens.
I am looking for any help or pointers on how can I deal with such a
huge number of concurrent requests to squid for the same object
effectively, any help is highly appreciated. I am already considering
the option of rate limiting using iptables, however if there is a
effective way to deal with this in the squid configuration itself; I
would love to understand.
You were a bit vague about which specific release versions of Squid you
have. 2.6 should have had collapsed forwarding feature which acts as a
great DOS barrier. It has not been ported to squid-3 yet, but
efficiencies have been improved in the cache handling so you could try
the latest 2.7 or 3.2 releases and see if this raises the bar high
enough for you.
Amos