On 02/06/12 08:03, Sebastian Moeller wrote:
From my totally unscientific testing I am quite convinced that even
16MB of /tmp used will make the router spiral into reboot if used over the 5GHz
radio to the wan port. However, if I use one of the wired ports I get plenty of
the following (not always hostapd):
Jun 1 23:41:08 nacktmulle kern.warn kernel: [185428.417968] hostapd: page
allocation failure: order:0, mode:0x4020
Jun 1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] Call Trace:
Jun 1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<802850a4>]
dump_stack+0x8/0x34
Jun 1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<800b4548>]
warn_alloc_failed+0xe8/0x10c
Jun 1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<800b684c>]
__alloc_pages_nodemask+0x5a0/0x600
Jun 1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<800da070>]
new_slab+0xa8/0x280
Jun 1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<80286b18>]
__slab_alloc.isra.60.constprop.63+0x25c/0x2fc
Jun 1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<800dba48>]
__kmalloc_track_caller+0x88/0x140
Jun 1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<801e0854>]
__alloc_skb+0x80/0x140
Jun 1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<801e0930>]
dev_alloc_skb+0x1c/0x48
Jun 1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<801d0c74>]
ag71xx_poll+0x430/0x65c
Jun 1 23:41:08 nacktmulle kern.alert kernel: [185428.417968] [<801e8c10>]
net_rx_action+0x88/0x1c8
Jun 1 23:41:09 nacktmulle kern.warn kernel: [185429.484375] hostapd: page
allocation failure: order:0, mode:0x4020
Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] Call Trace:
Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] [<802850a4>]
dump_stack+0x8/0x34
Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] [<800b4548>]
warn_alloc_failed+0xe8/0x10c
Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] [<800b684c>]
__alloc_pages_nodemask+0x5a0/0x600
Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] [<800da070>]
new_slab+0xa8/0x280
Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] [<80286b18>]
__slab_alloc.isra.60.constprop.63+0x25c/0x2fc
Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] [<800dba48>]
__kmalloc_track_caller+0x88/0x140
Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] [<801e0854>]
__alloc_skb+0x80/0x140
Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] [<801e0930>]
dev_alloc_skb+0x1c/0x48
Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] [<801d0c74>]
ag71xx_poll+0x430/0x65c
Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375]
Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] Mem-Info:
Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] Normal per-cpu:
Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] CPU 0: hi:
18, btch: 3 usd: 18
Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] active_anon:3826
inactive_anon:63 isolated_anon:0
Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] active_file:683
inactive_file:561 isolated_file:0
Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] unevictable:0
dirty:0 writeback:0 unstable:0
Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] free:96
slab_reclaimable:408 slab_unreclaimable:7706
Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] mapped:501
shmem:109 pagetables:142 bounce:0
Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] Normal free:384kB
min:1016kB low:1268kB high:1524kB active_anon:15304kB inactive_anon:252kB
active_file:2732kB inactive_file:2244kB unevictable:0kB isolated(anon):0kB
isolated(file):0kB present:65024kB mlocked:0k
Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] lowmem_reserve[]:
0 0
Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] Normal: 42*4kB
15*8kB 0*16kB 1*32kB 1*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB
= 384kB
Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] 1353 total
pagecache pages
Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] 0 pages in swap
cache
Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] Swap cache stats:
add 0, delete 0, find 0/0
Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] Free swap = 0kB
Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] Total swap = 0kB
Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] 16384 pages RAM
Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] 965 pages reserved
Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] 1399 pages shared
Jun 1 23:41:09 nacktmulle kern.alert kernel: [185429.484375] 14306 pages
non-shared
Jun 1 23:41:09 nacktmulle kern.warn kernel: [185429.484375] SLUB: Unable to
allocate memory on node -1 (gfp=0x20)
Jun 1 23:41:09 nacktmulle kern.warn kernel: [185429.484375] cache:
kmalloc-2048, object size: 2048, buffer size: 2048, default order: 2, min
order: 0
Jun 1 23:41:09 nacktmulle kern.warn kernel: [185429.484375] node 0: slabs:
0, objs: 0, free: 0
But the box seems to survive this… Heck this even survives my test case with
16000 KB used of /tmp. Under that amount of memory pressure named and ntpd get
killed but the router does go into automatically reboot, it just stays up and
running albeit somewhat useless without named.
Yes - that stack trace is because the ag71xx driver can't allocate the
memory for a skb structure. Unlike the wireless driver though, the
ag71xx_poll function simply returns immediately with ENOMEM. I had no
real success in tracing what the equivalent is in ath9k.
I noticed a possible issue in ath9k_rx_tasklet, since if
bf->bf_mpdu=NULL (bf being an Atheros-specific buffer type) you could
potentially get an infinite loop. I can't see though if that can ever
occur in reality. I *think* it uses a list of skb structures
preallocated at init-time for incoming frames, but I'm still trying to
interpret that part of the code. (The exact behaviour is
hardware-dependent.)
The way I interpret my latest test results is that the "assumed leak"
should be restricted to the wireless driver, does that sound right to you? Also with
cerowrt 3.3.6-2 even 16MB seem to much for /tmp. I will see what happens if I add some
swap space to the router, I hope it will be quite happy with 31MB /tmp and actual usage
of that space :). Since Dave only recommends full tftp reflashes maybe the update
scenario might not be such a big issue for cerowrt?
I'll leave that to Dave to say - I was assuming that the firmware would
be stored in memory first and then flashed. (There's always tftp at
boot time as an alternative flashing method.)
--
Robert Bradley
_______________________________________________
Cerowrt-devel mailing list
[email protected]
https://lists.bufferbloat.net/listinfo/cerowrt-devel