Hi ASPj, My name is Y.B, and I have encounted the similar problem as below > However, I ran into another bug. > The network itself was fine, but as soon as I started to generate > traffic with lots of pings, certain nodes that had 8 to 12 active > neighbors, ran out of kernel memory! > slabinfo told me that all the memory had been eaten by kmalloc calls. > kmemleak found some suspected leaks from skb_copy in > ieee80211_rx_handlers. > So I enabled some mac80211 debugging, mounted debugfs and discovered, > that the queues were filling up: > mndii-04:/sys/kernel/debug/ieee80211/phy3# cat queues > 00: 0x00000001/48992 > 01: 0x00000000/0 > 02: 0x00000001/0 > 03: 0x00000000/0 > mndii-04:/sys/kernel/debug/ieee80211/phy3# grep "" statistics/rx* > statistics/rx_expand_skb_head:0 > statistics/rx_expand_skb_head2:0 > statistics/rx_handlers_drop:42279 > statistics/rx_handlers_drop_defrag:0 > statistics/rx_handlers_drop_nullfunc:0 > statistics/rx_handlers_drop_passive_scan:0 > statistics/rx_handlers_drop_short:0 > statistics/rx_handlers_fragments:0 > statistics/rx_handlers_queued:293706 > > Adding some printk lines to rx.c it seemed that most of those packets > came from ieee80211_rx_h_action, hitting the "queue:" label there. > > Fun thing is, as soon as the first few nodes in crowded areas (therefor > with many neighbors) crashed due to memory problems, the other nodes, > having less neighbors then, slowly started emptying their queues.
My test nodes are ARM boards with S3C2440, and the test kernel is 2.6.35.4. I advise you to use Wireshark or Tcpdump to see what packages are in air when memory being eaten. I found that there are a lot of PREQs (PREQs burst) in air when memory decreasing. The innumerable PREQs been retransmitted many times causes the memory problem. I think the PREQ processing is the criminal. Note the codes below belong to the function hwmp_route_info_get() in mesh_hwmp.c(codes is from kernel 3.0) I put % in front of the important lines ////// if (memcmp(orig_addr, sdata->vif.addr, ETH_ALEN) == 0) { /* This MP is the originator, we are not interested in this * frame, except for updating transmitter's path info. */ process = false; fresh_info = false; } else { mpath = mesh_path_lookup(orig_addr, sdata); if (mpath) { spin_lock_bh(&mpath->state_lock); if (mpath->flags & MESH_PATH_FIXED) fresh_info = false; % else if ((mpath->flags & MESH_PATH_ACTIVE) && % (mpath->flags & MESH_PATH_SN_VALID)) { if (SN_GT(mpath->sn, orig_sn) || (mpath->sn == orig_sn && new_metric >= mpath->metric)) { process = false; fresh_info = false; } } } /// In the lines marked %, (mpath->flags & MESH_PATH_SN_VALID) must be ture after which process=false and fresh_info=false. This is the reason that PREQ been retransmitted many times until TLL reaches 0 or nodes crash. I think it's no need to judge whether (mpath->flags & MESH_PATH_SN_VALID) is true here, so I cancel it and the memory problems disappeard. Pay much attention to the function hwmp_route_info_get() and with the help of Wireshrk, hope you can solve your problem soon Cheers Y.B
_______________________________________________ Devel mailing list Devel@lists.open80211s.org http://open80211s.com/mailman/listinfo/devel