Hi there,

> Thanks for the great bug report.  This looks more like a contention
> problem than a leak.  I would say frames are queued for deferred
> processing but the workqueue (ieee80211_iface_work()) doesn't get the
> opportunity to run and dequeue them.  Is it easy for you to try to
> confirm this hypothesis?

It sure is NO leak. kmemleak just pointed me to the right way, that the space 
was eaten by skbufs.
As soon as several nodes go down, the remaining ones empty not only their 
queues but also the associated memory. So kmemleak goes back to 0 MB for the 
kmalloc stuff.

The hack proposed by 游波 does indeed fix that problem:
--- a/net/mac80211/mesh_hwmp.c
+++ b/net/mac80211/mesh_hwmp.c
@@ -387,8 +387,7 @@ static u32 hwmp_route_info_get(struct 
ieee80211_sub_if_data *sdata,
                        spin_lock_bh(&mpath->state_lock);
                        if (mpath->flags & MESH_PATH_FIXED)
                                fresh_info = false;
-                       else if ((mpath->flags & MESH_PATH_ACTIVE) &&
-                           (mpath->flags & MESH_PATH_SN_VALID)) {
+                       else if (mpath->flags & MESH_PATH_ACTIVE) {
                                if (SN_GT(mpath->sn, orig_sn) ||
                                    (mpath->sn == orig_sn &&
                                     new_metric >= mpath->metric)) {

I am currently running your git version plus the above patch, and for the last 
hour, it just works FINE! Workqueues are empty, and no huge amounts of memory 
allocated. Every node is pinging every other node, so all possible routes are 
established.

I'll give a SMP build a try in the next few days to see if the locking issues 
I had are also gone.

Thanks!
ASPj

PS: If you want me to test some new code, just drop a line, I can keep stuff 
running for days on our testbed, I can also perform automated iperf, ping, 
sniffing and info tasks (regularly dumping neighbors, routes, dBm, bitrates, 
etc).

Here are some nice pictures:
Our network running the open11s git in idle state, values are signal strength 
in dBm: http://aspj.aircrack-ng.org/o11s/idle.png
The same network after the memory consumption bug ate a few nodes (notice the 
greatly reduced link count!): http://aspj.aircrack-ng.org/o11s/crashed.png
And finally after the patch, this time showing the routes used (arrow's color 
is the routes destination, arrow's target is next hop): http://aspj.aircrack-
ng.org/o11s/patched_routes.png
[mndii = "Mesh Node II", nodes 18, 19 and 20 are NOT outside of the building 
but one floor above the others]
_______________________________________________
Devel mailing list
Devel@lists.open80211s.org
http://open80211s.com/mailman/listinfo/devel

Reply via email to