Hi, ASPJ

>The hack proposed by ?? does indeed fix that problem:
>--- a/net/mac80211/mesh_hwmp.c
>+++ b/net/mac80211/mesh_hwmp.c
>@@ -387,8 +387,7 @@ static u32 hwmp_route_info_get(struct 
>ieee80211_sub_if_data *sdata,
>                        spin_lock_bh(&mpath->state_lock);
>                        if (mpath->flags & MESH_PATH_FIXED)
>                                fresh_info = false;
>-                       else if ((mpath->flags & MESH_PATH_ACTIVE) &&
>-                           (mpath->flags & MESH_PATH_SN_VALID)) {
>+                       else if (mpath->flags & MESH_PATH_ACTIVE) {
>                                if (SN_GT(mpath->sn, orig_sn) ||
>                                    (mpath->sn == orig_sn &&
>                                     new_metric >= mpath->metric)) {
>
>I am currently running your git version plus the above patch, and for the last 
>hour, it just works FINE! Workqueues are empty, and no huge amounts of memory 
>allocated. Every node is pinging every other node, so all possible routes are 
>established.

Glad to see that this hack could work, but indeed I am not sure it's the best 
way to solve your problem
I hope you can test the changed code longer to see if it's stable.
Maybe (mpath->flags & MESH_PATH_SN_VALID) is not the source of the problem.
The implementation problem here is below:
(1)node N first received a PREQ with originator STA A, and the route of A in N 
is updated with the MESH_PATH_SN_VALID=1
(2)node N secondly received a PREQ with tx node STA A(NOW STA A is not 
orginator!!), and the route of A in N is updated with the
MESH_PATH_SN_VALID=0
if step (1) and (2) occur in turn, a PREQ can be retransmitted many times until 
memory of relative nodes empty 
or TTL of PREQ reaches 0.
Then the problem you saw happened. 
 
>PS: If you want me to test some new code, just drop a line, I can keep stuff 
>running for days on our testbed, I can also perform automated iperf, ping, 
>sniffing and info tasks (regularly dumping neighbors, routes, dBm, bitrates, 
>etc).
>
>Here are some nice pictures:
>Our network running the open11s git in idle state, values are signal strength 
>in dBm: http://aspj.aircrack-ng.org/o11s/idle.png
>The same network after the memory consumption bug ate a few nodes (notice the 
>greatly reduced link count!): http://aspj.aircrack-ng.org/o11s/crashed.png
>And finally after the patch, this time showing the routes used (arrow's color 
>is the routes destination, arrow's target is next hop): http://aspj.aircrack-
>ng.org/o11s/patched_routes.png
>[mndii = "Mesh Node II", nodes 18, 19 and 20 are NOT outside of the building 
>but one floor above the others]

You are doing very nice work in test!! 
I still did some test with BATMAN and 802.11s, but something bother me much: 
Why BATMAN is much better than 802.11s
When the originator and the target is several hops  away(in my test 6 or 7 
hops), the performance
of BATMAN is much better than 802.11s. 
The only reason I think BATMAN is better than 802.11s is that in 802.11s the 
routing management frame(PREQ
,PREP,PERR,RANN) may loss in air, but the lost does not affect BATMAN.
I see you have tested performance about 802.11s and BATMAN. 
Do you mind share some test result and the analyze of BATMAN compared with 
802.11s with me? 
 
Thanks in advace!
You Bo

_______________________________________________
Devel mailing list
Devel@lists.open80211s.org
http://open80211s.com/mailman/listinfo/devel

Reply via email to