Re: [B.A.T.M.A.N.] ap_isolation issue
Hello Andre, On Thu, Jan 26, 2012 at 04:46:44 -0500, Andre Courchesne wrote: Ok, did a bit of tcpdump and my test was the following: [cut] Thank you for reporting this issue and sending us the dumps. Actually it is very hard to link the ap isolation mechanism to this problem. First of all I would like to make a simple test. Please, could you use dump packets received on T003 and see if the ARP request (the first one that receives no reply) reaches the node (T003)? In particular I would suggest you to use wireshark (it can parse batman packets) and to sniff at the same time packets either from the physical interface used by the mesh (I'd say wlan0) and bat0. Then tell us if you see the ARP request on both interfaces, on wlan0 only or on none of them. Another question, why are you using the bridge loop avoidance? If possible I would like you to disable any optional feature you have in order to have the cleanest testbed possible. I know that you already tried to disable it without effect, but it is better to perform test without any other noise. Cheers, -- Antonio Quartulli ..each of us alone is worth nothing.. Ernesto Che Guevara
Re: [B.A.T.M.A.N.] ap_isolation issue
Hi Antonio, Thanks for the reply. I will attempt these tests today and provide you as much feedback as possible. We are using loop avoidance because in some (if not all) installations we will be doing there will be multiple AP wired to the same network to provide redundancy. And if we move to LoopAvoidance-II if I understand correctlt if should also provide bandwidth balancing correct ? On 2012-01-27, at 8:36 AM, Antonio Quartulli wrote: Hello Andre, On Thu, Jan 26, 2012 at 04:46:44 -0500, Andre Courchesne wrote: Ok, did a bit of tcpdump and my test was the following: [cut] Thank you for reporting this issue and sending us the dumps. Actually it is very hard to link the ap isolation mechanism to this problem. First of all I would like to make a simple test. Please, could you use dump packets received on T003 and see if the ARP request (the first one that receives no reply) reaches the node (T003)? In particular I would suggest you to use wireshark (it can parse batman packets) and to sniff at the same time packets either from the physical interface used by the mesh (I'd say wlan0) and bat0. Then tell us if you see the ARP request on both interfaces, on wlan0 only or on none of them. Another question, why are you using the bridge loop avoidance? If possible I would like you to disable any optional feature you have in order to have the cleanest testbed possible. I know that you already tried to disable it without effect, but it is better to perform test without any other noise. Cheers, -- Antonio Quartulli ..each of us alone is worth nothing.. Ernesto Che Guevara
Re: [B.A.T.M.A.N.] ap_isolation issue
On Fri, Jan 27, 2012 at 09:30:05AM -0500, Andre Courchesne wrote: Hi Antonio, Thanks for the reply. I will attempt these tests today and provide you as much feedback as possible. Thank you. We are using loop avoidance because in some (if not all) installations we will be doing there will be multiple AP wired to the same network to provide redundancy. And if we move to LoopAvoidance-II if I understand correctlt if should also provide bandwidth balancing correct ? it depends on what you mean. Incoming traffic will enter the LAN through the best (depending on the TQ) node. While the current implementation, IIRC, provides only one fixed entry point. But please, don't mix topic :) Cheers, On 2012-01-27, at 8:36 AM, Antonio Quartulli wrote: Hello Andre, On Thu, Jan 26, 2012 at 04:46:44 -0500, Andre Courchesne wrote: Ok, did a bit of tcpdump and my test was the following: [cut] Thank you for reporting this issue and sending us the dumps. Actually it is very hard to link the ap isolation mechanism to this problem. First of all I would like to make a simple test. Please, could you use dump packets received on T003 and see if the ARP request (the first one that receives no reply) reaches the node (T003)? In particular I would suggest you to use wireshark (it can parse batman packets) and to sniff at the same time packets either from the physical interface used by the mesh (I'd say wlan0) and bat0. Then tell us if you see the ARP request on both interfaces, on wlan0 only or on none of them. Another question, why are you using the bridge loop avoidance? If possible I would like you to disable any optional feature you have in order to have the cleanest testbed possible. I know that you already tried to disable it without effect, but it is better to perform test without any other noise. Cheers, -- Antonio Quartulli ..each of us alone is worth nothing.. Ernesto Che Guevara -- Antonio Quartulli ..each of us alone is worth nothing.. Ernesto Che Guevara
[B.A.T.M.A.N.] [PATCH] batman-adv: encourage batman to take shorter routes by changing the default hop penalty
Signed-off-by: Marek Lindner lindner_ma...@yahoo.de --- soft-interface.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/soft-interface.c b/soft-interface.c index 2ffdc74..7548762 100644 --- a/soft-interface.c +++ b/soft-interface.c @@ -836,7 +836,7 @@ struct net_device *softif_create(const char *name) atomic_set(bat_priv-gw_sel_class, 20); atomic_set(bat_priv-gw_bandwidth, 41); atomic_set(bat_priv-orig_interval, 1000); - atomic_set(bat_priv-hop_penalty, 10); + atomic_set(bat_priv-hop_penalty, 30); atomic_set(bat_priv-log_level, 0); atomic_set(bat_priv-fragmentation, 1); atomic_set(bat_priv-bcast_queue_left, BCAST_QUEUE_LEN); -- 1.7.5.4
Re: [B.A.T.M.A.N.] [PATCH] batman-adv: encourage batman to take shorter routes by changing the default hop penalty
On Fri, Jan 27, 2012 at 11:11:55PM +0800, Marek Lindner wrote: Signed-off-by: Marek Lindner lindner_ma...@yahoo.de --- soft-interface.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/soft-interface.c b/soft-interface.c index 2ffdc74..7548762 100644 --- a/soft-interface.c +++ b/soft-interface.c @@ -836,7 +836,7 @@ struct net_device *softif_create(const char *name) atomic_set(bat_priv-gw_sel_class, 20); atomic_set(bat_priv-gw_bandwidth, 41); atomic_set(bat_priv-orig_interval, 1000); - atomic_set(bat_priv-hop_penalty, 10); + atomic_set(bat_priv-hop_penalty, 30); Hi Marek Do you have any performance analysis to show this is really helpful and not harmful? I've seen indoor results where i had to reduce the hop penalty, otherwise BATMAN was taking a short path which worked badly. By reducing the hop penalty, so encouraging it to take more hops, i got usable routes. I see the danger here this could break working networks, so maybe it needs justification? Thanks Andrew
Re: [B.A.T.M.A.N.] [PATCH] batman-adv: encourage batman to take shorter routes by changing the default hop penalty
Hi Andrew, Do you have any performance analysis to show this is really helpful and not harmful? I've seen indoor results where i had to reduce the hop penalty, otherwise BATMAN was taking a short path which worked badly. By reducing the hop penalty, so encouraging it to take more hops, i got usable routes. I see the danger here this could break working networks, so maybe it needs justification? as a matter of fact I do believe it is helpful. In various networks (more than a dozen) I have seen that batman would largely favor multi-hop routes, thus reducing the overall throughput. By setting it to a higher value I regained some of its performance. The networks are still up running - I can show them to you if you are interested. So, you had to reduce the default value of 10 to something even smaller ? A hop penalty of 10 results in a penatly of 4% per hop. A rough equivalent of 2 lost packets (62/64). Does not sound very much to me. Can you explain your test setup a little more ? Nevertheless, this patch was intended to get a discussion going. The main problem I have been seeing in the last weeks is that OGM broadcasts have a hard time estimating the link quality / throughput on 11n devices. I'll also try to hack a proof of concept for an rssi influence on the routing and see if that has a better effect. Regards, Marek
Re: [B.A.T.M.A.N.] [PATCH] batman-adv: encourage batman to take shorter routes by changing the default hop penalty
Hi all, 2012/1/27 Marek Lindner lindner_ma...@yahoo.de: Hi Andrew, Do you have any performance analysis to show this is really helpful and not harmful? I've seen indoor results where i had to reduce the hop penalty, otherwise BATMAN was taking a short path which worked badly. By reducing the hop penalty, so encouraging it to take more hops, i got usable routes. I see the danger here this could break working networks, so maybe it needs justification? I have experencied the same situation in some tests, and I agree with Andrew when he says that some form of justification is necessary. as a matter of fact I do believe it is helpful. In various networks (more than a dozen) I have seen that batman would largely favor multi-hop routes, thus reducing the overall throughput. By setting it to a higher value I regained some of its performance. The networks are still up running - I can show them to you if you are interested. So, you had to reduce the default value of 10 to something even smaller ? A hop penalty of 10 results in a penatly of 4% per hop. A rough equivalent of 2 lost packets (62/64). Does not sound very much to me. Can you explain your test setup a little more ? Nevertheless, this patch was intended to get a discussion going. The main problem I have been seeing in the last weeks is that OGM broadcasts have a hard time estimating the link quality / throughput on 11n devices. I'll also try to hack a proof of concept for an rssi influence on the routing and see if that has a better effect. The problems of TQ emerges when the rate of devices increase, because especially in mixed b,g,n networks TQ does not distinguish between fast and slow link. We all know that brodcast losses does not say almost nothing about link speed or load. The only way to improve the TQ metric is a cross-layer implementation as already experienced (considering only bandwidth) in my tests. Obviously this means breaking the universal compatibility with network interfaces, the use of mac80211 and cfg80211 in any case can limit this problem in my opinion. Regards, Marek Regards, Daniele
Re: [B.A.T.M.A.N.] [PATCH] batman-adv: encourage batman to take shorter routes by changing the default hop penalty
So, you had to reduce the default value of 10 to something even smaller ? A hop penalty of 10 results in a penatly of 4% per hop. A rough equivalent of 2 lost packets (62/64). Does not sound very much to me. Can you explain your test setup a little more ? These observations come from a research project made together with Hochschule Luzern. There is some flyer like documentation in: www.hslu.ch/t-spawn-project-description_en.pdf It is a deployable indoor network. The tests i made were with a mesh of 6 nodes, deployed in a chain. The deployment is intelligent, made independently of BATMAN. It uses packet probing at the lowest coding rate to ensure there is always a link to two nodes upstream in the chain. So you walk along with 5 nodes in your hand. When the algorithm determines the link upstream to two nodes has reach a threshold, it tells you to deploy the next mesh node. We kept doing this, along the corridor, down the steps, along another corridor, through a fire door, etc, until we were out of nodes. When iperf was used to measure the traffic from one end of the chain to the other. With the default hop penalty we got poor performance. With the traceroute facility of batctl, we could see it was route flipping between 3 hops and 4 hops. When it used 3 hops, the packet loss was too high and we got poor bandwidth. Then it went up to 4 hops, the packet loss was lower, so we got more bandwidth. This was repeatable, with each deploy we made. Then we tried with a lower hop penalty. I think it was 5, but i don't remember. BATMAN then used 5 hops and there was no route flipping. We also got the best iperf bandwidth for end to end of the chain. The fact BATMAN was route flipping with a hop penalty of 10 suggests to me the links had similar TQ. So OGMs are getting through at the lowest coding rate. But data packets are having trouble, maybe because they are full MTU, or because the wifi driver is using the wrong coding rate. I suspect the TQ measurements as determined by OGMs are more optimistic than actual data packets. Linus's played with different NDP packet sizes, and i think he ended up with big packets in order to give more realistic TQ measurements. Unfortunately, this project is now finished. I do have access to the hardware, but no time allocated to play with it :-( Nevertheless, this patch was intended to get a discussion going. Well, i'm happy to take part in the discussion. I've no idea if our use case is typical, or an edge case. So comments, and results from other peoples networks would be useful. If this change it to help 11n, maybe some more intelligence would be better, to ask the wireless stack is the interface abg or n, and from that determine what hop penalty should be used? Andrew