On Sat, May 12, 2018 at 02:57:23AM +0800, Marek Lindner wrote:
> Whenever a new VLAN is created on top of batman virtual interfaces
> the batman-adv kernel module creates internal structures to track
> the status of said VLAN. Amongst other things, the MAC address of
> the VLAN interface itself has to be stored.
> 
> Without this change a VLAN and its infrastructure could be created
> while the interface MAC address is not stored without triggering
> any error, thus creating issues in other parts of the code.
> 
> Prevent the VLAN from being created if the MAC address can not
> be stored.
> 
> Fixes: 952cebb57518 ("batman-adv: add per VLAN interface attribute framework")
> 
> Signed-off-by: Marek Lindner <[email protected]>

I tested this patch but so far could not spot any issues either in
dmesg or logread.

I've added these patches to a branch for Gluon here:

https://github.com/T-X/gluon/tree/tt-vlan-patched

And used these images (warning, they have my SSH public added):

https://metameute.de/~tux/Freifunk/firmware/ffh-tt-patched/

I've tested with an isolated two nodes setup for now.


I started playing with restarting the network multiple times:

~~~~~
root@freifunk-b0487ae7f31e:~# rm /tmp/vlan-test.log; trap '' SIGPIPE; for i in 
`seq 1 30`; do echo "Starting network restart $i" >> /tmp/vlan-test.log; 
/etc/init.d/network restart; sleep 5; if batctl tl | grep " 0 \["; then echo 
"BROKEN - aborting" >> /tmp/vlan-test.log; batctl tl >> /tmp/vlan-test.log; 
sleep 3; echo "waiting..." >> /tmp/vlan-test.log; batctl tl >> 
/tmp/vlan-test.log; break; fi; done; echo "finished" >> /tmp/vlan-test.log
~~~~~

And the result is the following - which looks odd?

~~~~~
root@freifunk-b0487ae7f31e:~# cat /tmp/vlan-test.log 
Starting network restart 1
Starting network restart 2
Starting network restart 3
BROKEN - aborting
[B.A.T.M.A.N. adv 2018.1, MainIF/MAC: primary0/66:c6:34:9d:58:43 
(bat0/b0:48:7a:e7:f3:1e BATMAN_IV), TTVN: 1]
Client             VID Flags    Last seen (CRC       )
9a:86:17:9c:5f:4f   -1 [.P.X..]   0.000   (0x0ce60e81)
b0:48:7a:e7:f3:1e    0 [.PN...]   0.000   (0x00000000)
b0:48:7a:e7:f3:1e   -1 [.PN...]   0.000   (0x0ce60e81)
waiting...
[B.A.T.M.A.N. adv 2018.1, MainIF/MAC: primary0/66:c6:34:9d:58:43 
(bat0/b0:48:7a:e7:f3:1e BATMAN_IV), TTVN: 2]
Client             VID Flags    Last seen (CRC       )
b0:48:7a:e7:f3:1e    0 [.P....]   0.000   (0xc4c7d9cf)
b0:48:7a:e7:f3:1e   -1 [.P....]   0.000   (0x62afdc24)
finished
~~~~~

However, this oddity seems to be temporary, now the local TT looks
just fine, without having rebooted the node:

~~~~~
root@freifunk-b0487ae7f31e:~# batctl tl
[B.A.T.M.A.N. adv 2018.1, MainIF/MAC: primary0/66:c6:34:9d:58:43 
(bat0/b0:48:7a:e7:f3:1e BATMAN_IV), TTVN: 4]
Client             VID Flags    Last seen (CRC       )
33:33:ff:40:f8:dc   -1 [.P....]   0.000   (0xd118c666)
b0:48:7a:e7:f3:1e    0 [.P....]   0.000   (0xc4c7d9cf)
33:33:00:00:00:02   -1 [.P....]   0.000   (0xd118c666)
33:33:ff:00:00:01   -1 [.P....]   0.000   (0xd118c666)
33:33:00:02:10:01   -1 [.P....]   0.000   (0xd118c666)
01:00:5e:00:00:01   -1 [.P....]   0.000   (0xd118c666)
b0:48:7a:e7:f3:1e   -1 [.P....]   0.000   (0xd118c666)
33:33:ff:e7:f3:1e   -1 [.P....]   0.000   (0xd118c666)
33:33:00:00:00:01   -1 [.P....]   0.000   (0xd118c666)
~~~~~

Or is it expected that a TT VLAN entry with an "N" flag will have
the CRC set to 0x00000000?

I also noticed that the VLAN 0 is added to bat0 by 8021q right
after bat0 gets created and activated:

~~~~~
Sun Feb 25 14:20:28 2018 kern.info kernel: [ 7852.985327] batman_adv: bat0: 
Adding interface: primary0
Sun Feb 25 14:20:28 2018 kern.info kernel: [ 7852.990712] batman_adv: bat0: 
Interface activated: primary0
Sun Feb 25 14:20:28 2018 kern.info kernel: [ 7853.025080] 8021q: adding VLAN 0 
to HW filter on device bat0
Sun Feb 25 14:20:28 2018 daemon.notice netifd: Interface 'bat0' is enabled
Sun Feb 25 14:20:28 2018 kern.info kernel: [ 7853.038815] device bat0 entered 
promiscuous mode
Sun Feb 25 14:20:28 2018 kern.info kernel: [ 7853.043649] br-client: port 
3(bat0) entered forwarding state
Sun Feb 25 14:20:28 2018 kern.info kernel: [ 7853.049388] br-client: port 
3(bat0) entered forwarding state
Sun Feb 25 14:20:28 2018 daemon.notice netifd: Network device 'bat0' link is up
Sun Feb 25 14:20:28 2018 daemon.notice netifd: Interface 'bat0' has link 
connectivity 
Sun Feb 25 14:20:28 2018 daemon.notice netifd: Interface 'bat0' is setting up 
now
Sun Feb 25 14:20:28 2018 daemon.notice netifd: Interface 'bat0' is now up
~~~~~

Which looks like it might have the potential for a race condition?
Also the "HW filter" remark by 8021q seems a bit odd as this is a
virtual interface, doesn't it?

Regards, Linus

Reply via email to