Re: WDS stopped working in 21.02, looking for bug in netifd, how to patch?

2021-09-27 Thread Daniel Haid
There is a race condition between hostapd and netifd. Now that the bug is found, I could try to write a patch. But I do not know what the correct behaviour should be. Should netifd not add wlan0.sta1 to the bridge at all? If so, what is the best way to implement it? Or should hostapd be

Re: WDS stopped working in 21.02, looking for bug in netifd, BUG FOUND!

2021-09-23 Thread Daniel Haid
Hi everyone, I think I finally located the problem! There is a race condition between hostapd and netifd. In hostapd, src/drivers/driver_nl80211.c, look at the function i802_set_wds_sta. There are calls to 1) nl80211_create_iface and 2) linux_br_add_if. Now call 1) seems to trigger netifd

Re: WDS stopped working in 21.02, looking for bug in netifd

2021-09-23 Thread Bastian Bittorf
On Thu, Sep 23, 2021 at 03:17:15PM +0200, Daniel Haid wrote: > Is there any way to dump a detailed state of the wlan driver in the kernel? > Or the state of netifd? Sould I enable some debug options? at least you can try to debug with 2 terminals an running: iw event ip monitor bye, Bastian

Re: WDS stopped working in 21.02, looking for bug in netifd

2021-09-23 Thread Daniel Haid
Another update: If I issue the following commands: 1. /etc/init.d/network restart 2. ip addr 3. ip addr 4. ip addr Then, in a "bad case", if the timing is right, 2. shows that the interface wlan0.sta1 is DOWN, 3. shows that it is UP and 4. shows that is DOWN again. Then it stays DOWN. This

Re: WDS stopped working in 21.02, looking for bug in netifd

2021-09-22 Thread Daniel Haid
can you please add this function ontop of /lib/netifd/wireless/mac80211.sh Unfortunately, /tmp/foo is identical after good and bad boot, see below. There are three ways to trigger the bug (randomly, yesterday I thought the chance was about 50%, but today it felt much lower, about 5-10%): 1.

Re: WDS stopped working in 21.02, looking for bug in netifd

2021-09-22 Thread Bastian Bittorf
On Wed, Sep 22, 2021 at 06:12:13PM +0200, Daniel Haid wrote: > Another update: can you please add this function ontop of /lib/netifd/wireless/mac80211.sh #!/bin/sh iw() { local rc; command iw "$@"; rc=$?; echo "rc:$rc | iw $*" >>/tmp/foo; test $rc -eq 0 || command iw "$@" 2>>/tmp/foo; return

Re: WDS stopped working in 21.02, looking for bug in netifd

2021-09-22 Thread Daniel Haid
Another update: I put some logging code into the function interface_add_link. On every reboot the function interface_add_link is sometimes called for the device wlan0.sta1 and sometimes not. What I have seen is the following: When it is not called, the connection works. When it is called,

Re: WDS stopped working in 21.02, looking for bug in netifd

2021-09-21 Thread Daniel Haid
Small update: Preventing the call to mdev->hotplug_ops->add (and replacing it with return 0) inside the function interface_add_link whenever it is called from interface_handle_link and the string name contains the substring ".sta" seems to "fix" the bug. What kind of hotplug_ops are called

Re: WDS stopped working in 21.02, looking for bug in netifd

2021-09-21 Thread Daniel Haid
By the way, maybe I should add that both devices are GL.iNet GL-AR150. Also, the configs are only minimally different from the defaults. The only option that could be a bit unusual is having 802.11r enabled. And indeed, after disabling 802.11r, the bug occurs much less often. In fact,

Re: WDS stopped working in 21.02, looking for bug in netifd

2021-09-21 Thread Daniel Haid
Can you please send me the config that you're using? I'd like to try to reproduce it myself. Find attached the config dumps of the AP and the client. They have been created with 21.02, but after flashing the snapshot on the AP I restored exactly this config (and the bug was still there).

Re: WDS stopped working in 21.02, looking for bug in netifd

2021-09-21 Thread Felix Fietkau
On 2021-09-20 22:56, Daniel Haid wrote: > Felix, I took the last openwrt snapshot and compiled netifd from master > with your patch applied and installed it. > > Result: > After boot wlan0.sta1 was DOWN. > After "/etc/init.d/network restart" it was UP and the connection worked! > After another

Re: WDS stopped working in 21.02, looking for bug in netifd

2021-09-20 Thread Daniel Haid
Felix, I took the last openwrt snapshot and compiled netifd from master with your patch applied and installed it. Result: After boot wlan0.sta1 was DOWN. After "/etc/init.d/network restart" it was UP and the connection worked! After another "/etc/init.d/network restart" it was DOWN again. After

Re: WDS stopped working in 21.02, looking for bug in netifd

2021-09-20 Thread Daniel Haid
Please test if applying this change to netifd fixes the issue. I am currently building the toolchain for the current snapshot, so I can test on the current snapshot. So far I have only been able to test the patch on 21.02. Since the patch does not apply cleanly I tried to versions of the

Re: WDS stopped working in 21.02, looking for bug in netifd

2021-09-20 Thread Felix Fietkau
On 2021-09-20 16:46, Daniel Haid wrote: > I have continued investigating. > > After all, it seems that the interface being down is just a symptom. > > I summarize my current findings: > > With the 21.02 netifd version, there seems to be a bug concerting WDS. > The bug has the following effect:

Re: WDS stopped working in 21.02, looking for bug in netifd

2021-09-20 Thread Daniel Haid
I have continued investigating. After all, it seems that the interface being down is just a symptom. I summarize my current findings: With the 21.02 netifd version, there seems to be a bug concerting WDS. The bug has the following effect: I have openwrt 21.02 running on one system running

Re: WDS stopped working in 21.02, looking for bug in netifd

2021-09-20 Thread Daniel Haid
I have investigated a bit more. Even without the "fix", after each reboot WDS there seems to be about a 50% chance of WDS working. To reliably reproduce the bug, it is necessary to do /etc/init.d/network restart with the WDS client connected. Now what I noticed is that using the netifd