On 26/06/2025 15:48, Alexey Klimov wrote:
Hi all,

After a long time of testing it seems the problem narrows down to qrb2210 rb1
and qrb4210 rb2 boards.

After booting, the board connects to the wifi network and after around ~5-10
minutes it loses the connection (nothing in dmesg). A simple ping of another
machine on the local network doesn't work. After, I guess, around 5000
seconds the GROUP_KEY_HANDSHAKE_TIMEOUT message is printked:

[ 5064.093748] wlan0: deauthenticated from 8c:58:72:d4:d1:8d (Reason: 
16=GROUP_KEY_HANDSHAKE_TIMEOUT)
[ 5067.083790] wlan0: authenticate with 8c:58:72:d4:d1:8d (local 
address=82:95:77:b1:05:a5)
[ 5067.091971] wlan0: send auth to 8c:58:72:d4:d1:8d (try 1/3)
[ 5067.100192] wlan0: authenticated
[ 5067.104734] wlan0: associate with 8c:58:72:d4:d1:8d (try 1/3)
[ 5067.113230] wlan0: RX AssocResp from 8c:58:72:d4:d1:8d (capab=0x11 status=0 
aid=2)
[ 5067.193624] wlan0: associated

and after that wireless connection works for ~5-10 minutes and then the cycle
repeats. The longer log with more info and some info with firmware versions,
ids, etc is at the end of this email [1]. Simple wlan0 down and wlan0 up fixes
things for a few minutes.

iw wlan0 link reports the following when wireless network is working:

root@rb1:~# iw wlan0 link
Connected to 8c:58:72:d4:d1:8d (on wlan0)
         SSID: void
         freq: 5300
         RX: 45802 bytes (424 packets)
         TX: 71260 bytes (125 packets)
         signal: -66 dBm
         rx bitrate: 433.3 MBit/s VHT-MCS 9 80MHz short GI VHT-NSS 1

bss flags:      short-slot-time
dtim period:    1
beacon int:     100

and this when wireless connection doesn't work:

Connected to 8c:58:72:d4:d1:8d (on wlan0)
         SSID: void
         freq: 5300
         RX: 850615 bytes (9623 packets)
         TX: 20372 bytes (247 packets)
         signal: -61 dBm
         rx bitrate: 6.0 MBit/s

     bss flags:      short-slot-time
     dtim period:    1
     beacon int:     100

This was tested with three different routers and different wifi networks.
Other devices here do not exhibit this behaviour.

Any hints on how to debug this? Any debug switches I can toggle to debug this?
I am happy to provide more info or test changes/patches if any.

Thanks in advance.
Best regards,
Alexey

[1]:

[    7.758934] ath10k_snoc c800000.wifi: qmi chip_id 0x120 chip_family 0x4007 
board_id 0xff soc_id 0x40670000
[    7.769740] ath10k_snoc c800000.wifi: qmi fw_version 0x337703a3 
fw_build_timestamp 2023-10-14 01:26 fw_build_id 
QC_IMAGE_VERSION_STRING=WLAN.HL.3.3.7.c2-00931-QCAHLSWMTPLZ-1
[   11.086123] ath10k_snoc c800000.wifi: wcn3990 hw1.0 target 0x00000008 
chip_id 0x00000000 sub 0000:0000
[   11.095622] ath10k_snoc c800000.wifi: kconfig debug 0 debugfs 0 tracing 0 
dfs 0 testmode 0
[   11.103998] ath10k_snoc c800000.wifi: firmware ver  api 5 features 
wowlan,mgmt-tx-by-reference,non-bmi,single-chan-info-per-channel crc32 a79c5b24
[   11.144810] ath10k_snoc c800000.wifi: htt-ver 3.128 wmi-op 4 htt-op 3 cal 
file max-sta 32 raw 0 hwcrypto 1
[   11.230894] ath10k_snoc c800000.wifi: invalid MAC address; choosing random
[   11.238128] ath: EEPROM regdomain: 0x0
[   11.242060] ath: EEPROM indicates default country code should be used
[   11.248582] ath: doing EEPROM country->regdmn map search
[   11.253950] ath: country maps to regdmn code: 0x3a
[   11.258805] ath: Country alpha2 being used: US
[   11.263466] ath: Regpair used: 0x3a
[   15.355756] wlan0: authenticate with 8c:58:72:d4:d1:8d (local 
address=82:95:77:b1:05:a5)
[   15.363942] wlan0: send auth to 8c:58:72:d4:d1:8d (try 1/3)
[   15.372142] wlan0: authenticated
[   15.377928] wlan0: associate with 8c:58:72:d4:d1:8d (try 1/3)
[   15.386338] wlan0: RX AssocResp from 8c:58:72:d4:d1:8d (capab=0x11 status=0 
aid=2)
[   15.466514] wlan0: associated
[   23.167251] systemd-journald[195]: Oldest entry in 
/var/log/journal/ec3e0078e5e0499bac67949f3edf3fcf/system.journal is older than 
the configured file retention duration (1month), suggesting rotation.
[   23.185186] systemd-journald[195]: 
/var/log/journal/ec3e0078e5e0499bac67949f3edf3fcf/system.journal: Journal 
header limits reached or header out-of-date, rotating.
[   31.750177] l5: disabling
[   31.753382] l11: disabling
[   31.756385] l16: disabling
[ 5064.093748] wlan0: deauthenticated from 8c:58:72:d4:d1:8d (Reason: 
16=GROUP_KEY_HANDSHAKE_TIMEOUT)

So.

I wonder what state the GTK - offload is in here.

        WMI_GTK_OFFLOAD_CMDID = WMI_CMD_GRP(WMI_GRP_GTK_OFL),

drivers/net/wireless/ath/ath10k/wmi-tlv.c: cfg->gtk_offload_max_vdev = __cpu_to_le32(2);

Try toggling that offload off or on and see what happens.

[ 5067.083790] wlan0: authenticate with 8c:58:72:d4:d1:8d (local 
address=82:95:77:b1:05:a5)
[ 5067.091971] wlan0: send auth to 8c:58:72:d4:d1:8d (try 1/3)
[ 5067.100192] wlan0: authenticated
[ 5067.104734] wlan0: associate with 8c:58:72:d4:d1:8d (try 1/3)
[ 5067.113230] wlan0: RX AssocResp from 8c:58:72:d4:d1:8d (capab=0x11 status=0 
aid=2)
[ 5067.193624] wlan0: associated
[10437.346541] wlan0: deauthenticated from 8c:58:72:d4:d1:8d (Reason: 
16=GROUP_KEY_HANDSHAKE_TIMEOUT)
[10440.340111] wlan0: authenticate with 8c:58:72:d4:d1:8d (local 
address=82:95:77:b1:05:a5)
[10440.348408] wlan0: send auth to 8c:58:72:d4:d1:8d (try 1/3)
[10440.356698] wlan0: authenticated
[10440.361077] wlan0: associate with 8c:58:72:d4:d1:8d (try 1/3)
[10440.369516] wlan0: RX AssocResp from 8c:58:72:d4:d1:8d (capab=0x11 status=0 
aid=2)
[10440.446661] wlan0: associated

You can put another device on your WiFi network into monitor mode and sniff what is taking place.

Kali Linux I've used in the past on an RPI for this purpose and it was very easy todo.

https://cyberlab.pacific.edu/resources/lab-network-wireless-sniffing

Another thing to try is to do this same test on an open - unencrypted link.

If we really suspect firmware here, lets try switching off firmware offload features one-by-one, starting with GTK offload.

---
bod

Reply via email to