On Tue, May 11, 2021 at 10:28:21AM +0200, Stefan Sperling wrote:
> On Sun, May 09, 2021 at 08:28:14PM +0000, Mikolaj Kucharski wrote:
> > ..and in case timestamps may give a bit more clue, here is example from
> > one of the accesspoints:
>
> Yes, this is insightful:
>
> > # grep -F ieee80211_encap /var/log/messages
> > 2021-05-09T11:35:31.155Z pce-0041 /bsd: ieee80211_encap: data frame for
> > node c0:ee:fb:33:f0:11 in state 4
>
> At this point the node was in COLLECT state.
>
> This happens if the node deliberately disassociates or if the AP doesn't
> see activity from the node for a while. Which should be a common case for
> phones that walk in and out of range.
>
> > 2021-05-09T11:36:07.957Z pce-0041 /bsd: ieee80211_encap: data frame for
> > node c0:ee:fb:33:f0:11 in state 4
> > 2021-05-09T11:36:07.964Z pce-0041 /bsd: ieee80211_encap: data frame for
> > node c0:ee:fb:33:f0:11 in state 4
> > 2021-05-09T19:14:34.117Z pce-0041 /bsd: ieee80211_encap: data frame for
> > node c0:ee:fb:33:f0:11 in state 4
> > 2021-05-09T19:14:34.567Z pce-0041 /bsd: ieee80211_encap: data frame for
> > node c0:ee:fb:33:f0:11 in state 4
> > 2021-05-09T19:14:37.467Z pce-0041 /bsd: ieee80211_encap: data frame for
> > node c0:ee:fb:33:f0:11 in state 4
> > 2021-05-09T19:14:44.217Z pce-0041 /bsd: ieee80211_encap: data frame for
> > node c0:ee:fb:33:f0:11 in state 4
>
> My guess is that all of these frames were broadcasts which get sent to
> all nodes. It doesn't look like a data stream that targets this particular
> node since the packets are far apart from each other.
>
> > 2021-05-09T19:14:49.267Z pce-0041 /bsd: ieee80211_encap: data frame for
> > node c0:ee:fb:33:f0:11 in state 2
>
> The phone came back, and is in state AUTH.
> Not sure why it stays in this state for a while. Maybe it has decided to
> try a different AP after all and never made it back into ASSOC state this
> time around?
Unlikely, at this location there is only one access point which is
configured on that mobile phone. Tha OnePlus mobile doesn't have
other choice than that pce-0041 with athn(4).
I have machine with unused iwn(4) in that location and I started
recently below tcpdumploop script (don't have tcpdumps during
above logs, as I wrote tcpdumploop after above testing):
---8<---
#!/bin/sh
#
# $OpenBSD$
daemon="/bin/sh /opt/bin/tcpdumploop"
daemon_logger="daemon.info"
. /etc/rc.d/rc.subr
rc_reload=NO
rc_bg=YES
rc_stop() {
local parent=$(pgrep -T "${daemon_rtable}" -xf "${pexp}")
if [ -z "${parent}" ]
then
# daemon not running?
return 1
fi
pkill -T "${daemon_rtable}" -P "${parent}"
kill "${parent}"
}
rc_cmd $1
--->8---
---8<---
#!/bin/sh
set -e
umask 077
if='iwn0'
chan='6'
count='1000000'
dir='/home/tcpdumps'
ifconfig $if mediaopt monitor
ifconfig $if chan $chan
ifconfig $if up
while :
do
find "$dir" -maxdepth 1 -type f -iname '*.pcap' -mtime +30 -ls -delete
|| :
ts=`date -u '+%Y%m%d-%H%M%S'`
rnd=`openssl rand -hex 8`
pcap="$if-$ts-$rnd.pcap"
echo "saving $count packets on $if interface to $dir/$pcap"
rm -vf "$dir/$pcap"
tcpdump -c $count -n -i $if -y IEEE802_11_RADIO -s 4096 -w "$dir/$pcap"
done
--->8---
Maybe it will be useful in the future? I never used tcpdump(8) with that
iwn(4), so not sure how reliable it works:
pc1# dmesg | grep -e iwn
iwn0 at pci2 dev 0 function 0 "Intel WiFi Link 5100" rev 0x00: msi, MIMO 1T2R,
MoW, address 00:24:d6:92:e9:f0
pc1# ifconfig iwn0
iwn0: flags=8943<UP,BROADCAST,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500
lladdr 00:24:d6:92:e9:f0
index 2 priority 4 llprio 3
groups: wlan
media: IEEE802.11 autoselect monitor
status: active
ieee80211: nwid ""
I also didn't even look at those tcpdumps, so not sure what's inside
them:
pc1# ls -lh /home/tcpdumps/
total 7300448
-rw------- 1 root wheel 589M May 10 04:49
iwn0-20210510-042148-5d168b68375a1877.pcap
-rw------- 1 root wheel 424M May 10 12:08
iwn0-20210510-044948-3755e8e9d53154b6.pcap
-rw------- 1 root wheel 348M May 10 16:54
iwn0-20210510-120844-f15d72c75800b384.pcap
-rw------- 1 root wheel 278M May 10 20:55
iwn0-20210510-165442-71b19d49c5c85f67.pcap
-rw------- 1 root wheel 486M May 11 08:18
iwn0-20210510-205507-8d0cac31a70203be.pcap
-rw------- 1 root wheel 255M May 11 18:05
iwn0-20210511-081820-f54c1262501bc25b.pcap
-rw------- 1 root wheel 546M May 11 18:24
iwn0-20210511-180553-5a6b939813426388.pcap
-rw------- 1 root wheel 639M May 11 19:37
iwn0-20210511-182414-5232129b4cc89335.pcap
> > 2021-05-09T19:14:50.317Z pce-0041 /bsd: ieee80211_encap: data frame for
> > node c0:ee:fb:33:f0:11 in state 2
> > 2021-05-09T19:14:51.317Z pce-0041 /bsd: ieee80211_encap: data frame for
> > node c0:ee:fb:33:f0:11 in state 2
> > 2021-05-09T19:14:53.317Z pce-0041 /bsd: ieee80211_encap: data frame for
> > node c0:ee:fb:33:f0:11 in state 2
> > 2021-05-09T19:14:57.318Z pce-0041 /bsd: ieee80211_encap: data frame for
> > node c0:ee:fb:33:f0:11 in state 2
>
> Perhaps the phone finally associated here.
>
> > 2021-05-09T19:16:06.170Z pce-0041 /bsd: ieee80211_encap: data frame for
> > node c0:ee:fb:33:f0:11 in state 4
>
> And after a couple of minutes the AP has given up on the phone again.
>
> We don't have clear confirmation yet that this is indeed triggering the
> sw crypto panic but I believe it could. In any case, sending data frames
> to non-associated nodes is wrong and is clearly occurring here.
> I will commit the fix. Thank you for testing!
--
Regards,
Mikolaj