Re: athn0 works in 6.6, fails in 6.7

2020-07-13 Thread Stefan Sperling
On Sun, Jul 12, 2020 at 08:25:43AM -0500, Tim Chase wrote:
> That said, I haven't yet applied your patch below (the second patch
> you sent me) since everything was working with the first one.  Would
> you still like me to apply that patch and test too?  Or are you
> satisfied with the patch you sent to tech@ ?

I expect that either patch in isolation would fix it. I'll commit both.

Generally, 'ic' represents the wireless interface and 'ni' is used when
referring to a peer on the wifi network. So this second diff is valid,
even though combined with the other patch it boils down to a style fix.

Thank you for your patience in getting this resolved :)

> > diff 21633c8848e72769b1658114d9c706c177040a2a /usr/src
> > blob - 70dbaf422bd5ddc4567ef61322718cb15e7453ce
> > file + sys/dev/ic/ar5008.c
> > --- sys/dev/ic/ar5008.c
> > +++ sys/dev/ic/ar5008.c
> > @@ -1005,7 +1005,7 @@ ar5008_rx_process(struct athn_softc *sc,
> > struct mbuf_l (ni->ni_flags & IEEE80211_NODE_RXPROT) &&
> > (ni->ni_rsncipher == IEEE80211_CIPHER_CCMP ||
> > (IEEE80211_IS_MULTICAST(wh->i_addr1) &&
> > -   ic->ic_rsngroupcipher == IEEE80211_CIPHER_CCMP))) {
> > +   ni->ni_rsngroupcipher == IEEE80211_CIPHER_CCMP))) {
> > if (ar5008_ccmp_decap(sc, m, ni) != 0) {
> > ifp->if_ierrors++;
> > ieee80211_release_node(ic, ni);
> 
> 
> 



Re: athn0 works in 6.6, fails in 6.7

2020-07-11 Thread Stefan Sperling
On Sat, Jul 11, 2020 at 10:22:10AM -0500, Tim Chase wrote:
> On 2020-07-11 10:34, Stefan Sperling wrote:
> > On Fri, Jul 10, 2020 at 10:20:07AM -0500, Tim Chase wrote:
> > > On 2020-07-10 09:58, Stefan Sperling wrote:  
> > > > Does it work if you eliminate DHCP and assign static IPs?  
> > > 
> > > It indeed allowed me to talk to other devices on the network
> > > over wifi.  So that's at least some form of progress from where
> > > the physical layer wasn't even giving a carrier originally.  
> > 
> > With static IP working, it's quite likely that looking at the wifi
> > layer won't lead us anywhere. This is looking more and more like a
> > DHCP problem, rather than a wifi problem. I'm at a loss as to where
> > we could look next.
> 
> Since it seems like we hit a dead-end, I went ahead took your other
> suggestion, switching my router so it's AES only, no TKIP and it's
> back to working.  So something in that TKIP confusion that you
> identified seems to be dropping the DHCP reply on the floor.

Thanks for confirming that CCPM-only (i.e. WPA2-only) works as expected.
What is a bit puzzling is that a CCMP+TKIP AP I used for trying to
reproduce the issue seemed to work as expected for me.

I have now found out that ic_rsngroupcipher wasn't updated and always
remained set to CCMP (see the other patch I sent to tech@ earlier today).
Turn out athn is relying on that value to be correct.
If the AP uses TKIP for group-addressed frames, then the check shown in
the patch below might result in a false-positive for TKIP frames.
Does this patch make the device work against your AP in both modes?

Something else we could still try is to only enable offloading if the AP
uses CCMP as a groupcipher. We could completely fall back to software crypto
in all the other cases. But let's try this simple patch first, since this
keeps pairwise CCMP accelerated in hardware.

diff 21633c8848e72769b1658114d9c706c177040a2a /usr/src
blob - 70dbaf422bd5ddc4567ef61322718cb15e7453ce
file + sys/dev/ic/ar5008.c
--- sys/dev/ic/ar5008.c
+++ sys/dev/ic/ar5008.c
@@ -1005,7 +1005,7 @@ ar5008_rx_process(struct athn_softc *sc, struct mbuf_l
(ni->ni_flags & IEEE80211_NODE_RXPROT) &&
(ni->ni_rsncipher == IEEE80211_CIPHER_CCMP ||
(IEEE80211_IS_MULTICAST(wh->i_addr1) &&
-   ic->ic_rsngroupcipher == IEEE80211_CIPHER_CCMP))) {
+   ni->ni_rsngroupcipher == IEEE80211_CIPHER_CCMP))) {
if (ar5008_ccmp_decap(sc, m, ni) != 0) {
ifp->if_ierrors++;
ieee80211_release_node(ic, ni);



Re: athn0 works in 6.6, fails in 6.7

2020-07-09 Thread Stefan Sperling
On Wed, Jul 08, 2020 at 06:37:31PM -0500, Tim Chase wrote:
> > Are you able to run tcpdump on the AP itself or on the network
> > behind the AP? Do you see DHCP requests from the athn client
> > arriving there? 
> 
> The wireless router (also serving DHCP/DNS) is ISP-provided hardware.
> I put Wireshark on my wife's Debian machine and ran it with a filter
> for
> 
>   ether src {MAC} || ether dst {MAC}
> 
> for the MAC of my Mini10 in question and was pleasantly surprised
> to get signs of DHCP chatter on the network (I was fully expecting
> a lack of link to mean I'd get nothing).  I've attached the pcap.gz
> which hopefully gives you useful stuff to hammer at.

What this shows is that DHCP to the broadcast address arrives on
the LAN. The next thing we need to know is why your client doesn't
seem to be receiving any DHCP responses.

> Please let me know if I can scrounge any other useful information for
> you.  Thanks again!

Can you provide another pcap showing the same situation from the
athn's client's perspective?

  tcpdump -n -i athn0 -y IEEE802_11_RADIO -s 4096 -w /tmp/athn.pcap



Re: athn0 works in 6.6, fails in 6.7

2020-07-07 Thread Mikolaj Kucharski
Hi all,

On Tue, Jul 07, 2020 at 07:21:10PM +0200, Stefan Sperling wrote:
> That is looking good. The AP is supposed to rotate the group key hourly.
> Since you are able to receive group key updates it follows that the
> pairwise WPA2 crypto for regular data traffic is now working, too.
> Group key updates are encrypted with the pairwise key and sent to
> each client individually. Back when your athn client didn't see group
> key updates it implied that pairwise crypto wasn't working.
> 
> Are CCMP decryption error counters in netstat -W athn0 close to zero now?
> If so, then I have successfully reproduced and fixed one bug. But some other
> problem remains which we need to diagnose next.
> 
> Are you able to run tcpdump on the AP itself or on the network behind the AP?
> Do you see DHCP requests from the athn client arriving there?
> 

Joining the thread as I'm also facing the same issue on my APU which has
athn(4) configured as a client. That client connects to athn(4) on
OpenBSD running in hostap mode.

Today morning I've upgraded (both hostap and client) to:

OpenBSD 6.7-current (GENERIC.MP) #333: Mon Jul  6 15:01:05 MDT 2020
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP

and I didn't see the problem so far. It happened for me rarely. First
occurence started in the middle of the night, so I didn't notice it
until morning when I've noticed that one of my PC Engiens is not
reachable. I didn't actually investigated it too much.

Please CC me in this thread as I'm not subscribed to the list. I will be
traveling during second half of this week, but when I'm at home, I'm
happy to help investigate this problem. Below two dmesgs from client and
hostap.


This is dmesg from client:

OpenBSD 6.7-current (GENERIC.MP) #333: Mon Jul  6 15:01:05 MDT 2020
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 4261076992 (4063MB)
avail mem = 4116893696 (3926MB)
random: good seed from bootblocks
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.7 @ 0xdffb7020 (7 entries)
bios0: vendor coreboot version "4.0.7" date 02/28/2017
bios0: PC Engines APU2
acpi0 at bios0: ACPI 4.0
acpi0: sleep states S0 S1 S2 S3 S4 S5
acpi0: tables DSDT FACP SSDT APIC HEST SSDT SSDT HPET
acpi0: wakeup devices PWRB(S4) PBR4(S4) PBR5(S4) PBR6(S4) PBR7(S4) PBR8(S4) 
UOH1(S3) UOH3(S3) UOH5(S3) XHC0(S4)
acpitimer0 at acpi0: 3579545 Hz, 32 bits
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: AMD GX-412TC SOC, 998.27 MHz, 16-30-01
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,IBS,SKINIT,TOPEXT,DBKP,PERFTSC,PCTRL3,ITSC,BMI1,XSAVEOPT
cpu0: 32KB 64b/line 2-way I-cache, 32KB 64b/line 8-way D-cache, 2MB 64b/line 
16-way L2 cache
cpu0: ITLB 32 4KB entries fully associative, 8 4MB entries fully associative
cpu0: DTLB 40 4KB entries fully associative, 8 4MB entries fully associative
cpu0: TSC skew=0 observed drift=0
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
cpu0: apic clock running at 99MHz
cpu0: mwait min=64, max=64, IBE
cpu1 at mainbus0: apid 1 (application processor)
cpu1: AMD GX-412TC SOC, 998.17 MHz, 16-30-01
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,IBS,SKINIT,TOPEXT,DBKP,PERFTSC,PCTRL3,ITSC,BMI1,XSAVEOPT
cpu1: 32KB 64b/line 2-way I-cache, 32KB 64b/line 8-way D-cache, 2MB 64b/line 
16-way L2 cache
cpu1: ITLB 32 4KB entries fully associative, 8 4MB entries fully associative
cpu1: DTLB 40 4KB entries fully associative, 8 4MB entries fully associative
cpu1: TSC skew=31 observed drift=0
cpu1: smt 0, core 1, package 0
cpu2 at mainbus0: apid 2 (application processor)
cpu2: AMD GX-412TC SOC, 998.14 MHz, 16-30-01
cpu2: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,CX16,SSE4.1,SSE4.2,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,IBS,SKINIT,TOPEXT,DBKP,PERFTSC,PCTRL3,ITSC,BMI1,XSAVEOPT
cpu2: 32KB 64b/line 2-way I-cache, 32KB 64b/line 8-way D-cache, 2MB 64b/line 
16-way L2 cache
cpu2: ITLB 32 4KB entries fully associative, 8 4MB entries fully associative
cpu2: DTLB 40 4KB entries fully associative, 8 4MB entries fully associative
cpu2: TSC skew=-50 observed drift=0
cpu2: smt 0, core 2, package 0
cpu3 at mainbus0: apid 3 (application processor)
cpu3: AMD GX-412TC SOC, 998.24 MHz, 16-30-01
cpu3: 

Re: athn0 works in 6.6, fails in 6.7

2020-07-07 Thread Tim Chase
On 2020-07-07 08:50, Stefan Sperling wrote:
> The diff has since been committed. Could you try a snapshot just
> to see if that works?

I pulled down the latest snap (#290 Jul 6 15:31:39 according to
dmesg), rebooted into it and grabbed a shell, then issued

  # ifconfig athn0 debug nwid "$MYSSID" wpakey "$MYKEY" up

and got largely the same output as before, eventually hitting

  athn: associated with {MAC} ssid "{MYSSID}" channel 3 start 1Mb short 
preamble short slot time

and getting through all 4/4 of the 4-way handshake, and doing 1/2
and 2/2 fo the group key handshake, stopping at the

  athn0: sending msg 2/2 of the group key handshake to {MAC}

This is further than 6.7 was getting (6.6 got this far).  However,
the output stops there.  An ifconfig says (hand transcribed)

athn0: flags=8847 mtu 1500
  lladdr {mymac}
  llprio 3
  groups: wlan
  media: IEEE802.11 autoselect (DS1 mode 11g)
  status: active
  ieee80211: nwid {myssid} chan 3 bssid {routermac} -43dBm wpakey wpaprotos 
wpa2 wpaakms psk wpaciphers ccmp wpacgroupcipher ccmp

About hourly(?) since issuing those, my console has given me another
pair of

  athn0: received msg 1/2 of the group key handshake from {router MAC}
  athn0: sending msg 2/2 of the group key handshake to {router MAC}

if that matters, though it seems successful.

On 6.6, after getting that successful 2/2-group-key-handshake,
issuing `dhclient athn0` would get me a connection but trying it on
this snap gives me the same as 6.7

 # dhclient athn0
 athn0: no lease... sleeping

If you need further information, I can do my best to provide it.

> Lacking any better ideas I may try to install i386 on an APU and
> see if I can still reproduce it then...

My offer to rebuild with debugging patches holds as well if needed.

Thanks again!

-tim






Re: athn0 works in 6.6, fails in 6.7

2020-07-07 Thread Stefan Sperling
On Tue, Jul 07, 2020 at 11:31:08AM -0500, Tim Chase wrote:
> On 2020-07-07 08:50, Stefan Sperling wrote:
> > The diff has since been committed. Could you try a snapshot just
> > to see if that works?
> 
> I pulled down the latest snap (#290 Jul 6 15:31:39 according to
> dmesg), rebooted into it and grabbed a shell, then issued
> 
>   # ifconfig athn0 debug nwid "$MYSSID" wpakey "$MYKEY" up
> 
> and got largely the same output as before, eventually hitting
> 
>   athn: associated with {MAC} ssid "{MYSSID}" channel 3 start 1Mb short 
> preamble short slot time
> 
> and getting through all 4/4 of the 4-way handshake, and doing 1/2
> and 2/2 fo the group key handshake, stopping at the
> 
>   athn0: sending msg 2/2 of the group key handshake to {MAC}
> 
> This is further than 6.7 was getting (6.6 got this far).  However,
> the output stops there.  An ifconfig says (hand transcribed)
> 
> athn0: flags=8847 mtu 1500
>   lladdr {mymac}
>   llprio 3
>   groups: wlan
>   media: IEEE802.11 autoselect (DS1 mode 11g)
>   status: active
>   ieee80211: nwid {myssid} chan 3 bssid {routermac} -43dBm wpakey wpaprotos 
> wpa2 wpaakms psk wpaciphers ccmp wpacgroupcipher ccmp
> 
> About hourly(?) since issuing those, my console has given me another
> pair of
> 
>   athn0: received msg 1/2 of the group key handshake from {router MAC}
>   athn0: sending msg 2/2 of the group key handshake to {router MAC}
> 
> if that matters, though it seems successful.

That is looking good. The AP is supposed to rotate the group key hourly.
Since you are able to receive group key updates it follows that the
pairwise WPA2 crypto for regular data traffic is now working, too.
Group key updates are encrypted with the pairwise key and sent to
each client individually. Back when your athn client didn't see group
key updates it implied that pairwise crypto wasn't working.

Are CCMP decryption error counters in netstat -W athn0 close to zero now?
If so, then I have successfully reproduced and fixed one bug. But some other
problem remains which we need to diagnose next.

Are you able to run tcpdump on the AP itself or on the network behind the AP?
Do you see DHCP requests from the athn client arriving there?



Re: athn0 works in 6.6, fails in 6.7

2020-07-07 Thread Stefan Sperling
On Mon, Jul 06, 2020 at 03:27:37PM -0500, Tim Chase wrote:
> On 2020-07-03 20:33, Stefan Sperling wrote:
> > On Wed, Jul 01, 2020 at 06:14:50PM -0500, Tim Chase wrote:
> > > Just wanted to check back in if there's anything else I can get
> > > you to help diagnose this.  
> > 
> > Please try this patch. It fixes the issue for me.
> 
> Sorry it has taken so long to get back to you.  Building a kernel
> took ~12hr (finished 2020-07-03 as seen in the dmesg at the bottom
> of this reply) and the rest of the base system took another 2 days.
> 
> I pulled the latest CVS sources, applied the patch, built, and
> rebooted into the new kernel but unfortunately get the same
> non-working results as before.  In case it matters, this is an
> AR9281 which I imagine should be pretty close to the AR9280 that
> you tested.
> 

The diff has since been committed. Could you try a snapshot just
to see if that works?

Lacking any better ideas I may try to install i386 on an APU and
see if I can still reproduce it then...



Re: athn0 works in 6.6, fails in 6.7

2020-06-13 Thread Matej Nanut
> > On 2020-06-12 09:19, Stefan Sperling wrote:
> > > Can you please boot into 6.7, let it fail to connect, and then get
> > > the output of the following command and show it to me?
> > >
> > > netstat -W athn0

Hello, I seem to have the same issue running latest -current from
ftp2.eu.openbsd.org:
$ uname -a
OpenBSD asus 6.7 GENERIC.MP#268 amd64

I executed the following two commands on a fresh boot:
$ doas ifconfig athn0 debug nwid  wpakey  up
$ doas dhclient athn0
and waited for "... sleeping".

My (greater than 0) counters from "netstat -W athn0" are:
14 input packets with mismatched channel
2 input eapol-key packets
1 active scan started
43 ccmp decryption errors
1 HT negotiation failure because peer does not support MCS 0-7

And here's an excerpt from "dmesg" after scans:
athn0: SCAN -> AUTH
athn0: sending auth to 00:23:69:ea:49:3d on channel 7 mode 11g
athn0: AUTH -> ASSOC
athn0: sending assoc_req to 00:23:69:ea:49:3d on channel 7 mode 11g
athn0: ASSOC -> RUN
athn0: associated with 00:23:69:ea:49:3d ssid "" channel 7 start
1Mb long preamble short slot time
athn0: missed beacon threshold set to 30 beacons, beacon interval is 100 TU
athn0: received msg 1/4 of the 4-way handshake from 00:23:69:ea:49:3d
athn0: sending msg 2/4 of the 4-way handshake to 00:23:69:ea:49:3d
athn0: received msg 3/4 of the 4-way handshake from 00:23:69:ea:49:3d
athn0: sending msg 4/4 of the 4-way handshake to 00:23:69:ea:49:3d

I hope this is in any way useful.
Matej



Re: athn0 works in 6.6, fails in 6.7

2020-06-13 Thread Stefan Sperling
On Fri, Jun 12, 2020 at 07:13:39AM -0500, Tim Chase wrote:
> On 2020-06-12 09:19, Stefan Sperling wrote:
> > Can you please boot into 6.7, let it fail to connect, and then get
> > the output of the following command and show it to me?
> > 
> > netstat -W athn0
> 
> The machine currently has a bit less than 13hr of uptime to put this
> output in perspective of frequency in case that matters.

It would help me to see what these counters look like after boot + 1 failed
connection attempt. What I'll be doing is go through the code and check where
relevenant counters get incremented. Then maybe, just maybe, I will be able
to deduce where your problem is coming from.

I'm not going to try to do that based on counters which have been updating
for 13 hours.



Re: athn0 works in 6.6, fails in 6.7

2020-06-12 Thread Stefan Sperling
On Thu, Jun 11, 2020 at 06:08:31PM -0500, Tim Chase wrote:
> and it works fine there.  The big distinction is that after 
> 
>   sending msg 4/4 of the 4-way handshake
> 
> my `ifconfig athn0 debug` output is giving me these two lines in the
> 6.6 bsd.rd:
> 
>   received msg 1/2 of the group key handshake from [MAC]
>   sending msg 2/2 of the group key handshake to [same MAC]
> 
> that never happen in the 6.7 (both -RELEASE and -CURRENT snap) output.

Can you please boot into 6.7, let it fail to connect, and then get the
output of the following command and show it to me?

netstat -W athn0