Re: [External] : pf_state_key_unref: panic: kernel diagnostic assertion "refcnt != ~0" failed: file "/usr/src/sys/kern/kern_synch.c", line 826

2021-05-06 Thread Alexandr Nedvedicky
Hello Olivier and Sebastien,

I took a look at old version of pf_state_key_link_reverse(),
before my commit [1] changed it. The answer was there:

7368 void
7369 pf_state_key_link_reverse(struct pf_state_key *sk, struct pf_state_key 
*skrev)
7370 {
7371 /* Note that sk and skrev may be equal, then we refcount twice. */
7372 KASSERT(sk != skrev);
7373 KASSERT(sk->reverse == NULL);
7374 KASSERT(skrev->reverse == NULL);
7375 sk->reverse = pf_state_key_ref(skrev);
7376 skrev->reverse = pf_state_key_ref(sk);
7377 }

comment at line 7371 says the skrev and sk may be equal.

Taking a look at current pf_state_key_link_reverse(), which is in the tree
right now, behaves differently to old function when sk and skrev are
identical:

7368 void
7369 pf_state_key_link_reverse(struct pf_state_key *sk, struct pf_state_key 
*skrev)
7370 {
7371 struct pf_state_key *old_reverse;
7372 
7373 old_reverse = atomic_cas_ptr(>reverse, NULL, skrev);
7374 if (old_reverse != NULL)
7375 KASSERT(old_reverse == skrev);
7376 else
7377 pf_state_key_ref(skrev);
7378 
7379 old_reverse = atomic_cas_ptr(>reverse, NULL, sk);
7380 if (old_reverse != NULL)
7381 KASSERT(old_reverse == sk);
7382 else
7383 pf_state_key_ref(sk);
7384 }

note that if both keys are identical, we are going to grab just single
reference. The story goes like that:

we assume sk == skrev

sk->reverse == NULL when we do  atomic_cas() at line 7373
the cas sets sk->reverse to skrev, which is equal to sk

the old_reverse is NULL when we reach 7374, therefore we
grab reference to skrev

at line 7379 we do cas_ptr on skrev->reverse.
because sk == skrev, the skrev->reverse is not NULL.

the old_reverse is not NULL when we reach 7380,
old_reverse equals to skrev (and sk)

old_reverse == skrev == sk

in this case we do KASSERT(old_reverse == sk) at line 7381,
the assertion holds so we just return from the function
without getting extra reference. this is wrong.

I'd like to ask you for yet another brave test. just to verify
the story I dream of above really happens. The plan is to put
yet another KASSERT() to pf_state_key_link_reverse():

KASSERT(sk != skrev);

I expect the KASSERT will fire sooner or later. I'm not sure
which packet could trigger such condition (sk == skrev), I
suspect this could be kind of multicast/broadcast packet.
Hence I'd like to ask you to give a try KASSSERT() above.

diff below is against current tree. It adds desired assert to
your pf_state_key_link_reverse().

thank you very much for your help.

regards
sashan

[1] https://marc.info/?l=openbsd-cvs=161951631726837=2

8<---8<---8<--8<
diff --git a/sys/net/pf.c b/sys/net/pf.c
index 23eebf4a274..298568dec28 100644
--- a/sys/net/pf.c
+++ b/sys/net/pf.c
@@ -7368,19 +7368,12 @@ pf_inp_unlink(struct inpcb *inp)
 void
 pf_state_key_link_reverse(struct pf_state_key *sk, struct pf_state_key *skrev)
 {
-   struct pf_state_key *old_reverse;
-
-   old_reverse = atomic_cas_ptr(>reverse, NULL, skrev);
-   if (old_reverse != NULL)
-   KASSERT(old_reverse == skrev);
-   else
-   pf_state_key_ref(skrev);
-
-   old_reverse = atomic_cas_ptr(>reverse, NULL, sk);
-   if (old_reverse != NULL)
-   KASSERT(old_reverse == sk);
-   else
-   pf_state_key_ref(sk);
+   /* Note that sk and skrev may be equal, then we refcount twice. */
+   KASSERT(sk != skrev);
+   KASSERT(sk->reverse == NULL);
+   KASSERT(skrev->reverse == NULL);
+   sk->reverse = pf_state_key_ref(skrev);
+   skrev->reverse = pf_state_key_ref(sk);
 }
 
 #if NPFLOG > 0



Re: Lenovo Thinkstation Intel Xeon - Fails to boot

2021-05-06 Thread jeanfrancois

Hi,

Thank you, two persons including you pointed that out, I'll have to test 
on site, should have been today but postponed.


I'll report once tested, thanks.

Jean-François


Le 03/05/2021 à 09:07, Janne Johansson a écrit :

Den mån 3 maj 2021 kl 08:36 skrev jeanfrancois :


AMD released their 64 bit extensions with the FX series back in 2005(?),
which Intel later adopted.
Intel had their own 64 bit architecture with itanium, which could run 32
bit i386 Software, but that was, iirc, emulation.

https://ark.intel.com/content/www/us/en/ark/products/123547/intel-xeon-silver-4110-processor-11m-cache-2-10-ghz.html

says the Xeon Silver 4110 supports 64bit mode (without being an
Itanium), so please do try the amd64 version of OpenBSD.






Re: [External] : pf_state_key_unref: panic: kernel diagnostic assertion "refcnt != ~0" failed: file "/usr/src/sys/kern/kern_synch.c", line 826

2021-05-06 Thread Olivier Cherrier
On Thu, May 06, 2021 at 04:50:31PM +0200, alexandr.nedvedi...@oracle.com wrote:
> thank you for your help with this. I have not heard back from Sebastien 
> yet.
> one more question:
>   are you building your bsd kernel with DIAGNOSTIC option enabled?

It is the GENERIC kernel. So with DIAGNOSTIC:
$ grep DIAG /sys/conf/GENERIC  
option  DIAGNOSTIC  # internal consistency checks
$


Thanks,
Best.

-- 
Olivier Cherrier
Phone: +352691570680
mailto:o...@symacx.com



Re: 6.9amd64 'dhcpd re0' won't listen/reply DHCP requests on Realtek 8168 RTL8168H/8111H

2021-05-06 Thread Martin
Small update:

The laptop has two integrated adapters:
re0 is: RTL8168EP/8111EP - dhcpd re0 -> dhcp server works and give IPs to 
clients;
re1 is: RTL8168H/8111H - dhcpd re1 -> dhcpd server does not reply to client's 
requests.

Martin

‐‐‐ Original Message ‐‐‐
On Thursday, May 6, 2021 1:33 PM, Martin  wrote:

> Hi list,
>
> Facing an issue with dhcpd and re(4) driver.
>
> Changed Lenovo X230 with em(4) driver Intel based LAN adapter to a laptop 
> with re(4) based on Realtek 8168 RTL8168H/8111H which works as gateway.
>
> Previous configuration 'dhcpd em0' works fine for years, but 'dhcpd re0' 
> don't reply with leases. Clients never receive their leases in reply to DHCP 
> requests.
>
> No changes have been performed except em0 -> re0.
>
> dhcpd works with athn(4) adapters, but no re(4).
>
> Martin




Re: 6.9amd64 athn(4) AR7010+AR9280 & AR9271 USB WiFi don't work in AP mode (frames corruption)

2021-05-06 Thread Martin
Great, waiting for an USB driver AP mode power management updates. Can test if 
necessary.

Do you know any workaround to make phone do not care about power management AP?

Sadly, going to connect the phone to OpenBSD box by a cable then.

Martin

‐‐‐ Original Message ‐‐‐
On Thursday, May 6, 2021 2:33 PM, Stefan Sperling  wrote:

> On Thu, May 06, 2021 at 01:17:48PM +, Martin wrote:
>
> > After some testing the issue is quite more complicated than expected.
> >
> > 1.  Using USB dongles as AP allow connections from any device with 
> > relatively 'old' WiFi modules installed (clients with Intel, Atheros). It 
> > works smoothly with AP based on AR9280+AR7010 (on both 2.4G and 5G bands) 
> > and with AP based on AR9271 except Android clients.
> >
> > 2.  But both USB dongles in AP mode work very poor with Android devices 
> > with Broadcom WiFi driver integrated into Android 10 OS. Android's dmesg 
> > shows that wlan0 uses bcmdhd kernel driver. I think its code is here: 
> > https://android.googlesource.com/kernel/common.git/+/bcmdhd-3.10
> >
> >
> > 2.1. The same Android 10 based device with the same bcmdhd kernel driver 
> > connects successfully to PCIe AR9280 card in AP mode and works absolutely 
> > without any issues.
> > 2.2. Moreother, any of USB dongle with Android 10 client connected triggers 
> > PF 's rule 'max-src-rate 100/1' rule which never triggered with PCIe AR9280 
> > card in AP mode. Commenting this rule out can't help much, ICMP packets 
> > disappearing between devices, but some ICMP goes between them. No any 
> > possibility of data transfers because USB AP and Android 10 client because 
> > of lots of missed packets. Distance between USB AP and Android client is 
> > not more than 1M.
> > The question is why Android client works fine with PCIe version AR9280 and 
> > don't work with USB version of AR9280 / AR9271.
> > I've tested all modes 11a (5G) 11n, 11g, and 11b (2.4G). The USB APs 
> > behavior is the same.
>
> Ah, yes, that make sense and rings a bell:
>
> The issue is that the USB part of athn does not support AP-side power
> management (i.e. buffering of frames while clients are sleeping), while
> the PCI version of the driver does support this. This explains the
> packet loss issues you mentioned in your first post.
>
> Phones really do not like access points that don't support power management.
> I'm afraid you'll have to find another AP for this purpose until perhaps
> power management support gets implemented for the USB part of athn some day.
>
> Granted, this makes athn on USB rather useless as an access point with
> clients that require power management support.
> I'll add a warning to the man page.




Re: AMD Ryzen based Asus ZENBOOK 14 UM433DA-PURE4 14" panic at first boot post install

2021-05-06 Thread Peter N. M. Hansteen
Hi Mark,

On Thu, May 06, 2021 at 04:30:00PM +0200, Mark Kettenis wrote:
> > 
> > Are there other tests I could usefully perform for this one?
> 
> Not sure.  The BIOS on this machine is kinda broken.  It adbertises
> itself as "hardware reduced" ACPI (something usually seen on arm64
> machines) and claims to support S4 and S5 (hibernate and poweroff) but
> doesn't actually implement the registers to do so.  So the machine
> doesn't actually power off.
> 
> Does the BIOS screen on this machine provide a way to enable S3 mode?
> If so it might be worth trying.

It doesn't, unfortunately. But I was thinking I would contact ASUS anyway
to see if an updated BIOS is available. So I stole most of that to input
in their support site's form.

Now I'll be looking forward to their response :)

All the best,
Peter

-- 
Peter N. M. Hansteen, member of the first RFC 1149 implementation team
http://bsdly.blogspot.com/ http://www.bsdly.net/ http://www.nuug.no/
"Remember to set the evil bit on all malicious network traffic"
delilah spamd[29949]: 85.152.224.147: disconnected after 42673 seconds.



Re: AMD Ryzen based Asus ZENBOOK 14 UM433DA-PURE4 14" panic at first boot post install

2021-05-06 Thread Peter N. M. Hansteen
Hi Mark,

Are there other tests I could usefully perform for this one?

All the best,
Peter


-- 
Peter N. M. Hansteen, member of the first RFC 1149 implementation team
http://bsdly.blogspot.com/ http://www.bsdly.net/ http://www.nuug.no/
"Remember to set the evil bit on all malicious network traffic"
delilah spamd[29949]: 85.152.224.147: disconnected after 42673 seconds.



6.9amd64 'dhcpd re0' won't listen/reply DHCP requests on Realtek 8168 RTL8168H/8111H

2021-05-06 Thread Martin
Hi list,

Facing an issue with dhcpd and re(4) driver.

Changed Lenovo X230 with em(4) driver Intel based LAN adapter to a laptop with 
re(4) based on Realtek 8168 RTL8168H/8111H which works as gateway.

Previous configuration 'dhcpd em0' works fine for years, but 'dhcpd re0' don't 
reply with leases. Clients never receive their leases in reply to DHCP requests.

No changes have been performed except em0 -> re0.

dhcpd works with athn(4) adapters, but no re(4).

Martin




Re: [External] : pf_state_key_unref: panic: kernel diagnostic assertion "refcnt != ~0" failed: file "/usr/src/sys/kern/kern_synch.c", line 826

2021-05-06 Thread Sebastien Marie
On Thu, May 06, 2021 at 06:10:39PM +0200, Alexandr Nedvedicky wrote:
> Hello,
> 
> 
> > > to be honest I have no idea what could be causing problems on those two 
> > > fairly
> > > distinct machines. The strange thing is that pf_test() currently does not 
> > > run in
> > > parallel. I don't quite understand why reverting my earlier change helps 
> > > here.
> > 
> > it could be two differents ways to trigger a bug somewhere else that
> > your commit expose.
> > 
> > the panic doesn't trigger in the same way on both machines:
> > - Olivier's machine seems to trigger it quickly (after some minutes)
> > - mine relatively slowly (~ once a day)
> 
> Olivier's machine acts as AP, so it forwards packets between interfaces.
> 
> If I remember correctly your machine is laptop/workstation, which
> does not forward traffic. 

it doesn't forward, but it acts as a bridge: 2 physical networks cards
grouped in a bridge(4) with only few traffic (a network printer is on
other side, the bridge(4) is here because I had a sparse network card
and no physical-switch to put here).

> the function, which we change back and forth here is
> pf_state_key_link_reverse(), which is being called from pf_find_state() 
> here:
> 
> 1085 
> 1086 if (sk == NULL) {
> 1087 if ((sk = RB_FIND(pf_state_tree, _statetbl,
> 1088 (struct pf_state_key *)key)) == NULL)
> 1089 return (PF_DROP);
> 1090 if (pd->dir == PF_OUT && pkt_sk &&
> 1091 pf_compare_state_keys(pkt_sk, sk, pd->kif, pd->dir) 
> == 0)
> 1092 pf_state_key_link_reverse(sk, pkt_sk);
> 1093 else if (pd->dir == PF_OUT && pd->m->m_pkthdr.pf.inp &&
> 1094 !pd->m->m_pkthdr.pf.inp->inp_pf_sk && !sk->inp)
> 1095 pf_state_key_link_inpcb(sk, 
> pd->m->m_pkthdr.pf.inp);
> 1096 }
> 1097 
> 
> the story in human words goes as follows:
> 
>   sk == NULL -> no matching state key was attached to packet, Thus we
>   have to search state key in state tree using RB_FIND()
> 
>   if we could find state key for packet in table, then we will try
>   to set up a 'shortcut', which can save us RB_FIND() later.
> 
>   1090 - 1092
>   the shortcut can be set up for outbound packet only (pd->dir PF_OUT),
>   which is also being forwarded (pkt_sk != NULL, indicates we are seeing
>   the packet for the second time pkt_sk holds state key for inbound
>   direction).  pf_compare_state_keys() is sanity check, it leaves a
>   debug message on system console on failure.
> 
>   So if it is outbound forwarded packet, we've seen earlier, we
>   set up a reverse link to save one RB_FIND() operation on next
>   forwarded packet, which matches the same state.
> 
>   1093 - 1095
>   creates similar shortcut for local bound packets. We put reference 
>   to state key into PCB linked to socket. This will save us RB_FIND()
>   operation for next local outbound, which matches the same state.
> 
> 
> given the bug seems to be triggered/uncovered by pf_state_key_link_reverse()
> is there any chance your laptop/workstation occasionally forwards packets?
> like doing NAT for vmd/qemu virtual machine?

I don't know bridge(4) internals, but it could make sense that it is
using such functions.

> if it is not the case then the question is how does it come we run
> pf_state_key_link_reverse()? which same as why pkt_sk is not NULL at line 
> 1090.
> 
> > 
> > I could try to run with your commit and see if I could trigger it more
> > easily or found some elements influencing it. I could try with GENERIC
> > for example to see if I still trigger the same assert() or if it is
> > more like Olivier.
> 
> I need to think of how to further debug the thing.
> 
> > 
> > my LAN was several hosts with the same kernel and only this machine
> > trigger the panic, so it shouldn't be strictly linked to the
> > environment.
> > 

Thanks
-- 
Sebastien Marie



Re: 6.9amd64 athn(4) AR7010+AR9280 & AR9271 USB WiFi don't work in AP mode (frames corruption)

2021-05-06 Thread Martin
After some testing the issue is quite more complicated than expected.

1. Using USB dongles as AP allow connections from any device with relatively 
'old' WiFi modules installed (clients with Intel, Atheros). It works smoothly 
with AP based on AR9280+AR7010 (on both 2.4G and 5G bands) and with AP based on 
AR9271 except Android clients.

2. But both USB dongles in AP mode work very poor with Android devices with 
Broadcom WiFi driver integrated into Android 10 OS. Android's dmesg shows that 
wlan0 uses bcmdhd kernel driver. I think its code is here: 
https://android.googlesource.com/kernel/common.git/+/bcmdhd-3.10

2.1. The same Android 10 based device with the same bcmdhd kernel driver 
connects successfully to PCIe AR9280 card in AP mode and works absolutely 
without any issues.

2.2. Moreother, any of USB dongle with Android 10 client connected triggers PF 
's rule 'max-src-rate 100/1' rule which never triggered with PCIe AR9280 card 
in AP mode. Commenting this rule out can't help much, ICMP packets disappearing 
between devices, but some ICMP goes between them. No any possibility of data 
transfers because USB AP and Android 10 client because of lots of missed 
packets. Distance between USB AP and Android client is not more than 1M.

The question is why Android client works fine with PCIe version AR9280 and 
don't work with USB version of AR9280 / AR9271.

I've tested all modes 11a (5G) 11n, 11g, and 11b (2.4G). The USB APs behavior 
is the same.

Martin


‐‐‐ Original Message ‐‐‐
On Wednesday, May 5, 2021 7:39 PM, Stefan Sperling  wrote:

> On Wed, May 05, 2021 at 06:14:52PM +, Martin wrote:
>
> > Hi list,
> > I use AR9280 PCIe card in AP mode for about two years in Lenovo x230 laptop 
> > (whitelist removed). Works perfectly.
> > After moving to a modern laptop there is no ability to install PCIe card 
> > into it.
> > I tried to use one of USB dongles I have based on:
> >
> > 1.  AR9280+AR7010
> > 2.  AR9271
> >
> > After attaching AR9271 USB dongle and restarting it by '/etc/netstart 
> > athn0' It broadcasts BSSID, and client can get IP address from it. But no 
> > data flow between USB AP and any client device connected. Most frames are 
> > dropped or corrupted on the 'air' level even USB AP and client on the same 
> > table.
> > ICMP between AP and client goes with huge delays in both ways, most 
> > packages drop. Delay is about ~3.5ms but sometimes 337.3ms and more.
> > The same issue with both USB dongles on x230 laptop (original PCIe was 
> > removed and nothing changed in configuration of PF and /etc/hostname.athn0).
> > I tried three computers with OpenBSD 6.9amd64 GENERIC installed. The 
> > behaviour is the same with USB AR9280+AR7010 and AR9271 dongles.
> > Once I return back PCIe version of AR9280 card all works like a charm.
> > It looks like a bug in athn(4) driver related USB based devices.
> > Martin
>
> What you are seeing does not match what I see with this device
> plugged into an APU2 board:
>
> athn1 at uhub0 port 4 configuration 1 interface 0 "ATHEROS UB91C" rev 
> 2.00/1.08 addr 2
> athn1: AR9271 rev 1 (1T1R), ROM rev 15, address 00:c0:ca:xx:xx:xx
>
> tcpbench sending to an iwm client associated to athn1:
>
> $ tcpbench 192.168.1.2
> elapsed_ms bytes mbps bwidth
> 1000 2195168 17.561 100.00%
> Conn: 1 Mbps: 17.561 Peak Mbps: 17.561 Avg Mbps: 17.561
> 2011 2264672 17.938 100.00%
> Conn: 1 Mbps: 17.938 Peak Mbps: 17.938 Avg Mbps: 17.938
> 3012 2292184 18.337 100.00%
> Conn: 1 Mbps: 18.337 Peak Mbps: 18.337 Avg Mbps: 18.337
> 4014 2274808 18.162 100.00%
> Conn: 1 Mbps: 18.162 Peak Mbps: 18.337 Avg Mbps: 18.162
> 5016 2231368 17.815 100.00%
> Conn: 1 Mbps: 17.815 Peak Mbps: 18.337 Avg Mbps: 17.815
> 6017 2228472 17.828 100.00%
> Conn: 1 Mbps: 17.828 Peak Mbps: 18.337 Avg Mbps: 17.828
> 7017 2200960 17.608 100.00%
> Conn: 1 Mbps: 17.608 Peak Mbps: 18.337 Avg Mbps: 17.608
> 8023 2293632 18.240 100.00%
> Conn: 1 Mbps: 18.240 Peak Mbps: 18.337 Avg Mbps: 18.240
> 9023 2274808 18.198 100.00%
> Conn: 1 Mbps: 18.198 Peak Mbps: 18.337 Avg Mbps: 18.198
> ^C
> --- 192.168.1.2 tcpbench statistics ---
> 21104600 bytes sent over 9.393 seconds
> bandwidth min/avg/max/std-dev = 17.561/17.965/18.337/0.267 Mbps
>
> I'd suggest you try your adapters on some different machines that
> use different USB host controllers. There are known issues with
> these devices on some controllers which I believe relate to (lack of?)
> USB power management in the USB stack, though I don't know for certain.
> In some known cases the firmware wouldn't even boot.
>
> Cheers,
> Stefan




Re: [External] : pf_state_key_unref: panic: kernel diagnostic assertion "refcnt != ~0" failed: file "/usr/src/sys/kern/kern_synch.c", line 826

2021-05-06 Thread Alexandr Nedvedicky
Hello,


> > to be honest I have no idea what could be causing problems on those two 
> > fairly
> > distinct machines. The strange thing is that pf_test() currently does not 
> > run in
> > parallel. I don't quite understand why reverting my earlier change helps 
> > here.
> 
> it could be two differents ways to trigger a bug somewhere else that
> your commit expose.
> 
> the panic doesn't trigger in the same way on both machines:
> - Olivier's machine seems to trigger it quickly (after some minutes)
> - mine relatively slowly (~ once a day)

Olivier's machine acts as AP, so it forwards packets between interfaces.

If I remember correctly your machine is laptop/workstation, which
does not forward traffic. 

the function, which we change back and forth here is
pf_state_key_link_reverse(), which is being called from pf_find_state() 
here:

1085 
1086 if (sk == NULL) {
1087 if ((sk = RB_FIND(pf_state_tree, _statetbl,
1088 (struct pf_state_key *)key)) == NULL)
1089 return (PF_DROP);
1090 if (pd->dir == PF_OUT && pkt_sk &&
1091 pf_compare_state_keys(pkt_sk, sk, pd->kif, pd->dir) == 
0)
1092 pf_state_key_link_reverse(sk, pkt_sk);
1093 else if (pd->dir == PF_OUT && pd->m->m_pkthdr.pf.inp &&
1094 !pd->m->m_pkthdr.pf.inp->inp_pf_sk && !sk->inp)
1095 pf_state_key_link_inpcb(sk, 
pd->m->m_pkthdr.pf.inp);
1096 }
1097 

the story in human words goes as follows:

sk == NULL -> no matching state key was attached to packet, Thus we
have to search state key in state tree using RB_FIND()

if we could find state key for packet in table, then we will try
to set up a 'shortcut', which can save us RB_FIND() later.

1090 - 1092
the shortcut can be set up for outbound packet only (pd->dir PF_OUT),
which is also being forwarded (pkt_sk != NULL, indicates we are seeing
the packet for the second time pkt_sk holds state key for inbound
direction).  pf_compare_state_keys() is sanity check, it leaves a
debug message on system console on failure.

So if it is outbound forwarded packet, we've seen earlier, we
set up a reverse link to save one RB_FIND() operation on next
forwarded packet, which matches the same state.

1093 - 1095
creates similar shortcut for local bound packets. We put reference 
to state key into PCB linked to socket. This will save us RB_FIND()
operation for next local outbound, which matches the same state.


given the bug seems to be triggered/uncovered by pf_state_key_link_reverse()
is there any chance your laptop/workstation occasionally forwards packets?
like doing NAT for vmd/qemu virtual machine?

if it is not the case then the question is how does it come we run
pf_state_key_link_reverse()? which same as why pkt_sk is not NULL at line 1090.


> 
> I could try to run with your commit and see if I could trigger it more
> easily or found some elements influencing it. I could try with GENERIC
> for example to see if I still trigger the same assert() or if it is
> more like Olivier.

I need to think of how to further debug the thing.

> 
> my LAN was several hosts with the same kernel and only this machine
> trigger the panic, so it shouldn't be strictly linked to the
> environment.
> 

thanks a lot for your help (and patience)

regards
sashan



Re: [External] : pf_state_key_unref: panic: kernel diagnostic assertion "refcnt != ~0" failed: file "/usr/src/sys/kern/kern_synch.c", line 826

2021-05-06 Thread Sebastien Marie
On Thu, May 06, 2021 at 04:50:31PM +0200, Alexandr Nedvedicky wrote:
> Hello Olivier,
> > 
> > This morning, I've rebuild a new --current kernel and got some panics after
> > some minutes with PF enabled.
> > Then I've applied your patch and it is stable so far.
> > 
> 
> thank you for your help with this. I have not heard back from Sebastien 
> yet.

my machine is stable with the commit reverted: up 1 day, 8:20.

> one more question:
>   are you building your bsd kernel with DIAGNOSTIC option enabled?
>   I think you don't, because your crash matches uvm fault due to
>   use-after-free.
> 
> Sebastien hit the problem earlier by KASSERT().
> 
> just to summarize there are two boxes so far, which choked up with my commit 
> [1].
> both boxes are quite different. yours runs bsd kernel on single core CPU:
> 
> cpu0: Geode(TM) Integrated Processor by AMD PCS ("AuthenticAMD" 586-class) 
> 499 MHz, \
>   05-0a-02
> 
> Sebastien runs bsd.mp on two CPU cores:
> cpu0: Intel(R) Core(TM)2 Duo CPU E6750 @ 2.66GHz, 2660.30 MHz, 06-0f-0b
> 
> I'm not able to trigger crash on my HW. Which is notebbok running bsd.mp on:
> 
> cpu0: Intel(R) Core(TM) i5-4200U CPU @ 1.60GHz, 1496.74 MHz, 06-45-01
> 
> the other box, is APU router running bsd.mp on 4 cores:
> cpu0: AMD GX-412TC SOC, 998.26 MHz, 16-30-01
> 
> 
> to be honest I have no idea what could be causing problems on those two fairly
> distinct machines. The strange thing is that pf_test() currently does not run 
> in
> parallel. I don't quite understand why reverting my earlier change helps here.

it could be two differents ways to trigger a bug somewhere else that
your commit expose.

the panic doesn't trigger in the same way on both machines:
- Olivier's machine seems to trigger it quickly (after some minutes)
- mine relatively slowly (~ once a day)

I could try to run with your commit and see if I could trigger it more
easily or found some elements influencing it. I could try with GENERIC
for example to see if I still trigger the same assert() or if it is
more like Olivier.

my LAN was several hosts with the same kernel and only this machine
trigger the panic, so it shouldn't be strictly linked to the
environment.

Thanks.
-- 
Sebastien Marie



Re: [External] : pf_state_key_unref: panic: kernel diagnostic assertion "refcnt != ~0" failed: file "/usr/src/sys/kern/kern_synch.c", line 826

2021-05-06 Thread Alexandr Nedvedicky
Hello Olivier,



> > in your case we've missed the assert and are dying on uvm fault.
> > 
> > you both seem to be using rdr-to. your pf seems to use also divert-to rule.
> > I suspect something is going wrong when we deal with traffic, which matches
> > rdr-to rule.
> > 
> > 
> > would you be so kind and try diff below on your AP box. The diff removes
> > my change to pf_state_key_link_reverse(). Which is a primary suspect
> > at the moment.
> > 
> > I'm not able to trigger the panic on my notebook, nor on my
> > home router.
> 
> This morning, I've rebuild a new --current kernel and got some panics after
> some minutes with PF enabled.
> Then I've applied your patch and it is stable so far.
> 

thank you for your help with this. I have not heard back from Sebastien yet.
one more question:
are you building your bsd kernel with DIAGNOSTIC option enabled?
I think you don't, because your crash matches uvm fault due to
use-after-free.

Sebastien hit the problem earlier by KASSERT().

just to summarize there are two boxes so far, which choked up with my commit 
[1].
both boxes are quite different. yours runs bsd kernel on single core CPU:

cpu0: Geode(TM) Integrated Processor by AMD PCS ("AuthenticAMD" 586-class) 499 
MHz, \
05-0a-02

Sebastien runs bsd.mp on two CPU cores:
cpu0: Intel(R) Core(TM)2 Duo CPU E6750 @ 2.66GHz, 2660.30 MHz, 06-0f-0b 


   


I'm not able to trigger crash on my HW. Which is notebbok running bsd.mp on:

cpu0: Intel(R) Core(TM) i5-4200U CPU @ 1.60GHz, 1496.74 MHz, 06-45-01

the other box, is APU router running bsd.mp on 4 cores:
cpu0: AMD GX-412TC SOC, 998.26 MHz, 16-30-01


to be honest I have no idea what could be causing problems on those two fairly
distinct machines. The strange thing is that pf_test() currently does not run in
parallel. I don't quite understand why reverting my earlier change helps here.

sorry for inconveniences
regards
sashan

[1] https://marc.info/?l=openbsd-cvs=161951631726837=2



Re: 6.9amd64 athn(4) AR7010+AR9280 & AR9271 USB WiFi don't work in AP mode (frames corruption)

2021-05-06 Thread Stefan Sperling
On Thu, May 06, 2021 at 01:17:48PM +, Martin wrote:
> After some testing the issue is quite more complicated than expected.
> 
> 1. Using USB dongles as AP allow connections from any device with relatively 
> 'old' WiFi modules installed (clients with Intel, Atheros). It works smoothly 
> with AP based on AR9280+AR7010 (on both 2.4G and 5G bands) and with AP based 
> on AR9271 except Android clients.
> 
> 2. But both USB dongles in AP mode work very poor with Android devices with 
> Broadcom WiFi driver integrated into Android 10 OS. Android's dmesg shows 
> that wlan0 uses bcmdhd kernel driver. I think its code is here: 
> https://android.googlesource.com/kernel/common.git/+/bcmdhd-3.10
> 
> 2.1. The same Android 10 based device with the same bcmdhd kernel driver 
> connects successfully to PCIe AR9280 card in AP mode and works absolutely 
> without any issues.
> 
> 2.2. Moreother, any of USB dongle with Android 10 client connected triggers 
> PF 's rule 'max-src-rate 100/1' rule which never triggered with PCIe AR9280 
> card in AP mode. Commenting this rule out can't help much, ICMP packets 
> disappearing between devices, but some ICMP goes between them. No any 
> possibility of data transfers because USB AP and Android 10 client because of 
> lots of missed packets. Distance between USB AP and Android client is not 
> more than 1M.
> 
> The question is why Android client works fine with PCIe version AR9280 and 
> don't work with USB version of AR9280 / AR9271.
> 
> I've tested all modes 11a (5G) 11n, 11g, and 11b (2.4G). The USB APs behavior 
> is the same.

Ah, yes, that make sense and rings a bell:

The issue is that the USB part of athn does not support AP-side power
management (i.e. buffering of frames while clients are sleeping), while
the PCI version of the driver does support this. This explains the
packet loss issues you mentioned in your first post.

Phones really do not like access points that don't support power management.
I'm afraid you'll have to find another AP for this purpose until perhaps
power management support gets implemented for the USB part of athn some day.

Granted, this makes athn on USB rather useless as an access point with
clients that require power management support.
I'll add a warning to the man page.



Re: AMD Ryzen based Asus ZENBOOK 14 UM433DA-PURE4 14" panic at first boot post install

2021-05-06 Thread Mark Kettenis
> Date: Thu, 6 May 2021 16:17:51 +0200
> From: "Peter N. M. Hansteen" 
> 
> Hi Mark,
> 
> Are there other tests I could usefully perform for this one?

Not sure.  The BIOS on this machine is kinda broken.  It adbertises
itself as "hardware reduced" ACPI (something usually seen on arm64
machines) and claims to support S4 and S5 (hibernate and poweroff) but
doesn't actually implement the registers to do so.  So the machine
doesn't actually power off.

Does the BIOS screen on this machine provide a way to enable S3 mode?
If so it might be worth trying.



Re: [External] : pf_state_key_unref: panic: kernel diagnostic assertion "refcnt != ~0" failed: file "/usr/src/sys/kern/kern_synch.c", line 826

2021-05-06 Thread Olivier Cherrier


Hello Alexandr,

On Wed, May 05, 2021 at 05:53:25PM +0200, alexandr.nedvedi...@oracle.com wrote:
> I've seen your report here
> 
> https://marc.info/?l=openbsd-bugs=161968896108810
> 
> your crash is slightly different. Sebastien is lucky enough
> to trip crash in assert, when state key is dereferenced.
 
OK.

> in your case we've missed the assert and are dying on uvm fault.
> 
> you both seem to be using rdr-to. your pf seems to use also divert-to rule.
> I suspect something is going wrong when we deal with traffic, which matches
> rdr-to rule.
> 
> 
> would you be so kind and try diff below on your AP box. The diff removes
> my change to pf_state_key_link_reverse(). Which is a primary suspect
> at the moment.
> 
> I'm not able to trigger the panic on my notebook, nor on my
> home router.

This morning, I've rebuild a new --current kernel and got some panics after
some minutes with PF enabled.
Then I've applied your patch and it is stable so far.

Thanks,
Best.

-- 
Olivier Cherrier
Phone: +352691570680
mailto:o...@symacx.com



Re: intel(4): edp_panel_vdd_on calls task_del(9) with NULL taskq

2021-05-06 Thread Scott Cheloha
On Thu, May 06, 2021 at 02:36:21PM +0200, Mark Kettenis wrote:
> > Date: Thu, 6 May 2021 21:59:12 +1000
> > From: Jonathan Gray 
> > 
> > On Wed, May 05, 2021 at 04:47:50PM -0500, Scott Cheloha wrote:
> > > 
> > > [...]
> > > 
> > > On a hunch I added additional parameter checks to task_add(9) and
> > > task_del(9) and caught intel(4) doing something strange.
> > > 
> > > [...]
> > > 
> > > And here is the panic on my machine.  I had to reconstruct it from
> > > OCR, the machine has no serial port, sorry if there are typos.
> > 
> > boot crash can be helpful for such machines

Oh nice, thanks.

> > > [...]
> > > 
> > > db_enter() at db_enter+Oxa
> > > panic(81db24fb) at panic+0x12f
> > > task del(0,810633e0) at task_del+Oxa8
> > > edp_panel vdd_on(81063128) at edp_panel_vdd_on+0x6a
> > > intel_dp_aux_xfer(81063128, 82512a20,4, 
> > > 82512400,2,0) at intel_dp_aux_xfer+0x18b
> > > intel_dp_aux_transfer(810631e8, 82512a88) at 
> > > intel_dp_aux_transfer+0x183
> > > drm_dp_dpcd_access(810631e8,9,0,8106313a, 1) at 
> > > drm_dp_dpcd_access+Oxa9
> > > drm_dp_dpcd_read(810631e8,0,8106313a, f) at 
> > > drm_dp_dpcd_read+0x61
> > > intel_dp_read_dpcd(81063128) at intel_dp_read_dpcd+0x45
> > > intel_dp_init_connector(81063000, 81064000) at 
> > > intel_dp_init_connector+0x988
> > > intel_ddi_init(80272000,0) at intel_ddi_init+0x454
> > > intel_modeset_init(80272000) at intel_modeset_init+0x1c9f
> > > i915_driver_probe(80272000, 82052f98) at 
> > > i915_driver_probe+0x7df
> > > inteldrm_attachhook(80272000) at inteldrm_attachhook+0x46
> > > end trace frame: Ox82512700, count: 0
> > > 
> > > From the backtrace, I gather the following:
> > > 
> > > edp_panel_vdd_on() calls clear_delayed_work() which is just a macro
> > 
> > cancel_delayed_work()

whoops

> > > that calls task_del().  And for whatever reason the taskq passed to
> > > task_del() is NULL.  Maybe there is a missing INIT_DELAYED_WORK() call
> > > somewhere prior to this point?
> > 
> > the call to that is in
> > INIT_DELAYED_WORK(_dp->panel_vdd_work, edp_panel_vdd_work);
> > but tq doesn't get set until work is scheduled as there are interfaces
> > to pick a tq when scheduling work.
> > 
> > So perhaps you want something like this to catch the cases where work is
> > cancelled before it is scheduled.
> 
> Yeah, I think that makes sense.
> 
> ok kettenis@

This patch fixes the panic, running with it now.

ok cheloha@

Caveat: my confidence in my understanding of these interfaces is
pretty low.

-Scott



Re: intel(4): edp_panel_vdd_on calls task_del(9) with NULL taskq

2021-05-06 Thread Mark Kettenis
> Date: Thu, 6 May 2021 21:59:12 +1000
> From: Jonathan Gray 
> 
> On Wed, May 05, 2021 at 04:47:50PM -0500, Scott Cheloha wrote:
> > Hi,
> > 
> > On a hunch I added additional parameter checks to task_add(9) and
> > task_del(9) and caught intel(4) doing something strange.
> > 
> > The patch is straightforward: check that the taskq pointer tq is not
> > NULL.  In the current code we return early if a flag is set or cleared
> > in the task w, in which case we don't catch bogus taskq inputs, which
> > is why the machine boots fine without the extra checks.
> > 
> > The patch:
> > 
> > Index: kern_task.c
> > ===
> > RCS file: /cvs/src/sys/kern/kern_task.c,v
> > retrieving revision 1.31
> > diff -u -p -r1.31 kern_task.c
> > --- kern_task.c 1 Aug 2020 08:40:20 -   1.31
> > +++ kern_task.c 5 May 2021 21:29:08 -
> > @@ -354,6 +354,9 @@ task_add(struct taskq *tq, struct task *
> >  {
> > int rv = 0;
> >  
> > +   if (tq == NULL)
> > +   panic("%s: NULL taskq", __func__);
> > +
> > if (ISSET(w->t_flags, TASK_ONQUEUE))
> > return (0);
> >  
> > @@ -378,6 +381,9 @@ int
> >  task_del(struct taskq *tq, struct task *w)
> >  {
> > int rv = 0;
> > +
> > +   if (tq == NULL)
> > +   panic("%s: NULL taskq", __func__);
> >  
> > if (!ISSET(w->t_flags, TASK_ONQUEUE))
> > return (0);
> > 
> > And here is the panic on my machine.  I had to reconstruct it from
> > OCR, the machine has no serial port, sorry if there are typos.
> 
> boot crash can be helpful for such machines
> 
> > 
> > panic: task_del: NULL taskq
> > Stopped at db_enter+0xa: popq %rbp
> > TIDPID  UID PRFLAGSPFLAGS  CPU  COMMAND
> >  513524  448830 Ox14000 0x2003  update
> >  352928  824020 0x14000 0x2002  cleaner
> >  382195  660350 Ox14000 0x2001  reaper
> > ...
> > db_enter() at db_enter+Oxa
> > panic(81db24fb) at panic+0x12f
> > task del(0,810633e0) at task_del+Oxa8
> > edp_panel vdd_on(81063128) at edp_panel_vdd_on+0x6a
> > intel_dp_aux_xfer(81063128, 82512a20,4, 
> > 82512400,2,0) at intel_dp_aux_xfer+0x18b
> > intel_dp_aux_transfer(810631e8, 82512a88) at 
> > intel_dp_aux_transfer+0x183
> > drm_dp_dpcd_access(810631e8,9,0,8106313a, 1) at 
> > drm_dp_dpcd_access+Oxa9
> > drm_dp_dpcd_read(810631e8,0,8106313a, f) at 
> > drm_dp_dpcd_read+0x61
> > intel_dp_read_dpcd(81063128) at intel_dp_read_dpcd+0x45
> > intel_dp_init_connector(81063000, 81064000) at 
> > intel_dp_init_connector+0x988
> > intel_ddi_init(80272000,0) at intel_ddi_init+0x454
> > intel_modeset_init(80272000) at intel_modeset_init+0x1c9f
> > i915_driver_probe(80272000, 82052f98) at 
> > i915_driver_probe+0x7df
> > inteldrm_attachhook(80272000) at inteldrm_attachhook+0x46
> > end trace frame: Ox82512700, count: 0
> > 
> > >From the backtrace, I gather the following:
> > 
> > edp_panel_vdd_on() calls clear_delayed_work() which is just a macro
> 
> cancel_delayed_work()
> 
> > that calls task_del().  And for whatever reason the taskq passed to
> > task_del() is NULL.  Maybe there is a missing INIT_DELAYED_WORK() call
> > somewhere prior to this point?
> 
> the call to that is in
> INIT_DELAYED_WORK(_dp->panel_vdd_work, edp_panel_vdd_work);
> but tq doesn't get set until work is scheduled as there are interfaces
> to pick a tq when scheduling work.
> 
> So perhaps you want something like this to catch the cases where work is
> cancelled before it is scheduled.

Yeah, I think that makes sense.

ok kettenis@

> Index: sys/dev/pci/drm/include/linux/workqueue.h
> ===
> RCS file: /cvs/src/sys/dev/pci/drm/include/linux/workqueue.h,v
> retrieving revision 1.4
> diff -u -p -r1.4 workqueue.h
> --- sys/dev/pci/drm/include/linux/workqueue.h 14 Feb 2021 03:42:55 -  
> 1.4
> +++ sys/dev/pci/drm/include/linux/workqueue.h 6 May 2021 11:51:14 -
> @@ -95,7 +95,8 @@ queue_work(struct workqueue_struct *wq, 
>  static inline void
>  cancel_work_sync(struct work_struct *work)
>  {
> - task_del(work->tq, >task);
> + if (work->tq != NULL)
> + task_del(work->tq, >task);
>  }
>  
>  #define work_pending(work)   task_pending(&(work)->task)
> @@ -169,6 +170,8 @@ mod_delayed_work(struct workqueue_struct
>  static inline bool
>  cancel_delayed_work(struct delayed_work *dwork)
>  {
> + if (dwork->tq == NULL)
> + return false;
>   if (timeout_del(>to))
>   return true;
>   return task_del(dwork->tq, >work.task);
> @@ -177,6 +180,8 @@ cancel_delayed_work(struct delayed_work 
>  static inline bool
>  cancel_delayed_work_sync(struct delayed_work *dwork)
>  {
> + if (dwork->tq == NULL)
> + return false;
>   if 

Re: intel(4): edp_panel_vdd_on calls task_del(9) with NULL taskq

2021-05-06 Thread Jonathan Gray
On Wed, May 05, 2021 at 04:47:50PM -0500, Scott Cheloha wrote:
> Hi,
> 
> On a hunch I added additional parameter checks to task_add(9) and
> task_del(9) and caught intel(4) doing something strange.
> 
> The patch is straightforward: check that the taskq pointer tq is not
> NULL.  In the current code we return early if a flag is set or cleared
> in the task w, in which case we don't catch bogus taskq inputs, which
> is why the machine boots fine without the extra checks.
> 
> The patch:
> 
> Index: kern_task.c
> ===
> RCS file: /cvs/src/sys/kern/kern_task.c,v
> retrieving revision 1.31
> diff -u -p -r1.31 kern_task.c
> --- kern_task.c   1 Aug 2020 08:40:20 -   1.31
> +++ kern_task.c   5 May 2021 21:29:08 -
> @@ -354,6 +354,9 @@ task_add(struct taskq *tq, struct task *
>  {
>   int rv = 0;
>  
> + if (tq == NULL)
> + panic("%s: NULL taskq", __func__);
> +
>   if (ISSET(w->t_flags, TASK_ONQUEUE))
>   return (0);
>  
> @@ -378,6 +381,9 @@ int
>  task_del(struct taskq *tq, struct task *w)
>  {
>   int rv = 0;
> +
> + if (tq == NULL)
> + panic("%s: NULL taskq", __func__);
>  
>   if (!ISSET(w->t_flags, TASK_ONQUEUE))
>   return (0);
> 
> And here is the panic on my machine.  I had to reconstruct it from
> OCR, the machine has no serial port, sorry if there are typos.

boot crash can be helpful for such machines

> 
> panic: task_del: NULL taskq
> Stopped at db_enter+0xa: popq %rbp
> TID  PID  UID PRFLAGSPFLAGS  CPU  COMMAND
>  513524  448830 Ox14000 0x2003  update
>  352928  824020 0x14000 0x2002  cleaner
>  382195  660350 Ox14000 0x2001  reaper
> ...
> db_enter() at db_enter+Oxa
> panic(81db24fb) at panic+0x12f
> task del(0,810633e0) at task_del+Oxa8
> edp_panel vdd_on(81063128) at edp_panel_vdd_on+0x6a
> intel_dp_aux_xfer(81063128, 82512a20,4, 82512400,2,0) 
> at intel_dp_aux_xfer+0x18b
> intel_dp_aux_transfer(810631e8, 82512a88) at 
> intel_dp_aux_transfer+0x183
> drm_dp_dpcd_access(810631e8,9,0,8106313a, 1) at 
> drm_dp_dpcd_access+Oxa9
> drm_dp_dpcd_read(810631e8,0,8106313a, f) at 
> drm_dp_dpcd_read+0x61
> intel_dp_read_dpcd(81063128) at intel_dp_read_dpcd+0x45
> intel_dp_init_connector(81063000, 81064000) at 
> intel_dp_init_connector+0x988
> intel_ddi_init(80272000,0) at intel_ddi_init+0x454
> intel_modeset_init(80272000) at intel_modeset_init+0x1c9f
> i915_driver_probe(80272000, 82052f98) at 
> i915_driver_probe+0x7df
> inteldrm_attachhook(80272000) at inteldrm_attachhook+0x46
> end trace frame: Ox82512700, count: 0
> 
> >From the backtrace, I gather the following:
> 
> edp_panel_vdd_on() calls clear_delayed_work() which is just a macro

cancel_delayed_work()

> that calls task_del().  And for whatever reason the taskq passed to
> task_del() is NULL.  Maybe there is a missing INIT_DELAYED_WORK() call
> somewhere prior to this point?

the call to that is in
INIT_DELAYED_WORK(_dp->panel_vdd_work, edp_panel_vdd_work);
but tq doesn't get set until work is scheduled as there are interfaces
to pick a tq when scheduling work.

So perhaps you want something like this to catch the cases where work is
cancelled before it is scheduled.

Index: sys/dev/pci/drm/include/linux/workqueue.h
===
RCS file: /cvs/src/sys/dev/pci/drm/include/linux/workqueue.h,v
retrieving revision 1.4
diff -u -p -r1.4 workqueue.h
--- sys/dev/pci/drm/include/linux/workqueue.h   14 Feb 2021 03:42:55 -  
1.4
+++ sys/dev/pci/drm/include/linux/workqueue.h   6 May 2021 11:51:14 -
@@ -95,7 +95,8 @@ queue_work(struct workqueue_struct *wq, 
 static inline void
 cancel_work_sync(struct work_struct *work)
 {
-   task_del(work->tq, >task);
+   if (work->tq != NULL)
+   task_del(work->tq, >task);
 }
 
 #define work_pending(work) task_pending(&(work)->task)
@@ -169,6 +170,8 @@ mod_delayed_work(struct workqueue_struct
 static inline bool
 cancel_delayed_work(struct delayed_work *dwork)
 {
+   if (dwork->tq == NULL)
+   return false;
if (timeout_del(>to))
return true;
return task_del(dwork->tq, >work.task);
@@ -177,6 +180,8 @@ cancel_delayed_work(struct delayed_work 
 static inline bool
 cancel_delayed_work_sync(struct delayed_work *dwork)
 {
+   if (dwork->tq == NULL)
+   return false;
if (timeout_del(>to))
return true;
return task_del(dwork->tq, >work.task);



iwm: Fatal firmware error (could not add sta (error 35))

2021-05-06 Thread Matthias Schmidt
>Synopsis:  Fatal firmware error with iwm
>Environment:
System  : OpenBSD 6.9
Details : OpenBSD 6.9-current (GENERIC.MP) #4: Wed May  5 11:06:38 
MDT 2021
 
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP

Architecture: OpenBSD.amd64
Machine : amd64
>Description:

I noticed an iwm "fatal firmware error" I've never seen before.  This happened
since some days and I assume it is related to the recent Wifi changes that
so awesomely speed up the Wifi.

The error usually appears multiple times and sometimes the wifi recovers,
sometimes I have to take the interface down and up again for recovery.

Here's an excerpt from my messages:

2021-05-03T15:41:07.069Z sigma /bsd: iwm0: fatal firmware error
2021-05-03T15:41:08.069Z sigma /bsd: iwm0: could not add sta (error 35)
2021-05-03T17:25:05.355Z sigma /bsd: iwm0: fatal firmware error
2021-05-03T17:25:06.405Z sigma /bsd: iwm0: could not add sta (error 35)
2021-05-03T17:25:23.905Z sigma /bsd: iwm0: fatal firmware error
2021-05-03T17:25:24.905Z sigma /bsd: iwm0: could not add sta (error 35)
2021-05-03T17:25:43.655Z sigma /bsd: iwm0: fatal firmware error
2021-05-03T17:25:44.655Z sigma /bsd: iwm0: could not add sta (error 35)
2021-05-03T17:49:02.712Z sigma /bsd: iwm0: fatal firmware error
2021-05-03T17:49:28.858Z sigma /bsd: iwm0: fatal firmware error
2021-05-03T17:49:29.852Z sigma /bsd: iwm0: could not add sta (error 35)

2021-05-04T07:21:41.018Z sigma /bsd: iwm0: fatal firmware error
2021-05-04T07:21:42.018Z sigma /bsd: iwm0: could not add sta (error 35)
2021-05-04T07:41:50.175Z sigma /bsd: iwm0: could not add sta (error 35)
2021-05-04T07:45:28.374Z sigma /bsd: iwm0: fatal firmware error
2021-05-04T07:45:29.424Z sigma /bsd: iwm0: could not add sta (error 35)
2021-05-04T08:58:25.572Z sigma /bsd: iwm0: fatal firmware error
2021-05-04T08:58:26.572Z sigma /bsd: iwm0: could not add sta (error 35)
2021-05-04T09:02:04.491Z sigma /bsd: iwm0: fatal firmware error
2021-05-04T09:02:05.491Z sigma /bsd: iwm0: could not add sta (error 35)
2021-05-04T09:22:39.061Z sigma /bsd: iwm0: fatal firmware error
2021-05-04T09:22:40.061Z sigma /bsd: iwm0: could not add sta (error 35)
2021-05-04T09:31:18.270Z sigma /bsd: iwm0: fatal firmware error
2021-05-04T10:01:42.366Z sigma /bsd: iwm0: fatal firmware error
2021-05-04T10:01:43.366Z sigma /bsd: iwm0: could not add sta (error 35)

2021-05-05T12:27:16.883Z sigma /bsd: iwm0: fatal firmware error
2021-05-05T12:27:17.893Z sigma /bsd: iwm0: failed to update MAC
2021-05-05T12:28:56.195Z sigma /bsd: iwm0: fatal firmware error
2021-05-05T12:28:57.195Z sigma /bsd: iwm0: could not add sta (error 35)
2021-05-05T17:08:30.547Z sigma /bsd: iwm0: fatal firmware error
2021-05-05T17:08:31.597Z sigma /bsd: iwm0: could not add sta (error 35)
2021-05-05T19:47:36.736Z sigma /bsd: iwm0: fatal firmware error
2021-05-05T19:47:37.731Z sigma /bsd: iwm0: could not add sta (error 35)
2021-05-05T19:47:52.666Z sigma /bsd: iwm0: fatal firmware error
2021-05-05T19:47:53.661Z sigma /bsd: iwm0: could not add sta (error 35)
2021-05-05T19:48:08.636Z sigma /bsd: iwm0: fatal firmware error
2021-05-05T19:48:09.631Z sigma /bsd: iwm0: could not add sta (error 35)
2021-05-05T19:48:27.243Z sigma /bsd: iwm0: fatal firmware error
2021-05-05T20:21:20.768Z sigma /bsd: iwm0: fatal firmware error
2021-05-05T20:21:21.818Z sigma /bsd: iwm0: could not add sta (error 35)

2021-05-06T06:53:21.933Z sigma /bsd: iwm0: dumping device error log
2021-05-06T06:53:21.934Z sigma /bsd: iwm0: Start Error Log Dump:
2021-05-06T06:53:21.934Z sigma /bsd: iwm0: Status: 0x39, count: 6
2021-05-06T06:53:21.934Z sigma /bsd: iwm0: 0x1043 | ADVANCED_SYSASSERT  

2021-05-06T06:53:21.934Z sigma /bsd: iwm0: 00A00283 | trm_hw_status0
2021-05-06T06:53:21.935Z sigma /bsd: iwm0:  | trm_hw_status1
2021-05-06T06:53:21.935Z sigma /bsd: iwm0: 00023FDC | branchlink2
2021-05-06T06:53:21.935Z sigma /bsd: iwm0: 0003915A | interruptlink1
2021-05-06T06:53:21.935Z sigma /bsd: iwm0:  | interruptlink2
2021-05-06T06:53:21.936Z sigma /bsd: iwm0: 27C4 | data1
2021-05-06T06:53:21.936Z sigma /bsd: iwm0: 2818 | data2
2021-05-06T06:53:21.936Z sigma /bsd: iwm0: DEADBEEF | data3
2021-05-06T06:53:21.936Z sigma /bsd: iwm0: 8440C4DD | beacon time
2021-05-06T06:53:21.937Z sigma /bsd: iwm0: 86CDBB2A | tsf low
2021-05-06T06:53:21.937Z sigma /bsd: iwm0: 06F6 | tsf hi
2021-05-06T06:53:21.937Z sigma /bsd: iwm0: 036F | time gp1
2021-05-06T06:53:21.937Z sigma /bsd: iwm0: 1082349D | time gp2
2021-05-06T06:53:21.938Z sigma /bsd: iwm0: 0001 | uCode revision type
2021-05-06T06:53:21.938Z sigma /bsd: iwm0: 0022 | uCode version major
2021-05-06T06:53:21.938Z sigma /bsd: iwm0:  | uCode version minor
2021-05-06T06:53:21.938Z sigma /bsd: iwm0: 0230 | hw version
2021-05-06T06:53:21.939Z sigma /bsd: iwm0: 18089000 | board version
2021-05-06T06:53:21.939Z sigma /bsd: iwm0: 0524001C | hcmd
2021-05-06T06:53:21.939Z 

Bug in ksh

2021-05-06 Thread Ivan Bityutskiy
  # 'Let' is executing 2nd part of && operator
  # after 1st part evaluates to false:
$ foo=2; (( foo > 2 && foo-- )); print $foo
1

  # This is working as expected:
$ foo=2;(( foo > 2 && --foo ));print $foo
2