Re: panic: unix: lock not held

2024-05-02 Thread Alexander Bluhm
On Fri, May 03, 2024 at 12:04:02AM +0300, Vitaliy Makkoveev wrote: > On Thu, May 02, 2024 at 10:06:45PM +0200, kir...@korins.ky wrote: > > >Synopsis: panic: unix: lock not held > > >Category: kernel > > >Environment: > > System : OpenBSD 7.5 > > Details : OpenBSD 7.5-current

Re: lock order reversal in soreceive and NFS

2024-04-30 Thread Alexander Bluhm
On Tue, Apr 30, 2024 at 05:26:15PM +0300, Vitaliy Makkoveev wrote: > On Tue, Apr 30, 2024 at 04:06:29PM +0200, Mark Kettenis wrote: > > > Date: Tue, 30 Apr 2024 16:18:31 +0300 > > > From: Vitaliy Makkoveev > > > > > > On Tue, Apr 30, 2024 at 11:08:13AM +0200, Martin Pieuchot wrote: > > > > > >

Re: lock order reversal in soreceive and NFS

2024-04-30 Thread Alexander Bluhm
On Tue, Apr 30, 2024 at 11:08:13AM +0200, Martin Pieuchot wrote: > > With the patch, the nfsnode-vmmaplk reversal looks like this: > > So the issue here is due to NFS entering the network stack after the > VFS. Alexander, Vitaly are we far from a NET_LOCK()-free sosend()? > Is something we

Re: [PATCH 2/2] Handle short writes in cp(1)

2024-04-26 Thread Alexander Bluhm
On Thu, Apr 25, 2024 at 09:05:53PM +0200, Piotr Durlej wrote: > Handle short writes in cp(1) OK bluhm@ > --- > bin/cp/utils.c | 7 ++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/bin/cp/utils.c b/bin/cp/utils.c > index 347081151f2..40265fce7f7 100644 > ---

lock order reversal in soreceive and NFS

2024-04-22 Thread Alexander Bluhm
Hi, I see a witness lock order reversal warning with soreceive. It happens during NFS regress tests. In /var/log/messages is more context from regress. Apr 22 03:18:08 ot29 /bsd: uid 0 on /mnt/regress-ffs/fstest_49fd035b8230791792326afb0604868b: out of inodes Apr 22 03:18:21 ot29

Re: potential unfixed CVE in usr.bin/compress/zopen.c

2024-04-03 Thread Alexander Bluhm
On Wed, Apr 03, 2024 at 03:35:07PM +, Lu ChenHao wrote: > As CVE-2011-2895 said, the > LZW decompressor is vulnerable to an infinite loop or a heap-based buffer > overflow. As a mitigation, freebsd has added checks in >

Re: dwqe ifconfig down panic

2024-03-29 Thread Alexander Bluhm
On Thu, Mar 28, 2024 at 11:06:13PM +0100, Stefan Sperling wrote: > On Wed, Mar 27, 2024 at 02:08:27PM +0100, Stefan Sperling wrote: > > On Tue, Mar 26, 2024 at 11:05:49PM +0100, Patrick Wildt wrote: > > > On Fri, Mar 01, 2024 at 12:00:29AM +0100, Alexander Bluhm

ntpd NULL deref

2024-03-19 Thread Alexander Bluhm
Hi, ntpd crashed on my laptop. cstr->addr is NULL. According to accounting it was running for a while. ntpd[43355] - _ntp __ 0.06 secs Thu Mar 14 10:57 (41:41:32.00) ntpd[81566] -F root __ 0.28 secs Thu Mar 14 10:57 (41:39:28.00) ntpd[5567] -DXT_ntp __

Re: ICMP6 Type2 with MTU=PrevMTU Packet Flood in specific cornercase scenarios on OpenBSD7.4

2024-03-07 Thread Alexander Bluhm
Hi, Thanks for the detailed bug report. Note that I have also written some scapy script to test path MTU discovery. /usr/src/regress/sys/netinet/pmtu/tcp_connect.py and tcp_connect6.py Sometimes these tests fail, so PMTU may have bugs. Or my tests are just unreliable. How does the route look

protection fault in amap_wipeout

2024-03-01 Thread Alexander Bluhm
Hi, An OpenBSD 7.4 machine on KVM running postgress and pagedaemon crashed in amap_wipeout(). bluhm kernel: protection fault trap, code=0 Stopped at amap_wipeout+0x76: movq%rcx,0x28(%rax) ddb{3}> show panic the kernel did not panic ddb{3}> trace amap_wipeout(fd8015b154d0) at

dwqe ifconfig down panic

2024-02-29 Thread Alexander Bluhm
Hi, When doing flood ping transmit from a machine and simultaneously ifconfig down/up in a loop, dwqe(4) interface driver crashes. dwqe_down() contains an interrupt barrier, but somehow it does not work. Immediately after Xspllower() a transmit interrupt is processed. bluhm kernel: protection

Re: TSO em(4) problem

2024-02-01 Thread Alexander Bluhm
On Tue, Jan 30, 2024 at 02:32:24PM +0100, Hrvoje Popovski wrote: > yes, and forwarding only without pf. > I'm sending traffic from host connected to vlan/ix0 and forward through > em5 to other host. > I'm sending 1Gbps of traffic with cisco t-rex I cannot reproduce. ix0 at pci6 dev 0 function 0

Re: TSO em(4) problem

2024-01-30 Thread Alexander Bluhm
On Tue, Jan 30, 2024 at 12:07:08PM +0100, Hrvoje Popovski wrote: > On 30.1.2024. 9:27, Hrvoje Popovski wrote: > > I will prepare one box for this kind of traffic and will contact you and > > marcus > > > >> In theory when going through vlan interface it should remove > >> M_VLANTAG. But

Re: TSO em(4) problem

2024-01-29 Thread Alexander Bluhm
On Sun, Jan 28, 2024 at 07:46:29PM +0100, Marcus Glocker wrote: > Anyway, the TSO support just has been backed out. Thanks again for all > your testing! I am still interested to get em with TSO working if possible. Most use cases work fine. If there is a bug in our driver, we may fix it. If

Re: TSO em(4) problem

2024-01-29 Thread Alexander Bluhm
On Sat, Jan 27, 2024 at 08:08:35AM +0100, Hrvoje Popovski wrote: > On 26.1.2024. 22:47, Alexander Bluhm wrote: > > On Fri, Jan 26, 2024 at 11:41:49AM +0100, Hrvoje Popovski wrote: > >> I've manage to reproduce TSO em problem on anoter setup, unfortunatly > >> production

Re: TSO em(4) problem

2024-01-26 Thread Alexander Bluhm
On Fri, Jan 26, 2024 at 11:41:49AM +0100, Hrvoje Popovski wrote: > I've manage to reproduce TSO em problem on anoter setup, unfortunatly > production. What helped debugging a similar issue with ixl(4) and TSO was to remove all TSO specific code from the driver. Then only this part remains from

Re: OpenBSD 7.4/amd64 on APU4D4 - kernel panic

2024-01-17 Thread Alexander Bluhm
On Wed, Jan 17, 2024 at 11:46:36AM +0100, Radek wrote: > ddb{0}> show panic > *cpu0: kernel diagnostic assertion "pkt->pkt_m != NULL" failed: file > "/usr/src/ > sys/dev/pci/if_em.c", line 2580 > OpenBSD 7.4 (GENERIC.MP) #0: Fri Jan 12 09:31:37 CET 2024 >

Re: kernel panic: ip_output no HDR

2024-01-15 Thread Alexander Bluhm
On Mon, Jan 15, 2024 at 01:42:55PM -0300, K R wrote: > >Synopsis: kernel panic: ip_output no HDR > >Category: kernel amd64 > >Environment: > System : OpenBSD 7.4 > Details : OpenBSD 7.4-stable (GENERIC.MP) #0: Mon Dec 11 > 19:17:55 UTC 2023 > >

Re: vmm guest crash in vio

2024-01-09 Thread Alexander Bluhm
On Tue, Jan 09, 2024 at 07:49:16PM +0100, Stefan Fritsch wrote: > @bluhm: Does the attached patch fix the panic? Yes. My test does not crash the patched guest anymore. bluhm > The fdt part is completely untested, testers welcome. > > diff --git a/sys/dev/fdt/virtio_mmio.c

Re: bnxt panic - HWRM_RING_ALLOC command returned RESOURCE_ALLOC_ERROR error.

2024-01-09 Thread Alexander Bluhm
On Tue, Jan 09, 2024 at 12:04:17PM +1000, Jonathan Matthew wrote: > On Wed, Jan 03, 2024 at 10:14:12AM +0100, Hrvoje Popovski wrote: > > On 3.1.2024. 7:51, Jonathan Matthew wrote: > > > On Wed, Jan 03, 2024 at 01:50:06AM +0100, Alexander Bluhm wrote: > > >> On Wed, Ja

vmm guest crash in vio

2024-01-08 Thread Alexander Bluhm
Hi, When running a guest in vmm and doing ifconfig operations on vio interface, I can crash the guest. I run these loops in the guest: while doas ifconfig vio1 inet 10.188.234.74/24; do :; done while doas ifconfig vio1 -inet; do :; done while doas ifconfig vio1 down; do :; done And from host I

Re: bnxt panic - HWRM_RING_ALLOC command returned RESOURCE_ALLOC_ERROR error.

2024-01-03 Thread Alexander Bluhm
On Wed, Jan 03, 2024 at 04:51:39PM +1000, Jonathan Matthew wrote: > On Wed, Jan 03, 2024 at 01:50:06AM +0100, Alexander Bluhm wrote: > > On Wed, Jan 03, 2024 at 12:26:26AM +0100, Hrvoje Popovski wrote: > > > While testing kettenis@ ipl diff from tech@ and doing iperf3 to bn

Re: bnxt panic - HWRM_RING_ALLOC command returned RESOURCE_ALLOC_ERROR error.

2024-01-02 Thread Alexander Bluhm
On Wed, Jan 03, 2024 at 12:26:26AM +0100, Hrvoje Popovski wrote: > While testing kettenis@ ipl diff from tech@ and doing iperf3 to bnxt > interface and ifconfig bnxt0 down/up at the same time I can trigger > panic. Panic can be triggered without kettenis@ diff... It is easy to reproduce.

Re: rw_enter: netlock locking against myself, 7.4+errata007, hyper-v hvn((4)

2024-01-02 Thread Alexander Bluhm
On Tue, Jan 02, 2024 at 09:11:07PM +0300, Vitaliy Makkoveev wrote: > ifq_task_mtx initialized with IPL_NET priority, so this sequence > started this. > > THR1 mtx_enter(>ifq_task_mtx) > THR2 splnet() /* hv_wait(), just before hv_intr() */ > THR1 mtx_leave(>ifq_task_mtx) > THR1 `-> Xspllower() >

Re: rw_enter: netlock locking against myself, 7.4+errata007, hyper-v hvn((4)

2024-01-02 Thread Alexander Bluhm
On Tue, Jan 02, 2024 at 08:29:50PM +0300, Vitaliy Makkoveev wrote: > > On 2 Jan 2024, at 20:16, Alexander Bluhm wrote: > > > > On Tue, Jan 02, 2024 at 03:45:10PM +, Stuart Henderson wrote: > >> panic: rw_enter: netlock locking against myself > >> Sto

Re: rw_enter: netlock locking against myself, 7.4+errata007, hyper-v hvn((4)

2024-01-02 Thread Alexander Bluhm
On Tue, Jan 02, 2024 at 03:45:10PM +, Stuart Henderson wrote: > panic: rw_enter: netlock locking against myself > Stopped atdb_enter+0x14: popq%rbp > TIDPIDUID PRFLAGS PFLAGS CPU COMMAND > *388963 11506755 0x2 00 snmpbulkwalk > 320140

ntpd crash in constraint_msg_close log_sockaddr

2023-12-18 Thread Alexander Bluhm
Hi, for some days or weeks I see crashes of ntpd in accounting log on my laptop. Program terminated with signal SIGSEGV, Segmentation fault. #0 log_sockaddr (sa=0x8) at /usr/src/usr.sbin/ntpd/util.c:159 159 if (getnameinfo(sa, SA_LEN(sa), buf, sizeof(buf), NULL, 0, (gdb) bt #0

Re: kernel diagnostic assertion "st->timeout == PFTM_UNLINKED" failed: file "

2023-12-11 Thread Alexander Bluhm
On Mon, Dec 11, 2023 at 10:58:22AM +0100, Alexandr Nedvedicky wrote: > dlg@ and I are basically trying to remove all NET_LOCK() operations from > pf(4), because we don't want pf(4) to be playing with global NET_LOCK(). > all callers to pf(4) should either obtain NET_LOCK() in case they

Re: kernel diagnostic assertion "st->timeout == PFTM_UNLINKED" failed: file "

2023-12-11 Thread Alexander Bluhm
On Mon, Dec 11, 2023 at 02:47:49PM +0300, Vitaliy Makkoveev wrote: > Hi, > > > On 11 Dec 2023, at 12:58, Alexandr Nedvedicky wrote: > > > >on the other hand if there is a way to implement pflowif_list as > > lock-less > >(or move it ouf of NET_LOCK() scope), then this is a preferred

Re: kernel diagnostic assertion "st->timeout == PFTM_UNLINKED" failed: file "

2023-12-08 Thread Alexander Bluhm
On Sat, Dec 09, 2023 at 02:07:06AM +0300, Vitaliy Makkoveev wrote: > > > SLIST_ENTRY(pflow_softc) sc_next; > > > > This list is protected by net lock. Can you add an [N] here? > > > > This is not true. The netlock is not taken while export_pflow() called > from pf_purge_states(). I privately

Re: kernel diagnostic assertion "st->timeout == PFTM_UNLINKED" failed: file "

2023-12-08 Thread Alexander Bluhm
On Thu, Dec 07, 2023 at 06:14:30PM +0300, Vitaliy Makkoveev wrote: > Here id the diff. I introduces `sc_mtx' mutex(9) to protect the most of > pflow_softc structure. The `send_nam', `sc_flowsrc' and `sc_flowdst' are > prtected by `sc_lock' rwlock(9). `sc_tmpl_ipfix' is immutable. > > Also, the

Re: system panics now & then

2023-12-06 Thread Alexander Bluhm
On Wed, Dec 06, 2023 at 10:17:26AM +0100, Claudio Jeker wrote: > On Wed, Dec 06, 2023 at 12:57:57AM +0100, Alexander Bluhm wrote: > > On Wed, Dec 06, 2023 at 01:39:40AM +0300, Vitaliy Makkoveev wrote: > > > > Diff makes sense in any case. > > > > > &

Re: system panics now & then

2023-12-05 Thread Alexander Bluhm
On Wed, Dec 06, 2023 at 01:39:40AM +0300, Vitaliy Makkoveev wrote: > > Diff makes sense in any case. > > > > Just checked, socket6_send() is identical to socket_send() and needs > to be reworked in the same way. New diff for v4 and v6. The other callers seem to be correct. I will run this

Re: system panics now & then

2023-12-05 Thread Alexander Bluhm
On Tue, Dec 05, 2023 at 08:22:52PM +0100, Jo Geraerts wrote: > maybe its a good idea to just change 1 thing Yes, only change 1 thing. I just wrote down all my ideas. > > It could be race or a single packet that crashes the machine. Found a race when we insert the IGMP packet into the socket

Re: system panics now & then

2023-12-05 Thread Alexander Bluhm
On Tue, Dec 05, 2023 at 04:22:47PM +0100, Jo Geraerts wrote: > *92547 388831 1 0 7 0mrouted Cool, you are running a multicast router. Unfortunately this code path is not well tested. > *cpu0: receive 1: so 0xfd80259ea760, so_type 3, sb_cc 40 When the

arm64 panic: malloc: out of space in kmem_map

2023-11-09 Thread Alexander Bluhm
Hi, During make build my arm64 machine with 32 CPUs crashed. bluhm ddb{24}> x/s version version:OpenBSD 7.4-current (GENERIC.MP) #16: Fri Nov 3 21:38:55 MDT 2023\012 dera...@arm64.openbsd.org:/usr/src/sys/arch/arm64/compile/GENERIC.MP\012 ddb{24}> show panic cpu0: kernel

Re: ixl driver - MAC address "reset"?

2023-10-12 Thread Alexander Bluhm
On Fri, Oct 06, 2023 at 12:08:41PM -0700, Joao Pedras wrote: > Another odd thing is that I do have another card but with its MACs > starting in 3c:ec:ef:b4:40. If I understand you corretly, this is the broken card. > ixl0 at pci6 dev 0 function 0 "Intel X710 SFP+" rev 0x02: port 0, FW >

Re: PXE install of OpenBSD 7.3 (amd64) fails on Protectli VP4650 & VP2420, with 'igc' Intel I225-V 2.5Gbps NICs

2023-10-02 Thread Alexander Bluhm
On Sun, Oct 01, 2023 at 06:25:41PM -0700, Ian R. wrote: > ?? The 'bsd' file it's looking for definitely does exist on my pxeboot > server, in the tftpd root dirirectory where it's supposed to be. I've It is supposed to be in an IP address subdirectory. At least that's how my setup looks

Re: pf nat-to doesn't match a crafted packet

2023-09-04 Thread Alexander Bluhm
On Mon, Sep 04, 2023 at 03:58:02PM +0200, Alexandr Nedvedicky wrote: > Hello, > > On Mon, Sep 04, 2023 at 03:28:00PM +0200, Alexander Bluhm wrote: > > On Sun, Sep 03, 2023 at 11:00:56PM +0200, Alexandr Nedvedicky wrote: > > > Hello, > > > > > > On Sun,

Re: pf nat-to doesn't match a crafted packet

2023-09-04 Thread Alexander Bluhm
On Sun, Sep 03, 2023 at 11:00:56PM +0200, Alexandr Nedvedicky wrote: > Hello, > > On Sun, Sep 03, 2023 at 09:26:29PM +0200, Florian Obser wrote: > > FYI, I'm not using sloppy, and I don't have a network with asymmetric > > routing > > at the moment. I only remembered that we used sloppy for a

Re: pf nat-to doesn't match a crafted packet

2023-09-03 Thread Alexander Bluhm
On Sun, Sep 03, 2023 at 06:17:12PM +0200, Florian Obser wrote: > On 2023-09-03 18:13 +02, Alexander Bluhm wrote: > > On Sun, Sep 03, 2023 at 05:59:18PM +0200, Alexandr Nedvedicky wrote: > >> Hello, > >> > >> On Sun, Sep 03, 2023 at 05:10:02PM +0200, Alexande

Re: pf nat-to doesn't match a crafted packet

2023-09-03 Thread Alexander Bluhm
On Sun, Sep 03, 2023 at 05:59:18PM +0200, Alexandr Nedvedicky wrote: > Hello, > > On Sun, Sep 03, 2023 at 05:10:02PM +0200, Alexander Bluhm wrote: > > On Sun, Sep 03, 2023 at 04:12:35AM +0200, Alexandr Nedvedicky wrote: > > > in my opinion is to fix pf_match_rule() functio

Re: pf nat-to doesn't match a crafted packet

2023-09-03 Thread Alexander Bluhm
On Sun, Sep 03, 2023 at 04:12:35AM +0200, Alexandr Nedvedicky wrote: > in my opinion is to fix pf_match_rule() function, so ICMP error message > will no longer match 'keep state' rule. Diff below is for IPv4. I still > need to think of more about IPv6. My gut feeling is it will be very similar.

Re: umb(4): splassert: rtable_getsource: want 2 have 0

2023-08-31 Thread Alexander Bluhm
On Thu, Aug 31, 2023 at 04:25:37PM +0300, Vitaliy Makkoveev wrote: > > NET_UNLOCK() and NET_LOCK_SHARED() just after each other does not > > make much sense. Just keep exclusive netlock for the few lines. > > Agreed. Both the cases perform route sockets walkthrough and message > transmission. No

Re: umb(4): splassert: rtable_getsource: want 2 have 0

2023-08-31 Thread Alexander Bluhm
On Thu, Aug 31, 2023 at 01:05:11PM +0300, Vitaliy Makkoveev wrote: > On Thu, Aug 31, 2023 at 11:26:42AM +0200, Jeremie Courreges-Anglas wrote: > > > > Looks umb(4) triggers the NET_ASSERT_LOCKED() check in > > rtable_getsource() when the umb(4) interface comes up (here with > > kern.splassert=2

Re: macppc panic: pool_do_get: vp: page empty

2023-07-18 Thread Alexander Bluhm
On Tue, Jul 18, 2023 at 11:59:27PM +0300, Alexander Bluhm wrote: > While booting with > OpenBSD 7.3-current (GENERIC.MP) #153: Sat Jul 15 16:24:01 MDT 2023 > dera...@macppc.openbsd.org:/usr/src/sys/arch/macppc/compile/GENERIC.MP > Machine paniced in rc reordering. With lat

macppc panic: pool_do_get: vp: page empty

2023-07-18 Thread Alexander Bluhm
Hi, While booting with OpenBSD 7.3-current (GENERIC.MP) #153: Sat Jul 15 16:24:01 MDT 2023 dera...@macppc.openbsd.org:/usr/src/sys/arch/macppc/compile/GENERIC.MP Machine paniced in rc reordering. reordering: ld.so libc libcrypto sshdpanic: pool_do_get: vp: page empty Stopped at

Re: kernel diagnostic assertion "!_kernel_lock_held()" failed

2023-07-06 Thread Alexander Bluhm
On Thu, Jul 06, 2023 at 02:14:09PM +, Valdrin MUJA wrote: > I've applied your patch but crashed again. Here it is: > ddb{1}> show panic > *cpu1: kernel diagnostic assertion "refcnt_read(>rt_refcnt) >= 2" failed: > f > ile "/usr/src/sys/net/rtable.c", line 828 This kassert I added seems to be

Re: kernel diagnostic assertion "!_kernel_lock_held()" failed

2023-07-06 Thread Alexander Bluhm
On Wed, Jul 05, 2023 at 12:17:15PM +, Valdrin MUJA wrote: > ddb{3}> show panic > *cpu3: kernel diagnostic assertion "!ISSET(rt->rt_flags, RTF_UP)" failed: > file " > /usr/src/sys/net/route.c", line 496 > > ddb{3}> trace > db_enter() at db_enter+0x10 > panic(82067518) at panic+0xbf >

Re: Syslog does not attempt DNS resolution if it previously failed during startup - proposed patch In

2023-07-03 Thread Alexander Bluhm
On Thu, Jun 29, 2023 at 01:40:17PM +, Robert Larsson wrote: > most every 10 seconds. To do this I've added f_resolvetime to f_forw - > I chose this approach rather than the TCP evtimer because I don't > fully understand the concurrency or blocking aspects of evtimer, and I have changed your

Re: panic: rw_enter: pfioctl_rw locking against myself

2023-06-29 Thread Alexander Bluhm
On Wed, Jun 28, 2023 at 09:41:15PM +0200, Florian Obser wrote: > Yes, good idea, let's ship 7.4 with the leak! The backout replaced a crash with a leak. We don't want to ship 7.4 with a potential kernel crash either. > I'm getting a bit annoyed with unlocking this and rewriting that and then >

Re: ifconfig sbar hang

2023-06-28 Thread Alexander Bluhm
On Wed, Jun 28, 2023 at 11:25:56AM +0200, Mark Kettenis wrote: > > From: Alexander Bluhm > > load: 3.00 cmd: ifconfig 52949 [sbar] 0.01u 0.05s 0% 78k > > ifconfig holds the netlock, I guess this prevents progress. > > What does a WITNESS kernel report? This is hard to s

Re: panic: rw_enter: pfioctl_rw locking against myself

2023-06-28 Thread Alexander Bluhm
} > > NET_LOCK(); > PF_LOCK(); > On Wed, Jun 28, 2023 at 02:38:00PM +0200, Alexander Bluhm wrote: > > Hi, > > > > Since Jun 26 regress tests panic the kernel. > > > > panic: rw_enter: pfioctl_rw locking against myself > >

panic: rw_enter: pfioctl_rw locking against myself

2023-06-28 Thread Alexander Bluhm
Hi, Since Jun 26 regress tests panic the kernel. panic: rw_enter: pfioctl_rw locking against myself Stopped at db_enter+0x14: popq%rbp TIDPIDUID PRFLAGS PFLAGS CPU COMMAND * 19846 58589 0 0x2 01K pfctl 343161 43899 0 0x2

ifconfig sbar hang

2023-06-26 Thread Alexander Bluhm
Hi, I have an ifconfig on ix(4) that hangs in "sbar" wait queue during "starting network" while booting. load: 3.00 cmd: ifconfig 52949 [sbar] 0.01u 0.05s 0% 78k ddb{0}> ps PID TID PPIDUID S FLAGS WAIT COMMAND 52949 250855 50082 0 3 0x3 sbar

Re: dvmrpd start causes kernel panic: assertion failed

2023-06-12 Thread Alexander Bluhm
On Mon, Jun 12, 2023 at 11:56:43PM +0300, Vitaliy Makkoveev wrote: > On Mon, Jun 12, 2023 at 09:04:41PM +0200, Why 42? The lists account. wrote: > > > > On Wed, Jun 07, 2023 at 03:50:29PM +0300, Vitaliy Makkoveev wrote: > > > > ... > > > > Please, share your dvmrpd.conf. > > > > > > > > > >

powerpc64 panic pool_do_get: pted free list modified

2023-06-05 Thread Alexander Bluhm
Hi During make release my powerpc64 machines paniced. [-- MARK -- Mon Jun 5 17:55:00 2023] ppaanniicc:: p o o l _pda on_ ig ce: t : p t e d f r e e l i s t m o d i f ie d : p o o l _ pd ao _g g ee t : p t edf 0r x e ec 0 0 0 0 0 0

powerpc64 panic: kernel diagnostic assertion "pm == pted->pted_pmap"

2023-05-10 Thread Alexander Bluhm
Hi, During release build my powerpc64 machine crashed. login: [-- MARK -- Wed May 10 14:40:00 2023] panic: kernel diagnostic assertion "pm == pted->pted_pmap" failed: file "/usr/src/sys/arch/powerpc64/powerpc64/pmap.c", line 865 Stopped at panic+0x134:ori r0,r0,0x0 TIDPID

powerpc64 panic: pool_do_get: vmmpepl free list modified

2023-05-05 Thread Alexander Bluhm
Hi, I got this crash while building clang during make release. [-- MARK -- Wed May 3 15:40:00 2023] panic: pool_do_get: vmmpepl free list modified: page 0xc0007ad90590; item addr 0x91a2a032; offset 0x0=0xc000 != 0x50007b7ba542 Stopped at panic+0x134:ori r0,r0,0x0

Re: Intel Ethernet (?Synopsys based) on Filet3 Elkhart Lake unconfigured on recent snapshot

2023-04-28 Thread Alexander Bluhm
> > On 28 Apr 2023, at 06:06, Ted Ri wrote: > > "Intel Elkhart Lake Ethernet" rev 0x11 at pci0 dev 29 function 1 not > > configured > > "Intel Elkhart Lake Ethernet" rev 0x11 at pci0 dev 29 function 2 not > > configured On Fri, Apr 28, 2023 at 07:50:48PM +1000, David Gwynne wrote: > and one of

Re: PF still blocks IGMP multicast control packets

2023-02-24 Thread Alexander Bluhm
On Fri, Feb 24, 2023 at 08:42:29AM +0100, Luca Di Gregorio wrote: > I would implement this logic: > > If the IP Destination Address is 224.0.0.0/4, then the TTL should be 1. > If the IP Destination Address is not 224.0.0.0/4, then no restrictions on > TTL. > > In your code, I would do this

Re: bbolt can freeze 7.2 from userspace

2023-02-20 Thread Alexander Bluhm
On Mon, Feb 20, 2023 at 09:43:10AM +0100, Martin Pieuchot wrote: > On 20/02/23(Mon) 03:59, Renato Aguiar wrote: > > [...] > > I can't reproduce it anymore with this patch on 7.2-stable :) > > Thanks a lot for testing! Here's a better fix from Chuck Silvers. > That's what I believe we should

sys_pselect assertion "timo || _kernel_lock_held()" failed

2023-02-13 Thread Alexander Bluhm
Hi, Today I saw this panic on my i386 regress machine. panic: kernel diagnostic assertion "timo || _kernel_lock_held()" failed: file "/usr/src/sys/kern/kern_synch.c", line 127 Looks like src/regress/lib/libc/sys/ triggered it. Kernel was built from some 2023-02-13 source checkout. I will

Re: pf.conf bug

2023-02-06 Thread Alexander Bluhm
On Mon, Feb 06, 2023 at 09:37:47PM +0100, Alexandr Nedvedicky wrote: > if we want to allow firewall administrator to specify a match > on icmptype 255 then extending type from uint8_t to uint16_t > is the right change. > > another option is to change logic here to allow matching

Re: kernel panic: diagnostic assertion "inp->inp_laddr.s_addr == INADDR_ANY || inp->inp_lport"

2023-01-12 Thread Alexander Bluhm
On Fri, Dec 16, 2022 at 05:36:04PM +, K R wrote: > ddb> show panic > *cpuO: kernel diagnostic assertion "inp->inp_laddr.s_addr == > INADDR_ANY || inp->inp_lport" failed: file > "/usr/src/sys/netinet/in_pcb.c", line 510 This has been fixed in errata and syspatch.

Re: panic: kernel diagnostic assertion "timo || _kernel_lock_held()" failed

2022-12-06 Thread Alexander Bluhm
On Tue, Dec 06, 2022 at 11:33:06PM +0300, Vitaliy Makkoveev wrote: > On Tue, Dec 06, 2022 at 07:56:13PM +0100, Paul de Weerd wrote: > > I was playing with the USB NIC that's in my (USB-C) monitor. As soon > > as I do traffic over the interface, I get a kernel panic: > > > > panic: kernel

Re: deadlock in ifconfig

2022-11-25 Thread Alexander Bluhm
On Thu, Nov 24, 2022 at 11:23:51AM +1000, David Gwynne wrote: > > we're working toward dropping the need for NET_LOCK before PF_LOCK. could > > we try the diff below as a compromise? > > > > sashan@ and mvs@ have pushed that forward, so this diff should be enough > now. This diff has been

Re: deadlock in ifconfig

2022-11-25 Thread Alexander Bluhm
On Thu, Nov 24, 2022 at 07:09:39PM +0100, Alexandr Nedvedicky wrote: > Hello, > > > On Thu, Nov 24, 2022 at 10:29:37AM -0500, David Hill wrote: > > > > > > > > With this diff against -current - my dmesg is spammed with: > > > > splassert: pfsync_delete_state: want 2 have 0 > > Starting stack

Re: deadlock in ifconfig

2022-11-24 Thread Alexander Bluhm
On Thu, Nov 24, 2022 at 11:23:51AM +1000, David Gwynne wrote: > > we're working toward dropping the need for NET_LOCK before PF_LOCK. could > > we try the diff below as a compromise? > > > > sashan@ and mvs@ have pushed that forward, so this diff should be enough > now. I have tested the

deadlock in ifconfig

2022-11-21 Thread Alexander Bluhm
Hi, Some of my test machines hang while booting userland. starting network -> here it hangs load: 0.02 cmd: ifconfig 81303 [sbar] 0.00u 0.15s 0% 78k ddb shows these two processes. 81303 375320 89140 0 3 0x3 sbar ifconfig 48135 157353 0 0 3 0x14200

Re: macppc panic: vref used where vget required

2022-11-14 Thread Alexander Bluhm
On Wed, Nov 09, 2022 at 04:14:10PM +, Martin Pieuchot wrote: > On 09/09/22(Fri) 14:41, Martin Pieuchot wrote: > > On 09/09/22(Fri) 12:25, Theo Buehler wrote: > > > > Yesterday gnezdo@ fixed a race in uvn_attach() that lead to the same > > > > assert. Here's an rebased diff for the bug

Re: performance regression RDTSCP

2022-10-09 Thread Alexander Bluhm
On Sat, Oct 08, 2022 at 08:41:34AM +0200, Robert Nagy wrote: > What is the output of sysctl kernl.timecounter? # sysctl kern.timecounter kern.timecounter.tick=1 kern.timecounter.timestepwarnings=0 kern.timecounter.hardware=tsc kern.timecounter.choice=i8254(0) acpihpet0(1000) tsc(2000)

Re: performance regression RDTSCP

2022-10-07 Thread Alexander Bluhm
On Fri, Oct 07, 2022 at 01:10:14PM -0500, Scott Cheloha wrote: > Does this machine have the corresponding libc change that went with the kernel > change? The RDTSCP option has a distinct userspace implementation. If libc > isn't up to date it won't know what to do and it will fall back to the >

performance regression RDTSCP

2022-10-07 Thread Alexander Bluhm
Hi, My monthly UDP benchmarks detect a reduction in iperf3 UDP througput by 30% between September 22 and 23. http://bluhm.genua.de/perform/results/7.1/2022-10-01T06%3A17%3A03Z/gnuplot/udp.html It is this test: iperf3 -c10.3.45.35 -u -b10G -w1m -t10 Per commit checkout shows the relevant

Re: macppc panic: vref used where vget required

2022-09-06 Thread Alexander Bluhm
On Tue, Sep 06, 2022 at 05:17:56AM +, Miod Vallat wrote: > > On Thu, Sep 01, 2022 at 02:57:27PM +0200, Martin Pieuchot wrote: > > > Yesterday gnezdo@ fixed a race in uvn_attach() that lead to the same > > > assert. Here's an rebased diff for the bug discussed in this thread, > > > could you

Re: em0: problems seen initial neighbor solicitation (ipv6)

2022-09-05 Thread Alexander Bluhm
On Mon, Sep 05, 2022 at 03:23:21PM +0200, Sebastien Marie wrote: > 15:06:59.567147 c8:be:19:e2:2c:ed 33:33:ff:fc:bf:56 ip6 86: > fd00:65c6:f26a:5c::1 > ff02::1:fffc:bf56: icmp6: neighbor sol: who has > fd00:65c6:f26a:5c:452e:64ab:3ffc:bf56 Some packets sent by the neighbor solicitation state

Re: rt_ifa_del NULL deref

2022-09-04 Thread Alexander Bluhm
On Sun, Sep 04, 2022 at 05:42:14PM +0300, Vitaliy Makkoveev wrote: > Not for commit, just for collect assertions. Netlock assertions doen't > provide panics. Better use NET_ASSERT_LOCKED_EXCLUSIVE() ? But maybe nd6 uses a combination of shared netlock and kernel lock. Is it possible that we

Re: macppc panic: vref used where vget required

2022-09-02 Thread Alexander Bluhm
On Thu, Sep 01, 2022 at 02:57:27PM +0200, Martin Pieuchot wrote: > Yesterday gnezdo@ fixed a race in uvn_attach() that lead to the same > assert. Here's an rebased diff for the bug discussed in this thread, > could you try again and let us know? Thanks! Wow! With this diff I finished make

Re: OpenBSD 7.1/amd64 on APU4D4 - system drops into ddb few times a week

2022-08-29 Thread Alexander Bluhm
On Mon, Aug 29, 2022 at 04:42:45AM +0200, Radek wrote: > the same problem occurs on -current. It is not the same problem. Traces are different. But I guess your setup triggers some sort of race. Previous crashes with 7.1 were in route and IPsec, now it is in pf. Unfortunately you missed my pf

Re: rt_ifa_del NULL deref

2022-08-27 Thread Alexander Bluhm
On Sat, Aug 27, 2022 at 03:14:15AM +0300, Vitaliy Makkoveev wrote: > > On 27 Aug 2022, at 00:04, Alexander Bluhm wrote: > > > > Anyone willing to test or ok this? > > > > This fixes weird `ifa??? refcounting. I like this. > > Could the ifaref() and ifafree

Re: rt_ifa_del NULL deref

2022-08-26 Thread Alexander Bluhm
Anyone willing to test or ok this? I would like to get it in before kn@ bumps ports due to net/if_var.h changes. On Wed, Aug 24, 2022 at 03:14:35PM +0200, Alexander Bluhm wrote: > On Tue, Aug 23, 2022 at 02:47:11PM +0200, Stefan Sperling wrote: > > ddb{2}> show struct ifaddr 0x

Re: rt_ifa_del NULL deref

2022-08-24 Thread Alexander Bluhm
On Tue, Aug 23, 2022 at 02:47:11PM +0200, Stefan Sperling wrote: > ddb{2}> show struct ifaddr 0x804e9400 > struct ifaddr at 0x804e9400 (64 bytes) {ifa_addr = (struct sockaddr > *)0 > xdeaf0009deafbead, ifa_dstaddr = (struct sockaddr *)0x20ef1a8af1d895de, > ifa_net > mask =

Re: rt_ifa_del NULL deref

2022-08-23 Thread Alexander Bluhm
On Tue, Aug 23, 2022 at 12:23:05PM +0200, Stefan Sperling wrote: > On Tue, Aug 23, 2022 at 11:43:22AM +0200, Alexander Bluhm wrote: > > On Tue, Aug 23, 2022 at 10:15:22AM +0200, Stefan Sperling wrote: > > > I found one of my amd64 systems running -current, built on 12th of

Re: rt_ifa_del NULL deref

2022-08-23 Thread Alexander Bluhm
On Tue, Aug 23, 2022 at 10:15:22AM +0200, Stefan Sperling wrote: > I found one of my amd64 systems running -current, built on 12th of > August, has crashed as follows. I there any chance that the kernel sources are between these commits? August 12th does not fit exactly, do you remember when you

Re: Information leakage of IP-layer data on LAN

2022-08-22 Thread Alexander Bluhm
On Mon, Aug 22, 2022 at 08:56:32PM +0200, Peter J. Philipp wrote: > On Mon, Aug 22, 2022 at 08:15:13PM +0200, Alexander Bluhm wrote: > > Note that sending an error reply to packets that cannot be processed > > is not uncommon and sometimes required to make the network behave > &g

Re: Information leakage of IP-layer data on LAN

2022-08-22 Thread Alexander Bluhm
On Mon, Aug 22, 2022 at 06:04:17PM +0200, p...@delphinusdns.org wrote: > >Synopsis:IP Information leakage using MAC address > >Category:system > >Environment: > System : OpenBSD 7.1 > Details : OpenBSD 7.1 (GENERIC.MP) #3: Sun May 15 10:27:01 MDT 2022 >

Re: 7.2-beta crash with bridge Interface

2022-08-06 Thread Alexander Bluhm
On Sat, Aug 06, 2022 at 11:33:46PM +, mgra...@brainfat.net wrote: > after creating a bridge interface running an ifconfig command will > crash the system. > c: netlock: lock not held > rw_exit_write(822af5e8) at rw_exit_write+0xae >

Re: macppc panic: vref used where vget required

2022-06-01 Thread Alexander Bluhm
On Tue, May 31, 2022 at 04:40:32PM +0200, Martin Pieuchot wrote: > Any of you got the chance to try this diff? Could you reproduce the > panic with it? With this diff I could build a release. It worked twice. Usually it crashes after a day. This time it finished release after 30 hours. bluhm

Re: macppc panic: vref used where vget required

2022-05-19 Thread Alexander Bluhm
On Tue, May 17, 2022 at 05:43:02PM +0200, Martin Pieuchot wrote: > Andrew, Alexander, could you test this and report back? Panic "vref used where vget required" is still there. As usual it needs a day to reproduce. This time I was running without the vref history diff. bluhm > Index:

Re: [External] : Re: 7.1-Current crash with NET_TASKQ 4 and veb interface

2022-05-18 Thread Alexander Bluhm
On Mon, May 16, 2022 at 05:06:28PM +0200, Claudio Jeker wrote: > > In veb configuration we are holding the netlock and sleep in > > smr_barrier() and refcnt_finalize(). An additional sleep in malloc() > > is fine here. > > Are you sure about this? smr_barrier() on busy systems with many cpus can

Re: [External] : Re: 7.1-Current crash with NET_TASKQ 4 and veb interface

2022-05-13 Thread Alexander Bluhm
On Fri, May 13, 2022 at 05:53:27PM +0200, Alexandr Nedvedicky wrote: > at this point we hold a NET_LOCK(). So basically if there won't > be enough memory we might start sleeping waiting for memory > while we will be holding a NET_LOCK. > > This is something we should try to avoid,

Re: [External] : Re: 7.1-Current crash with NET_TASKQ 4 and veb interface

2022-05-13 Thread Alexander Bluhm
On Fri, May 13, 2022 at 12:19:46PM +1000, David Gwynne wrote: > sorry i'm late to the party. can you try this diff? Thanks for having a look. I added veb(4) to my setup. With this diff, I cannot trigger a crash anymore. OK bluhm@ > this diff replaces the list of ports with an array/map of

Re: 7.1-Current crash with NET_TASKQ 4 and veb interface

2022-05-10 Thread Alexander Bluhm
On Tue, May 10, 2022 at 09:37:12PM +0200, Hrvoje Popovski wrote: > On 9.5.2022. 22:04, Alexander Bluhm wrote: > > Can some veb or smr hacker explain how this is supposed to work? > > > > Sleeping in pf is also not ideal as it is in the hot path and slows > > down pac

Re: 7.1-Current crash with NET_TASKQ 4 and veb interface

2022-05-09 Thread Alexander Bluhm
On Mon, May 09, 2022 at 06:01:07PM +0300, Barbaros Bilek wrote: > I was using veb (veb+vlan+ixl) interfaces quite stable since 6.9. > My system ran as a firewall under OpenBSD 6.9 and 7.0 quite stable. > Also I've used 7.1 for a limited time and there were no crash. > After OpenBSD' NET_TASKQ

Re: macppc panic: vref used where vget required

2022-05-06 Thread Alexander Bluhm
Same with this diff. On Wed, May 04, 2022 at 05:58:14PM +0200, Martin Pieuchot wrote: > Index: nfs/nfs_serv.c > === > RCS file: /cvs/src/sys/nfs/nfs_serv.c,v > retrieving revision 1.120 > diff -u -p -r1.120 nfs_serv.c > ---

Re: macppc panic: vref used where vget required

2022-05-04 Thread Alexander Bluhm
On Wed, May 04, 2022 at 05:58:14PM +0200, Martin Pieuchot wrote: > I don't understand the mechanism around UVM_VNODE_CANPERSIST. I looked > for missing uvm_vnp_uncache() and found the following two. I doubt > those are the one triggering the bug because they are in NFS & softdep. It crashes

Re: macppc panic: vref used where vget required

2022-05-03 Thread Alexander Bluhm
On Mon, May 02, 2022 at 06:53:08AM +0200, Sebastien Marie wrote: > New diff, with new iteration on vnode_history_*() functions. I added a label > in > the record function. I also changed when showing the stacktrace. powerpc has > poor backtrace support, but now it will at least print some infos

Re: macppc panic: vref used where vget required

2022-05-01 Thread Alexander Bluhm
Still panics with the latest uvm fixes. http://bluhm.genua.de/release/results/2022-04-30T21%3A55%3A03Z/bsdcons-ot26.txt [-- MARK -- Sun May 1 14:40:00 2022] uvn_io: start: 0x25452b58, type VREG, use 0, write 0, hold 834, flags (VBIOONFREELIST) tag VT_UFS, ino 469309, on dev 0, 10 flags

Re: ipsp_ids_gc panic after 7.1 upgrade

2022-04-29 Thread Alexander Bluhm
On Thu, Apr 28, 2022 at 12:52:41AM +0300, Vitaliy Makkoveev wrote: > On Thu, Apr 28, 2022 at 12:15:25AM +0300, Vitaliy Makkoveev wrote: > > > On 27 Apr 2022, at 23:24, Kasak wrote: > [ skip ] > > > I???m afraid your patch did not help, it crashed again after three hours > > > > Did it panic

Re: macppc panic: vref used where vget required

2022-04-29 Thread Alexander Bluhm
On Thu, Apr 28, 2022 at 08:47:53PM +0200, Martin Pieuchot wrote: > On 28/04/22(Thu) 16:54, Sebastien Marie wrote: > > On Thu, Apr 28, 2022 at 04:04:41PM +0200, Alexander Bluhm wrote: > > > On Wed, Apr 27, 2022 at 09:16:48AM +0200, Sebastien Marie wrote: > > > > Here a

  1   2   3   4   >