Re: macppc panic: vref used where vget required

2022-04-13 Thread Martin Pieuchot
On 12/04/22(Tue) 14:58, Sebastien Marie wrote: > [...] > uvn_io: start: 0x8000ffab9688, type VREG, use 0, write 0, hold 0, flags > (VBIOONFREELIST) > tag VT_UFS, ino 14802284, on dev 4, 30 flags 0x100, effnlink 1, nlink 1 > mode 0100644, owner 858, group 1000, size 345603015 > =>

Re: macppc panic: vref used where vget required

2022-04-19 Thread Martin Pieuchot
On 14/04/22(Thu) 18:29, Alexander Bluhm wrote: > [...] > vn_lock: v_usecount == 0: 0x23e6b910, type VREG, use 0, write 0, hold 0, > flags (VBIOONFREELIST) > tag VT_UFS, ino 703119, on dev 0, 10 flags 0x100, effnlink 1, nlink 1 > mode 0100660, owner 21, group 21, size 13647873 > ==>

Re: macppc panic: vref used where vget required

2022-04-28 Thread Martin Pieuchot
On 28/04/22(Thu) 16:54, Sebastien Marie wrote: > On Thu, Apr 28, 2022 at 04:04:41PM +0200, Alexander Bluhm wrote: > > On Wed, Apr 27, 2022 at 09:16:48AM +0200, Sebastien Marie wrote: > > > Here a new diff (sorry for the delay) which add a new > > > vnode_history_record() > > > point inside uvn_det

Re: macppc panic: vref used where vget required

2022-05-04 Thread Martin Pieuchot
On 04/05/22(Wed) 09:16, Sebastien Marie wrote: > [...] > we don't have any vclean label ("vclean (inactive)" or "vclean (active)"), so > vclean() was not called in this timeframe. So we are narrowing down the issue: 1. A file is opened 2. Then mmaped 3. Some of its pages are swapped to disk 4.

Re: macppc panic: vref used where vget required

2022-05-05 Thread Martin Pieuchot
On 04/05/22(Wed) 18:23, Mark Kettenis wrote: > > Date: Wed, 4 May 2022 17:58:14 +0200 > > From: Martin Pieuchot > > > > On 04/05/22(Wed) 09:16, Sebastien Marie wrote: > > > [...] > > > we don't have any vclean label ("vclean (inactive)"

Re: macppc panic: vref used where vget required

2022-05-05 Thread Martin Pieuchot
On 04/05/22(Wed) 18:30, Alexander Bluhm wrote: > On Wed, May 04, 2022 at 05:58:14PM +0200, Martin Pieuchot wrote: > > I don't understand the mechanism around UVM_VNODE_CANPERSIST. I looked > > for missing uvm_vnp_uncache() and found the following two. I doubt > > those

Re: macppc panic: vref used where vget required

2022-05-17 Thread Martin Pieuchot
On 06/05/22(Fri) 22:16, Alexander Bluhm wrote: > Same with this diff. Thanks for testing. Here's a possible fix. The idea is to call uvm_vnp_terminate() when recycling a vnode. This will flush any pending pages that are still associated with the vnode. Ironically that is what the comment above

Re: macppc panic: vref used where vget required

2022-05-24 Thread Martin Pieuchot
On 19/05/22(Thu) 13:33, Alexander Bluhm wrote: > On Tue, May 17, 2022 at 05:43:02PM +0200, Martin Pieuchot wrote: > > Andrew, Alexander, could you test this and report back? > > Panic "vref used where vget required" is still there. As usual it > needs a day to reprodu

Re: macppc panic: vref used where vget required

2022-05-31 Thread Martin Pieuchot
On 24/05/22(Tue) 14:16, Martin Pieuchot wrote: > On 19/05/22(Thu) 13:33, Alexander Bluhm wrote: > > On Tue, May 17, 2022 at 05:43:02PM +0200, Martin Pieuchot wrote: > > > Andrew, Alexander, could you test this and report back? > > > > Panic "vref used where

Re: macppc panic: vref used where vget required

2022-06-03 Thread Martin Pieuchot
On 02/06/22(Thu) 07:29, Theo de Raadt wrote: > So this basically converts the flag into a proper reference? It completely gets rid of the extra reference. UVM objects related to a vnode are no longer kept alive after uvn_detach() has been called. > If you go back to 4.4BSD, there's another aspec

Re: macppc panic: vref used where vget required

2022-06-03 Thread Martin Pieuchot
On 02/06/22(Thu) 13:54, Sebastien Marie wrote: > On Tue, May 24, 2022 at 02:16:44PM +0200, Martin Pieuchot wrote: > > On 19/05/22(Thu) 13:33, Alexander Bluhm wrote: > > > On Tue, May 17, 2022 at 05:43:02PM +0200, Martin Pieuchot wrote: > > > > Andrew, Alexander, could

Re: kern_event.c:839 assertion failed

2022-06-20 Thread Martin Pieuchot
On 19/06/22(Sun) 11:34, Visa Hankala wrote: > On Fri, Jun 17, 2022 at 04:25:48PM +0300, Mikhail wrote: > > I was debugging tog in lldb and in second tmux window opened another > > bare tog instance, after a second I got this panic: > > > > panic: kernel diagnostic assetion "p->p_kq->kq_refcnt.r_re

Re: kern_event.c:839 assertion failed

2022-06-26 Thread Martin Pieuchot
On 20/06/22(Mon) 14:59, Visa Hankala wrote: > On Mon, Jun 20, 2022 at 01:59:25PM +0200, Martin Pieuchot wrote: > > On 19/06/22(Sun) 11:34, Visa Hankala wrote: > > > On Fri, Jun 17, 2022 at 04:25:48PM +0300, Mikhail wrote: > > > > I was debugging tog in lldb and in sec

Re: System frequently hangs, found commit that probably causes it

2022-06-26 Thread Martin Pieuchot
On 26/06/22(Sun) 20:36, Caspar Schutijser wrote: > A laptop of mine (dmesg below) frequently hangs. After some bisecting > and extensive testing I think I found the commit that causes this: > mpi@'s > "Always acquire the `vmobjlock' before incrementing an object's reference." > commit from 2022-04-

Re: System frequently hangs, found commit that probably causes it

2022-06-27 Thread Martin Pieuchot
On 27/06/22(Mon) 18:04, Caspar Schutijser wrote: > On Sun, Jun 26, 2022 at 10:03:59PM +0200, Martin Pieuchot wrote: > > On 26/06/22(Sun) 20:36, Caspar Schutijser wrote: > > > A laptop of mine (dmesg below) frequently hangs. After some bisecting > > > and extensive

Re: System frequently hangs, found commit that probably causes it

2022-07-06 Thread Martin Pieuchot
On 01/07/22(Fri) 07:13, Sebastien Marie wrote: > On Mon, Jun 27, 2022 at 06:29:55PM +0200, Martin Pieuchot wrote: > > On 27/06/22(Mon) 18:04, Caspar Schutijser wrote: > > > On Sun, Jun 26, 2022 at 10:03:59PM +0200, Martin Pieuchot wrote: > > > > On 26/06/22(Sun)

Re: macppc panic: vref used where vget required

2022-07-11 Thread Martin Pieuchot
On 11/07/22(Mon) 07:50, Theo Buehler wrote: > On Fri, Jun 03, 2022 at 03:02:36PM +0200, Theo Buehler wrote: > > > Please do note that this change can introduce/expose other issues. > > > > It seems that this diff causes occasional hangs when building snapshots > > on my mac M1 mini. This happened

Re: panic: Stopped at kqueue_scan

2019-04-28 Thread Martin Pieuchot
On 26/04/19(Fri) 11:23, Olivier Antoine wrote: > Hi, could it be that the problem is related to pipex? > My npppd.conf rely on the default pipex settings (on) > But net.pipex.enable is 0 since I have no explicit settings in my > /etc/sysctl.conf > Under these conditions, I have no difficulty to cra

Re: panic: Stopped at kqueue_scan

2019-04-28 Thread Martin Pieuchot
On 23/04/19(Tue) 12:16, Olivier Antoine wrote: > >Synopsis:panic: Stopped at kqueue_scan > >Category:kernel i386 > >Environment: > System : OpenBSD 6.5 > Details : OpenBSD 6.5-current (GENERIC.MP) #1368: Sun Apr 21 > 19:50:46 MDT 2019 > > dera...@i386.openbsd.

Re: panic: Stopped at kqueue_scan

2019-04-29 Thread Martin Pieuchot
On 29/04/19(Mon) 17:24, David Gwynne wrote: > On Sun, Apr 28, 2019 at 06:57:02PM -0300, Martin Pieuchot wrote: > > On 23/04/19(Tue) 12:16, Olivier Antoine wrote: > > > >Synopsis:panic: Stopped at kqueue_scan > > > >Category:kernel i386 > > > >

"infinite" gethostbyname(3)

2019-04-29 Thread Martin Pieuchot
When my machine doesn't get a reply from a DNS server, generally the wifi router of the place where I'm drinking coffee, applications sit in gethostbyname(3) "indefinitely". At least too long for me to wait and I end up killing the app. $ cat /etc/resolv.conf # Generated by iwm0 dhclient

Re: panic: Stopped at kqueue_scan

2019-04-30 Thread Martin Pieuchot
On 30/04/19(Tue) 08:06, David Gwynne wrote: > > > > On 30 Apr 2019, at 03:24, Martin Pieuchot wrote: > > > > On 29/04/19(Mon) 17:24, David Gwynne wrote: > >> On Sun, Apr 28, 2019 at 06:57:02PM -0300, Martin Pieuchot wrote: > >>> On 23/04/19(Tue) 1

Re: "infinite" gethostbyname(3)

2019-04-30 Thread Martin Pieuchot
On 30/04/19(Tue) 15:41, Ted Unangst wrote: > Martin Pieuchot wrote: > > When my machine doesn't get a reply from a DNS server, generally the > > wifi router of the place where I'm drinking coffee, applications sit > > in gethostbyname(3) "indefinitely". A

Re: bridge - kernel: protection fault trap

2019-04-30 Thread Martin Pieuchot
On 30/04/19(Tue) 14:45, Hrvoje Popovski wrote: > Hi all, > > if i have bridge with rstp on interfaces and rstp on switch and i want > to disable rstp on openbsd interfaces i'm getting fault trap. I can > reproduce it on 6.4 and on -current. > i can't reproduce it if i don't have rstp on switch. S

Re: panic: kernel diagnostic assertion "_kernel_lock_held()" faled: file "/usr/src/sys/kern/kern_event.c", line 1076

2019-05-04 Thread Martin Pieuchot
On 02/05/19(Thu) 22:33, nay...@openbsd.org wrote: > >Synopsis:kernel crash while browsing with ~20 chrome tabs > >Category:kernel crash > >Environment: > System : OpenBSD 6.5 > Details : OpenBSD 6.5-current (GENERIC.MP) #0: Thu May 2 10:35:18 > MDT 2019 >

Re: bridge - kernel: protection fault trap

2019-05-04 Thread Martin Pieuchot
On 01/05/19(Wed) 09:34, Hrvoje Popovski wrote: > On 30.4.2019. 23:40, Martin Pieuchot wrote: > > On 30/04/19(Tue) 14:45, Hrvoje Popovski wrote: > >> Hi all, > >> > >> if i have bridge with rstp on interfaces and rstp on switch and i want > >> to disable

Re: bridge - panic: free: duplicated free

2019-05-07 Thread Martin Pieuchot
On 07/05/19(Tue) 11:46, Hrvoje Popovski wrote: > Hi all, > > for testing MP network diffs i'm using some intrusive commands in loop > while sending traffic through openbsd box, leave loop to run and after > some time something will happen or not :) > > few days ago by accident i run same loop twi

Re: 6.5 stable SMP (2 processor) Kernel panic when ifconfig is called and there are 2 bridges are running in two different Rdomains

2019-05-23 Thread Martin Pieuchot
On 23/05/19(Thu) 02:29, Tom Smyth wrote: > Hello > Kernel panic when ifconfig is called and 2 bridges are running in two > different rdomains Your trace shows that a context switch occurs while a thread is still holding a mutex. It isn't clear which thread is that. The traces shows that a userla

Re: Panic on 6.5-stable

2019-05-25 Thread Martin Pieuchot
On 25/05/19(Sat) 00:51, Lévai, Dániel wrote: > Hi everyone! > > I had this panic in the recent days. > Attached the ddb output (traces, ps) and the dmesg. > Unfortunately I wasn't around when it happened, I have no clue what went on > with the machine then. > Reading through the traces it has som

Re: Strange issue due to stale udp route cache

2019-05-25 Thread Martin Pieuchot
Hello Yannick, On 15/05/19(Wed) 15:50, Yannick Gravel wrote: > > Synopsis: Strange issue due to stale udp route cache > > Category: kernel > > Environment: > System : OpenBSD 6.5 > Details : OpenBSD 6.5 (GENERIC) #3: Sat Apr 13 14:42:43 MDT 2019 > >

Re: Panic on 6.5-stable

2019-06-09 Thread Martin Pieuchot
On 05/06/19(Wed) 21:40, Alexander Bluhm wrote: > On Sat, May 25, 2019 at 02:44:02PM -0300, Martin Pieuchot wrote: > > It looks like a stack exhaustion. > > Having a non-recursive art_table_walk() might be a solution. > > We see a similar crash on OpenBSD 6.5. Disab

Re: Panic on 6.5-stable

2019-06-11 Thread Martin Pieuchot
On 09/06/19(Sun) 15:41, Martin Pieuchot wrote: > [...] > Another way to prevent stack exhaustion would be to return a reference > to any `rt' that needs to be deleted instead of deleting it in place. Diff below does that by adding a new `prt' argument to rtable_walk(). I&#x

Re: ifconfig bridge command crashes the machine

2019-06-11 Thread Martin Pieuchot
On 05/06/19(Wed) 13:09, Russell Sutherland wrote: > >Synopsis: machine crashes when issuing the ifconfig bridge0 command > >Category: kernel > >Environment: > System : OpenBSD 6.5 > Details : OpenBSD 6.5-current (GENERIC.MP) #5: Mon Jun 3 > 07:46:49 MDT 2019 >

Re: sparc64: bridge: mtx 0x400191ba5e0: locking against myself

2019-07-08 Thread Martin Pieuchot
On 06/07/19(Sat) 20:17, Klemens Nanni wrote: > Running latest snapshot on a T5240. > > The machine paniced while removing interfaces from protected domains. > Here is the console log showing both the bridge's configuration as well > as the commands used: There's a mtx_leave() missing in bridge_rt

Re: vfs lookup crash during unmount

2019-09-09 Thread Martin Pieuchot
On 05/09/19(Thu) 23:54, Alexander Bluhm wrote: > Hi, > > When doing a forced unmount on a stressed file system, it may crash > here. > > uvm_fault(0xfd85827449a0, 0x50, 0, 1) -> e > kernel: page fault trap, code=0 > Stopped at vfs_lookup+0x5d8: testb $0x1,0x50(%rcx) > ddb{2}> bt

Re: filt_bpfrdetach uvm_fault after vmd vm was shutdown

2019-10-21 Thread Martin Pieuchot
On 19/10/19(Sat) 10:30, Anton Lindqvist wrote: > On Wed, Jul 31, 2019 at 06:11:33PM +0100, Stuart Henderson wrote: > > July 29 amd64 snap. I had just tested something in a vm (not very > > common for me) and did "halt -p" in the guest. Immediately afterwards > > I hit this: > > > > uvm_fault(0xfff

Re: filt_bpfrdetach uvm_fault after vmd vm was shutdown

2019-10-21 Thread Martin Pieuchot
On 21/10/19(Mon) 13:28, Alexandr Nedvedicky wrote: > Hello, > > > > > > > The vnode is not locked in this path > > > either so it won't end up waiting on the ongoing vclean(). > > > > That leads to an interesting question: should we serialize device access >

Re: 6.6 on Hetzner: arpresolve: 172.31.1.1: route contains no arp information

2019-10-24 Thread Martin Pieuchot
On 23/10/19(Wed) 17:55, Lauri Tirkkonen wrote: > [...] > I did some bisecting today: using the github src mirror, I built > bsd.rd's on a fresh 6.5 install (which I had to manually bootstrap > libc.so.95.1 on to build all the necessary revisions), booted them on > a Hetzner VM, and checked whether

i386 pmap_apte_flush() fault in vmm(4)

2019-11-03 Thread Martin Pieuchot
When halting/rebooting a i386 VM on an amd64 host (dmesg attached), the following fault is triggered. The same happens with a self built -current kernel w/o any suspect UVM diff :o) kernel: protection fault trap, code=0 Stopped at pmap_apte_flush+0x6:movl%eax,%cr3 ddb> tr pmap_apte_f

Re: resume failures/lockups

2023-09-02 Thread Martin Pieuchot
Hello Ross, On 27/08/23(Sun) 15:16, Ross L Richardson wrote: > For the past several weeks (using -current), I've had problems with > resume on an amd64 desktop. It's intermittent (but if anything > becoming increasingly frequent). If you can still reproduce the issue, please try enabling WITNESS

Re: Sparc64 rthreads Instablilty

2023-09-02 Thread Martin Pieuchot
On 13/08/23(Sun) 22:59, Kurt Miller wrote: > I’ve been hunting an intermittent jdk crash on sparc64 for some time now. > Since egdb has not been up to the task, I created a small c program which > reproduces the problem. This partially mimics the jdk startup where a number > of detached threads are

Re: Sparc64 livelock/system freeze w/cpu traces

2023-09-02 Thread Martin Pieuchot
On 28/06/23(Wed) 20:07, Kurt Miller wrote: > On Jun 28, 2023, at 7:16 AM, Martin Pieuchot wrote: > > > > On 28/06/23(Wed) 08:58, Claudio Jeker wrote: > >> > >> I doubt this is a missing wakeup. It is more the system is thrashing and > >> not making p

Re: Sparc64 rthreads Instablilty

2024-02-16 Thread Martin Pieuchot
On 15/02/24(Thu) 20:06, Kurt Miller wrote: > On Feb 15, 2024, at 3:01 PM, Miod Vallat wrote: > > > >> Has been running for the last few hours without any issue. > >> OK claudio@ on that diff. > > > > But it's your diff! I only polished it a bit. > > > > I have also been testing various version

Re: panic: "wakeup: p_stat is 2" using btrace(8) & vmd(8)

2024-02-20 Thread Martin Pieuchot
On 28/10/21(Thu) 05:45, Visa Hankala wrote: > On Wed, Oct 27, 2021 at 09:02:08PM -0400, Dave Voutila wrote: > > > > Dave Voutila writes: > > > > > Was tinkering on a bt(5) script for trying to debug an issue in vmm(4) > > > when I managed to start hitting a panic "wakeup: p_stat is 2" being > >

Re: panic: "wakeup: p_stat is 2" using btrace(8) & vmd(8)

2024-02-22 Thread Martin Pieuchot
On 21/02/24(Wed) 13:05, Claudio Jeker wrote: > On Tue, Feb 20, 2024 at 09:34:12PM +0100, Martin Pieuchot wrote: > > On 28/10/21(Thu) 05:45, Visa Hankala wrote: > > > On Wed, Oct 27, 2021 at 09:02:08PM -0400, Dave Voutila wrote: > > > > Dave Voutila writes: > >

Re: panic: kernel diagnostic assertion "p->p_wchan == NULL" failed

2024-02-28 Thread Martin Pieuchot
On 28/02/24(Wed) 12:36, Claudio Jeker wrote: > On Wed, Feb 28, 2024 at 12:26:43PM +0100, Marko Cupać wrote: > > Hi, > > > > thank you for looking into it, and for the advice. > > > > On Wed, 28 Feb 2024 10:13:06 + > > Stuart Henderson wrote: > > > > > Please try to re-type at least the most

Re: panic: kernel diagnostic assertion "p->p_wchan == NULL" failed

2024-02-28 Thread Martin Pieuchot
On 28/02/24(Wed) 16:39, Vitaliy Makkoveev wrote: > On Wed, Feb 28, 2024 at 02:22:31PM +0100, Mark Kettenis wrote: > > > Date: Wed, 28 Feb 2024 16:16:09 +0300 > > > From: Vitaliy Makkoveev > > > > > > On Wed, Feb 28, 2024 at 12:36:26PM +0100, Claudio Jeker wrote: > > > > On Wed, Feb 28, 2024 at 12

Re: panic: "wakeup: p_stat is 2" using btrace(8) & vmd(8)

2024-03-24 Thread Martin Pieuchot
On 22/02/24(Thu) 17:24, Claudio Jeker wrote: > On Thu, Feb 22, 2024 at 04:16:57PM +0100, Martin Pieuchot wrote: > > On 21/02/24(Wed) 13:05, Claudio Jeker wrote: > > > On Tue, Feb 20, 2024 at 09:34:12PM +0100, Martin Pieuchot wrote: > > > > On 28/10/21(Thu) 05:45, Vi

Re: protection fault in amap_wipeout

2024-03-30 Thread Martin Pieuchot
Hello Alexander, Thanks for the report. On 01/03/24(Fri) 16:39, Alexander Bluhm wrote: > Hi, > > An OpenBSD 7.4 machine on KVM running postgress and pagedaemon > crashed in amap_wipeout(). > > bluhm > > kernel: protection fault trap, code=0 > Stopped at amap_wipeout+0x76: movq%rc

Re: protection fault in amap_wipeout

2024-04-13 Thread Martin Pieuchot
On 30/03/24(Sat) 18:38, Martin Pieuchot wrote: > Hello Alexander, > > Thanks for the report. > > On 01/03/24(Fri) 16:39, Alexander Bluhm wrote: > > Hi, > > > > An OpenBSD 7.4 machine on KVM running postgress and pagedaemon > > crashed in amap_wi

Re: lock order reversal in soreceive and NFS

2024-04-23 Thread Martin Pieuchot
On 22/04/24(Mon) 16:18, Mark Kettenis wrote: > > Date: Mon, 22 Apr 2024 15:39:55 +0200 > > From: Alexander Bluhm > > > > Hi, > > > > I see a witness lock order reversal warning with soreceive. It > > happens during NFS regress tests. In /var/log/messages is more > > context from regress. > >

Re: lock order reversal in soreceive and NFS

2024-04-30 Thread Martin Pieuchot
On 27/04/24(Sat) 13:44, Visa Hankala wrote: > On Tue, Apr 23, 2024 at 02:48:32PM +0200, Martin Pieuchot wrote: > > [...] > > I agree. Now I'd be very grateful if someone could dig into WITNESS to > > figure out why we see such reports. Are these false positive or are we

Re: macppc panic: vref used where vget required

2022-09-01 Thread Martin Pieuchot
On 29/07/22(Fri) 14:22, Theo Buehler wrote: > On Mon, Jul 11, 2022 at 01:05:19PM +0200, Martin Pieuchot wrote: > > On 11/07/22(Mon) 07:50, Theo Buehler wrote: > > > On Fri, Jun 03, 2022 at 03:02:36PM +0200, Theo Buehler wrote: > > > > > Please do note that this

Re: macppc panic: vref used where vget required

2022-09-09 Thread Martin Pieuchot
On 09/09/22(Fri) 12:25, Theo Buehler wrote: > > Yesterday gnezdo@ fixed a race in uvn_attach() that lead to the same > > assert. Here's an rebased diff for the bug discussed in this thread, > > could you try again and let us know? Thanks! > > This seems to be stable now. It's been running for ne

Swap on sdhc(4) and dwmmc(4) is broken

2022-09-10 Thread Martin Pieuchot
On the rockpro64 as well as on the rpi4 if too much swapping occurs biowait() returns an error (B_ERROR) in both cases it seems to come from sdmmc_complete_xs(). I see the following: sdmmc_complete_xs: write error = 35 sdmmc_complete_xs: read error = 35 c++: B_ERROR after biowait() c++: error 4 f

arm64 (rockpro64) regression

2022-09-18 Thread Martin Pieuchot
The rockpro64 no longer boots in multi-user on -current. It hangs after displaying the following lines: rkiis0 at mainbus0 rkiis1 at mainbus0 The 8/09 snapshot works, the next one from 11/09 doesn't. bsd.rd still boots. Dmesg below. OpenBSD 7.2-beta (GENERIC.MP) #1815: Thu Sep 8 13:20:08 MDT

bse(4) media/link bug

2022-11-07 Thread Martin Pieuchot
On a raspberry pi4, with the following configuration : $ cat /etc/hostname.bse0 dhcp ...and with the cable directly connected to my laptop (amd64 w/ em(4)) I have to force the media type, with the command below, to make it work. # ifconfig bse0 media

Re: bse(4) media/link bug

2022-11-07 Thread Martin Pieuchot
On 07/11/22(Mon) 13:20, Martin Pieuchot wrote: > On a raspberry pi4, with the following configuration : > > $ cat /etc/hostname.bse0 > dhcp > > ...and with the cable directly connected to my laptop (amd64 w/ em(4)) I > have to

Re: macppc panic: vref used where vget required

2022-11-09 Thread Martin Pieuchot
On 09/09/22(Fri) 14:41, Martin Pieuchot wrote: > On 09/09/22(Fri) 12:25, Theo Buehler wrote: > > > Yesterday gnezdo@ fixed a race in uvn_attach() that lead to the same > > > assert. Here's an rebased diff for the bug discussed in this thread, > > > could you

Re: bbolt can freeze 7.2 from userspace

2022-12-18 Thread Martin Pieuchot
On 17/12/22(Sat) 14:15, David Hill wrote: > > > On 10/28/22 03:46, Renato Aguiar wrote: > > Use of bbolt Go library causes 7.2 to freeze. I suspect it is triggering > > some > > sort of deadlock in mmap because threads get stuck at vmmaplk. > > > > I managed to reproduce it consistently in a la

Re: bbolt can freeze 7.2 from userspace

2022-12-21 Thread Martin Pieuchot
On 18/12/22(Sun) 20:55, Martin Pieuchot wrote: > On 17/12/22(Sat) 14:15, David Hill wrote: > > > > > > On 10/28/22 03:46, Renato Aguiar wrote: > > > Use of bbolt Go library causes 7.2 to freeze. I suspect it is triggering > > > some > > > sort

Re: bbolt can freeze 7.2 from userspace

2022-12-21 Thread Martin Pieuchot
On 21/12/22(Wed) 09:20, David Hill wrote: > > > On 12/21/22 07:08, David Hill wrote: > > > > > > On 12/21/22 05:33, Martin Pieuchot wrote: > > > On 18/12/22(Sun) 20:55, Martin Pieuchot wrote: > > > > On 17/12/22(Sat) 14:15, David Hill wrote: &

Re: bbolt can freeze 7.2 from userspace

2023-01-20 Thread Martin Pieuchot
Hello David, On 21/12/22(Wed) 11:37, David Hill wrote: > On 12/21/22 11:23, Martin Pieuchot wrote: > > On 21/12/22(Wed) 09:20, David Hill wrote: > > > On 12/21/22 07:08, David Hill wrote: > > > > On 12/21/22 05:33, Martin Pieuchot wrote: > > > > >

Re: bbolt can freeze 7.2 from userspace

2023-01-29 Thread Martin Pieuchot
On 23/01/23(Mon) 22:57, David Hill wrote: > On 1/20/23 09:02, Martin Pieuchot wrote: > > > [...] > > > Ran it 20 times and all completed and passed. I was also able to > > > interrupt > > > it as well. no issues. > > > > > > Excellen

Re: bbolt can freeze 7.2 from userspace

2023-01-29 Thread Martin Pieuchot
On 29/01/23(Sun) 14:36, Mark Kettenis wrote: > > Date: Sun, 29 Jan 2023 12:31:22 +0100 > > From: Martin Pieuchot > > > > On 23/01/23(Mon) 22:57, David Hill wrote: > > > On 1/20/23 09:02, Martin Pieuchot wrote: > > > > > [...] > > > >

Re: bbolt can freeze 7.2 from userspace

2023-02-18 Thread Martin Pieuchot
On 24/01/23(Tue) 04:40, Renato Aguiar wrote: > Hi Martin, > > "David Hill" writes: > > > > > Yes, same result as before. This patch does not seem to help. > > > > I could also reproduce it with patched 'current' :( Here's another possible fix I came up with. The idea is to deliberately allow

Re: Repeated crashes with OpenBSD 7.2 on Raspberry Pi 4 (arm64)

2023-02-19 Thread Martin Pieuchot
Hello Tomas, Thanks for the report. I'm setting up an arm64 machine to try to reproduce the crash. Could you tell me what are the steps required to run the reproducer you quoted below? I read the buildfarm wiki page and I'm not interested in running a periodic cron job... I cloned the git repo

Re: bbolt can freeze 7.2 from userspace

2023-02-20 Thread Martin Pieuchot
On 20/02/23(Mon) 03:59, Renato Aguiar wrote: > [...] > I can't reproduce it anymore with this patch on 7.2-stable :) Thanks a lot for testing! Here's a better fix from Chuck Silvers. That's what I believe we should commit. The idea is to prevent sibling from modifying the vm_map by marking it a

Re: Repeated crashes with OpenBSD 7.2 on Raspberry Pi 4 (arm64)

2023-02-20 Thread Martin Pieuchot
Hello Tomas, On 19/02/23(Sun) 23:43, Tomas Vondra wrote: > [...] > I think it's probably easier to just try PostgreSQL build and tests > directly, without the buildfarm tooling. Ultimately that's what the > buildfarm tooling is doing, except that it tests multiple branches. > > I'd try cloning e

Re: Sparc64 livelock/system freeze w/cpu traces

2023-05-12 Thread Martin Pieuchot
On 09/05/23(Tue) 20:02, Kurt Miller wrote: > While building devel/jdk/1.8 on May 3rd snapshot I noticed the build freezing > and processes getting stuck like ps. After enabling ddb.console I was able to > reproduce the livelock and capture cpu traces. Dmesg at the end. > Let me know if more informa

Re: Sparc64 livelock/system freeze w/cpu traces

2023-05-30 Thread Martin Pieuchot
On 25/05/23(Thu) 16:33, Kurt Miller wrote: > On May 22, 2023, at 2:27 AM, Claudio Jeker wrote: > > I have seen these WITNESS warnings on other systems as well. I doubt this > > is the problem. IIRC this warning is because sys_mount() is doing it wrong > > but it is not really an issue since sys_mo

Re: Sparc64 livelock/system freeze w/cpu traces

2023-06-28 Thread Martin Pieuchot
On 28/06/23(Wed) 08:58, Claudio Jeker wrote: > On Tue, Jun 27, 2023 at 08:18:15PM -0400, Kurt Miller wrote: > > On Jun 27, 2023, at 1:52 PM, Kurt Miller wrote: > > > > > > On Jun 14, 2023, at 12:51 PM, Vitaliy Makkoveev wrote: > > >> > > >&g

Re: panic: rw_enter: vmmaplk locking agaist myself

2023-06-29 Thread Martin Pieuchot
On 28/06/23(Wed) 15:47, Moritz Buhl wrote: > Dear bugs@, > > with the following snapshot I had two panics on my x270 recently. This is a bug in iwm(4) suggesting a missing SPL protection. > sysctl kern.version > kern.version=OpenBSD 7.3-current (GENERIC.MP) #1256: Thu Jun 22 10:53:02 MDT > 2023

Re: panic: rw_enter: vmmaplk locking agaist myself

2023-06-29 Thread Martin Pieuchot
On 29/06/23(Thu) 11:17, Stefan Sperling wrote: > On Thu, Jun 29, 2023 at 10:59:32AM +0200, Martin Pieuchot wrote: > > On 28/06/23(Wed) 15:47, Moritz Buhl wrote: > > > Dear bugs@, > > > > > > with the following snapshot I had two panics on my x270 recen

Re: WireGuard(?) issues

2024-05-20 Thread Martin Pieuchot
On 19/05/24(Sun) 23:50, Vitaliy Makkoveev wrote: > > > > On 19 May 2024, at 22:05, Anthony J. Bentley wrote: > > > > Vitaliy Makkoveev writes: > >>> On 17 May 2024, at 12:06, Stuart Henderson = > >> wrote: > >>> =20 > >>> There are problems with wg(4) that people with some workloads have = > >

Re: powerpc64/pmap.c trouble report

2024-05-31 Thread Martin Pieuchot
On 30/05/24(Thu) 13:11, Eric Grosse wrote: > And, fairly quickly, another one. The load depends on what's in the Go > team build queue, which is not under my control.To avoid further > spamming the list I won't report any more of these until I can get > something reproducible under my control. Of c

arc4random lock order issue

2024-06-03 Thread Martin Pieuchot
Now that the SCHED_LOCK() is a mutex I see the following WITNESS report on arm64. witness: lock order reversal: 1st 0xff80012486e8 /usr/src/sys/dev/rnd.c:321 (/usr/src/sys/dev/rnd.c:321) 2nd 0xff800120afb0 /usr/src/sys/kern/kern_timeout.c:57 (/usr/src/sys/kern/kern_timeout.c:57) lock o

Re: panic: pool_do_get: mcl2k free list modified

2024-06-17 Thread Martin Pieuchot
On 16/06/24(Sun) 20:37, Daniel Jakots wrote: > On Sat, 15 Jun 2024 18:56:14 +0200, Jan Klemkow wrote: > > > Does ist also happend, if you disable LRO? > > > > try: > > > > ifconfig vio0 -tcplro > > Thanks for the cue, it doesn't happen indeed. This is a/the wg(4) race.

Re: Mac Studio hangs; locking problems on WITNESS/MP_LOCKDEBUG kernels

2024-06-19 Thread Martin Pieuchot
On 18/06/24(Tue) 23:34, Dana Koch wrote: > >Synopsis: Mac Studio hangs; locking problems on WITNESS/MP_LOCKDEBUG kernels > >Category: kernel > >Environment: > System : OpenBSD 7.5 > Details : OpenBSD 7.5-current (GENERIC.MP) #69: Wed Jun 12 04:43:28 MDT > 2024 > dera...@arm64.openbsd.org:

Re: Mac Studio hangs; locking problems on WITNESS/MP_LOCKDEBUG kernels

2024-06-20 Thread Martin Pieuchot
Hello Dana, Thanks again for your report. On 19/06/24(Wed) 09:37, Dana Koch wrote: > On Wed, Jun 19, 2024 at 6:58 AM Martin Pieuchot wrote: > > This is a lock order reversal reported by WITNESS. Thankfully claudio@ > > already committed a fix for this on the 16th. So please,

gdb broken on arm64/MT

2024-06-21 Thread Martin Pieuchot
So I'm trying to see where the remaining sched_yield() are coming from ld(1): $ cd /sys/arch/arm64/compile/GENERIC.MP $ LD="egdb --args ld" make -j32 Then I add a breakpoint on sched_yield & hit run. As soon as the first thread is stopped, I can see the trace as usual, however the process is now

Re: Mac Studio hangs; locking problems on WITNESS/MP_LOCKDEBUG kernels

2024-06-23 Thread Martin Pieuchot
Hello Dana, On 20/06/24(Thu) 17:16, Dana Koch wrote: > On Thu, Jun 20, 2024 at 3:33 PM Martin Pieuchot wrote: > > > > Hello Dana, > > > > Thanks again for your report. > > > > On 19/06/24(Wed) 09:37, Dana Koch wrote: > > > On Wed, Jun 19, 2024 at 6

Re: Mac Studio hangs; locking problems on WITNESS/MP_LOCKDEBUG kernels

2024-06-25 Thread Martin Pieuchot
On 24/06/24(Mon) 22:32, Dana Koch wrote: > Dana Koch schrieb am So., 23. Juni 2024, 19:50: > > > > Could you try the diff below? Stuart confirmed it prevents the hang on > > > his machine. > > > > This also seems to be working well for me so far. > > > > Okay, I've got an actual panic now, with

Re: kernel diagnostic assertion "p->p_kq->kq_refcnt.r_refs == 1" failed

2024-06-27 Thread Martin Pieuchot
Thanks, On 27/06/24(Thu) 11:24, kir...@korins.ky wrote: > [...] > > panic: kernel diagnostic assertion "p->p_kq->kq_refcnt.r_refs == 1" failed: > file "/usr/src/sys/kern/kern_event.c", line 894 > Stopped at db_enter+0x14: movq%rbp > TID PID UID PRFLAGSPFLAGS

Re: gdb broken on arm64/MT

2024-07-25 Thread Martin Pieuchot
Thanks a lot for figuring that out. This is awesome! On 24/07/24(Wed) 16:19, Claudio Jeker wrote: > On Fri, Jun 21, 2024 at 01:24:27PM +0200, Martin Pieuchot wrote: > > So I'm trying to see where the remaining sched_yield() are coming from > > ld(1): > > > &

Re: gdb broken on arm64/MT

2024-07-25 Thread Martin Pieuchot
On 25/07/24(Thu) 14:51, Claudio Jeker wrote: > On Thu, Jul 25, 2024 at 11:09:44AM +0200, Martin Pieuchot wrote: > [...] > > > Index: kern/kern_synch.c > > > === > > > RCS file: /cvs/src/sys/ker

Re: gdb broken on arm64/MT

2024-07-25 Thread Martin Pieuchot
On 25/07/24(Thu) 17:33, Claudio Jeker wrote: > On Thu, Jul 25, 2024 at 05:15:32PM +0200, Martin Pieuchot wrote: > > On 25/07/24(Thu) 14:51, Claudio Jeker wrote: > > > On Thu, Jul 25, 2024 at 11:09:44AM +0200, Martin Pieuchot wrote: > > > [...] > &g

Re: panic - ffs_write - AMD64/7.4 to current

2024-08-24 Thread Martin Pieuchot
Hugh, If you can reproduce this easily, please send a new panic with the outputs of: - show uvm - show bcstats - And the traces of all running processes... In the two reports below we only have the trace of pax(1) which is running on CPU2. The two panics are due to corruptions of two different g

Re: panic - mutex? - AMD64/7.4 to current

2024-08-24 Thread Martin Pieuchot
6:leave > x86_ipi_db(8000489eaff0) at x86_ipi_db+0x16 > x86_ipi_handler() at x86_ipi_handler+0x80 > Xresume_lapic_ipi() at Xresume_lapic_ipi+0x27 > acpicpu_idle() at acpicpu_idle+0x11f > sched_idle(8000489eaff0) at sched_idle+0x282 > end trace frame: 0x0, coun

Re: panic - mutex? - AMD64/7.4 to current

2024-08-25 Thread Martin Pieuchot
On 24/08/24(Sat) 13:14, Hugh Graham wrote: > On Sat, Aug 24, 2024 at 09:31:43PM +0200, Martin Pieuchot wrote: > > On 24/08/24(Sat) 12:09, Hugh Graham wrote: > > > The machine that slowly received the ports tree crashed upon reboot, > > > and I have included the traces. I

Re: gdb broken on arm64/MT

2024-09-06 Thread Martin Pieuchot
On 26/07/24(Fri) 08:36, Claudio Jeker wrote: > On Thu, Jul 25, 2024 at 08:20:32PM +0200, Martin Pieuchot wrote: > > On 25/07/24(Thu) 17:33, Claudio Jeker wrote: > > > On Thu, Jul 25, 2024 at 05:15:32PM +0200, Martin Pieuchot wrote: > > > > On 25/07/24(Thu) 14:51, Cla

Re: wg(4) crash

2021-03-20 Thread Martin Pieuchot
On 19/03/21(Fri) 20:15, Stuart Henderson wrote: > Not a great report but I don't have much more to go on, machine had > ddb.panic=0 and ddb hanged while printing the stack trace. Retyped by > hand, may contain typos. Happened a few hours after setting up wg on it. > > uvm_fault(0x82204e38,

firefox vs jitsi: stack exhaustion?

2021-04-08 Thread Martin Pieuchot
firefox often crash when somebody else connects to the jitsi I'm in. The trace looks like a stack exhaustion, see below. Does this ring a bell? 0 0x in ?? () #1 0x02ae66ef2359 in WasmTrapHandler(int, siginfo_t*, void*) () from /usr/local/lib/firefox/libxul.so.101.0 #2

Re: kernel panic when invoking ddb from another tty than ttyC0

2021-04-16 Thread Martin Pieuchot
On 15/04/21(Thu) 22:35, Jérôme FRGACIC wrote: > >Synopsis:kernel panic when invoking ddb from another tty than ttyC0 > >Category:kernel > >Environment: > System : OpenBSD 6.8 > Details : OpenBSD 6.8 (GENERIC) #97: Sun Oct 4 18:00:46 MDT 2020 > >

Re: i386 panic: pmap_pinit_pd_pae: can't locate PD page

2021-04-21 Thread Martin Pieuchot
On 18/04/21(Sun) 15:22, Alexander Bluhm wrote: > Hi, > > Tonight one of my i386 regress tests machines crashed. It happend > before the tests started when some scripts were copied with scp > onto the machine. This is done once a day for years, I have never > seen this panic before. Note that it

Re: kernel panic when invoking ddb from another tty than ttyC0

2021-04-21 Thread Martin Pieuchot
On 16/04/21(Fri) 16:50, Jérôme Frgacic wrote: > Thanks for your answer. :) > > > Could you set "sysctl kern.splassert=2" in order to get a useful stacktrace > > for this issue? This is probably where some attention is required. > > Sure, here is the new output I get. > > splassert: assertwaito

Re: i386 pagedaemon panic pg->wire_count == 0

2021-04-29 Thread Martin Pieuchot
On 29/04/21(Thu) 12:07, Alexander Bluhm wrote: > On Thu, Apr 29, 2021 at 11:08:30AM +0200, Mark Kettenis wrote: > > > > panic: kernel diagnostic assertion "pg->wire_count == 0" failed: file > > > > "/usr/src/sys/uvm/uvm_page.c", line 1265 > > > > I suspect pmapae.c rev 1.61 causes this issue. Do

Re: i386 pagedaemon panic pg->wire_count == 0

2021-04-29 Thread Martin Pieuchot
On 29/04/21(Thu) 16:59, Alexander Bluhm wrote: > On Thu, Apr 29, 2021 at 04:17:05PM +0200, Martin Pieuchot wrote: > > On 29/04/21(Thu) 12:07, Alexander Bluhm wrote: > > > On Thu, Apr 29, 2021 at 11:08:30AM +0200, Mark Kettenis wrote: > > > > > > panic: kernel

X1 carbon gen2 & flickering screen in X11

2021-06-18 Thread Martin Pieuchot
Default i386 install on a X1 carbon gen2, dmesg below, results in a flickering screen in X11. The experience is comparable to an high refresh on an old CRT screen and makes X11 unusable. Default Xorg.0.log attached. I tried using the "intel" driver instead of the "modesetting" by using a custom

Re: kernel: page fault trap in rw_status

2021-08-05 Thread Martin Pieuchot
Hello Thomas, Thanks a lot for your great but report, see below for an explanation and a possible fix. On 04/08/21(Wed) 12:18, Thomas L. wrote: > >Synopsis: page fault trap in rw_status > >Category: kernel > >Environment: > System : OpenBSD 6.9 > Details : OpenB

<    1   2   3   4   5   6   >