from:"Martin Pieuchot"

Re: lock order reversal in soreceive and NFS

2024-04-30 Thread Martin Pieuchot

On 27/04/24(Sat) 13:44, Visa Hankala wrote:
> On Tue, Apr 23, 2024 at 02:48:32PM +0200, Martin Pieuchot wrote:
> > [...]
> > I agree.  Now I'd be very grateful if someone could dig into WITNESS to
> > figure out why we see such reports.  Are these false positive or are we
> > missing data from the code path that we think are incorrect?
> 
> WITNESS currently cannot show lock cycles longer than two locks.
> To fix this, WITNESS needs to do a path search in the lock order graph.

Lovely!

> However, there is also something else wrong in WITNESS, possibly
> related to situations where the kernel lock comes between two rwlocks
> in the lock order. I still need to study this more.

I greatly appreciate your dedication in this area.

> Below is a patch that adds the cycle search and printing. The patch
> also tweaks a few prints to show more context.

This is ok mpi@

> With the patch, the nfsnode-vmmaplk reversal looks like this:

So the issue here is due to NFS entering the network stack after the
VFS.  Alexander, Vitaly are we far from a NET_LOCK()-free sosend()?
Is something we should consider?

On the other side, would that make sense to have a NET_LOCK()-free
sysctl path?

> witness: lock order reversal:
>  1st 0xfd8126deacf8 vmmaplk (>lock)
>  2nd 0x800039831948 nfsnode (>n_lock)
> lock order [1] vmmaplk (>lock) -> [2] nfsnode (>n_lock)
> #0  rw_enter+0x6d
> #1  rrw_enter+0x5e
> #2  VOP_LOCK+0x5f
> #3  vn_lock+0xbc
> #4  vn_rdwr+0x83
> #5  vndstrategy+0x2ca
> #6  physio+0x204
> #7  spec_write+0x9e
> #8  VOP_WRITE+0x6e
> #9  vn_write+0x100
> #10 dofilewritev+0x143
> #11 sys_pwrite+0x60
> #12 syscall+0x588
> #13 Xsyscall+0x128
> lock order [2] nfsnode (>n_lock) -> [3] netlock (netlock)
> #0  rw_enter_read+0x50
> #1  solock_shared+0x3a
> #2  sosend+0x10c
> #3  nfs_send+0x8d
> #4  nfs_request+0x258
> #5  nfs_getattr+0xcb
> #6  VOP_GETATTR+0x55
> #7  mountnfs+0x37c
> #8  nfs_mount+0x125
> #9  sys_mount+0x343
> #10 syscall+0x561
> #11 Xsyscall+0x128
> lock order [3] netlock (netlock) -> [1] vmmaplk (>lock)
> #0  rw_enter_read+0x50
> #1  uvmfault_lookup+0x8a
> #2  uvm_fault_check+0x36
> #3  uvm_fault+0xfb
> #4  kpageflttrap+0x158
> #5  kerntrap+0x94
> #6  alltraps_kern_meltdown+0x7b
> #7  _copyin+0x62
> #8  sysctl_bounded_arr+0x83
> #9  tcp_sysctl+0x546
> #10 sys_sysctl+0x17b
> #11 syscall+0x561
> #12 Xsyscall+0x128
> 
> 
> Index: kern/subr_witness.c
> ===
> RCS file: src/sys/kern/subr_witness.c,v
> retrieving revision 1.50
> diff -u -p -r1.50 subr_witness.c
> --- kern/subr_witness.c   30 May 2023 08:30:01 -  1.50
> +++ kern/subr_witness.c   27 Apr 2024 13:08:43 -
> @@ -369,6 +369,13 @@ static struct witness_lock_order_data*w
>   struct witness *child);
>  static void  witness_list_lock(struct lock_instance *instance,
>   int (*prnt)(const char *fmt, ...));
> +static void  witness_print_cycle(int(*prnt)(const char *fmt, ...),
> + struct witness *parent, struct witness *child);
> +static void  witness_print_cycle_edge(int(*prnt)(const char *fmt, ...),
> + struct witness *parent, struct witness *child,
> + int step, int last);
> +static int   witness_search(struct witness *w, struct witness *target,
> + struct witness **path, int depth, int *remaining);
>  static void  witness_setflag(struct lock_object *lock, int flag, int set);
>  
>  /*
> @@ -652,8 +659,9 @@ witness_ddb_display_descendants(int(*prn
>  
>   for (i = 0; i < indent; i++)
>   prnt(" ");
> - prnt("%s (type: %s, depth: %d)",
> -  w->w_type->lt_name, w->w_class->lc_name, w->w_ddb_level);
> + prnt("%s (%s) (type: %s, depth: %d)",
> + w->w_subtype, w->w_type->lt_name,
> + w->w_class->lc_name, w->w_ddb_level);
>   if (w->w_displayed) {
>   prnt(" -- (already displayed)\n");
>   return;
> @@ -719,7 +727,8 @@ witness_ddb_display(int(*prnt)(const cha
>   SLIST_FOREACH(w, _all, w_list) {
>   if (w->w_acquired)
>   continue;
> - prnt("%s (type: %s, depth: %d)\n", w->w_type->lt_name,
> + prnt("%s (%s) (type: %s, depth: %d)\n",
> + w->w_subtype, w->w_type->lt_name,
>   w->w_class->lc_name, w->w_ddb_level);
>   }
>  }
> @@ -1066,47 +1075,8 @@ witness_checkorder(struct lock_object *l
>

Re: lock order reversal in soreceive and NFS

2024-04-23 Thread Martin Pieuchot

On 22/04/24(Mon) 16:18, Mark Kettenis wrote:
> > Date: Mon, 22 Apr 2024 15:39:55 +0200
> > From: Alexander Bluhm 
> > 
> > Hi,
> > 
> > I see a witness lock order reversal warning with soreceive.  It
> > happens during NFS regress tests.  In /var/log/messages is more
> > context from regress.
> > 
> > Apr 22 03:18:08 ot29 /bsd: uid 0 on 
> > /mnt/regress-ffs/fstest_49fd035b8230791792326afb0604868b: out of inodes
> > Apr 22 03:18:21 ot29 mountd[6781]: Bad exports list line 
> > /mnt/regress-nfs-server
> > Apr 22 03:19:08 ot29 /bsd: witness: lock order reversal:
> > Apr 22 03:19:08 ot29 /bsd:  1st 0xfd85c8ae12a8 vmmaplk (>lock)
> > Apr 22 03:19:08 ot29 /bsd:  2nd 0x80004c488c78 nfsnode (>n_lock)
> > Apr 22 03:19:08 ot29 /bsd: lock order data w2 -> w1 missing
> > Apr 22 03:19:08 ot29 /bsd: lock order ">lock"(rwlock) -> 
> > ">n_lock"(rrwlock) first seen at:
> > Apr 22 03:19:08 ot29 /bsd: #0  rw_enter+0x6d
> > Apr 22 03:19:08 ot29 /bsd: #1  rrw_enter+0x5e
> > Apr 22 03:19:08 ot29 /bsd: #2  VOP_LOCK+0x5f
> > Apr 22 03:19:08 ot29 /bsd: #3  vn_lock+0xbc
> > Apr 22 03:19:08 ot29 /bsd: #4  vn_rdwr+0x83
> > Apr 22 03:19:08 ot29 /bsd: #5  vndstrategy+0x2ca
> > Apr 22 03:19:08 ot29 /bsd: #6  physio+0x204
> > Apr 22 03:19:08 ot29 /bsd: #7  spec_write+0x9e
> > Apr 22 03:19:08 ot29 /bsd: #8  VOP_WRITE+0x45
> > Apr 22 03:19:08 ot29 /bsd: #9  vn_write+0x100
> > Apr 22 03:19:08 ot29 /bsd: #10 dofilewritev+0x14e
> > Apr 22 03:19:08 ot29 /bsd: #11 sys_pwrite+0x60
> > Apr 22 03:19:08 ot29 /bsd: #12 syscall+0x588
> > Apr 22 03:19:08 ot29 /bsd: #13 Xsyscall+0x128
> 
> You're not talking about this one isn't it?

This also seems to be in the correct order.  vmmaplk before FS lock.
That's the order of physio(9) and uvm_fault().

> > Apr 22 03:19:08 ot29 /bsd: witness: lock order reversal:
> > Apr 22 03:19:08 ot29 /bsd:  1st 0xfd85c8ae12a8 vmmaplk (>lock)
> > Apr 22 03:19:08 ot29 /bsd:  2nd 0x80002ec41860 sbufrcv 
> > (>so_rcv.sb_lock)
> > Apr 22 03:19:08 ot29 /bsd: lock order ">so_rcv.sb_lock"(rwlock) -> 
> > ">lock"(rwlock) first seen at:
> > Apr 22 03:19:08 ot29 /bsd: #0  rw_enter_read+0x50
> > Apr 22 03:19:08 ot29 /bsd: #1  uvmfault_lookup+0x8a
> > Apr 22 03:19:08 ot29 /bsd: #2  uvm_fault_check+0x36
> > Apr 22 03:19:08 ot29 /bsd: #3  uvm_fault+0xfb
> > Apr 22 03:19:08 ot29 /bsd: #4  kpageflttrap+0x158
> > Apr 22 03:19:08 ot29 /bsd: #5  kerntrap+0x94
> > Apr 22 03:19:08 ot29 /bsd: #6  alltraps_kern_meltdown+0x7b
> > Apr 22 03:19:08 ot29 /bsd: #7  copyout+0x57
> > Apr 22 03:19:08 ot29 /bsd: #8  soreceive+0x99a
> > Apr 22 03:19:08 ot29 /bsd: #9  recvit+0x1fd
> > Apr 22 03:19:08 ot29 /bsd: #10 sys_recvfrom+0xa4
> > Apr 22 03:19:08 ot29 /bsd: #11 syscall+0x588
> > Apr 22 03:19:08 ot29 /bsd: #12 Xsyscall+0x128
> > Apr 22 03:19:08 ot29 /bsd: lock order data w1 -> w2 missing
> 
> Unfortunately we don't see the backtrace for the reverse lock order.
> So it is hard to say something sensible.  Without more information I'd
> say that taking ">so_rcv.sb_lock" before ">lock" is the
> correct lock order.

I agree.  Now I'd be very grateful if someone could dig into WITNESS to
figure out why we see such reports.  Are these false positive or are we
missing data from the code path that we think are incorrect?

Re: protection fault in amap_wipeout

2024-04-13 Thread Martin Pieuchot

On 30/03/24(Sat) 18:38, Martin Pieuchot wrote:
> Hello Alexander,
> 
> Thanks for the report.
> 
> On 01/03/24(Fri) 16:39, Alexander Bluhm wrote:
> > Hi,
> > 
> > An OpenBSD 7.4 machine on KVM running postgress and pagedaemon
> > crashed in amap_wipeout().
> > 
> > bluhm
> > 
> > kernel: protection fault trap, code=0
> > Stopped at  amap_wipeout+0x76:  movq%rcx,0x28(%rax)
> 
> The problem is an incorrect call to amap_wipeout() in OOM situation
> inside amap_copy().  At this moment the amap being copied/allocated
> is not in the global list.  That's why you see this incorrect
> dereference which corresponds to:
> 
>   amap_list_remove(amap);
> 
> > ddb{3}> show panic
> > the kernel did not panic
> > 
> > ddb{3}> trace
> > amap_wipeout(fd8015b154d0) at amap_wipeout+0x76
> > uvm_fault_check(8000232d6a20,8000232d6a58,8000232d6a80) at 
> > uvm_faul
> > t_check+0x2ad
> > uvm_fault(fd811d150748,7d42519fb000,0,1) at uvm_fault+0xfb
> > upageflttrap(8000232d6b80,7d42519fb3c0) at upageflttrap+0x65
> > usertrap(8000232d6b80) at usertrap+0x1ee
> > recall_trap() at recall_trap+0x8
> > end of kernel
> > end trace frame: 0x7d42519fb3f0, count: -6
> 
> Diff below should fix it.  I don't know how to test it.
> 
> ok?

Anyone?

> Index: uvm/uvm_amap.c
> ===
> RCS file: /cvs/src/sys/uvm/uvm_amap.c,v
> diff -u -p -r1.92 uvm_amap.c
> --- uvm/uvm_amap.c11 Apr 2023 00:45:09 -  1.92
> +++ uvm/uvm_amap.c30 Mar 2024 17:30:10 -
> @@ -662,9 +658,10 @@ amap_copy(struct vm_map *map, struct vm_
>  
>   chunk = amap_chunk_get(amap, lcv, 1, PR_NOWAIT);
>   if (chunk == NULL) {
> - /* amap_wipeout() releases the lock. */
> - amap->am_ref = 0;
> - amap_wipeout(amap);
> + amap_unlock(srcamap);
> + /* Destroy the new amap. */
> + amap->am_ref--;
> + amap_free(amap);
>   return;
>   }
>  
>

Re: protection fault in amap_wipeout

2024-03-30 Thread Martin Pieuchot

Hello Alexander,

Thanks for the report.

On 01/03/24(Fri) 16:39, Alexander Bluhm wrote:
> Hi,
> 
> An OpenBSD 7.4 machine on KVM running postgress and pagedaemon
> crashed in amap_wipeout().
> 
> bluhm
> 
> kernel: protection fault trap, code=0
> Stopped at  amap_wipeout+0x76:  movq%rcx,0x28(%rax)

The problem is an incorrect call to amap_wipeout() in OOM situation
inside amap_copy().  At this moment the amap being copied/allocated
is not in the global list.  That's why you see this incorrect
dereference which corresponds to:

amap_list_remove(amap);

> ddb{3}> show panic
> the kernel did not panic
> 
> ddb{3}> trace
> amap_wipeout(fd8015b154d0) at amap_wipeout+0x76
> uvm_fault_check(8000232d6a20,8000232d6a58,8000232d6a80) at 
> uvm_faul
> t_check+0x2ad
> uvm_fault(fd811d150748,7d42519fb000,0,1) at uvm_fault+0xfb
> upageflttrap(8000232d6b80,7d42519fb3c0) at upageflttrap+0x65
> usertrap(8000232d6b80) at usertrap+0x1ee
> recall_trap() at recall_trap+0x8
> end of kernel
> end trace frame: 0x7d42519fb3f0, count: -6

Diff below should fix it.  I don't know how to test it.

ok?

Index: uvm/uvm_amap.c
===
RCS file: /cvs/src/sys/uvm/uvm_amap.c,v
diff -u -p -r1.92 uvm_amap.c
--- uvm/uvm_amap.c  11 Apr 2023 00:45:09 -  1.92
+++ uvm/uvm_amap.c  30 Mar 2024 17:30:10 -
@@ -662,9 +658,10 @@ amap_copy(struct vm_map *map, struct vm_
 
chunk = amap_chunk_get(amap, lcv, 1, PR_NOWAIT);
if (chunk == NULL) {
-   /* amap_wipeout() releases the lock. */
-   amap->am_ref = 0;
-   amap_wipeout(amap);
+   amap_unlock(srcamap);
+   /* Destroy the new amap. */
+   amap->am_ref--;
+   amap_free(amap);
return;
}

Re: panic: "wakeup: p_stat is 2" using btrace(8) & vmd(8)

2024-03-24 Thread Martin Pieuchot

On 22/02/24(Thu) 17:24, Claudio Jeker wrote:
> On Thu, Feb 22, 2024 at 04:16:57PM +0100, Martin Pieuchot wrote:
> > On 21/02/24(Wed) 13:05, Claudio Jeker wrote:
> > > On Tue, Feb 20, 2024 at 09:34:12PM +0100, Martin Pieuchot wrote:
> > > > On 28/10/21(Thu) 05:45, Visa Hankala wrote:
> > > > > On Wed, Oct 27, 2021 at 09:02:08PM -0400, Dave Voutila wrote:
> > > > > > Dave Voutila  writes:
> > > > > > 
> > > > > > > Was tinkering on a bt(5) script for trying to debug an issue in 
> > > > > > > vmm(4)
> > > > > > > when I managed to start hitting a panic "wakeup: p_stat is 2" 
> > > > > > > being
> > > > > > > triggered by kqueue coming from the softnet kernel task.
> > > > > > >
> > > > > > > I'm running an amd64 kernel built from the tree today (latest CVS 
> > > > > > > commit
> > > > > > > id UynQo1r7kLKA0Q2p) with VMM_DEBUG option set and the defaults 
> > > > > > > from
> > > > > > > GENERIC.MP. Userland is from the latest amd snap.
> > > > > > >
> > > > > > > To reproduce, I'm running a single OpenBSD-current guest under 
> > > > > > > vmd(8)
> > > > > > > which I'm targeting with the following trivial btrace script I was
> > > > > > > working on to use for debugging something in vmm(4):
> > > > > > >
> > > > > > > tracepoint:sched:sleep / pid == $1 && tid == $2 /{
> > > > > > >   printf("pid %d, tid %d slept %d!\n", pid, tid, nsecs);
> > > > > > > }
> > > > > > >
> > > > > > > tracepoint:sched:wakeup / pid == $1 && tid == $2 /{
> > > > > > >   printf("pid %d, tid %d awoke %d!\n", pid, tid, nsecs);
> > > > > > > }
> > > > > > 
> > > > > > Even easier reproduction: if you have 2 machines and can use 
> > > > > > tcpbench(1)
> > > > > > between them, then while tcpbench is running target it with the 
> > > > > > above
> > > > > > btrace script. I've found running the script, killing it with 
> > > > > > ctrl-c,
> > > > > > and re-running it 2-3 times triggers the panic on my laptop.
> > > > > > 
> > > > > > >
> > > > > > > Both times this happened I was trying to sysupgrade the vmd(8) 
> > > > > > > guest
> > > > > > > while running the above btrace script. When I don't run the 
> > > > > > > script,
> > > > > > > there is no panic.
> > > > > > >
> > > > > > > Image of the full backtrace is here: https://imgur.com/a/swW1qoj
> > > > > > >
> > > > > > > Simple transcript of the call stack after the panic() call looks 
> > > > > > > like:
> > > > > > >
> > > > > > > wakeup_n
> > > > > > > kqueue_wakeup
> > > > > > > knote
> > > > > > > selwakekup
> > > > > > > tun_enqueue
> > > > > > > ether_output
> > > > > > > ip_output
> > > > > > > ip_forward
> > > > > > > ip_input_if
> > > > > > > ipv4_input
> > > > > > > ether_input
> > > > > > > if_input_process
> > > > > > >
> > > > > > > The other 3 cpu cores appeared to be in ipi handlers. (Image in 
> > > > > > > that
> > > > > > > imgur link)
> > > > > 
> > > > > I think the problem is recursion within the scheduler. Trace points
> > > > > invoke wakeup() directly, which is unsafe when done from within the
> > > > > run queue routines.
> > > > > 
> > > > > One approach to avoid the recursion is to defer the wakeup() with
> > > > > a soft interrupt. The scheduler runs at an elevated system priority
> > > > > level that blocks soft interrupts. The deferred wakeup() is issued 
> > > > > once
> > > > > the system priority level turns low enough.
> > > > > 
> > > > > Unfortunately, the patch will get broken when someone adds t

Re: panic: kernel diagnostic assertion "p->p_wchan == NULL" failed

2024-02-28 Thread Martin Pieuchot

On 28/02/24(Wed) 16:39, Vitaliy Makkoveev wrote:
> On Wed, Feb 28, 2024 at 02:22:31PM +0100, Mark Kettenis wrote:
> > > Date: Wed, 28 Feb 2024 16:16:09 +0300
> > > From: Vitaliy Makkoveev 
> > > 
> > > On Wed, Feb 28, 2024 at 12:36:26PM +0100, Claudio Jeker wrote:
> > > > On Wed, Feb 28, 2024 at 12:26:43PM +0100, Marko Cupać wrote:
> > > > > Hi,
> > > > > 
> > > > > thank you for looking into it, and for the advice.
> > > > > 
> > > > > On Wed, 28 Feb 2024 10:13:06 +
> > > > > Stuart Henderson  wrote:
> > > > > 
> > > > > > Please try to re-type at least the most important bits from a
> > > > > > screenshot so readers can quickly see which subsystems are involved.
> > > > > 
> > > > > Below is manual transcript of whole screenshot, hopefully no typos.
> > > > > 
> > > > > If you have any advice on what should I do if it happens again in 
> > > > > order
> > > > > to get as much info for debuggers as possible, please let me know.
> > > > > 
> > > > > splassert: assertwaitok: want 0 have 4
> > > > > panic: kernel diagnostic assertion "p->p_wchan == NULL" failed: file 
> > > > > "/usr/src/sys/kern/kern_sched.c", line 373
> > > > > Stopped at db_enter+0x14: popq %rbp
> > > > >TIDPID  UID   PRFLAGS  PFLAGS  CPU  COMMAND
> > > > > 199248  36172  577  0x10   01  openvpn
> > > > > 490874  474460   0x14000   0x2002  wg_handshake
> > > > >  71544   93110   0x14000   0x2003  softnet0
> > > > > db_enter() at db_enter+0x14
> > > > > panic(820a4b9f) at panic+0xc3
> > > > > __assert(82121fcb,8209ae5f,175,82092fbf) at 
> > > > > assert+0x29
> > > > > sched_chooseproc() at sched_chooseproc+0x26d
> > > > > mi_switch() at mi_switch+0x17f
> > > > > sleep_finish(0,1) at sleep_finish+0x107
> > > > > rw_enter(88003cf0,2) at rw_enter+0x1ad
> > > > > noise_remote_ready(88003bf0) at noise_remote_ready+0x33
> > > > > wg_qstart(fff80a622a8) at wg_qstart+0x18c
> > > > > ifq_serialize(80a622a8,80a62390) at ifq_serialize+0xfd
> > > > > hfsc_deferred(80a62000) at hfsc_deferred+0x68
> > > > > softclock_process_tick_timeout(8115e248,1) at 
> > > > > softclock_process_tick_timeout+0xfb
> > > > > softclock(0) at softclock+0xb8
> > > > > softintr_dispatch(0) at softintr_dispatch+0xeb
> > > > > end trace frame: 0x800020dbc730, count:0
> > > > > 
> > > > 
> > > > WTF! wg(4) is just broken. How the hell should a sleeping rw_lock work
> > > > when called from inside a timeout aka softclock? This is interrupt 
> > > > context
> > > > code is not allowed to sleep there.
> > > > 
> > > 
> > > Not only wg(4). Depends on interface queue usage, ifq_start() schedules
> > > (*if_qstart)() or calls it, so all the interfaces with use rwlock(9) in
> > > (*if_qstart)() handler are in risk.
> > > 
> > > What about to always schedule (*if_qstart)()?
> > 
> > Why would you want to introduce additional latence?
> > 
> 
> I suppose it the less evil than strictly deny rwlocks in (*if_qstart)().
> Anyway it will be scheduled unless `seq_len' exceeds the watermark. 

Please no.  This is not going to happen.  wg(4) has to be fixed.  Let's
not change the design of the kernel every time a bug is found.

Re: panic: kernel diagnostic assertion "p->p_wchan == NULL" failed

2024-02-28 Thread Martin Pieuchot

On 28/02/24(Wed) 12:36, Claudio Jeker wrote:
> On Wed, Feb 28, 2024 at 12:26:43PM +0100, Marko Cupać wrote:
> > Hi,
> > 
> > thank you for looking into it, and for the advice.
> > 
> > On Wed, 28 Feb 2024 10:13:06 +
> > Stuart Henderson  wrote:
> > 
> > > Please try to re-type at least the most important bits from a
> > > screenshot so readers can quickly see which subsystems are involved.
> > 
> > Below is manual transcript of whole screenshot, hopefully no typos.
> > 
> > If you have any advice on what should I do if it happens again in order
> > to get as much info for debuggers as possible, please let me know.
> > 
> > splassert: assertwaitok: want 0 have 4
> > panic: kernel diagnostic assertion "p->p_wchan == NULL" failed: file 
> > "/usr/src/sys/kern/kern_sched.c", line 373
> > Stopped at db_enter+0x14: popq %rbp
> >TIDPID  UID   PRFLAGS  PFLAGS  CPU  COMMAND
> > 199248  36172  577  0x10   01  openvpn
> > 490874  474460   0x14000   0x2002  wg_handshake
> >  71544   93110   0x14000   0x2003  softnet0
> > db_enter() at db_enter+0x14
> > panic(820a4b9f) at panic+0xc3
> > __assert(82121fcb,8209ae5f,175,82092fbf) at 
> > assert+0x29
> > sched_chooseproc() at sched_chooseproc+0x26d
> > mi_switch() at mi_switch+0x17f
> > sleep_finish(0,1) at sleep_finish+0x107
> > rw_enter(88003cf0,2) at rw_enter+0x1ad
> > noise_remote_ready(88003bf0) at noise_remote_ready+0x33
> > wg_qstart(fff80a622a8) at wg_qstart+0x18c
> > ifq_serialize(80a622a8,80a62390) at ifq_serialize+0xfd
> > hfsc_deferred(80a62000) at hfsc_deferred+0x68
> > softclock_process_tick_timeout(8115e248,1) at 
> > softclock_process_tick_timeout+0xfb
> > softclock(0) at softclock+0xb8
> > softintr_dispatch(0) at softintr_dispatch+0xeb
> > end trace frame: 0x800020dbc730, count:0
> > 
> 
> WTF! wg(4) is just broken. How the hell should a sleeping rw_lock work
> when called from inside a timeout aka softclock? This is interrupt context
> code is not allowed to sleep there.

It is indeed.  It isn't clear to me why `r_keypair_lock' is a rwlock and
not a mutex.  Does anything sleep in this code?  Why did they pick
sleeping locks in the first place?

Re: panic: "wakeup: p_stat is 2" using btrace(8) & vmd(8)

2024-02-22 Thread Martin Pieuchot

On 21/02/24(Wed) 13:05, Claudio Jeker wrote:
> On Tue, Feb 20, 2024 at 09:34:12PM +0100, Martin Pieuchot wrote:
> > On 28/10/21(Thu) 05:45, Visa Hankala wrote:
> > > On Wed, Oct 27, 2021 at 09:02:08PM -0400, Dave Voutila wrote:
> > > > Dave Voutila  writes:
> > > > 
> > > > > Was tinkering on a bt(5) script for trying to debug an issue in vmm(4)
> > > > > when I managed to start hitting a panic "wakeup: p_stat is 2" being
> > > > > triggered by kqueue coming from the softnet kernel task.
> > > > >
> > > > > I'm running an amd64 kernel built from the tree today (latest CVS 
> > > > > commit
> > > > > id UynQo1r7kLKA0Q2p) with VMM_DEBUG option set and the defaults from
> > > > > GENERIC.MP. Userland is from the latest amd snap.
> > > > >
> > > > > To reproduce, I'm running a single OpenBSD-current guest under vmd(8)
> > > > > which I'm targeting with the following trivial btrace script I was
> > > > > working on to use for debugging something in vmm(4):
> > > > >
> > > > > tracepoint:sched:sleep / pid == $1 && tid == $2 /{
> > > > >   printf("pid %d, tid %d slept %d!\n", pid, tid, nsecs);
> > > > > }
> > > > >
> > > > > tracepoint:sched:wakeup / pid == $1 && tid == $2 /{
> > > > >   printf("pid %d, tid %d awoke %d!\n", pid, tid, nsecs);
> > > > > }
> > > > 
> > > > Even easier reproduction: if you have 2 machines and can use tcpbench(1)
> > > > between them, then while tcpbench is running target it with the above
> > > > btrace script. I've found running the script, killing it with ctrl-c,
> > > > and re-running it 2-3 times triggers the panic on my laptop.
> > > > 
> > > > >
> > > > > Both times this happened I was trying to sysupgrade the vmd(8) guest
> > > > > while running the above btrace script. When I don't run the script,
> > > > > there is no panic.
> > > > >
> > > > > Image of the full backtrace is here: https://imgur.com/a/swW1qoj
> > > > >
> > > > > Simple transcript of the call stack after the panic() call looks like:
> > > > >
> > > > > wakeup_n
> > > > > kqueue_wakeup
> > > > > knote
> > > > > selwakekup
> > > > > tun_enqueue
> > > > > ether_output
> > > > > ip_output
> > > > > ip_forward
> > > > > ip_input_if
> > > > > ipv4_input
> > > > > ether_input
> > > > > if_input_process
> > > > >
> > > > > The other 3 cpu cores appeared to be in ipi handlers. (Image in that
> > > > > imgur link)
> > > 
> > > I think the problem is recursion within the scheduler. Trace points
> > > invoke wakeup() directly, which is unsafe when done from within the
> > > run queue routines.
> > > 
> > > One approach to avoid the recursion is to defer the wakeup() with
> > > a soft interrupt. The scheduler runs at an elevated system priority
> > > level that blocks soft interrupts. The deferred wakeup() is issued once
> > > the system priority level turns low enough.
> > > 
> > > Unfortunately, the patch will get broken when someone adds trace points
> > > to soft interrupt scheduling...
> > 
> > Thanks for the report.  Sorry for the delay.  I'm now really interested
> > in fixing this bug because I'm heavily using btrace(8) to analyse the
> > scheduler and I hit this panic a couple of times.
> > 
> > The problem is that `pnext' might no longer be on the sleepqueue after a
> > tracepoint inside setrunnable() fired.  Diff below fixes that by making
> > wakeup_n() re-entrant.
> > 
> > I'm not very interested in committing this diff because it relies on a
> > recursive SCHED_LOCK().  Instead I'd prefer to split wakeup_n() in two
> > stages: first unlink the threads then call setrunnable().  This approach
> > will help us untangle the sleepqueue but needs a bit more shuffling,
> > like moving unsleep() out of setrunnable()...
> > 
> > Claudio, Visa do you agree?
>
> [...] 
>
> Ugh, this is too much magic for my little brain.
> I would prefer to unqueue the procs onto a new local list and have a
> special wakeup_proc for them. The code around wakeup and unsleep will ne

Re: panic: "wakeup: p_stat is 2" using btrace(8) & vmd(8)

2024-02-20 Thread Martin Pieuchot

On 28/10/21(Thu) 05:45, Visa Hankala wrote:
> On Wed, Oct 27, 2021 at 09:02:08PM -0400, Dave Voutila wrote:
> > 
> > Dave Voutila  writes:
> > 
> > > Was tinkering on a bt(5) script for trying to debug an issue in vmm(4)
> > > when I managed to start hitting a panic "wakeup: p_stat is 2" being
> > > triggered by kqueue coming from the softnet kernel task.
> > >
> > > I'm running an amd64 kernel built from the tree today (latest CVS commit
> > > id UynQo1r7kLKA0Q2p) with VMM_DEBUG option set and the defaults from
> > > GENERIC.MP. Userland is from the latest amd snap.
> > >
> > > To reproduce, I'm running a single OpenBSD-current guest under vmd(8)
> > > which I'm targeting with the following trivial btrace script I was
> > > working on to use for debugging something in vmm(4):
> > >
> > > tracepoint:sched:sleep / pid == $1 && tid == $2 /{
> > >   printf("pid %d, tid %d slept %d!\n", pid, tid, nsecs);
> > > }
> > >
> > > tracepoint:sched:wakeup / pid == $1 && tid == $2 /{
> > >   printf("pid %d, tid %d awoke %d!\n", pid, tid, nsecs);
> > > }
> > 
> > Even easier reproduction: if you have 2 machines and can use tcpbench(1)
> > between them, then while tcpbench is running target it with the above
> > btrace script. I've found running the script, killing it with ctrl-c,
> > and re-running it 2-3 times triggers the panic on my laptop.
> > 
> > >
> > > Both times this happened I was trying to sysupgrade the vmd(8) guest
> > > while running the above btrace script. When I don't run the script,
> > > there is no panic.
> > >
> > > Image of the full backtrace is here: https://imgur.com/a/swW1qoj
> > >
> > > Simple transcript of the call stack after the panic() call looks like:
> > >
> > > wakeup_n
> > > kqueue_wakeup
> > > knote
> > > selwakekup
> > > tun_enqueue
> > > ether_output
> > > ip_output
> > > ip_forward
> > > ip_input_if
> > > ipv4_input
> > > ether_input
> > > if_input_process
> > >
> > > The other 3 cpu cores appeared to be in ipi handlers. (Image in that
> > > imgur link)
> 
> I think the problem is recursion within the scheduler. Trace points
> invoke wakeup() directly, which is unsafe when done from within the
> run queue routines.
> 
> One approach to avoid the recursion is to defer the wakeup() with
> a soft interrupt. The scheduler runs at an elevated system priority
> level that blocks soft interrupts. The deferred wakeup() is issued once
> the system priority level turns low enough.
> 
> Unfortunately, the patch will get broken when someone adds trace points
> to soft interrupt scheduling...

Thanks for the report.  Sorry for the delay.  I'm now really interested
in fixing this bug because I'm heavily using btrace(8) to analyse the
scheduler and I hit this panic a couple of times.

The problem is that `pnext' might no longer be on the sleepqueue after a
tracepoint inside setrunnable() fired.  Diff below fixes that by making
wakeup_n() re-entrant.

I'm not very interested in committing this diff because it relies on a
recursive SCHED_LOCK().  Instead I'd prefer to split wakeup_n() in two
stages: first unlink the threads then call setrunnable().  This approach
will help us untangle the sleepqueue but needs a bit more shuffling,
like moving unsleep() out of setrunnable()...

Claudio, Visa do you agree?

Index: kern/kern_synch.c
===
RCS file: /cvs/src/sys/kern/kern_synch.c,v
retrieving revision 1.200
diff -u -p -r1.200 kern_synch.c
--- kern/kern_synch.c   13 Sep 2023 14:25:49 -  1.200
+++ kern/kern_synch.c   20 Feb 2024 19:51:53 -
@@ -484,7 +484,7 @@ wakeup_proc(struct proc *p, const volati
unsleep(p);
 #ifdef DIAGNOSTIC
else
-   panic("wakeup: p_stat is %d", (int)p->p_stat);
+   panic("thread %d p_stat is %d", p->p_tid, p->p_stat);
 #endif
}
 
@@ -532,21 +532,35 @@ void
 wakeup_n(const volatile void *ident, int n)
 {
struct slpque *qp;
-   struct proc *p;
-   struct proc *pnext;
+   struct proc *p, piter;
int s;
 
+   memset(, 0, sizeof(piter));
+   piter.p_stat = SMARKER;
+
SCHED_LOCK(s);
qp = [LOOKUP(ident)];
-   for (p = TAILQ_FIRST(qp); p != NULL && n != 0; p = pnext) {
-   pnext = TAILQ_NEXT(p, p_runq);
+   TAILQ_INSERT_HEAD(qp, , p_runq);
+   while (n != 0) {
+   p = TAILQ_NEXT(, p_runq);
+   if (p == NULL)
+   break;
+
+   /* Move marker forward */
+   TAILQ_REMOVE(qp, , p_runq);
+   TAILQ_INSERT_AFTER(qp, p, , p_runq);
+
+   if (p->p_stat == SMARKER)
+   continue;
+
 #ifdef DIAGNOSTIC
if (p->p_stat != SSLEEP && p->p_stat != SSTOP)
-   panic("wakeup: p_stat is %d", (int)p->p_stat);
+   panic("thread %d p_stat is %d", p->p_tid, p->p_stat);
 #endif
if (wakeup_proc(p, ident,

Re: Sparc64 rthreads Instablilty

2024-02-16 Thread Martin Pieuchot

On 15/02/24(Thu) 20:06, Kurt Miller wrote:
> On Feb 15, 2024, at 3:01 PM, Miod Vallat  wrote:
> > 
> >> Has been running for the last few hours without any issue.
> >> OK claudio@ on that diff.
> > 
> > But it's your diff! I only polished it a bit.
> > 
> 
> I have also been testing various versions of my test
> program for a few hours as well. It has not reproduced the
> problem. I’ve been using miod’s version of the diff.
> 
> Thank you Claudio for finding the root cause. I’m sure this
> will help more than the jdk on sparc64. Okay kurt@ as well.

Happy to see this fixed, ok mpi@

Re: Sparc64 livelock/system freeze w/cpu traces

2023-09-02 Thread Martin Pieuchot

On 28/06/23(Wed) 20:07, Kurt Miller wrote:
> On Jun 28, 2023, at 7:16 AM, Martin Pieuchot  wrote:
> > 
> > On 28/06/23(Wed) 08:58, Claudio Jeker wrote:
> >> 
> >> I doubt this is a missing wakeup. It is more the system is thrashing and
> >> not making progress. The SIGSTOP causes all threads to park which means
> >> that the thread not busy in its sched_yield() loop will finish its 
> >> operation
> >> and then on SIGCONT progress is possible.
> >> 
> >> I need to recheck your ps output from ddb but I guess one of the threads
> >> is stuck in a different place. That is where we need to look.
> >> It may well be a bad interaction between SCHED_LOCK() and whatever else is
> >> going on.
> > 
> > Or simply a poor userland scheduling based on sched_yield()...
> > 
> > To me it seems there are two bugs in your report:
> > 
> > 1/ a deadlock due to a single rwlock in sysctl(2)
> > 
> > 2/ something unknown in java not making progress and calling
> >  sched_yield() and triggering 1/ 
> > 
> > While 1/ is well understood 2/ isn't.  Why is java not making progress
> > is what should be understood.  Knowing where is the sched_yield() coming
> > from can help.
> 
> Okay. I dug into 2/ and believe I understand what’s happening there.
> The short version is that many threads are calling sched_yield(2) and
> that’s somehow preventing either an mmap or munmap call from completing.

I don't understand how that could happen.  sched_yield() shouldn't
prevent anything from happening.  This syscall doesn't require the
KERNEL_LOCK() nor mmap/munmap(2).

> Java spawns a number of GCTaskThread threads for doing tasks like garbage
> collection. The number of threads depends on the number of physical cpus
> in the machine. My machine as 64, java spawns 43 GCTaskThreads. When there’s
> nothing to do these threads are waiting on a condition variable. When there’s
> work to do it all 43 threads are clamoring to get the work done in a design
> that I find a bit unusual.
> 
> Until all 43 threads have no work to do they all continue to check for more
> work. If there’s none but at least one thread is not done yet it does
> the following; a certain number of hard spins or a sched_yield(2) call
> or a 1ms sleep (via pthread_cond_timedwait(3) configured w/CLOCK_MONOTONIC).
> After either a hard spin, sched_yield or the 1ms sleep it rechecks for more
> work to do. If there still no work it repeats the above until all 43 threads
> have no work to do. 
> 
> The relavant code is here with the defaults for some vars:
> 
> https://github.com/battleblow/jdk8u/blob/jdk8u372-ga/hotspot/src/share/vm/utilities/taskqueue.cpp#L153
> 
> uintx WorkStealingHardSpins = 4096
> {experimental}
> uintx WorkStealingSleepMillis   = 1   
> {experimental}
> uintx WorkStealingSpinToYieldRatio  = 10  
> {experimental}
> uintx WorkStealingYieldsBeforeSleep = 5000
> {experimental}
> 
> What I see when java is stuck in what I was calling run-away state, is
> that one thread (not necessarily a GCTaskThread) is stuck in mmap or
> munmap called via malloc/free and the set of 43 GCTaskThreads are trying
> to finish up a task.  Based on the feedback from you, Claudio
> and Vitaliy, I’m assuming that the frequent calls to sched_yield are 
> preventing the mmap/munmap call from completing. Since malloc made
> the call, the malloc mutex is held and about 20 other threads block
> waiting on the one thread to complete mmap/munmap. In unmodified
> -current the sched_yield calls are sufficient to prevent the mmap/unmap
> call from ever completing leading to 1/.

We're making progress here.  The question is now why/where is the thread
stuck in mmap/munmap...

> I’ve attached two debug sessions of a run-away java process. In each I
> have sorted the thread back traces into the following categories:
> 
> Stuck thread
> Run away threads
> Malloc mutex contention threads
> Appear to be normal condition var or sem wait threads
> 
> I’m currently testing reducing the number of sched_yield calls before
> sleeping 1ms from 5000 down to 10. In some limited testing this appears
> to be a work-around for the mmap/munmap starvation problem. I will
> do a long test to confirm this is a sufficent work-around for the problem.
> 

> Stuck thread (on proc or run state not making progress):
> 
> Thread 35 (thread 559567 of process 32495):
> #0  map (d=0xdc4e3717e0, sz=8192, zero_fill=0) at malloc.c:865

This corresponds to a ca

Re: Sparc64 rthreads Instablilty

2023-09-02 Thread Martin Pieuchot

On 13/08/23(Sun) 22:59, Kurt Miller wrote:
> I’ve been hunting an intermittent jdk crash on sparc64 for some time now.
> Since egdb has not been up to the task, I created a small c program which
> reproduces the problem. This partially mimics the jdk startup where a number
> of detached threads are created. When each thread is created the main thread
> waits for it to start and change state. In my test program I then have the 
> detached thread wait for a condition that will not happen (parked waiting
> on a condition var).
> 
> When the intermittent crash occurs, one of two things happen; a segfault or
> the process has been killed by the kernel. The segfault cores are similar to
> what I see with the jdk crashes. It looks like the stack of the thread 
> creating
> the threads is corrupted. In this case it is the primordial thread. In the jdk
> it is a different thread but its the thread that called pthread_create that
> has it stack wiped out.

I have seen similar symptoms on x86 with go & rust when unlocking the
fault handler.  I wonder if grabbing the KERNEL_LOCK() around uvm_fault()
in sparc64/trap.c makes the problem disappear...

> The problem is sparc64 specific. The jdk doesn’t crash like this on arm64,
> amd64 or i386 and the test program has run for over 1 million iterations on
> each of those platforms without an issue. 
> 
>  startup.c 
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
> 
> typedef struct {
> long t_num;
> pthread_t t_pthread_id;
> 
> /* sync startup */
> long t_state;
> pthread_mutex_t t_mutex;
> pthread_cond_t  t_cond_var;
> } thread_t;
> 
> #define NTHREADS 40
> 
> thread_t threads[NTHREADS];
> 
> void
> init_threads() {
> long t_num;
> for (t_num=0; t_num < NTHREADS; t_num++) {
> threads[t_num].t_num = t_num;
> threads[t_num].t_state = 0;
> if (pthread_mutex_init([t_num].t_mutex, NULL) != 0)
> err(1, "pthread_mutex_init failed");
> 
> if (pthread_cond_init([t_num].t_cond_var, NULL) != 0)
> err(1, "pthread_cond_init failed");
> }
> }
> 
> void *
> thread_start(thread_t *thread) {
> pthread_mutex_lock(>t_mutex);
> thread->t_state = 1;
> pthread_cond_broadcast(>t_cond_var);
>  
> while (thread->t_state != 2)
>pthread_cond_wait(>t_cond_var, >t_mutex); 
> 
> pthread_mutex_unlock(>t_mutex);
> 
> return(NULL);
> }
> 
> void
> create_thread(thread_t *thread) {
> pthread_attr_t attr;
> if (pthread_attr_init() != 0)
> err(1, "pthread_attr_init failed");
> 
> if (pthread_attr_setdetachstate(, PTHREAD_CREATE_DETACHED) != 0)
> err(1, "pthread_attr_setdetachstate failed");
> 
> int ret = pthread_create(>t_pthread_id, , (void* (*)(void*)) 
> thread_start, thread);
> pthread_attr_destroy();
> if (ret != 0 )
> err(1, "pthread_create failed");
>  
> /* wait for thread startup */
> pthread_mutex_lock(>t_mutex);
> while (thread->t_state == 0) 
>pthread_cond_wait(>t_cond_var, >t_mutex);
> pthread_mutex_unlock(>t_mutex);
> }
> 
> int
> main( int argc, char *argv[] )
> {
> long t_num;
> 
> init_threads();
> 
> /* startup threads */
> for (t_num=0; t_num < NTHREADS; t_num++) {
> create_thread([t_num]);
> }
> 
> return 0;
> }
> =
> 
> This counts the number of iterations until the test program fails and
> restarts the count.
> 
> oracle$ i=0; while true; do if ! ./startup; then echo $i; i=0; else 
> i=$((i+1)); fi done 
> Killed 
> 1904
> Segmentation fault (core dumped) 
> 12875
> Segmentation fault (core dumped) 
> 2104
> Killed 
> 12189
> Segmentation fault (core dumped) 
> 16616
> Segmentation fault (core dumped) 
> 912
> Segmentation fault (core dumped) 
> 7604
> Segmentation fault (core dumped) 
> 5508
> Segmentation fault (core dumped) 
> 4820
> Segmentation fault (core dumped) 
> 7349
> Segmentation fault (core dumped) 
> 10939
> Segmentation fault (core dumped) 
> 71
> Segmentation fault (core dumped) 
> 3844
> Segmentation fault (core dumped) 
> 977
> Killed 
> 744
> Segmentation fault (core dumped) 
> 7026
> Segmentation fault (core dumped) 
> 94
> Segmentation fault (core dumped) 
> 2517
> Segmentation fault (core dumped) 
> 452
> Segmentation fault (core dumped) 
> 8007
> Segmentation fault (core dumped) 
> 3109
> Killed 
> 1969
> Segmentation fault (core dumped) 
> 9162
> Killed 
> 5705
> Segmentation fault (core dumped) 
> 4990
> Segmentation fault (core dumped) 
> 3972
> Segmentation fault (core dumped) 
> 857
> Segmentation fault (core dumped) 
> 3034
> Segmentation fault (core dumped) 
> 454
> Segmentation fault (core dumped) 
> 8951
> Killed 
> 94
> Segmentation fault (core dumped) 
> 3942
> Segmentation fault (core dumped) 
> 4680
> Killed 
> 1322
> Killed 
> 1164
> Killed 
> 5283
> Segmentation fault (core dumped) 
> 122
> Segmentation fault (core dumped) 
> 3232
> Segmentation fault (core dumped)

Re: resume failures/lockups

2023-09-02 Thread Martin Pieuchot

Hello Ross,

On 27/08/23(Sun) 15:16, Ross L Richardson wrote:
> For the past several weeks (using -current), I've had problems with
> resume on an amd64 desktop.  It's intermittent (but if anything
> becoming increasingly frequent).

If you can still reproduce the issue, please try enabling WITNESS in
your kernel.  I fear it might be a missing unlock in an error path.

> When seen, the issue goes like this:
> - suspend (from X session) is uneventful
> - on resume, the system goes back to sleep after a few seconds
>   - what happens then varies:
>   - the next attempt at resume may succeed
>   - the back-to-sleep thing may happen again
>   - the system may appear to wake, but with blank screen
> and no response to ssh attempts
>   - the system may show the console
>   - often, the password I was typing to Xlock
> may be inserted at "login:"
>   - trying to switch to X generally shows an
> unresponsive debugger
> 
> I've waited to report it because I wanted evidence from a fairly
> up-to-date system.  This is on -current built from source on Thursday.
> 
> Debug log [Friday] (largely transcribed by hand):
> 
> Login: panic: kernel diagnostic assertion "__mp_lock_held(_lock, 
> curcpu*()) == 0" failed: file "/sys/kern/kern_lock.c" line 63
> Stopped at  db_enter+0x14:   popq   %rbp
> TIDPIDUIDPRFLAGSPFLAGS  CPU  COMMAND
> db_enter() at db_enter+0x14
> panic(820a365c) at panic+0xc3
> assert (8212052b,820e678d,3f,82157f0d) at 
> __assert+0x29
> _kernel_lock() at _kernel_lock+0x10F
> selwakeup(81c57990) at selwakeup+0x15
> ptsstart (81c46000) at ptsstart+0x7d
> tputchar(73,81c46000) at tputchar+0x88
> kputchar(73,5,0) at kputchar+0x8d
> kprintf() at kprintf +0x408
> printf(8214f759) at printf+0x74
> splassert_fail(0,5,8210f13a) at splassert_fail+0x46
> assertwaitok() at assertwaitok+0x40
> mi_switch() at mi_switch+0x41
> sleep_finish(0,1) at sleep_finish+0x107
> end trace frame: Ox8000234656c0, count: 0
> https://www.openbsd.org/ddb.html describes the minimum info required in bug
> reports.  Insufficient info makes it difficult to find and fix bugs.
> ddb{0}>
> 
> 
> dmesg at end is from today, following another lockup.
> 
> Ross
> 
> 
> OpenBSD 7.3-current (GENERIC.MP) #180: Thu Aug 24 10:49:00 AEST 2023
> nob...@host.rlr.id.au:/sys/arch/amd64/compile/GENERIC.MP
> real mem = 33636618240 (32078MB)
> avail mem = 32597454848 (31087MB)
> random: good seed from bootblocks
> mpath0 at root
> scsibus0 at mpath0: 256 targets
> mainbus0 at root
> bios0 at mainbus0: SMBIOS rev. 3.3 @ 0xe6cc0 (29 entries)
> bios0: vendor American Megatrends International, LLC. version "P2.30" date 
> 02/25/2022
> bios0: ASRock B550 Phantom Gaming-ITX/ax
> efi0 at bios0: UEFI 2.7
> efi0: American Megatrends rev 0x50011
> acpi0 at bios0: ACPI 6.0
> acpi0: sleep states S0 S3 S4 S5
> acpi0: tables DSDT FACP SSDT SSDT SSDT FIDT MCFG AAFT HPET BGRT TPM2 SSDT 
> CRAT CDIT SSDT SSDT SSDT WSMT APIC SSDT SSDT SSDT SSDT FPDT
> acpi0: wakeup devices GPP6(S4) GP17(S4) XHC0(S4) XHC1(S4) GPP0(S4) GPP5(S4) 
> GPP3(S4)
> acpitimer0 at acpi0: 3579545 Hz, 32 bits
> acpimcfg0 at acpi0
> acpimcfg0: addr 0xf000, bus 0-127
> acpihpet0 at acpi0: 14318180 Hz
> acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
> cpu0 at mainbus0: apid 0 (boot processor)
> cpu0: AMD Ryzen 7 PRO 5750GE with Radeon Graphics, 3200.00 MHz, 19-50-00, 
> patch 0a5c
> cpu0: 
> FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,HTT,SSE3,PCLMUL,MWAIT,SSSE3,FMA3,CX16,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,NXE,MMXX,FFXSR,PAGE1GB,RDTSCP,LONG,LAHF,CMPLEG,SVM,EAPICSP,AMCR8,ABM,SSE4A,MASSE,3DNOWP,OSVW,IBS,SKINIT,TCE,TOPEXT,CPCTR,DBKP,PCTRL3,MWAITX,ITSC,FSGSBASE,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,PQM,RDSEED,ADX,SMAP,CLFLUSHOPT,CLWB,SHA,UMIP,PKU,IBPB,IBRS,STIBP,STIBP_ALL,IBRS_PREF,IBRS_SM,SSBD,XSAVEOPT,XSAVEC,XGETBV1,XSAVES
> cpu0: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 512KB 
> 64b/line 8-way L2 cache, 16MB 64b/line 16-way L3 cache
> cpu0: smt 0, core 0, package 0
> mtrr: Pentium Pro MTRR support, 8 var ranges, 88 fixed ranges
> cpu0: apic clock running at 100MHz
> cpu0: mwait min=64, max=64, C-substates=1.1, IBE
> cpu1 at mainbus0: apid 2 (application processor)
> cpu1: AMD Ryzen 7 PRO 5750GE with Radeon Graphics, 3200.00 MHz, 19-50-00, 
> patch 0a5c
> cpu1: 
>

Re: panic: rw_enter: vmmaplk locking agaist myself

2023-06-29 Thread Martin Pieuchot

On 29/06/23(Thu) 11:17, Stefan Sperling wrote:
> On Thu, Jun 29, 2023 at 10:59:32AM +0200, Martin Pieuchot wrote:
> > On 28/06/23(Wed) 15:47, Moritz Buhl wrote:
> > > Dear bugs@,
> > > 
> > > with the following snapshot I had two panics on my x270 recently.
> > 
> > This is a bug in iwm(4) suggesting a missing SPL protection.
> > 
> > > sysctl kern.version
> > > kern.version=OpenBSD 7.3-current (GENERIC.MP) #1256: Thu Jun 22 10:53:02 
> > > MDT 2023
> > > dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> > > 
> > > Below are transcribed pictures of my laptop screen.
> > > 
> > > panic: rw_enter: vmmaplk locking against myself
> > > Stopped atdb_enter+0x14:  popq%rbp
> > > TID   PID UID PRFLAGS PFLAGS  CPU COMMAND
> > > *258766   67401   10000x212   0x400   0K  firefox
> > >  465097   28019   0   0x14000 0x200   1   drmwq
> > > db_enter () at db_enter+0x14
> > > panic(820e78b0) at panic+0xc3
> > > rw_enter(fd87449a0f60,2) at rw_enter+0x26f
> > > uvmfault_lookup(800044cc3a30,0) at uvmfault_lookup+0x8a
> > > uvm_fault_check(800044cc3a30, 800044cc3a68,800044cc3a90) at 
> > > uvm_fault_check+0x36
> > > uvm_fault(fd87449a0e78,ab6ed8ea000,0,1) at uvm_fault+0xfb
> > > kpageflttrap(800044cc3bb0, ab6ed8ea088) at kpageflttrap+0x171
> > > kerntrap(800044cc3bb0) at kerntrap+0x95
> > > alltraps_kern_meltdown() at alltraps_kern_meltdown+0x7b
> > > _rb_min(823f89a8,80278060) at _rb_min+0x23
> > > ieee80211_clean_inactive_nodes(80277048,a) at 
> > > ieee80211_clean_inactive_nodes+0x4c
> > 
> > Looks like a corruption in RB-tree used inside 
> > ieee80211_clean_inactive_nodes().
> > 
> > Since this is coming from interrupt handler it suggest a missing spl
> > dance.
> 
> iwm_intr already runs at IPL_NET. What else would be required?

Are we sure all accesses to `ic_tree' are run under KERNEL_LOCK()+splnet()?

Re: panic: rw_enter: vmmaplk locking agaist myself

2023-06-29 Thread Martin Pieuchot

On 28/06/23(Wed) 15:47, Moritz Buhl wrote:
> Dear bugs@,
> 
> with the following snapshot I had two panics on my x270 recently.

This is a bug in iwm(4) suggesting a missing SPL protection.

> sysctl kern.version
> kern.version=OpenBSD 7.3-current (GENERIC.MP) #1256: Thu Jun 22 10:53:02 MDT 
> 2023
> dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> 
> Below are transcribed pictures of my laptop screen.
> 
> panic: rw_enter: vmmaplk locking against myself
> Stopped atdb_enter+0x14:  popq%rbp
> TID   PID UID PRFLAGS PFLAGS  CPU COMMAND
> *258766   67401   10000x212   0x400   0K  firefox
>  465097   28019   0   0x14000 0x200   1   drmwq
> db_enter () at db_enter+0x14
> panic(820e78b0) at panic+0xc3
> rw_enter(fd87449a0f60,2) at rw_enter+0x26f
> uvmfault_lookup(800044cc3a30,0) at uvmfault_lookup+0x8a
> uvm_fault_check(800044cc3a30, 800044cc3a68,800044cc3a90) at 
> uvm_fault_check+0x36
> uvm_fault(fd87449a0e78,ab6ed8ea000,0,1) at uvm_fault+0xfb
> kpageflttrap(800044cc3bb0, ab6ed8ea088) at kpageflttrap+0x171
> kerntrap(800044cc3bb0) at kerntrap+0x95
> alltraps_kern_meltdown() at alltraps_kern_meltdown+0x7b
> _rb_min(823f89a8,80278060) at _rb_min+0x23
> ieee80211_clean_inactive_nodes(80277048,a) at 
> ieee80211_clean_inactive_nodes+0x4c

Looks like a corruption in RB-tree used inside ieee80211_clean_inactive_nodes().

Since this is coming from interrupt handler it suggest a missing spl
dance.

> ieee80211_end_scan(80277048) at ieee80211_end_scan+0xc8
> iwm_rx_pkt(80277000,802f6210,800044cc3e10) at 
> iwm_rx_pkt+0x871
> iwm_notif_intr(8027700) at iwm_notif_intr+0xd3
> ent trace frame: 0x800044cc3eb0, count: 0
> https://www.openbsd.org/ddb.html describes the minimum info required in bugr
> reports.  Insufficient info makes it difficult to find and fix bugs.
> ddb{0}> show reg
> rdi   0
> rsi   0x14
> rbp   0x800044cc3720
> rbx   0
> rdx   0x20
> rcx   0x20
> rax   0x30
> r80
> r90
> r10   0xfff800044cc3528
> r11   0xbe1d4bcaa3793568
> r12   0x82481990  cpu_info_full_primary+0x2990
> r13   0x800044bbee04
> r14   0
> r15   0x820e78b0  cmd680_setup_channel.udma_tbl+0x67e3
> rip   0x81d55bc4  db_enter+0x14
> cs0x8
> rflags0x282
> rsp   0x800044cc3720
> ss0x10
> db_enter+0x14:popq%rbp
> 
> 
> the previous panic looked similar except that there was a panic
> during that panic:
> dbb{0}>bt
> db_enter() at db_enter+0x14
> panic(820a0212) at panic+0xc3
> __assert(8211a18a,820eab20,3f,8215092f) at 
> __assert+0x29
> _kernel_lock() at _kernel_lock+0x10f
> selwakeup(819cc710) at selwakeup+0x15
> ptsstart(819a6c00) at ptsstart+0x7d
> tputchar(73,819a6c00) at tputchar+0x88
> kputchar(73,5,0) at kputchar+0x8d
> printf(8214c975) at printf+0x74
> splassert_fail(0,7,821069d9) at splassert_fail+0x46
> assertwaitok() at assertwaitok+0x40
> mi_switch() at mi_switch+0x44
> sleep_finish(800022f4ec50,1) at sleep_finish+0x102
> msleep(83cb5410,83cb5418,0,820b8504,3e8) at 
> msleep+0xcb
> drm_atomic_helper_wait_for_flip_done(80257078,839ea000) at 
> drm_atomic_helper_wait_for_flip_done+0xcf
> intel_atomic_commit_tail(839ea000) at intel_atomic_commit_tail+0xc26
> intel_atomic_commit(80257078,839ea000,0) at 
> intel_atomic_commit+0x33d
> drm_atomic_commit(839ea000) at drm_atomic_commit+0xa7
> drm_client_modeset_commit_atomic(812dee00,1,0) at 
> drm_client_modeset_commit_atomic+0x178
> drm_client_modeset_commit_locket(812dee00) at 
> drm_client_modeset_commitlocked+0x59
> drm_fb_helper_restore_fbdev_mode_unlocked(812dee00) at 
> drm_fb_helper_restore_fbdev_mode_unlocked+0x48
> intel_fbdev_restore_mode(800257078) at intel_fbdev_restore_mode+0x37
> db_ktrap(4,0,800022f4f130) at db_ktrap+0x30
> kerntrap(800022f4130) at kerntrap+0xa8
> alltraps_kern_meltdown() at alltraps_kern_meltdown+0x7b
> _rb_min(8220f7b8,80278060) at rb_min+0x23
> ieee80211_clean_inactive_nodes(80277048,a) at 
> ieee80211_clean_inactive_nodes+0x4c
> ieee80211_end_scan(80277048) at ieee80211_end_scan+0xc8
> iwm_rx_pkt(80277000,802f6500,800022f4f390) at 
> iwm_rx_pkt+0x871
> iwm_notif_intr(80277000) at iwm_notif_intr+0xd3
> intr_handler(800022f4f490,80234e80) at intr_handler+0x72
> Xinter_ioapic_edge23_untramp(() at Xintr_ioapic_edge23_untramp+0x18f
> acpicpu_idle() at acpicpu_idle+0x203
> sched_idle(82464ff0) at sched_idle+0x280
> end trace frame: 0x0, count: -36
> ddb{0}> show panic
> *cpu0: kernel diagnostic assertion "__mp_lock_held(_lock, curcpu()) == 
> 0" failed: file "/usr/src/sys/kern/kerndlock.c" line 63

Re: Sparc64 livelock/system freeze w/cpu traces

2023-06-28 Thread Martin Pieuchot

On 28/06/23(Wed) 08:58, Claudio Jeker wrote:
> On Tue, Jun 27, 2023 at 08:18:15PM -0400, Kurt Miller wrote:
> > On Jun 27, 2023, at 1:52 PM, Kurt Miller  wrote:
> > > 
> > > On Jun 14, 2023, at 12:51 PM, Vitaliy Makkoveev  wrote:
> > >> 
> > >> On Tue, May 30, 2023 at 01:31:08PM +0200, Martin Pieuchot wrote:
> > >>> So it seems the java process is holding the `sysctl_lock' for too long
> > >>> and block all other sysctl(2).  This seems wrong to me.  We should come
> > >>> up with a clever way to prevent vslocking too much memory.  A single
> > >>> lock obviously doesn't fly with that many CPUs. 
> > >>> 
> > >> 
> > >> We vslock memory to prevent context switch while doing copyin() and
> > >> copyout(), right? This is required for avoid context switch within 
> > >> foreach
> > >> loops of kernel lock protected lists. But this seems not be required for
> > >> simple sysctl_int() calls or rwlock protected data. So sysctl_lock
> > >> acquisition and the uvm_vslock() calls could be avoided for significant
> > >> count of mibs and pushed deep down for the rest.
> > > 
> > > I’m back on -current testing and have some additional findings that
> > > may help a bit. The memory leak fix had no effect on this issue. -current
> > > behavior is as I previously described. When java trips the issue, it 
> > > goes into a state where many threads are all running at 100% cpu but 
> > > does not make forward progress. I’m going to call this state run-away java
> > > process. Java is calling sched_yield(2) when in this state.
> > > 
> > > When java is in run-away state, a different process can trip
> > > the next stage were processes block waiting on sysctllk indefinitely.
> > > Top with process arguments is one, pgrep and ps -axl also trip this.
> > > My last test on -current java was stuck in run-away state for 7 hours
> > > 45 minutes before cron daily ran and cause the lockups.
> > > 
> > > I did a test with -current + locking sched_yield() back up with the
> > > kernel lock. The behavior changed slightly. Java still enters run-away
> > > state occasionally but eventually does make forward progress and 
> > > complete. When java is in run-away state the sysctllk issue can still
> > > be tripped, but if it is not tripped java eventually completes. For 
> > > about 200 invocations of a java command that usually takes 50 seconds
> > > to complete, 4 times java entered run-away state but eventually completed:
> > > 
> > > Typically it runs like this:
> > >0m51.16s real 5m09.37s user 0m49.96s system
> > > 
> > > The exceptions look like this:
> > >1m11.15s real 5m35.88s user13m20.47s system 
> > >   27m18.93s real31m13.19s user   754m48.41s system
> > >   13m44.44s real19m56.11s user   501m39.73s system 
> > >   19m23.72s real24m40.97s user   629m08.16s system
> > > 
> > > Testing -current with dumbsched.3 behaves the same as -current described
> > > above.
> > > 
> > > One other thing I observed so far is what happens when egdb is 
> > > Attached to the run-away java process. egdb stops the process
> > > using ptrace(2) PT_ATTACH. Now if I issue a command that would
> > > typically lock up the system like top displaying command line
> > > arguments, the system does not lock up. I think this rules out
> > > the kernel memory is fragmented theory.
> > > 
> > > Switching cpu’s in ddb tends to lock up ddb so I have limited
> > > info but here what I have from -current lockup and -current
> > > with dumbsched.3 lockup. 
> > 
> > Another data point to support the idea of a missing wakeup; when
> > java is in run-away state, if I send SIGSTOP followed by SIGCONT
> > it dislodges it from run-away state and returns to normal operation.
> 
> I doubt this is a missing wakeup. It is more the system is thrashing and
> not making progress. The SIGSTOP causes all threads to park which means
> that the thread not busy in its sched_yield() loop will finish its operation
> and then on SIGCONT progress is possible.
> 
> I need to recheck your ps output from ddb but I guess one of the threads
> is stuck in a different place. That is where we need to look.
> It may well be a bad interaction between SCHED_LOCK() and whatever else is
> going on.

Or simply a poor userland scheduling based on sched_yield()...

To me it seems there are two bugs in your report:

1/ a deadlock due to a single rwlock in sysctl(2)

2/ something unknown in java not making progress and calling
  sched_yield() and triggering 1/ 

While 1/ is well understood 2/ isn't.  Why is java not making progress
is what should be understood.  Knowing where is the sched_yield() coming
from can help.

Re: Sparc64 livelock/system freeze w/cpu traces

2023-05-30 Thread Martin Pieuchot

On 25/05/23(Thu) 16:33, Kurt Miller wrote:
> On May 22, 2023, at 2:27 AM, Claudio Jeker  wrote:
> > I have seen these WITNESS warnings on other systems as well. I doubt this
> > is the problem. IIRC this warning is because sys_mount() is doing it wrong
> > but it is not really an issue since sys_mount is not called often.
> 
> Yup. I see that now that I have tested witness on several arches. They all
> show this lock order reversal right after booting the system. I guess this
> means what I am seeing isn’t something that witness detects.
> 
> On -current with my T4-1, I can reliably reproduce the issues I am seeing.
> While the problem is intermittent I can’t get very far into the jdk build 
> without
> tripping it. Instructions for reproducing the issue are:
> 
> Add wxallowed to /usr/local/ and /usr/ports (or wherever WRKOBJDIR has
> been changed to)
> 
> doas pkg_add jdk zip unzip cups-libs bash gmake libiconv giflib
> 
> cd /usr/ports/devel/jdk/1.8
> FLAVOR=native_bootstrap make
> 
> There are two stages to the problem. A java command (or javac or javah)
> gets stuck making forward progress and nearly all of its cpu time is in
> sys time category. You can see this in top as 1500-3000% CPU time on
> the java process. ktrace of the process in this state shows endless
> sched_yield() calls. Debugging shows many threads in
> pthread_cond_wait(3). The condition vars are configured to use
> CLOCK_MONOTONIC.
> 
> The second stage of the problem is when things lock up. While java is
> spinning in this sched_yield() state, if you display the process arguments in
> top (pressing the right arrow) you trip the lockups. top stops responding.
> getty will reprompt if enter is pressed, but locks up if a username is 
> entered.
> Most processes lock up when doing anything after this point. ddb ps at this
> stage shows top waiting on vmmaplk and the rest of the stuck processes
> waiting on sysctllk (sshd, systat, login).

So it seems the java process is holding the `sysctl_lock' for too long
and block all other sysctl(2).  This seems wrong to me.  We should come
up with a clever way to prevent vslocking too much memory.  A single
lock obviously doesn't fly with that many CPUs. 

Once that's improved it should be easier to debug the java issue.
sched_yield() is called from _spinlock() in librthread.  sparc64 should
use the futex version of pthread_cond_wait(3) which doesn't rely on
_spinlock() and sched_yield(2).  So I'm puzzled.  This seems like a poor
man scheduling issue hoping that another thread/process will make
progress.  Can you figure out where this sched_yield() is coming from?

> I tried bisecting when this was introduced but as I go back in time with
> kernels it becomes more intermittent and I didn’t notice that so I would need
> to redo the bisecting. I can say I have seen the problem reproduce as far
> back as Feb 16th kernel. When I updated the jdk in late January I didn’t 
> notice
> it but it could have been a lucky build as I tend to only do one 
> native_bootstrap
> build of the jdk when updating as a way to test the resulting package.
> 
> Here is some sample output of top, systat and ddb ps output on -current in
> my last reproduction of the problem.
> 
> load averages: 15.27,  4.25,  1.68
>   
> oracle.intricatesoftware.com 16:13:10
> 64 processes: 62 idle, 2 on processor 
>  up 0 
> days 00:04:58
> 64  CPUs:  0.0% user,  1.8% nice, 55.5% sys,  0.4% spin,  0.1% intr, 42.2% 
> idle
> Memory: Real: 143M/2676M act/tot Free: 13G Cache: 2334M Swap: 0K/16G
>   PID USERNAME PRI NICE  SIZE   RES STATE WAIT  TIMECPU COMMAND
> 38582 _pbuild   105  290M   64M onproc/19 fsleep   17:16 2892.14% javac
> 
>3 users Load 15.27 4.25 1.68oracle.intricatesof 
> 16:13:09
> 
> memory totals (in KB)PAGING   SWAPPING Interrupts
>real   virtual free   in  out   in  out12777 total
> Active   146752146752 13198032   opsvcons0
> All 2740160   2740160 29713104   pages5 
> vpci0:0
>   2 mpii0
> Proc:r  d  s  wCsw   Trp   Sys   Int   Sof  Flt   forks mpii1
> 18   148 78089 18666  8090 94097  18891   fkppw   2 em0
>   fksvm  31 ehci0
>0.1%Int   0.3%Spn  55.1%Sys   1.8%Usr  42.7%Idle   pwait   12737 clock
> |||||||||||   relck
> > rlkok
>   noram
> Namei Sys-cacheProc-cacheNo-cache ndcpy
> Calls hits%

Re: Sparc64 livelock/system freeze w/cpu traces

2023-05-12 Thread Martin Pieuchot

On 09/05/23(Tue) 20:02, Kurt Miller wrote:
> While building devel/jdk/1.8 on May 3rd snapshot I noticed the build freezing
> and processes getting stuck like ps. After enabling ddb.console I was able to
> reproduce the livelock and capture cpu traces. Dmesg at the end.
> Let me know if more information is needed as this appears to be rather
> reproducible on my T4-1.

It seems that all CPUs are waiting for the KERNEL_LOCK().  Doing ps /o
in ddb(4) should show us which CPU is currently holding it.  I can't
figure it out just by looking at the traces below.

> Cpu traces for cpus 0-63 done in order, however I moved the less interesting
> ones down below the traces with more meaningful traces.
> 
> Stopped at  __mp_lock+0x70: bne,pt  __mp_lock+0x5c
> ddb{0}> bt
> task_add(1, 4001528d490, 1, 0, 1c00, 1ccdbf8) at task_add+0x6c
> ifiq_input(4001528d450, 20172b0, 400151c3380, 0, 1000, 1) at ifiq_input+0x1d0
> em_rxeof(40015292200, 59, 4001528d66c, 20172b0, 19a07c0, ff00) at 
> em_rxeof+0x2b4
> em_intr(0, 2, 40015297c00, 4, 40015297c20, 0) at em_intr+0xc4
> intr_handler(2017530, 400151c1100, 4d63d37, 0, 14ec228, ac) at 
> intr_handler+0x50
> sparc_intr_retry(1c1c158, 2017d2c, 400151afa00, 10, 400151afa20, 0) at 
> sparc_intr_retry+0x5c
> db_enter_ddb(1ca6000, 1ca6798, 130, 0, 0, 19a07c0) at db_enter_ddb+0xbc
> db_ktrap(101, 20179a8, 18a1420, 0, 1c75b66, 3b9ac800) at db_ktrap+0x104
> trap(20179a8, 101, 14ec224, 820006, 1c00, 1ccdbf8) at trap+0x2cc
> Lslowtrap_reenter(0, , 2017c58, 4, 0, 60) at 
> Lslowtrap_reenter+0xf8
> vcons_cnlookc(0, 2017d2c, 400151afa00, 10, 400151afa20, 0) at 
> vcons_cnlookc+0x84
> vcons_softintr(400151b8a00, 1c7e548, 20, 1, 40015297c20, 6) at 
> vcons_softintr+0x3c
> intr_handler(2017ec8, 400151afa00, 4c2210e, 1c7b3e8, 0, 6) at 
> intr_handler+0x50
> sparc_intr_retry(0, 0, 18a1420, a855edc1cc, 1c00, 12) at sparc_intr_retry+0x5c
> cpu_idle_cycle(1c7e528, 2018000, 1793b58, 1c7b3e8, 0, 19a07c0) at 
> cpu_idle_cycle+0x2c
> sched_idle(2018360, 4001516b600, 18a1420, 0, 1c75b66, 3b9ac800) at 
> sched_idle+0x158
> proc_trampoline(0, 0, 0, 0, 0, 0) at proc_trampoline+0x14
> ddb{0}> machine ddbcpu 1
> Stopped at  __mp_lock+0x6c: subcc   %g3, %g1, %g0
> ddb{1}> bt
> rw_enter(40013cdaf00, 19a07c0, 1c7f000, 0, 1c04000, 0) at rw_enter+0x1fc
> uvm_fault_lower_lookup(400e2c09cd0, 400e2c09d08, 400e2c09bd0, 0, 0, 4) at 
> uvm_fault_lower_lookup+0x2c
> uvm_fault_lower(40013ccd180, 400e2c09d08, 400e2c09bd0, 0, 1c1d000, 1ca88d0) 
> at uvm_fault_lower+0x3c
> uvm_fault(0, 400e2c09cd0, 0, 40013ccd180, 11f9be0, 1) at uvm_fault+0x1bc
> text_access_fault(400e2c09ed0, 9, a860ee37d0, 0, 0, 0) at 
> text_access_fault+0x114
> sun4v_texttrap(a8a09a5120, 52a, a855edc1c8, a855edc1cc, 0, 2b) at 
> sun4v_texttrap+0x1fc
> ddb{1}> machine ddbcpu 2
> Stopped at  __mp_lock+0x68: ld  [%o0 + 0x800], %g1
> ddb{2}> bt
> rw_enter(40013cdaf00, 19a07c0, 1c7f000, 0, 1c04000, 0) at rw_enter+0x1fc
> uvm_fault_lower_lookup(400e2c2dcd0, 400e2c2dd08, 400e2c2dbd0, 0, 0, 4) at 
> uvm_fault_lower_lookup+0x2c
> uvm_fault_lower(40013ccd180, 400e2c2dd08, 400e2c2dbd0, 0, 1c1d000, 1ca88d0) 
> at uvm_fault_lower+0x3c
> uvm_fault(0, 400e2c2dcd0, 0, 40013ccd180, 11f9be0, 1) at uvm_fault+0x1bc
> text_access_fault(400e2c2ded0, 9, a860ee37d0, 0, 0, 0) at 
> text_access_fault+0x114
> sun4v_texttrap(a8a09a5120, 52a, a855edc1c8, a855edc1cc, 0, 19a07c0) at 
> sun4v_texttrap+0x1fc
> ddb{2}> machine ddbcpu 0x3
> Stopped at  __mp_lock+0x68: ld  [%o0 + 0x800], %g1
> ddb{3}> bt
> mi_switch(1, 400e0968000, 18dc7e0, 0, 0, 19a07c0) at mi_switch+0x2ac
> sleep_finish(0, 1, 20, 1917b80, 0, 0) at sleep_finish+0x16c
> rw_enter(40013cdaf00, 19a07c0, 1c7f000, 0, 1c04000, 0) at rw_enter+0x21c
> uvm_fault_lower_lookup(400e2b75cd0, 400e2b75d08, 400e2b75bd0, 0, 0, 4) at 
> uvm_fault_lower_lookup+0x2c
> uvm_fault_lower(40013ccd180, 400e2b75d08, 400e2b75bd0, 0, 1c1d000, 1ca88d0) 
> at uvm_fault_lower+0x3c
> uvm_fault(0, 400e2b75cd0, 0, 40013ccd180, 11f9be0, 1) at uvm_fault+0x1bc
> text_access_fault(400e2b75ed0, 9, a860ee37d0, 0, 0, 0) at 
> text_access_fault+0x114
> sun4v_texttrap(a8a09a5120, 52a, a855edc1c8, a855edc1cc, 0, 19a07c0) at 
> sun4v_texttrap+0x1fc
> ddb{3}> machine ddbcpu 0x4
> Stopped at  __mp_lock+0x6c: subcc   %g3, %g1, %g0
> ddb{4}> bt
> rw_enter(40013cdaf00, 19a07c0, 1c7f000, 0, 1c04000, 0) at rw_enter+0x1fc
> uvm_fault_lower_lookup(400e2a09cd0, 400e2a09d08, 400e2a09bd0, 0, 0, 4) at 
> uvm_fault_lower_lookup+0x2c
> uvm_fault_lower(40013ccd180, 400e2a09d08, 400e2a09bd0, 0, 1c1d000, 1ca88d0) 
> at uvm_fault_lower+0x3c
> uvm_fault(0, 400e2a09cd0, 0, 40013ccd180, 11f9be0, 1) at uvm_fault+0x1bc
> text_access_fault(400e2a09ed0, 9, a860ee37d0, 0, 0, 0) at 
> text_access_fault+0x114
> sun4v_texttrap(a8a09a5120, 52a, a855edc1c8, a855edc1cc, 0, 19a07c0) at 
> sun4v_texttrap+0x1fc
> ddb{6}> machine ddbcpu 0x7
> Stopped at  __mp_lock+0x64: nop
> ddb{7}> bt
> rw_enter(40013cdaf00, 19a07c0,

Re: Repeated crashes with OpenBSD 7.2 on Raspberry Pi 4 (arm64)

2023-02-20 Thread Martin Pieuchot

Hello Tomas,

On 19/02/23(Sun) 23:43, Tomas Vondra wrote:
> [...] 
> I think it's probably easier to just try PostgreSQL build and tests
> directly, without the buildfarm tooling. Ultimately that's what the
> buildfarm tooling is doing, except that it tests multiple branches.
> 
> I'd try cloning e.g. https://github.com/postgres/postgres, and then
> something like this:
> 
> 
> ./configure --enable-cassert --enable-debug --enable-nls --with-perl \
> --with-python --with-tcl --with-openssl --with-libxml \
> --with-libxslt --enable-tap-tests --with-icu
> 
> # build
> make -s -j4
> 
> # run tests in a loop
> while /bin/true; do make check-world; done
> 
> 
> The --enable-tap-tests may require a couple perl packages to support the TAP
> stuff. I don't have the list at hand, but I can share that tomorrow when I
> have access to the rpi4.

Thanks a lot for these explanations.  I ran the regression tests during
the whole morning on my x13s without being able to trigger any panic.

So I'll try to get a rpi4 and see if I can reproduce it.

In the meantime if you could confirm you can still trigger the panic
with a -current snapshot on your machine, that would motivate me to look
at it.

Cheers,
Martin

Re: bbolt can freeze 7.2 from userspace

2023-02-20 Thread Martin Pieuchot

On 20/02/23(Mon) 03:59, Renato Aguiar wrote:
> [...] 
> I can't reproduce it anymore with this patch on 7.2-stable :)

Thanks a lot for testing!  Here's a better fix from Chuck Silvers.
That's what I believe we should commit.

The idea is to prevent sibling from modifying the vm_map by marking
it as "busy" in msync(2) instead of holding the exclusive lock while
sleeping.  This let siblings make progress and stop possible writers.

Could you all guys confirm this also prevent the deadlock?  Thanks!

Index: uvm/uvm_map.c
===
RCS file: /cvs/src/sys/uvm/uvm_map.c,v
retrieving revision 1.312
diff -u -p -r1.312 uvm_map.c
--- uvm/uvm_map.c   13 Feb 2023 14:52:55 -  1.312
+++ uvm/uvm_map.c   20 Feb 2023 08:10:39 -
@@ -4569,8 +4569,7 @@ fail:
  * => never a need to flush amap layer since the anonymous memory has
  * no permanent home, but may deactivate pages there
  * => called from sys_msync() and sys_madvise()
- * => caller must not write-lock map (read OK).
- * => we may sleep while cleaning if SYNCIO [with map read-locked]
+ * => caller must not have map locked
  */
 
 int
@@ -4592,25 +4591,27 @@ uvm_map_clean(struct vm_map *map, vaddr_
if (start > end || start < map->min_offset || end > map->max_offset)
return EINVAL;
 
-   vm_map_lock_read(map);
+   vm_map_lock(map);
first = uvm_map_entrybyaddr(>addr, start);
 
/* Make a first pass to check for holes. */
for (entry = first; entry != NULL && entry->start < end;
entry = RBT_NEXT(uvm_map_addr, entry)) {
if (UVM_ET_ISSUBMAP(entry)) {
-   vm_map_unlock_read(map);
+   vm_map_unlock(map);
return EINVAL;
}
if (UVM_ET_ISSUBMAP(entry) ||
UVM_ET_ISHOLE(entry) ||
(entry->end < end &&
VMMAP_FREE_END(entry) != entry->end)) {
-   vm_map_unlock_read(map);
+   vm_map_unlock(map);
return EFAULT;
}
}
 
+   vm_map_busy(map);
+   vm_map_unlock(map);
error = 0;
for (entry = first; entry != NULL && entry->start < end;
entry = RBT_NEXT(uvm_map_addr, entry)) {
@@ -4722,7 +4723,7 @@ flush_object:
}
}
 
-   vm_map_unlock_read(map);
+   vm_map_unbusy(map);
return error;
 }

Re: Repeated crashes with OpenBSD 7.2 on Raspberry Pi 4 (arm64)

2023-02-19 Thread Martin Pieuchot

Hello Tomas,

Thanks for the report.

I'm setting up an arm64 machine to try to reproduce the crash.

Could you tell me what are the steps required to run the reproducer you
quoted below?  I read the buildfarm wiki page and I'm not interested in
running a periodic cron job...

I cloned the git repo, downloaded the latest build-farm.X.tgz, the
client and data.  I installed gmake, bison and flex and I'm now reading
the conf file I need to edit but I'm not sure how to glue everything
together.  Any example of setup on OpenBSD would be appreciated.

Thanks,
Martin


On 05/12/22(Mon) 18:09, Tomas Vondra wrote:
> >Synopsis:Regular crashes on rpi4 when running PostgreSQL tests
> >Category:aarch64
> >Environment:
>   System  : OpenBSD 7.2
>   Details : OpenBSD 7.2-current (GENERIC.MP) #1896: Sat Nov 19 
> 21:38:32 MST 2022
>
> dera...@arm64.openbsd.org:/usr/src/sys/arch/arm64/compile/GENERIC.MP
> 
>   Architecture: OpenBSD.arm64
>   Machine : arm64
> >Description:
> 
> When running PostgreSQL regression tests (using the community buildfarm 
> tooling)
> on Raspberry Pi 4 machine, the system occasionally panics - this happens 
> after a
> small number of hours. The system is significantly slower compared to rpi4
> machines running linux (by a factor of ~5x) so the whole test suite would 
> finish
> in about 24 hours, but I have never seen that to happen due to a crash.
> 
> I suspected perhaps this particular rpi4 is somehow broken, so I tried booting
> a Linux and ran the same set of tests - and that worked just fine. In fact, it
> completed ~10 rounds of testing over ~2 days, while on OpenBSD I can't get a
> single complete run.
> 
> Another thing I suspected is faulty SD card, so I moved the work directory to
> a USB flash drive and then to a reliable SSD (connected using a USB/SATA).
> The SSD did improve the performance somewhat (compared to running from USB
> flash drive) but the panics are still there, unfortunately.
> 
> I managed to collect a bunch of information following the ddb page for two
> crashes (I can try again, if more information is needed).
> 
> For the first crash I have only the stuff from the console:
> 
> Stopped at   panic+0x160  cmp  w21,#0x0
> TID PID UID   PFFLAGSPFLAGS   CPU   COMMAND
> *178534   888041000 0 0 2   postgres
>  464655   671711000 0 0 0   postgres
>  470045   345911000 0 0 3   postgres
>  326421   840181000 0 0 3K  postgres
> 
> db_enter() at panic+0x15c
> panic() at __assert+0x24
> panic() at uvm_fault_upper_lookup+0x258
> uvm_fault_upper() at uvm_fault+0xec
> uvm_fault() at udata_abort+0x128
> udata_abort() at do_el0_sync+0xdc
> do_el0_sync() at handle_el0_sync+0x74
> 
> For the second crash, I have more:
> 
> Stopped at   panic+0x160  cmp  w21,#0x0
> TID PID UID   PFFLAGSPFLAGS   CPU   COMMAND
> *315901   524221000 0 0 0   postgres
>  286288   161501000 0 0 3   postgres
>  235152   96037   0   0x14000 0x200 1   zerothread
> 
> ddb{0}> bt
> db_enter() at panic+0x15c
> panic() at kdata_abort+0x168
> kdata_abort() at handle_el1h_sync+0x6c
> handle_el1h_sync() at pmap_copy_page+0x98
> pmap_copy_page() at pmap_copy_page+0x98
> pmap_copy_page() at uvm_fault_upper+0x13c
> uvm_fault_upper() at uvm_fault+0xb4
> uvm_fault() at udata_abort+0x128
> udata_abort() at do_el0_sync+0xdc
> do_el0_sync() at handle_el0_sync+0x74
> handle_el0_sync() at 0x1b02613208
> 
> ddb{0}> show uvm
> Current UVM status:
>   pagesize=4096 (0×1000), pagemask=0xfff, pageshift=12
>   967776 VM pages: 44735 active, 183278 inactive, 1 wired, 344603 free 
> (43089 zero)
>   min 10% (25) anon, 10% (25) vnode, 5% (12) vtext
>   freemin=32259, free-target=43012, inactive-target=74248, 
> wired-max=322592
>   faults=87269298, traps=0, intrs=0, ctxswitch=19407000 fpuswitch=8
>   softint=20156649, syscalls=124374327, kmapent=21
>   fault counts:
> noram=0, noanon=0, noamap=0, pgwait=0, pgrele=0
> ok relocks(total)=332129(335474), anget(retries)=35295179(0), 
> amapcopy-8887865
> neighbor anon/obj pg=16711715/60577268, 
> gets(lock/unlock)=18290527/335627
> cases: anon=32635596, anoncow=2659583, obj=16018472, prcopy-2268557, 
> przero=33701122
>   daemon and swap counts:
> woke=10195, revs=27, scans=0, obscans=0, anscans=0
> busy=0, freed=0, reactivate=0, deactivate=28030
> pageouts=0, pending=0, nswget=0
> nswapdev=1
> swpages=517387, swpginuse=0, swpgonly=0 paging=0
>   kernel pointers:
> objs(kern)=0xff80010d6f78
> 
> ddb{0}> show bcstats
> Current Buffer Cache status:
>

Re: bbolt can freeze 7.2 from userspace

2023-02-18 Thread Martin Pieuchot

On 24/01/23(Tue) 04:40, Renato Aguiar wrote:
> Hi Martin,
> 
> "David Hill"  writes:
> 
> >
> > Yes, same result as before.  This patch does not seem to help.
> >
> 
> I could also reproduce it with patched 'current' :(

Here's another possible fix I came up with.  The idea is to deliberately
allow readers to starve writers for the VM map lock.

My previous fix didn't work because a writer could already be waiting on
the lock before calling vm_map_busy() and there's currently no way to
tell it to back out and stop preventing readers from making progress.

I don't see any simpler solution without rewriting the vm_map_lock()
primitive.

Could you confirm this work for you?  If so I'll document the new
RW_FORCEREAD flag and add a comment to explain why we need it in this
case.

Thanks,
Martin

Index: kern/kern_rwlock.c
===
RCS file: /cvs/src/sys/kern/kern_rwlock.c,v
retrieving revision 1.48
diff -u -p -r1.48 kern_rwlock.c
--- kern/kern_rwlock.c  10 May 2022 16:56:16 -  1.48
+++ kern/kern_rwlock.c  18 Feb 2023 07:48:23 -
@@ -225,7 +225,7 @@ rw_enter(struct rwlock *rwl, int flags)
 {
const struct rwlock_op *op;
struct sleep_state sls;
-   unsigned long inc, o;
+   unsigned long inc, o, check;
 #ifdef MULTIPROCESSOR
/*
 * If process holds the kernel lock, then we want to give up on CPU
@@ -248,10 +248,14 @@ rw_enter(struct rwlock *rwl, int flags)
 #endif
 
op = _ops[(flags & RW_OPMASK) - 1];
+   check = op->check;
+
+   if (flags & RW_FORCEREAD)
+   check &= ~RWLOCK_WRWANT;
 
inc = op->inc + RW_PROC(curproc) * op->proc_mult;
 retry:
-   while (__predict_false(((o = rwl->rwl_owner) & op->check) != 0)) {
+   while (__predict_false(((o = rwl->rwl_owner) & check) != 0)) {
unsigned long set = o | op->wait_set;
int do_sleep;
 
Index: sys/rwlock.h
===
RCS file: /cvs/src/sys/sys/rwlock.h,v
retrieving revision 1.28
diff -u -p -r1.28 rwlock.h
--- sys/rwlock.h11 Jan 2021 18:49:38 -  1.28
+++ sys/rwlock.h18 Feb 2023 07:44:13 -
@@ -117,6 +117,7 @@ struct rwlock {
 #define RW_NOSLEEP 0x0040UL /* don't wait for the lock */
 #define RW_RECURSEFAIL 0x0080UL /* Fail on recursion for RRW locks. */
 #define RW_DUPOK   0x0100UL /* Permit duplicate lock */
+#define RW_FORCEREAD   0x0200UL /* Let a reader preempt writers.  */
 
 /*
  * for rw_status() and rrw_status() only: exclusive lock held by
Index: uvm/uvm_map.c
===
RCS file: /cvs/src/sys/uvm/uvm_map.c,v
retrieving revision 1.312
diff -u -p -r1.312 uvm_map.c
--- uvm/uvm_map.c   13 Feb 2023 14:52:55 -  1.312
+++ uvm/uvm_map.c   18 Feb 2023 07:46:23 -
@@ -5406,7 +5406,7 @@ void
 vm_map_lock_read_ln(struct vm_map *map, char *file, int line)
 {
if ((map->flags & VM_MAP_INTRSAFE) == 0)
-   rw_enter_read(>lock);
+   rw_enter(>lock, RW_READ|RW_FORCEREAD);
else
mtx_enter(>mtx);
LPRINTF(("map   lock: %p (at %s %d)\n", map, file, line));

Re: bbolt can freeze 7.2 from userspace

2023-01-29 Thread Martin Pieuchot

On 29/01/23(Sun) 14:36, Mark Kettenis wrote:
> > Date: Sun, 29 Jan 2023 12:31:22 +0100
> > From: Martin Pieuchot 
> > 
> > On 23/01/23(Mon) 22:57, David Hill wrote:
> > > On 1/20/23 09:02, Martin Pieuchot wrote:
> > > > > [...] 
> > > > > Ran it 20 times and all completed and passed.  I was also able to 
> > > > > interrupt
> > > > > it as well.   no issues.
> > > > > 
> > > > > Excellent!
> > > > 
> > > > Here's the best fix I could come up with.  We mark the VM map as "busy"
> > > > during the page fault just before the faulting thread releases the 
> > > > shared
> > > > lock.  This ensures no other thread will grab an exclusive lock until 
> > > > the
> > > > fault is finished.
> > > > 
> > > > I couldn't trigger the reproducer with this, can you?
> > > 
> > > Yes, same result as before.  This patch does not seem to help.
> > 
> > Is it the same as before?  I doubt it is.  On a 4-CPU machine I can't
> > trigger the race described in this thread.  On a 8-CPU one I now see all
> > threads sleeping on "thrsleep" except one in "kqread" and one in "wait".
> 
> I'm also seeing bbolt.test processes sleeping on "vmmaplk", "vmmapbsy"
> and "uvn_flsh", just like without the diff :(.  Well, maybe the
> "vmmapbsy" one is new...

"vmmapbsy" is new because vm_map_busy() is now being used.  If you're
seeing this one I need to understand if the faulting thread is being
blocked and where.

Can you enter ddb and get a trace of the threads?  I'm missing some
pieces of informations, so I need fresh debug data.

Thanks to anyone that could get me more information about this.

Re: bbolt can freeze 7.2 from userspace

2023-01-29 Thread Martin Pieuchot

On 23/01/23(Mon) 22:57, David Hill wrote:
> On 1/20/23 09:02, Martin Pieuchot wrote:
> > > [...] 
> > > Ran it 20 times and all completed and passed.  I was also able to 
> > > interrupt
> > > it as well.   no issues.
> > > 
> > > Excellent!
> > 
> > Here's the best fix I could come up with.  We mark the VM map as "busy"
> > during the page fault just before the faulting thread releases the shared
> > lock.  This ensures no other thread will grab an exclusive lock until the
> > fault is finished.
> > 
> > I couldn't trigger the reproducer with this, can you?
> 
> Yes, same result as before.  This patch does not seem to help.

Is it the same as before?  I doubt it is.  On a 4-CPU machine I can't
trigger the race described in this thread.  On a 8-CPU one I now see all
threads sleeping on "thrsleep" except one in "kqread" and one in "wait".

I don't know what's happening.  I don't know how to debug Go code.  I
can't say if this is a race in UVM or something else.  The original race
reported in this thread seems fixed by the diff I sent.  I don't know if
there's another one or if I missed something.  I can't figure it out by
myself, I'd appreciated if somebody else with some knowledge in Go could
give me a hand.

Joel could you find some time to reproduce the hang with the diff below
applied?

To reproduce, just do:
$ doas pkg_add git go
$ git clone https://github.com/etcd-io/bbolt.git
$ cd bbolt
$ git checkout v1.3.6
$ go test -v -run TestSimulate_1op_10p

Index: uvm/uvm_fault.c
===
RCS file: /cvs/src/sys/uvm/uvm_fault.c,v
retrieving revision 1.133
diff -u -p -r1.133 uvm_fault.c
--- uvm/uvm_fault.c 4 Nov 2022 09:36:44 -   1.133
+++ uvm/uvm_fault.c 20 Jan 2023 13:52:58 -
@@ -1277,6 +1277,9 @@ uvm_fault_lower(struct uvm_faultinfo *uf
/* update rusage counters */
curproc->p_ru.ru_majflt++;
 
+   /* prevent siblings to grab an exclusive lock on the map */
+   vm_map_busy(ufi->map);
+
uvmfault_unlockall(ufi, amap, NULL);
 
counters_inc(uvmexp_counters, flt_get);
@@ -1293,13 +1296,16 @@ uvm_fault_lower(struct uvm_faultinfo *uf
KASSERT(result != VM_PAGER_PEND);
 
if (result == VM_PAGER_AGAIN) {
+   vm_map_unbusy(ufi->map);
tsleep_nsec(, PVM, "fltagain2",
MSEC_TO_NSEC(5));
return ERESTART;
}
 
-   if (!UVM_ET_ISNOFAULT(ufi->entry))
+   if (!UVM_ET_ISNOFAULT(ufi->entry)) {
+   vm_map_unbusy(ufi->map);
return (EIO);
+   }
 
uobjpage = PGO_DONTCARE;
uobj = NULL;
@@ -1308,6 +1314,7 @@ uvm_fault_lower(struct uvm_faultinfo *uf
 
/* re-verify the state of the world.  */
locked = uvmfault_relock(ufi);
+   vm_map_unbusy(ufi->map);
if (locked && amap != NULL)
amap_lock(amap);

Re: bbolt can freeze 7.2 from userspace

2023-01-20 Thread Martin Pieuchot

Hello David,

On 21/12/22(Wed) 11:37, David Hill wrote:
> On 12/21/22 11:23, Martin Pieuchot wrote:
> > On 21/12/22(Wed) 09:20, David Hill wrote:
> > > On 12/21/22 07:08, David Hill wrote:
> > > > On 12/21/22 05:33, Martin Pieuchot wrote:
> > > > > On 18/12/22(Sun) 20:55, Martin Pieuchot wrote:
> > > > > > On 17/12/22(Sat) 14:15, David Hill wrote:
> > > > > > > On 10/28/22 03:46, Renato Aguiar wrote:
> > > > > > > > Use of bbolt Go library causes 7.2 to freeze. I suspect
> > > > > > > > it is triggering some
> > > > > > > > sort of deadlock in mmap because threads get stuck at vmmaplk.
> > > > > > > > 
> > > > > > > > I managed to reproduce it consistently in a laptop with
> > > > > > > > 4 cores (i5-1135G7)
> > > > > > > > using one unit test from bbolt:
> > > > > > > > 
> > > > > > > >      $ doas pkg_add git go
> > > > > > > >      $ git clone https://github.com/etcd-io/bbolt.git
> > > > > > > >      $ cd bbolt
> > > > > > > >      $ git checkout v1.3.6
> > > > > > > >      $ go test -v -run TestSimulate_1op_10p
> > > > > > > > 
> > > > > > > > The test never ends and this is the 'top' report:
> > > > > > > > 
> > > > > > > >      PID  TID PRI NICE  SIZE   RES STATE
> > > > > > > > WAIT  TIMECPU COMMAND
> > > > > > > > 32181   438138 -18    0   57M   13M idle  uvn_fls
> > > > > > > > 0:00  0.00% bbolt.test
> > > > > > > > 32181   331169  10    0   57M   13M sleep/1   nanoslp
> > > > > > > > 0:00  0.00% bbolt.test
> > > > > > > > 32181   497390  10    0   57M   13M idle  vmmaplk
> > > > > > > > 0:00  0.00% bbolt.test
> > > > > > > > 32181   380477  14    0   57M   13M idle  vmmaplk
> > > > > > > > 0:00  0.00% bbolt.test
> > > > > > > > 32181   336950  14    0   57M   13M idle  vmmaplk
> > > > > > > > 0:00  0.00% bbolt.test
> > > > > > > > 32181   491043  14    0   57M   13M idle  vmmaplk
> > > > > > > > 0:00  0.00% bbolt.test
> > > > > > > > 32181   347071   2    0   57M   13M idle  kqread
> > > > > > > > 0:00  0.00% bbolt.test
> > > > > > > > 
> > > > > > > > After this, most commands just hang. For example,
> > > > > > > > running a 'ps | grep foo' in
> > > > > > > > another shell would do it.
> > > > > > > > 
> > > > > > > 
> > > > > > > I can reproduce this on MP, but not SP.  Here is /trace from
> > > > > > > ddb after using
> > > > > > > the ddb.trigger sysctl.  Is there any other information I
> > > > > > > could pull from
> > > > > > > DDB that may help?
> > > > > > 
> > > > > > Thanks for the useful report David!
> > > > > > 
> > > > > > The issue seems to be a deadlock between the `vmmaplk' and a 
> > > > > > particular
> > > > > > `vmobjlock'.  uvm_map_clean() calls uvn_flush() which sleeps with 
> > > > > > the
> > > > > > `vmmaplk' held.
> > > > > > 
> > > > > > I'll think a bit about this and try to come up with a fix ASAP.
> > > > > 
> > > > > I'm missing a piece of information.  All the threads in your report 
> > > > > seem
> > > > > to want a read version of the `vmmaplk' so they should not block.  
> > > > > Could
> > > > > you reproduce the hang with a WITNESS kernel and print 'show all 
> > > > > locks'
> > > > > in addition to all the informations you've reported?
> > > > > 
> > > > 
> > > > Sure.  Its always the same; 2 processes (sysctl and bbolt.test) and 3
> > > > locks (sysctllk, kernel_lock, and vmmaplk) with bbolt.test always on the
> > > > uvn_flsh thread.
> > > > 
> > > > 
> > > > Process 98301 (sysctl) thread 0xfff..
> > > > exclusive rwlock sysctll

Re: bbolt can freeze 7.2 from userspace

2022-12-21 Thread Martin Pieuchot

On 21/12/22(Wed) 09:20, David Hill wrote:
> 
> 
> On 12/21/22 07:08, David Hill wrote:
> > 
> > 
> > On 12/21/22 05:33, Martin Pieuchot wrote:
> > > On 18/12/22(Sun) 20:55, Martin Pieuchot wrote:
> > > > On 17/12/22(Sat) 14:15, David Hill wrote:
> > > > > 
> > > > > 
> > > > > On 10/28/22 03:46, Renato Aguiar wrote:
> > > > > > Use of bbolt Go library causes 7.2 to freeze. I suspect
> > > > > > it is triggering some
> > > > > > sort of deadlock in mmap because threads get stuck at vmmaplk.
> > > > > > 
> > > > > > I managed to reproduce it consistently in a laptop with
> > > > > > 4 cores (i5-1135G7)
> > > > > > using one unit test from bbolt:
> > > > > > 
> > > > > >     $ doas pkg_add git go
> > > > > >     $ git clone https://github.com/etcd-io/bbolt.git
> > > > > >     $ cd bbolt
> > > > > >     $ git checkout v1.3.6
> > > > > >     $ go test -v -run TestSimulate_1op_10p
> > > > > > 
> > > > > > The test never ends and this is the 'top' report:
> > > > > > 
> > > > > >     PID  TID PRI NICE  SIZE   RES STATE
> > > > > > WAIT  TIMECPU COMMAND
> > > > > > 32181   438138 -18    0   57M   13M idle  uvn_fls  
> > > > > > 0:00  0.00% bbolt.test
> > > > > > 32181   331169  10    0   57M   13M sleep/1   nanoslp  
> > > > > > 0:00  0.00% bbolt.test
> > > > > > 32181   497390  10    0   57M   13M idle  vmmaplk  
> > > > > > 0:00  0.00% bbolt.test
> > > > > > 32181   380477  14    0   57M   13M idle  vmmaplk  
> > > > > > 0:00  0.00% bbolt.test
> > > > > > 32181   336950  14    0   57M   13M idle  vmmaplk  
> > > > > > 0:00  0.00% bbolt.test
> > > > > > 32181   491043  14    0   57M   13M idle  vmmaplk  
> > > > > > 0:00  0.00% bbolt.test
> > > > > > 32181   347071   2    0   57M   13M idle  kqread   
> > > > > > 0:00  0.00% bbolt.test
> > > > > > 
> > > > > > After this, most commands just hang. For example,
> > > > > > running a 'ps | grep foo' in
> > > > > > another shell would do it.
> > > > > > 
> > > > > 
> > > > > I can reproduce this on MP, but not SP.  Here is /trace from
> > > > > ddb after using
> > > > > the ddb.trigger sysctl.  Is there any other information I
> > > > > could pull from
> > > > > DDB that may help?
> > > > 
> > > > Thanks for the useful report David!
> > > > 
> > > > The issue seems to be a deadlock between the `vmmaplk' and a particular
> > > > `vmobjlock'.  uvm_map_clean() calls uvn_flush() which sleeps with the
> > > > `vmmaplk' held.
> > > > 
> > > > I'll think a bit about this and try to come up with a fix ASAP.
> > > 
> > > I'm missing a piece of information.  All the threads in your report seem
> > > to want a read version of the `vmmaplk' so they should not block.  Could
> > > you reproduce the hang with a WITNESS kernel and print 'show all locks'
> > > in addition to all the informations you've reported?
> > > 
> > 
> > Sure.  Its always the same; 2 processes (sysctl and bbolt.test) and 3
> > locks (sysctllk, kernel_lock, and vmmaplk) with bbolt.test always on the
> > uvn_flsh thread.
> > 
> > 
> > Process 98301 (sysctl) thread 0xfff..
> > exclusive rwlock sysctllk r = 0 (0xf...)
> > exclusive kernel_lock _lock r = 0 (0xff..)
> > Process 32181 (bbolt.test) thread (0xff...) (438138)
> > shared rwlock vmmaplk r = 0 (0xf..)
> > 
> > To reproduce, just do:
> > $ doas pkg_add git go
> > $ git clone https://github.com/etcd-io/bbolt.git
> > $ cd bbolt
> > $ git checkout v1.3.6
> > $ go test -v -run TestSimulate_1op_10p
> > 
> > The test will hang happen almost instantly.
> > 
> 
> Not sure if this is a hint..
> 
> https://github.com/etcd-io/bbolt/blob/master/db.go#L27-L31
> 
> // IgnoreNoSync specifies whether the NoSync field of a DB is ignored when
> // syncing changes to a file.  This is required as some operating systems,
> // such as OpenBSD, do not have a unified buffe

Re: bbolt can freeze 7.2 from userspace

2022-12-21 Thread Martin Pieuchot

On 18/12/22(Sun) 20:55, Martin Pieuchot wrote:
> On 17/12/22(Sat) 14:15, David Hill wrote:
> > 
> > 
> > On 10/28/22 03:46, Renato Aguiar wrote:
> > > Use of bbolt Go library causes 7.2 to freeze. I suspect it is triggering 
> > > some
> > > sort of deadlock in mmap because threads get stuck at vmmaplk.
> > > 
> > > I managed to reproduce it consistently in a laptop with 4 cores 
> > > (i5-1135G7)
> > > using one unit test from bbolt:
> > > 
> > >$ doas pkg_add git go
> > >$ git clone https://github.com/etcd-io/bbolt.git
> > >$ cd bbolt
> > >$ git checkout v1.3.6
> > >$ go test -v -run TestSimulate_1op_10p
> > > 
> > > The test never ends and this is the 'top' report:
> > > 
> > >PID  TID PRI NICE  SIZE   RES STATE WAIT  TIMECPU 
> > > COMMAND
> > > 32181   438138 -180   57M   13M idle  uvn_fls   0:00  0.00% 
> > > bbolt.test
> > > 32181   331169  100   57M   13M sleep/1   nanoslp   0:00  0.00% 
> > > bbolt.test
> > > 32181   497390  100   57M   13M idle  vmmaplk   0:00  0.00% 
> > > bbolt.test
> > > 32181   380477  140   57M   13M idle  vmmaplk   0:00  0.00% 
> > > bbolt.test
> > > 32181   336950  140   57M   13M idle  vmmaplk   0:00  0.00% 
> > > bbolt.test
> > > 32181   491043  140   57M   13M idle  vmmaplk   0:00  0.00% 
> > > bbolt.test
> > > 32181   347071   20   57M   13M idle  kqread0:00  0.00% 
> > > bbolt.test
> > > 
> > > After this, most commands just hang. For example, running a 'ps | grep 
> > > foo' in
> > > another shell would do it.
> > > 
> > 
> > I can reproduce this on MP, but not SP.  Here is /trace from ddb after using
> > the ddb.trigger sysctl.  Is there any other information I could pull from
> > DDB that may help?
> 
> Thanks for the useful report David! 
> 
> The issue seems to be a deadlock between the `vmmaplk' and a particular
> `vmobjlock'.  uvm_map_clean() calls uvn_flush() which sleeps with the
> `vmmaplk' held. 
> 
> I'll think a bit about this and try to come up with a fix ASAP.

I'm missing a piece of information.  All the threads in your report seem
to want a read version of the `vmmaplk' so they should not block.  Could
you reproduce the hang with a WITNESS kernel and print 'show all locks'
in addition to all the informations you've reported?

Re: bbolt can freeze 7.2 from userspace

2022-12-18 Thread Martin Pieuchot

On 17/12/22(Sat) 14:15, David Hill wrote:
> 
> 
> On 10/28/22 03:46, Renato Aguiar wrote:
> > Use of bbolt Go library causes 7.2 to freeze. I suspect it is triggering 
> > some
> > sort of deadlock in mmap because threads get stuck at vmmaplk.
> > 
> > I managed to reproduce it consistently in a laptop with 4 cores (i5-1135G7)
> > using one unit test from bbolt:
> > 
> >$ doas pkg_add git go
> >$ git clone https://github.com/etcd-io/bbolt.git
> >$ cd bbolt
> >$ git checkout v1.3.6
> >$ go test -v -run TestSimulate_1op_10p
> > 
> > The test never ends and this is the 'top' report:
> > 
> >PID  TID PRI NICE  SIZE   RES STATE WAIT  TIMECPU COMMAND
> > 32181   438138 -180   57M   13M idle  uvn_fls   0:00  0.00% 
> > bbolt.test
> > 32181   331169  100   57M   13M sleep/1   nanoslp   0:00  0.00% 
> > bbolt.test
> > 32181   497390  100   57M   13M idle  vmmaplk   0:00  0.00% 
> > bbolt.test
> > 32181   380477  140   57M   13M idle  vmmaplk   0:00  0.00% 
> > bbolt.test
> > 32181   336950  140   57M   13M idle  vmmaplk   0:00  0.00% 
> > bbolt.test
> > 32181   491043  140   57M   13M idle  vmmaplk   0:00  0.00% 
> > bbolt.test
> > 32181   347071   20   57M   13M idle  kqread0:00  0.00% 
> > bbolt.test
> > 
> > After this, most commands just hang. For example, running a 'ps | grep foo' 
> > in
> > another shell would do it.
> > 
> 
> I can reproduce this on MP, but not SP.  Here is /trace from ddb after using
> the ddb.trigger sysctl.  Is there any other information I could pull from
> DDB that may help?

Thanks for the useful report David! 

The issue seems to be a deadlock between the `vmmaplk' and a particular
`vmobjlock'.  uvm_map_clean() calls uvn_flush() which sleeps with the
`vmmaplk' held. 

I'll think a bit about this and try to come up with a fix ASAP.

> Stopped atdb_enter+0x10:  popq%rbp
> ddb{3}>PID TID   PPIDUID  S   FLAGS  WAIT COMMAND
> *50158  300210  75987  0  7 0x3sysctl
>  19266  326894  80979   1000  3 0x3  vmmaplk   bbolt.test
>  19266  173202  80979   1000  3   0x483  nanoslp   bbolt.test
>  19266   53881  80979   1000  3   0x483  kqreadbbolt.test
>  19266  305124  80979   1000  3   0x403  uvn_flsh  bbolt.test
>  19266  409572  80979   1000  3   0x403  vmmaplk   bbolt.test
>  19266  471071  80979   1000  3   0x403  vmmaplk   bbolt.test
>  19266   75742  80979   1000  3   0x403  vmmaplk   bbolt.test
>  80979  246480  44618   1000  30x83  thrsleep  go
>  80979  127832  44618   1000  3   0x483  thrsleep  go
>  80979  259946  44618   1000  3   0x483  thrsleep  go
>  80979  301163  44618   1000  3   0x483  thrsleep  go
>  80979  179798  44618   1000  3   0x483  wait  go
>  80979  488795  44618   1000  3   0x483  thrsleep  go
>  80979   34313  44618   1000  3   0x483  thrsleep  go
>  80979  265681  44618   1000  3   0x483  thrsleep  go
>  80979  497706  44618   1000  3   0x483  thrsleep  go
>  80979  427226  44618   1000  3   0x483  kqreadgo
>  94416  390071  1  0  30x100083  ttyin getty
>   8978  261384  1  0  30x100083  ttyin getty
>   9412  162712  1  0  30x100083  ttyin getty
>  44618  141216  1   1000  30x10008b  sigsusp   ksh
>  75987  285267  1  0  30x10008b  sigsusp   ksh
>  55798  180352  1  0  30x100098  kqreadcron
>  973992603  1  0  30x80  kqreadapmd
>   2179  523954  1 99  3   0x1100090  kqreadsndiod
>  26099  499871  1110  30x100090  kqreadsndiod
>  12661   11825  84402 95  3   0x1100092  kqreadsmtpd
>  97311   87889  84402103  3   0x1100092  kqreadsmtpd
>  18428  154020  84402 95  3   0x1100092  kqreadsmtpd
> 
>  ddb{3}> trace /t 0t326894
> sleep_finish(8000344b0bb0,1) at sleep_finish+0xfe
> rw_enter(fd821cbcb220,2) at rw_enter+0x232
> uvmfault_relock(8000344b0e10) at uvmfault_relock+0x6f
> uvm_fault_lower(8000344b0e10,8000344b0e48,8000344b0d90,0) at
> uvm_fault_lower+0x38a
> uvm_fault(fd821cbcb188,2e4182000,0,1) at uvm_fault+0x1b3
> upageflttrap(8000344b0f70,2e4182008) at upageflttrap+0x62
> usertrap(8000344b0f70) at usertrap+0x129
> recall_trap() at recall_trap+0x8
> end of kernel
> end trace frame: 0xc5d8d8, count: -8
> 
> ddb{3}> trace /t 0t173202
> sleep_finish(8000344b9380,1) at sleep_finish+0xfe
> tsleep(823bbde8,120,81f2bf4b,2) at tsleep+0xb2
> sys_nanosleep(800034349cf0,8000344b9490,8000344b94f0) at
> sys_nanosleep+0x12d
> syscall(8000344b9560) at syscall+0x384
> Xsyscall() at Xsyscall+0x128
> end of kernel
> end trace frame: 0x27a3e8610, count: -5
> 
> ddb{3}> trace /t 0t53881
>

Re: macppc panic: vref used where vget required

2022-11-09 Thread Martin Pieuchot

On 09/09/22(Fri) 14:41, Martin Pieuchot wrote:
> On 09/09/22(Fri) 12:25, Theo Buehler wrote:
> > > Yesterday gnezdo@ fixed a race in uvn_attach() that lead to the same
> > > assert.  Here's an rebased diff for the bug discussed in this thread,
> > > could you try again and let us know?  Thanks!
> > 
> > This seems to be stable now. It's been running for nearly 5 days.
> > Without gnezdo's fix it would blow up within at most 2 days.
> 
> Thanks!  I'm looking for oks then. 

Here's an alternative possible fix.  The previous one got reverted
because it exposes a bug on arm64 machines with Cortex-A72 CPUs.

The idea of the diff below is to flush data to physical pages that we keep
around when munmap(2) is called.  I hope that the page daemon does the right
thing and don't try to grab a reference to the vnode if all pages are PG_CLEAN.

Could you try that and tell me if this prevents the panic you're seeing?

Index: uvm/uvm_vnode.c
===
RCS file: /cvs/src/sys/uvm/uvm_vnode.c,v
retrieving revision 1.130
diff -u -p -r1.130 uvm_vnode.c
--- uvm/uvm_vnode.c 20 Oct 2022 13:31:52 -  1.130
+++ uvm/uvm_vnode.c 9 Nov 2022 16:08:57 -
@@ -329,7 +329,7 @@ uvn_detach(struct uvm_object *uobj)
 */
if (uvn->u_flags & UVM_VNODE_CANPERSIST) {
/* won't block */
-   uvn_flush(uobj, 0, 0, PGO_DEACTIVATE|PGO_ALLPAGES);
+   uvn_flush(uobj, 0, 0, PGO_CLEANIT|PGO_DEACTIVATE|PGO_ALLPAGES);
goto out;
}

Re: bse(4) media/link bug

2022-11-07 Thread Martin Pieuchot

On 07/11/22(Mon) 13:20, Martin Pieuchot wrote:
> On a raspberry pi4, with the following configuration :
> 
> $ cat /etc/hostname.bse0 
> dhcp
> 
> ...and with the cable directly connected to my laptop (amd64 w/ em(4)) I
> have to force the media type, with the command below, to make it work.
> 
> # ifconfig bse0 media 1000baseT mediaopt full-duplex

Actually it is worst than that.  It's completely broken and I can't use
it.

bse(4) media/link bug

2022-11-07 Thread Martin Pieuchot

On a raspberry pi4, with the following configuration :

$ cat /etc/hostname.bse0 
dhcp

...and with the cable directly connected to my laptop (amd64 w/ em(4)) I
have to force the media type, with the command below, to make it work.

# ifconfig bse0 media 1000baseT mediaopt full-duplex

arm64 (rockpro64) regression

2022-09-18 Thread Martin Pieuchot

The rockpro64 no longer boots in multi-user on -current.  It hangs after
displaying the following lines:

rkiis0 at mainbus0
rkiis1 at mainbus0

The 8/09 snapshot works, the next one from 11/09 doesn't.

bsd.rd still boots.

Dmesg below.

OpenBSD 7.2-beta (GENERIC.MP) #1815: Thu Sep  8 13:20:08 MDT 2022
dera...@arm64.openbsd.org:/usr/src/sys/arch/arm64/compile/GENERIC.MP
real mem  = 4038770688 (3851MB)
avail mem = 3837693952 (3659MB)
random: good seed from bootblocks
mainbus0 at root: Pine64 RockPro64 v2.1
psci0 at mainbus0: PSCI 1.1, SMCCC 1.2, SYSTEM_SUSPEND
cpu0 at mainbus0 mpidr 0: ARM Cortex-A53 r0p4
cpu0: 32KB 64b/line 2-way L1 VIPT I-cache, 32KB 64b/line 4-way L1 D-cache
cpu0: 512KB 64b/line 16-way L2 cache
cpu0: CRC32,SHA2,SHA1,AES+PMULL,ASID16
cpu1 at mainbus0 mpidr 1: ARM Cortex-A53 r0p4
cpu1: 32KB 64b/line 2-way L1 VIPT I-cache, 32KB 64b/line 4-way L1 D-cache
cpu1: 512KB 64b/line 16-way L2 cache
cpu1: CRC32,SHA2,SHA1,AES+PMULL,ASID16
cpu2 at mainbus0 mpidr 2: ARM Cortex-A53 r0p4
cpu2: 32KB 64b/line 2-way L1 VIPT I-cache, 32KB 64b/line 4-way L1 D-cache
cpu2: 512KB 64b/line 16-way L2 cache
cpu2: CRC32,SHA2,SHA1,AES+PMULL,ASID16
cpu3 at mainbus0 mpidr 3: ARM Cortex-A53 r0p4
cpu3: 32KB 64b/line 2-way L1 VIPT I-cache, 32KB 64b/line 4-way L1 D-cache
cpu3: 512KB 64b/line 16-way L2 cache
cpu3: CRC32,SHA2,SHA1,AES+PMULL,ASID16
cpu4 at mainbus0 mpidr 100: ARM Cortex-A72 r0p2
cpu4: 48KB 64b/line 3-way L1 PIPT I-cache, 32KB 64b/line 2-way L1 D-cache
cpu4: 1024KB 64b/line 16-way L2 cache
cpu4: CRC32,SHA2,SHA1,AES+PMULL,ASID16
cpu5 at mainbus0 mpidr 101: ARM Cortex-A72 r0p2
cpu5: 48KB 64b/line 3-way L1 PIPT I-cache, 32KB 64b/line 2-way L1 D-cache
cpu5: 1024KB 64b/line 16-way L2 cache
cpu5: CRC32,SHA2,SHA1,AES+PMULL,ASID16
efi0 at mainbus0: UEFI 2.8
efi0: Das U-Boot rev 0x20211000
apm0 at mainbus0
agintc0 at mainbus0 sec shift 3:3 nirq 288 nredist 6 ipi: 0, 1, 2: 
"interrupt-controller"
agintcmsi0 at agintc0
syscon0 at mainbus0: "qos"
syscon1 at mainbus0: "qos"
syscon2 at mainbus0: "qos"
syscon3 at mainbus0: "qos"
syscon4 at mainbus0: "qos"
syscon5 at mainbus0: "qos"
syscon6 at mainbus0: "qos"
syscon7 at mainbus0: "qos"
syscon8 at mainbus0: "qos"
syscon9 at mainbus0: "qos"
syscon10 at mainbus0: "qos"
syscon11 at mainbus0: "qos"
syscon12 at mainbus0: "qos"
syscon13 at mainbus0: "qos"
syscon14 at mainbus0: "qos"
syscon15 at mainbus0: "qos"
syscon16 at mainbus0: "qos"
syscon17 at mainbus0: "qos"
syscon18 at mainbus0: "qos"
syscon19 at mainbus0: "qos"
syscon20 at mainbus0: "qos"
syscon21 at mainbus0: "qos"
syscon22 at mainbus0: "qos"
syscon23 at mainbus0: "qos"
syscon24 at mainbus0: "qos"
syscon25 at mainbus0: "power-management"
"power-controller" at syscon25 not configured
syscon26 at mainbus0: "syscon"
"io-domains" at syscon26 not configured
rkclock0 at mainbus0
rkclock1 at mainbus0
syscon27 at mainbus0: "syscon"
"io-domains" at syscon27 not configured
"usb2phy" at syscon27 not configured
"usb2phy" at syscon27 not configured
rkemmcphy0 at syscon27
"pcie-phy" at syscon27 not configured
rktcphy0 at mainbus0
rktcphy1 at mainbus0
rkpinctrl0 at mainbus0: "pinctrl"
rkgpio0 at rkpinctrl0
rkgpio1 at rkpinctrl0
rkgpio2 at rkpinctrl0
rkgpio3 at rkpinctrl0
rkgpio4 at rkpinctrl0
pwmreg0 at mainbus0
syscon28 at mainbus0: "syscon"
syscon29 at mainbus0: "syscon"
"fit-images" at mainbus0 not configured
rkdrm0 at mainbus0
drm0 at rkdrm0
"pmu_a53" at mainbus0 not configured
"pmu_a72" at mainbus0 not configured
agtimer0 at mainbus0: 24000 kHz
"xin24m" at mainbus0 not configured
rkpcie0 at mainbus0
rkpcie0: link training timeout
dwge0 at mainbus0: rev 0x35, address 7a:4c:3b:2e:91:f1
rgephy0 at dwge0 phy 0: RTL8169S/8110S/8211 PHY, rev. 6
dwmmc0 at mainbus0: 50 MHz base clock
sdmmc0 at dwmmc0: 4-bit, sd high-speed, dma
dwmmc1 at mainbus0: 50 MHz base clock
sdmmc1 at dwmmc1: 4-bit, sd high-speed, dma
sdhc0 at mainbus0
sdhc0: SDHC 3.0, 200 MHz base clock
sdmmc2 at sdhc0: 8-bit, sd high-speed, mmc high-speed, dma
ehci0 at mainbus0
usb0 at ehci0: USB revision 2.0
uhub0 at usb0 configuration 1 interface 0 "Generic EHCI root hub" rev 2.00/1.00 
addr 1
ohci0 at mainbus0: version 1.0
ehci1 at mainbus0
usb1 at ehci1: USB revision 2.0
uhub1 at usb1 configuration 1 interface 0 "Generic EHCI root hub" rev 2.00/1.00 
addr 1
ohci1 at mainbus0: version 1.0
rkdwusb0 at mainbus0: "usb"
xhci0 at rkdwusb0, xHCI 1.10
usb2 at xhci0: USB revision 3.0
uhub2 at usb2 configuration 1 interface 0 "Generic xHCI root hub" rev 3.00/1.00 
addr 1
rkdwusb1 at mainbus0: "usb"
xhci1 at rkdwusb1, xHCI 1.10
usb3 at xhci1: USB revision 3.0
uhub3 at usb3 configuration 1 interface 0 "Generic xHCI root hub" rev 3.00/1.00 
addr 1
"saradc" at mainbus0 not configured
rkiic0 at mainbus0
iic0 at rkiic0
escodec0 at iic0 addr 0x11
rkiic1 at mainbus0
iic1 at rkiic1
com0 at mainbus0: dw16550, 64 byte fifo
com1 at mainbus0: dw16550, 64 byte fifo
com1: console
"spi" at mainbus0 not configured
rktemp0 at mainbus0
rkiic2 at mainbus0
iic2 at rkiic2
rkpmic0 at iic2 addr 0x1b: RK808

Swap on sdhc(4) and dwmmc(4) is broken

2022-09-10 Thread Martin Pieuchot

On the rockpro64 as well as on the rpi4 if too much swapping occurs
biowait() returns an error (B_ERROR) in both cases it seems to come
from sdmmc_complete_xs().  I see the following:

sdmmc_complete_xs: write error = 35
sdmmc_complete_xs: read error = 35
c++: B_ERROR after biowait()
c++: error 4 from uvmfault_anonget()
udata_abort: error 13
sdmmc_complete_xs: read error = 35
c++: B_ERROR after biowait()
c++: error 4 from uvmfault_anonget()
udata_abort: error 13

And:

sdmmc_complete_xs: read error = 60
c++: B_ERROR after biowait()
c++: error 4 from uvmfault_anonget()
udata_abort: error 13

kettenis@ suggested the kernel thread in the sdmmc stack cannot run
because the page daemon is holding the KERNEL_LOCK() which makes the
transfers time out...

Re: macppc panic: vref used where vget required

2022-09-09 Thread Martin Pieuchot

On 09/09/22(Fri) 12:25, Theo Buehler wrote:
> > Yesterday gnezdo@ fixed a race in uvn_attach() that lead to the same
> > assert.  Here's an rebased diff for the bug discussed in this thread,
> > could you try again and let us know?  Thanks!
> 
> This seems to be stable now. It's been running for nearly 5 days.
> Without gnezdo's fix it would blow up within at most 2 days.

Thanks!  I'm looking for oks then. 

Index: uvm/uvm_vnode.c
===
RCS file: /cvs/src/sys/uvm/uvm_vnode.c,v
retrieving revision 1.127
diff -u -p -r1.127 uvm_vnode.c
--- uvm/uvm_vnode.c 31 Aug 2022 09:07:35 -  1.127
+++ uvm/uvm_vnode.c 1 Sep 2022 12:54:27 -
@@ -163,11 +163,8 @@ uvn_attach(struct vnode *vp, vm_prot_t a
 */
rw_enter(uvn->u_obj.vmobjlock, RW_WRITE);
if (uvn->u_flags & UVM_VNODE_VALID) {   /* already active? */
+   KASSERT(uvn->u_obj.uo_refs > 0);
 
-   /* regain vref if we were persisting */
-   if (uvn->u_obj.uo_refs == 0) {
-   vref(vp);
-   }
uvn->u_obj.uo_refs++;   /* bump uvn ref! */
rw_exit(uvn->u_obj.vmobjlock);
 
@@ -235,7 +232,7 @@ uvn_attach(struct vnode *vp, vm_prot_t a
KASSERT(uvn->u_obj.uo_refs == 0);
uvn->u_obj.uo_refs++;
oldflags = uvn->u_flags;
-   uvn->u_flags = UVM_VNODE_VALID|UVM_VNODE_CANPERSIST;
+   uvn->u_flags = UVM_VNODE_VALID;
uvn->u_nio = 0;
uvn->u_size = used_vnode_size;
 
@@ -248,7 +245,7 @@ uvn_attach(struct vnode *vp, vm_prot_t a
/*
 * add a reference to the vnode.   this reference will stay as long
 * as there is a valid mapping of the vnode.   dropped when the
-* reference count goes to zero [and we either free or persist].
+* reference count goes to zero.
 */
vref(vp);
if (oldflags & UVM_VNODE_WANTED)
@@ -321,16 +318,6 @@ uvn_detach(struct uvm_object *uobj)
 */
vp->v_flag &= ~VTEXT;
 
-   /*
-* we just dropped the last reference to the uvn.   see if we can
-* let it "stick around".
-*/
-   if (uvn->u_flags & UVM_VNODE_CANPERSIST) {
-   /* won't block */
-   uvn_flush(uobj, 0, 0, PGO_DEACTIVATE|PGO_ALLPAGES);
-   goto out;
-   }
-
/* its a goner! */
uvn->u_flags |= UVM_VNODE_DYING;
 
@@ -380,7 +367,6 @@ uvn_detach(struct uvm_object *uobj)
/* wake up any sleepers */
if (oldflags & UVM_VNODE_WANTED)
wakeup(uvn);
-out:
rw_exit(uobj->vmobjlock);
 
/* drop our reference to the vnode. */
@@ -496,8 +482,8 @@ uvm_vnp_terminate(struct vnode *vp)
}
 
/*
-* done.   now we free the uvn if its reference count is zero
-* (true if we are zapping a persisting uvn).   however, if we are
+* done.   now we free the uvn if its reference count is zero.
+* however, if we are
 * terminating a uvn with active mappings we let it live ... future
 * calls down to the vnode layer will fail.
 */
@@ -505,14 +491,14 @@ uvm_vnp_terminate(struct vnode *vp)
if (uvn->u_obj.uo_refs) {
/*
 * uvn must live on it is dead-vnode state until all references
-* are gone.   restore flags.clear CANPERSIST state.
+* are gone.   restore flags.
 */
uvn->u_flags &= ~(UVM_VNODE_DYING|UVM_VNODE_VNISLOCKED|
- UVM_VNODE_WANTED|UVM_VNODE_CANPERSIST);
+ UVM_VNODE_WANTED);
} else {
/*
 * free the uvn now.   note that the vref reference is already
-* gone [it is dropped when we enter the persist state].
+* gone.
 */
if (uvn->u_flags & UVM_VNODE_IOSYNCWANTED)
panic("uvm_vnp_terminate: io sync wanted bit set");
@@ -1349,46 +1335,14 @@ uvm_vnp_uncache(struct vnode *vp)
}
 
/*
-* we have a valid, non-blocked uvn.   clear persist flag.
+* we have a valid, non-blocked uvn.
 * if uvn is currently active we can return now.
 */
-   uvn->u_flags &= ~UVM_VNODE_CANPERSIST;
if (uvn->u_obj.uo_refs) {
rw_exit(uobj->vmobjlock);
return FALSE;
}
 
-   /*
-* uvn is currently persisting!   we have to gain a reference to
-* it so that we can call uvn_detach to kill the uvn.
-*/
-   vref(vp);   /* seems ok, even with VOP_LOCK */
-   uvn->u_obj.uo_refs++;   /* value is now 1 */
-   rw_exit(uobj->vmobjlock);
-
-#ifdef VFSLCKDEBUG
-   /*
-* carry over sanity check from old vnode pager: the vnode should
-* be VOP_LOCK'd, and we confirm it here.
-*/
-   if ((vp->v_flag &

Re: macppc panic: vref used where vget required

2022-09-01 Thread Martin Pieuchot

On 29/07/22(Fri) 14:22, Theo Buehler wrote:
> On Mon, Jul 11, 2022 at 01:05:19PM +0200, Martin Pieuchot wrote:
> > On 11/07/22(Mon) 07:50, Theo Buehler wrote:
> > > On Fri, Jun 03, 2022 at 03:02:36PM +0200, Theo Buehler wrote:
> > > > > Please do note that this change can introduce/expose other issues.
> > > > 
> > > > It seems that this diff causes occasional hangs when building snapshots
> > > > on my mac M1 mini. This happened twice in 10 builds, both times in
> > > > xenocara. Unfortunately, both times the machine became entirely
> > > > unresponsive and as I don't have serial console, that's all the info I
> > > > have...
> > > > 
> > > > This machine has been very reliable and built >50 snaps without any hang
> > > > over the last 2.5 months. I'm now trying snap builds in a loop without
> > > > the diff to make sure the machine doesn't hang due to another recent
> > > > kernel change.
> > > > 
> > > 
> > > A little bit of info on this. The first three lines were a bit garbled on
> > > the screen:
> > > 
> > > panic: kernel diagnostic assertion "uvn->_oppa jai c:  ke r  
> > > el   d iag no   tic a  s   rt n "   map   ==UL L  | | rw wr   
> > >   k
> > > ite held(amap->amap_lock)" failed: file "/ss/uvm/uvm_fault.c", line 846.
> > > ernel diagnostic assertion "!_kernel_lock_held
> > > Stopped at panic+0160:  cmp w21, #0x0  ailed: file "/sys/kern/
> > > TIDPID UID PRFLAGS PFLAGS   CPU  COMMAND
> > >  411910  44540  210x11  0 3  make
> > > *436444  84241  210x13  0 6  sh
> > >  227952  53498  210x13  0 5  sh
> > >  258925  15765  210x101005  0 0  make
> > >  128459   9649  210x13  0 1  tradcpp
> > >  287213  64216  210x130x8 7  make
> > >  173587   461710000x10  0 2  tmux
> > >  126511  69919   0 0x14000  0x200 4  softnet
> > > db_enter() at panic+0x15c
> > > panic() at __assert+0x24
> > > uvm_fault() at uvm_fault_upper_lookup+0x258
> > > uvm_fault_upper() at uvm_fault+0xec
> > > uvm_fault() at udata_abort+0x128
> > > udata_abort() at do_el0_sync+0xdc
> > > do_el0_sync() at handle_el0_sync+0x74
> > > https://www.openbsd.org/ddb.html describes the minimum info required in 
> > > bug
> > > reports.  Insufficient info makes it difficult to find and fix bugs.
> > > ddb{6}> show panic
> > > *cpu0: kernel diagnostic assertion "uvn->u_obj.uo_refs == 0" failed: file 
> > >  "/sys/kern/uvn_vnode.c", line 231.
> > >  cpu6: kernel diagnostic assertion "amap == NULL || 
> > > rw_write_held(amap->am_lock)" failed: file  "/sys/uvm/uvm_fault", line 
> > > 846.
> > >  cpu3: kernel diagnostic assertion "!_kernel_lock_held()" failed: file 
> > > "/sys/kern/kern_fork.c", line 678
> > > ddb{6}> mach ddbcpu 0
> > > 
> > > After pressing enter here, the machine locked up completely.
> > 
> > It's hard for me to tell what's going on.  I believe the interesting
> > trace is the one from cpu0 that we don't have.  Can you easily reproduce
> > this?  I'm trying on amd64 without luck.  I'd glad if you could gather
> > more infos.
> 
> Sorry for the delay. I was only at home intermittently. I hit this three
> times:
> 
> panic: kernel diagnostic assertion "uvn->u_obj.uo_refs == 0" failed: file 
> "/sys/uvm/uvm_vnode.c", line 231

Yesterday gnezdo@ fixed a race in uvn_attach() that lead to the same
assert.  Here's an rebased diff for the bug discussed in this thread,
could you try again and let us know?  Thanks!

Index: uvm/uvm_vnode.c
===
RCS file: /cvs/src/sys/uvm/uvm_vnode.c,v
retrieving revision 1.127
diff -u -p -r1.127 uvm_vnode.c
--- uvm/uvm_vnode.c 31 Aug 2022 09:07:35 -  1.127
+++ uvm/uvm_vnode.c 1 Sep 2022 12:54:27 -
@@ -163,11 +163,8 @@ uvn_attach(struct vnode *vp, vm_prot_t a
 */
rw_enter(uvn->u_obj.vmobjlock, RW_WRITE);
if (uvn->u_flags & UVM_VNODE_VALID) {   /* already active? */
+   KASSERT(uvn->u_obj.uo_refs > 0);
 
-   /* regain vref if we were persisting */
-   if (uvn->u_obj.uo_refs == 0) {
-   v

Re: macppc panic: vref used where vget required

2022-07-11 Thread Martin Pieuchot

On 11/07/22(Mon) 07:50, Theo Buehler wrote:
> On Fri, Jun 03, 2022 at 03:02:36PM +0200, Theo Buehler wrote:
> > > Please do note that this change can introduce/expose other issues.
> > 
> > It seems that this diff causes occasional hangs when building snapshots
> > on my mac M1 mini. This happened twice in 10 builds, both times in
> > xenocara. Unfortunately, both times the machine became entirely
> > unresponsive and as I don't have serial console, that's all the info I
> > have...
> > 
> > This machine has been very reliable and built >50 snaps without any hang
> > over the last 2.5 months. I'm now trying snap builds in a loop without
> > the diff to make sure the machine doesn't hang due to another recent
> > kernel change.
> > 
> 
> A little bit of info on this. The first three lines were a bit garbled on
> the screen:
> 
> panic: kernel diagnostic assertion "uvn->_oppa jai c:  ke r  el   
> d iag no   tic a  s   rt n "   map   ==UL L  | | rw wr
>  k
> ite held(amap->amap_lock)" failed: file "/ss/uvm/uvm_fault.c", line 846.
> ernel diagnostic assertion "!_kernel_lock_held
> Stopped at panic+0160:  cmp w21, #0x0  ailed: file "/sys/kern/
> TIDPID UID PRFLAGS PFLAGS   CPU  COMMAND
>  411910  44540  210x11  0 3  make
> *436444  84241  210x13  0 6  sh
>  227952  53498  210x13  0 5  sh
>  258925  15765  210x101005  0 0  make
>  128459   9649  210x13  0 1  tradcpp
>  287213  64216  210x130x8 7  make
>  173587   461710000x10  0 2  tmux
>  126511  69919   0 0x14000  0x200 4  softnet
> db_enter() at panic+0x15c
> panic() at __assert+0x24
> uvm_fault() at uvm_fault_upper_lookup+0x258
> uvm_fault_upper() at uvm_fault+0xec
> uvm_fault() at udata_abort+0x128
> udata_abort() at do_el0_sync+0xdc
> do_el0_sync() at handle_el0_sync+0x74
> https://www.openbsd.org/ddb.html describes the minimum info required in bug
> reports.  Insufficient info makes it difficult to find and fix bugs.
> ddb{6}> show panic
> *cpu0: kernel diagnostic assertion "uvn->u_obj.uo_refs == 0" failed: file  
> "/sys/kern/uvn_vnode.c", line 231.
>  cpu6: kernel diagnostic assertion "amap == NULL || 
> rw_write_held(amap->am_lock)" failed: file  "/sys/uvm/uvm_fault", line 846.
>  cpu3: kernel diagnostic assertion "!_kernel_lock_held()" failed: file 
> "/sys/kern/kern_fork.c", line 678
> ddb{6}> mach ddbcpu 0
> 
> After pressing enter here, the machine locked up completely.

It's hard for me to tell what's going on.  I believe the interesting
trace is the one from cpu0 that we don't have.  Can you easily reproduce
this?  I'm trying on amd64 without luck.  I'd glad if you could gather
more infos.

Re: System frequently hangs, found commit that probably causes it

2022-07-06 Thread Martin Pieuchot

On 01/07/22(Fri) 07:13, Sebastien Marie wrote:
> On Mon, Jun 27, 2022 at 06:29:55PM +0200, Martin Pieuchot wrote:
> > On 27/06/22(Mon) 18:04, Caspar Schutijser wrote:
> > > On Sun, Jun 26, 2022 at 10:03:59PM +0200, Martin Pieuchot wrote:
> > > > On 26/06/22(Sun) 20:36, Caspar Schutijser wrote:
> > > > > A laptop of mine (dmesg below) frequently hangs. After some bisecting
> > > > > and extensive testing I think I found the commit that causes this:
> > > > > mpi@'s
> > > > > "Always acquire the `vmobjlock' before incrementing an object's 
> > > > > reference."
> > > > > commit from 2022-04-28.
> > > > > 
> > > > > My definition of "the system hangs": 
> > > > >  * Display is frozen
> > > > >  * Switching to ttyC0 using Ctrl+Alt+F1 doesn't do anything
> > > > >  * System does not respond to keyboard or mouse input
> > > > >  * Pressing the power button for 1-2 seconds doesn't achieve anything
> > > > > (usually this initiates a system shutdown)
> > > > >  * And also the fan starts spinning
> > > > > 
> > > > > The system sometimes hangs very soon after booting the system, I've
> > > > > seen it happen once while I was typing my username in xenodm to log 
> > > > > in.
> > > > > But sometimes it takes a couple of hours.
> > > > > 
> > > > > For some reason I put
> > > > > "@reboot while sleep 1 ; do sync ; done"
> > > > > in my crontab and it *seems* (I'm not sure) that the hangs occur more
> > > > > frequently this way. Not sure if that is useful information.
> > > > > 
> > > > > I don't see similar problems on my other machines.
> > > > > 
> > > > > It looks like when the system hangs, it's stuck spinning in the new
> > > > > code that was added in that commit; to confirm that I added some code
> > > > > (see the diff below) to enter ddb if it's spinning there for 10 
> > > > > seconds
> > > > > (and then it indeed enters ddb). If my thinking and diff make sense
> > > > > I think that indeed confirms that is the problem.
> > > > > 
> > > > > Any tips for debugging this?
> > > > 
> > > > I believe I introduced a deadlock.  If you can reproduce it could you
> > > > get us the output of `ps' in ddb(4) and the trace of all the active
> > > > processes.
> > > > 
> > > > I guess one is waiting for the KERNEL_LOCK() while holding the uobj's
> > > > vmobjlock.
> > > 
> > > "ps" output (pictures only):
> > > https://temp.schutijser.com/~caspar/2022-06-27-ddb/ps-1.jpg
> > > https://temp.schutijser.com/~caspar/2022-06-27-ddb/ps-2.jpg
> > > https://temp.schutijser.com/~caspar/2022-06-27-ddb/ps-3.jpg
> > > https://temp.schutijser.com/~caspar/2022-06-27-ddb/ps-4.jpg
> > > 
> > > 
> > > traces of active processes (I hope; if this is not correct I'm happy
> > > to run different commands; pictures and transcription follow):
> > > https://temp.schutijser.com/~caspar/2022-06-27-ddb/trace-1.jpg
> > > 
> > > ddb{1}> ps /o
> > > TIDPIDUIDPRFLAGSPFLAGS  CPU  COMMAND
> > > *246699  86564   10000x2 01K sync
> > >  395058  12288 48   0x100012 00  unwind
> > > ddb{1}> trace /t 0t246699
> > > kernel: protection fault trap, code=0
> > > Faulted in DDB; continuing...
> > > ddb{1}> trace /t 0t395058
> > > uvm_fault(0xfd8448ab5338, 0x1, 0, 1) -> e
> > > kernel: page fault trap, code=0
> > > Faulted in DDB; continuing...
> > > ddb{1}>
> > 
> > Is it a hang or a panic/fault?  Here's a possible fix.  The idea is to
> > make the list private to the sync function such that we could sleep on
> > the lock without lock ordering reversal.
> > 
> > That means multiple sync could be started in parallel, this should be
> > fine as the objects are refcounted and only the first flush result in
> > I/O.
> 
> I think there is one drawback: you can't sleep while iterating on 
> LIST_FOREACH(uvn_wlist): if the list is modified while sleeping, on wakeup 
> pointers might be changed.
> 
> see below.
> 
> > Index: uvm/uvm_vnode.c
> > ===
> > RCS file: /cvs

Re: System frequently hangs, found commit that probably causes it

2022-06-27 Thread Martin Pieuchot

On 27/06/22(Mon) 18:04, Caspar Schutijser wrote:
> On Sun, Jun 26, 2022 at 10:03:59PM +0200, Martin Pieuchot wrote:
> > On 26/06/22(Sun) 20:36, Caspar Schutijser wrote:
> > > A laptop of mine (dmesg below) frequently hangs. After some bisecting
> > > and extensive testing I think I found the commit that causes this:
> > > mpi@'s
> > > "Always acquire the `vmobjlock' before incrementing an object's 
> > > reference."
> > > commit from 2022-04-28.
> > > 
> > > My definition of "the system hangs": 
> > >  * Display is frozen
> > >  * Switching to ttyC0 using Ctrl+Alt+F1 doesn't do anything
> > >  * System does not respond to keyboard or mouse input
> > >  * Pressing the power button for 1-2 seconds doesn't achieve anything
> > > (usually this initiates a system shutdown)
> > >  * And also the fan starts spinning
> > > 
> > > The system sometimes hangs very soon after booting the system, I've
> > > seen it happen once while I was typing my username in xenodm to log in.
> > > But sometimes it takes a couple of hours.
> > > 
> > > For some reason I put
> > > "@reboot while sleep 1 ; do sync ; done"
> > > in my crontab and it *seems* (I'm not sure) that the hangs occur more
> > > frequently this way. Not sure if that is useful information.
> > > 
> > > I don't see similar problems on my other machines.
> > > 
> > > It looks like when the system hangs, it's stuck spinning in the new
> > > code that was added in that commit; to confirm that I added some code
> > > (see the diff below) to enter ddb if it's spinning there for 10 seconds
> > > (and then it indeed enters ddb). If my thinking and diff make sense
> > > I think that indeed confirms that is the problem.
> > > 
> > > Any tips for debugging this?
> > 
> > I believe I introduced a deadlock.  If you can reproduce it could you
> > get us the output of `ps' in ddb(4) and the trace of all the active
> > processes.
> > 
> > I guess one is waiting for the KERNEL_LOCK() while holding the uobj's
> > vmobjlock.
> 
> "ps" output (pictures only):
> https://temp.schutijser.com/~caspar/2022-06-27-ddb/ps-1.jpg
> https://temp.schutijser.com/~caspar/2022-06-27-ddb/ps-2.jpg
> https://temp.schutijser.com/~caspar/2022-06-27-ddb/ps-3.jpg
> https://temp.schutijser.com/~caspar/2022-06-27-ddb/ps-4.jpg
> 
> 
> traces of active processes (I hope; if this is not correct I'm happy
> to run different commands; pictures and transcription follow):
> https://temp.schutijser.com/~caspar/2022-06-27-ddb/trace-1.jpg
> 
> ddb{1}> ps /o
> TIDPIDUIDPRFLAGSPFLAGS  CPU  COMMAND
> *246699  86564   10000x2 01K sync
>  395058  12288 48   0x100012 00  unwind
> ddb{1}> trace /t 0t246699
> kernel: protection fault trap, code=0
> Faulted in DDB; continuing...
> ddb{1}> trace /t 0t395058
> uvm_fault(0xfd8448ab5338, 0x1, 0, 1) -> e
> kernel: page fault trap, code=0
> Faulted in DDB; continuing...
> ddb{1}>

Is it a hang or a panic/fault?  Here's a possible fix.  The idea is to
make the list private to the sync function such that we could sleep on
the lock without lock ordering reversal.

That means multiple sync could be started in parallel, this should be
fine as the objects are refcounted and only the first flush result in
I/O.

Index: uvm/uvm_vnode.c
===
RCS file: /cvs/src/sys/uvm/uvm_vnode.c,v
retrieving revision 1.124
diff -u -p -r1.124 uvm_vnode.c
--- uvm/uvm_vnode.c 3 May 2022 21:20:35 -   1.124
+++ uvm/uvm_vnode.c 27 Jun 2022 16:21:28 -
@@ -69,11 +69,7 @@
  */
 
 LIST_HEAD(uvn_list_struct, uvm_vnode);
-struct uvn_list_struct uvn_wlist;  /* writeable uvns */
-
-SIMPLEQ_HEAD(uvn_sq_struct, uvm_vnode);
-struct uvn_sq_struct uvn_sync_q;   /* sync'ing uvns */
-struct rwlock uvn_sync_lock;   /* locks sync operation */
+struct uvn_list_struct uvn_wlist;  /* [K] writeable uvns */
 
 extern int rebooting;
 
@@ -116,10 +112,7 @@ const struct uvm_pagerops uvm_vnodeops =
 void
 uvn_init(void)
 {
-
LIST_INIT(_wlist);
-   /* note: uvn_sync_q init'd in uvm_vnp_sync() */
-   rw_init_flags(_sync_lock, "uvnsync", RWL_IS_VNODE);
 }
 
 /*
@@ -1444,39 +1437,22 @@ uvm_vnp_setsize(struct vnode *vp, off_t 
  * uvm_vnp_sync: flush all dirty VM pages back to their backing vnodes.
  *
  * => called from sys_sync with no VM structures locked
- * => only one process can do a sync at a time (because the uvn
- *structure only has one queue

Re: System frequently hangs, found commit that probably causes it

2022-06-26 Thread Martin Pieuchot

On 26/06/22(Sun) 20:36, Caspar Schutijser wrote:
> A laptop of mine (dmesg below) frequently hangs. After some bisecting
> and extensive testing I think I found the commit that causes this:
> mpi@'s
> "Always acquire the `vmobjlock' before incrementing an object's reference."
> commit from 2022-04-28.
> 
> My definition of "the system hangs": 
>  * Display is frozen
>  * Switching to ttyC0 using Ctrl+Alt+F1 doesn't do anything
>  * System does not respond to keyboard or mouse input
>  * Pressing the power button for 1-2 seconds doesn't achieve anything
> (usually this initiates a system shutdown)
>  * And also the fan starts spinning
> 
> The system sometimes hangs very soon after booting the system, I've
> seen it happen once while I was typing my username in xenodm to log in.
> But sometimes it takes a couple of hours.
> 
> For some reason I put
> "@reboot while sleep 1 ; do sync ; done"
> in my crontab and it *seems* (I'm not sure) that the hangs occur more
> frequently this way. Not sure if that is useful information.
> 
> I don't see similar problems on my other machines.
> 
> It looks like when the system hangs, it's stuck spinning in the new
> code that was added in that commit; to confirm that I added some code
> (see the diff below) to enter ddb if it's spinning there for 10 seconds
> (and then it indeed enters ddb). If my thinking and diff make sense
> I think that indeed confirms that is the problem.
> 
> Any tips for debugging this?

I believe I introduced a deadlock.  If you can reproduce it could you
get us the output of `ps' in ddb(4) and the trace of all the active
processes.

I guess one is waiting for the KERNEL_LOCK() while holding the uobj's
vmobjlock.

> Index: uvm/uvm_vnode.c
> ===
> RCS file: /cvs/src/sys/uvm/uvm_vnode.c,v
> retrieving revision 1.124
> diff -u -p -r1.124 uvm_vnode.c
> --- uvm/uvm_vnode.c   3 May 2022 21:20:35 -   1.124
> +++ uvm/uvm_vnode.c   26 Jun 2022 18:21:19 -
> @@ -1455,6 +1455,10 @@ uvm_vnp_sync(struct mount *mp)
>  {
>   struct uvm_vnode *uvn;
>   struct vnode *vp;
> + struct timespec start;
> + struct timespec limit = { 10, 0 };
> + struct timespec now, diff;
> + int i = 0;
>  
>   /*
>* step 1: ensure we are only ones using the uvn_sync_q by locking
> @@ -1472,9 +1476,18 @@ uvm_vnp_sync(struct mount *mp)
>   if (mp && vp->v_mount != mp)
>   continue;
>  
> + nanotime();
>   /* Spin to ensure `uvn_wlist' isn't modified concurrently. */
>   while (rw_enter(uvn->u_obj.vmobjlock, RW_WRITE|RW_NOSLEEP)) {
>   CPU_BUSY_CYCLE();
> + if (++i % 4096 == 0) {
> + nanotime();
> + timespecsub(, , );
> + if (timespeccmp(, , >)) {
> + printf("i: %d\n", i);
> + db_enter();
> + }
> + }
>   }
>  
>   /*
> 
> 
>  2: ddb output
> 
> i: 175194112
> Stopped atdb_enter+0x10:  popq%rbp
> ddb{1}> ddb{1}> db_enter() at db_enter+0x10
> uvm_vnp_sync(8087f400) at uvm_vnp_sync+0xc9
> sys_sync(8000fffc0a90,800033ba0b30,800033ba0b80) at sys_sync+0x85
> syscall(800033ba0bf0) at syscall+0x374
> Xsyscall() at Xsyscall+0x128
> end of kernel
> end trace frame: 0x7f7f78c0, count: -5
> ddb{1}> 
> 
> 
>  3: dmesg
> 
> OpenBSD 7.1-current (GENERIC.MP) #1: Sun Jun 26 19:19:30 CEST 2022
>   caspar@laptop:/sys/arch/amd64/compile/GENERIC.MP
> real mem = 17032646656 (16243MB)
> avail mem = 16499073024 (15734MB)
> random: good seed from bootblocks
> mpath0 at root
> scsibus0 at mpath0: 256 targets
> mainbus0 at root
> bios0 at mainbus0: SMBIOS rev. 3.0 @ 0xa9764000 (34 entries)
> bios0: vendor HP version "P96 Ver. 01.38" date 01/06/2021
> bios0: HP HP EliteBook 1040 G4
> acpi0 at bios0: ACPI 5.0
> acpi0: sleep states S0 S3 S4 S5
> acpi0: tables DSDT FACP SSDT RTMA UEFI SSDT TPM2 SSDT MSDM SLIC WSMT HPET 
> APIC MCFG SSDT SSDT SSDT SSDT SSDT SSDT SSDT SSDT SSDT DBGP DBG2 DMAR NHLT 
> SSDT ASF! FPDT BGRT SSDT
> acpi0: wakeup devices GLAN(S4) XHC_(S3) XDCI(S4) HDAS(S4) RP01(S4) PXSX(S4) 
> TXHC(S3) RP02(S4) PXSX(S4) RP03(S4) PXSX(S4) RP04(S4) PXSX(S4) RP05(S0) 
> PXSX(S0) RP06(S4) [...]
> acpitimer0 at acpi0: 3579545 Hz, 24 bits
> acpihpet0 at acpi0: 2399 Hz
> acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
> cpu0 at mainbus0: apid 0 (boot processor)
> cpu0: Intel(R) Core(TM) i7-7500U CPU @ 2.70GHz, 2593.96 MHz, 06-8e-09
> cpu0: 
>

Re: kern_event.c:839 assertion failed

2022-06-26 Thread Martin Pieuchot

On 20/06/22(Mon) 14:59, Visa Hankala wrote:
> On Mon, Jun 20, 2022 at 01:59:25PM +0200, Martin Pieuchot wrote:
> > On 19/06/22(Sun) 11:34, Visa Hankala wrote:
> > > On Fri, Jun 17, 2022 at 04:25:48PM +0300, Mikhail wrote:
> > > > I was debugging tog in lldb and in second tmux window opened another
> > > > bare tog instance, after a second I got this panic:
> > > > 
> > > > panic: kernel diagnostic assetion "p->p_kq->kq_refcnt.r_refs == 1"
> > > > failed file "/usr/src/sys/kern/kern_event.c", line 839
> > > > 
> > > > There were also couple of xterms and chrome launched.
> > > > 
> > > > There was an update of kern_event.c from 12 Jun - not sure if it's the
> > > > fix for this panic or not.
> > > > 
> > > > After the panic I updated to the latest snapshot, and can't reproduce it
> > > > anymore, but maybe someone will have a clue.
> > > 
> > > The 12 Jun kern_event.c commit is unrelated.
> > > 
> > > This report shows no kernel stack trace, so I don't know if the panic
> > > was caused by some unexpected thread exit path.
> > > 
> > > However, it looks that there is a problem with kqueue_task(). Even
> > > though the task holds a reference to the kqueue, the task should be
> > > cleared before the kqueue is deleted. Otherwise, the kqueue's lifetime
> > > can extend beyond that of the file descriptor table, causing
> > > a use-after-free in KQRELE(). In addition, the task clearing should
> > > avoid the unexpected reference count in kqpoll_exit().
> > 
> > Nice catch. 
> > 
> > > The lifetime bug can be lured out by adding a brief sleep between
> > > taskq_next_work() and (*work.t_func)(work.t_arg) in taskq_thread().
> > > With the sleep in place, regress/sys/kern/kqueue causes the following
> > > panic:
> > > 
> > > panic: pool_do_get: fdescpl free list modified: page 0xfd811cf0e000; 
> > > item addr 0xfd811cf0e888; offset 0x48=0xdead4113
> > > Stopped at  db_enter+0x10:  popq%rbp
> > > TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
> > > *338246  40644   10010x13  03K make
> > > db_enter() at db_enter+0x10
> > > panic(81f841ac) at panic+0xbf
> > > pool_do_get(823c2fb8,9,8000226dc904) at pool_do_get+0x35c
> > > pool_get(823c2fb8,9) at pool_get+0x96
> > > fdcopy(800013d0) at fdcopy+0x38
> > > process_new(80009500,800013d0,1) at process_new+0x107
> > > fork1(80006d30,1,81a6eab0,0,8000226dcb00,0) at 
> > > fork1+0x236
> > > syscall(8000226dcb70) at syscall+0x374
> > > Xsyscall() at Xsyscall+0x128
> > > end of kernel
> > > end trace frame: 0x7f7f28e0, count: 6
> > > 
> > > 
> > > The following patch replaces the task_del() with taskq_del_barrier()
> > > but also makes the barrier conditional to prior task usage. This avoids
> > > the barrier invocation in the typical case where there is no kqueue
> > > nesting (or poll(2)'ing or select(2)'ing of kqueues).
> > 
> > One comment about this below.
> > 
> > > The new barrier adds a lock order constraint. The locks that the thread
> > > can hold when calling kqueue_terminate() should not be taken by tasks
> > > that are run by systqmp. If this becomes a problem in the future,
> > > kqueue can have its own taskq.
> > 
> > Can this be enforced by some asserts or should we document it somewhere?
> 
> WITNESS detects locking errors with the barrier routine.
> 
> I will send a manual page patch.
> 
> > > Index: kern/kern_event.c
> > > ===
> > > RCS file: src/sys/kern/kern_event.c,v
> > > retrieving revision 1.189
> > > diff -u -p -r1.189 kern_event.c
> > > --- kern/kern_event.c 12 Jun 2022 10:34:36 -  1.189
> > > +++ kern/kern_event.c 19 Jun 2022 10:38:45 -
> > > @@ -1581,6 +1581,7 @@ void
> > >  kqueue_terminate(struct proc *p, struct kqueue *kq)
> > >  {
> > >   struct knote *kn;
> > > + int state;
> > >  
> > >   mtx_enter(>kq_lock);
> > >  
> > > @@ -1593,11 +1594,17 @@ kqueue_terminate(struct proc *p, struct 
> > >   KASSERT(kn->kn_filter == EVFILT_MARKER);
> > >  
> > >   kq->kq_state |= KQ_DYING;
> > > + state = kq->kq_state;
> > >   kqueue_wakeup(kq);
> > 
> > Shouldn't we read `kq_state' after calling kqueue_wakeup()?  Or are we
> > sure this wakeup won't schedule a task?
> 
> The wakeup is there so that KQ_DYING takes effect. The task scheduling
> should not happen because kq->kq_sel.si_note should be empty.
> 
> I can move the read after the wakeup call if that makes the code
> clearer.

It's fine like that.  Thanks for explaining.  ok with me.

Re: kern_event.c:839 assertion failed

2022-06-20 Thread Martin Pieuchot

On 19/06/22(Sun) 11:34, Visa Hankala wrote:
> On Fri, Jun 17, 2022 at 04:25:48PM +0300, Mikhail wrote:
> > I was debugging tog in lldb and in second tmux window opened another
> > bare tog instance, after a second I got this panic:
> > 
> > panic: kernel diagnostic assetion "p->p_kq->kq_refcnt.r_refs == 1"
> > failed file "/usr/src/sys/kern/kern_event.c", line 839
> > 
> > There were also couple of xterms and chrome launched.
> > 
> > There was an update of kern_event.c from 12 Jun - not sure if it's the
> > fix for this panic or not.
> > 
> > After the panic I updated to the latest snapshot, and can't reproduce it
> > anymore, but maybe someone will have a clue.
> 
> The 12 Jun kern_event.c commit is unrelated.
> 
> This report shows no kernel stack trace, so I don't know if the panic
> was caused by some unexpected thread exit path.
> 
> However, it looks that there is a problem with kqueue_task(). Even
> though the task holds a reference to the kqueue, the task should be
> cleared before the kqueue is deleted. Otherwise, the kqueue's lifetime
> can extend beyond that of the file descriptor table, causing
> a use-after-free in KQRELE(). In addition, the task clearing should
> avoid the unexpected reference count in kqpoll_exit().

Nice catch. 

> The lifetime bug can be lured out by adding a brief sleep between
> taskq_next_work() and (*work.t_func)(work.t_arg) in taskq_thread().
> With the sleep in place, regress/sys/kern/kqueue causes the following
> panic:
> 
> panic: pool_do_get: fdescpl free list modified: page 0xfd811cf0e000; item 
> addr 0xfd811cf0e888; offset 0x48=0xdead4113
> Stopped at  db_enter+0x10:  popq%rbp
> TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
> *338246  40644   10010x13  03K make
> db_enter() at db_enter+0x10
> panic(81f841ac) at panic+0xbf
> pool_do_get(823c2fb8,9,8000226dc904) at pool_do_get+0x35c
> pool_get(823c2fb8,9) at pool_get+0x96
> fdcopy(800013d0) at fdcopy+0x38
> process_new(80009500,800013d0,1) at process_new+0x107
> fork1(80006d30,1,81a6eab0,0,8000226dcb00,0) at fork1+0x236
> syscall(8000226dcb70) at syscall+0x374
> Xsyscall() at Xsyscall+0x128
> end of kernel
> end trace frame: 0x7f7f28e0, count: 6
> 
> 
> The following patch replaces the task_del() with taskq_del_barrier()
> but also makes the barrier conditional to prior task usage. This avoids
> the barrier invocation in the typical case where there is no kqueue
> nesting (or poll(2)'ing or select(2)'ing of kqueues).

One comment about this below.

> The new barrier adds a lock order constraint. The locks that the thread
> can hold when calling kqueue_terminate() should not be taken by tasks
> that are run by systqmp. If this becomes a problem in the future,
> kqueue can have its own taskq.

Can this be enforced by some asserts or should we document it somewhere?

> Index: kern/kern_event.c
> ===
> RCS file: src/sys/kern/kern_event.c,v
> retrieving revision 1.189
> diff -u -p -r1.189 kern_event.c
> --- kern/kern_event.c 12 Jun 2022 10:34:36 -  1.189
> +++ kern/kern_event.c 19 Jun 2022 10:38:45 -
> @@ -1581,6 +1581,7 @@ void
>  kqueue_terminate(struct proc *p, struct kqueue *kq)
>  {
>   struct knote *kn;
> + int state;
>  
>   mtx_enter(>kq_lock);
>  
> @@ -1593,11 +1594,17 @@ kqueue_terminate(struct proc *p, struct 
>   KASSERT(kn->kn_filter == EVFILT_MARKER);
>  
>   kq->kq_state |= KQ_DYING;
> + state = kq->kq_state;
>   kqueue_wakeup(kq);

Shouldn't we read `kq_state' after calling kqueue_wakeup()?  Or are we
sure this wakeup won't schedule a task?

>   mtx_leave(>kq_lock);
>  
> + /*
> +  * Any knotes that were attached to this kqueue were deleted
> +  * by knote_fdclose() when this kqueue's file descriptor was closed.
> +  */
>   KASSERT(klist_empty(>kq_sel.si_note));
> - task_del(systqmp, >kq_task);
> + if (state & KQ_TASK)
> + taskq_del_barrier(systqmp, >kq_task);
>  }
>  
>  int
> @@ -1623,7 +1630,6 @@ kqueue_task(void *arg)
>   mtx_enter(_klist_lock);
>   KNOTE(>kq_sel.si_note, 0);
>   mtx_leave(_klist_lock);
> - KQRELE(kq);
>  }
>  
>  void
> @@ -1637,9 +1643,8 @@ kqueue_wakeup(struct kqueue *kq)
>   }
>   if (!klist_empty(>kq_sel.si_note)) {
>   /* Defer activation to avoid recursion. */
> - KQREF(kq);
> - if (!task_add(systqmp, >kq_task))
> - KQRELE(kq);
> + kq->kq_state |= KQ_TASK;
> + task_add(systqmp, >kq_task);
>   }
>  }
>  
> Index: sys/eventvar.h
> ===
> RCS file: src/sys/sys/eventvar.h,v
> retrieving revision 1.14
> diff -u -p -r1.14 eventvar.h
> --- sys/eventvar.h16 Mar 2022 14:38:43 -  1.14
> +++ sys/eventvar.h19 Jun 2022

Re: macppc panic: vref used where vget required

2022-06-03 Thread Martin Pieuchot

On 02/06/22(Thu) 13:54, Sebastien Marie wrote:
> On Tue, May 24, 2022 at 02:16:44PM +0200, Martin Pieuchot wrote:
> > On 19/05/22(Thu) 13:33, Alexander Bluhm wrote:
> > > On Tue, May 17, 2022 at 05:43:02PM +0200, Martin Pieuchot wrote:
> > > > Andrew, Alexander, could you test this and report back?
> > > 
> > > Panic "vref used where vget required" is still there.  As usual it
> > > needs a day to reproduce.  This time I was running without the vref
> > > history diff.
> > 
> > Thanks for testing.  Apparently calling uvm_vnp_terminate() in
> > getnewvnode() isn't good enough.  I suppose that in your case below the
> > pdaemon is trying to flush the pages before the vnode has been recycled,
> > so before uvm_vnp_terminate() has been called.
> > 
> > So either we prevent the vnode from being put on the free list or we get
> > rid of the persisting mechanism.  I don't fully understand what could be
> > the impact of always flushing the pages and why this hasn't been done in
> > the first place.  It seems the CANPERSIST logic has been"inherited from
> > 44BSD's original vm.
> > 
> > On the other hand I feel even less comfortable with preventing vnodes
> > from being recycled until the pdaemon has freed related pages.
> > 
> > Diff below has been lightly tested, it get rids of the extra CANPERSIST
> > referenced.  That means pages associated to a vnode will now always be
> > flushed when uvn_detach() is called.  This turns uvm_vnp_uncache() into
> > a noop and open the possibility of further simplifications.
> > 
> > I'd be happy to hear if this has any impact of the bug we're chasing.
> > Please do note that this change can introduce/expose other issues.
> 
> I don't know if you are looking for ok or not regarding this diff.

I'd appreciate if it could be stress tested before that. 

> For me, it makes sense: it simplifies uvm_vnode code, and seems to fix the 
> vref 
> problem.
> 
> Just some notes in case you want to commit it:
> 
> - UVM_VNODE_CANPERSIST define should be removed from uvm/uvm_vnode.h too

Indeed.

> - uvm_vnp_uncache(9) man page (src/share/man/man9/uvn_attach.9) should be 
>   amended: it mentions "uvm_vnp_uncache() function disables vnode vp from 
>   persisting"
> 
> - I wonder if UVM_VNODE_VALID could be removed too (it could be separated 
>   commit): once UVM_VNODE_CANPERSIST disappear, UVM_VNODE_VALID flag should 
> be 
>   equivalent to have uo_refs>0 if I properly understood the code
> 
> - more simplification might be possible inside uvm_vnp_terminate() or 
>   uvm_vnp_uncache(): they have both code path for uo_refs ==0 and >0 (and 
>   without UVM_VNODE_CANPERSIST, a uvn should always have refs or be "free").
>   uvm_vnp_uncache() might be deletable.

I agree we should then simplify/cleanup all this logic.  I'd like first
to be sure there is no regression.

Re: macppc panic: vref used where vget required

2022-06-03 Thread Martin Pieuchot

On 02/06/22(Thu) 07:29, Theo de Raadt wrote:
> So this basically converts the flag into a proper reference?

It completely gets rid of the extra reference.  UVM objects related to a
vnode are no longer kept alive after uvn_detach() has been called.

> If you go back to 4.4BSD, there's another aspect which was different:
> I believe vnodes weren't allocated dynamically, but came out of a fixed 
> and therefore the recycling behaviour was different.  Or maybe some
> kernel code had a subtle use-after-free mistake?

Indeed the PERSIST flag has been inherited/copied from 4.4BSD vm where
objects where kept in a global cache data structure.  It isn't clear to
me why this logic has been kept in UVM.

Re: macppc panic: vref used where vget required

2022-05-31 Thread Martin Pieuchot

On 24/05/22(Tue) 14:16, Martin Pieuchot wrote:
> On 19/05/22(Thu) 13:33, Alexander Bluhm wrote:
> > On Tue, May 17, 2022 at 05:43:02PM +0200, Martin Pieuchot wrote:
> > > Andrew, Alexander, could you test this and report back?
> > 
> > Panic "vref used where vget required" is still there.  As usual it
> > needs a day to reproduce.  This time I was running without the vref
> > history diff.
> 
> Thanks for testing.  Apparently calling uvm_vnp_terminate() in
> getnewvnode() isn't good enough.  I suppose that in your case below the
> pdaemon is trying to flush the pages before the vnode has been recycled,
> so before uvm_vnp_terminate() has been called.
> 
> So either we prevent the vnode from being put on the free list or we get
> rid of the persisting mechanism.  I don't fully understand what could be
> the impact of always flushing the pages and why this hasn't been done in
> the first place.  It seems the CANPERSIST logic has been"inherited from
> 44BSD's original vm.
> 
> On the other hand I feel even less comfortable with preventing vnodes
> from being recycled until the pdaemon has freed related pages.
> 
> Diff below has been lightly tested, it get rids of the extra CANPERSIST
> referenced.  That means pages associated to a vnode will now always be
> flushed when uvn_detach() is called.  This turns uvm_vnp_uncache() into
> a noop and open the possibility of further simplifications.
> 
> I'd be happy to hear if this has any impact of the bug we're chasing.
> Please do note that this change can introduce/expose other issues.

Any of you got the chance to try this diff?  Could you reproduce the
panic with it?

> Index: uvm/uvm_vnode.c
> ===
> RCS file: /cvs/src/sys/uvm/uvm_vnode.c,v
> retrieving revision 1.124
> diff -u -p -r1.124 uvm_vnode.c
> --- uvm/uvm_vnode.c   3 May 2022 21:20:35 -   1.124
> +++ uvm/uvm_vnode.c   20 May 2022 09:04:08 -
> @@ -162,12 +162,9 @@ uvn_attach(struct vnode *vp, vm_prot_t a
>* add it to the writeable list, and then return.
>*/
>   if (uvn->u_flags & UVM_VNODE_VALID) {   /* already active? */
> + KASSERT(uvn->u_obj.uo_refs > 0);
>  
>   rw_enter(uvn->u_obj.vmobjlock, RW_WRITE);
> - /* regain vref if we were persisting */
> - if (uvn->u_obj.uo_refs == 0) {
> - vref(vp);
> - }
>   uvn->u_obj.uo_refs++;   /* bump uvn ref! */
>   rw_exit(uvn->u_obj.vmobjlock);
>  
> @@ -234,7 +231,7 @@ uvn_attach(struct vnode *vp, vm_prot_t a
>   KASSERT(uvn->u_obj.uo_refs == 0);
>   uvn->u_obj.uo_refs++;
>   oldflags = uvn->u_flags;
> - uvn->u_flags = UVM_VNODE_VALID|UVM_VNODE_CANPERSIST;
> + uvn->u_flags = UVM_VNODE_VALID;
>   uvn->u_nio = 0;
>   uvn->u_size = used_vnode_size;
>  
> @@ -247,7 +244,7 @@ uvn_attach(struct vnode *vp, vm_prot_t a
>   /*
>* add a reference to the vnode.   this reference will stay as long
>* as there is a valid mapping of the vnode.   dropped when the
> -  * reference count goes to zero [and we either free or persist].
> +  * reference count goes to zero.
>*/
>   vref(vp);
>   if (oldflags & UVM_VNODE_WANTED)
> @@ -320,16 +317,6 @@ uvn_detach(struct uvm_object *uobj)
>*/
>   vp->v_flag &= ~VTEXT;
>  
> - /*
> -  * we just dropped the last reference to the uvn.   see if we can
> -  * let it "stick around".
> -  */
> - if (uvn->u_flags & UVM_VNODE_CANPERSIST) {
> - /* won't block */
> - uvn_flush(uobj, 0, 0, PGO_DEACTIVATE|PGO_ALLPAGES);
> - goto out;
> - }
> -
>   /* its a goner! */
>   uvn->u_flags |= UVM_VNODE_DYING;
>  
> @@ -379,7 +366,6 @@ uvn_detach(struct uvm_object *uobj)
>   /* wake up any sleepers */
>   if (oldflags & UVM_VNODE_WANTED)
>   wakeup(uvn);
> -out:
>   rw_exit(uobj->vmobjlock);
>  
>   /* drop our reference to the vnode. */
> @@ -495,8 +481,8 @@ uvm_vnp_terminate(struct vnode *vp)
>   }
>  
>   /*
> -  * done.   now we free the uvn if its reference count is zero
> -  * (true if we are zapping a persisting uvn).   however, if we are
> +  * done.   now we free the uvn if its reference count is zero.
> +  * however, if we are
>* terminating a uvn with active mappings we let it live ... future
>* calls down to the vnode layer will fail.
>*/
> @@ -504,14 +490,14 @@ uvm_

Re: macppc panic: vref used where vget required

2022-05-24 Thread Martin Pieuchot

On 19/05/22(Thu) 13:33, Alexander Bluhm wrote:
> On Tue, May 17, 2022 at 05:43:02PM +0200, Martin Pieuchot wrote:
> > Andrew, Alexander, could you test this and report back?
> 
> Panic "vref used where vget required" is still there.  As usual it
> needs a day to reproduce.  This time I was running without the vref
> history diff.

Thanks for testing.  Apparently calling uvm_vnp_terminate() in
getnewvnode() isn't good enough.  I suppose that in your case below the
pdaemon is trying to flush the pages before the vnode has been recycled,
so before uvm_vnp_terminate() has been called.

So either we prevent the vnode from being put on the free list or we get
rid of the persisting mechanism.  I don't fully understand what could be
the impact of always flushing the pages and why this hasn't been done in
the first place.  It seems the CANPERSIST logic has been"inherited from
44BSD's original vm.

On the other hand I feel even less comfortable with preventing vnodes
from being recycled until the pdaemon has freed related pages.

Diff below has been lightly tested, it get rids of the extra CANPERSIST
referenced.  That means pages associated to a vnode will now always be
flushed when uvn_detach() is called.  This turns uvm_vnp_uncache() into
a noop and open the possibility of further simplifications.

I'd be happy to hear if this has any impact of the bug we're chasing.
Please do note that this change can introduce/expose other issues.

Index: uvm/uvm_vnode.c
===
RCS file: /cvs/src/sys/uvm/uvm_vnode.c,v
retrieving revision 1.124
diff -u -p -r1.124 uvm_vnode.c
--- uvm/uvm_vnode.c 3 May 2022 21:20:35 -   1.124
+++ uvm/uvm_vnode.c 20 May 2022 09:04:08 -
@@ -162,12 +162,9 @@ uvn_attach(struct vnode *vp, vm_prot_t a
 * add it to the writeable list, and then return.
 */
if (uvn->u_flags & UVM_VNODE_VALID) {   /* already active? */
+   KASSERT(uvn->u_obj.uo_refs > 0);
 
rw_enter(uvn->u_obj.vmobjlock, RW_WRITE);
-   /* regain vref if we were persisting */
-   if (uvn->u_obj.uo_refs == 0) {
-   vref(vp);
-   }
uvn->u_obj.uo_refs++;   /* bump uvn ref! */
rw_exit(uvn->u_obj.vmobjlock);
 
@@ -234,7 +231,7 @@ uvn_attach(struct vnode *vp, vm_prot_t a
KASSERT(uvn->u_obj.uo_refs == 0);
uvn->u_obj.uo_refs++;
oldflags = uvn->u_flags;
-   uvn->u_flags = UVM_VNODE_VALID|UVM_VNODE_CANPERSIST;
+   uvn->u_flags = UVM_VNODE_VALID;
uvn->u_nio = 0;
uvn->u_size = used_vnode_size;
 
@@ -247,7 +244,7 @@ uvn_attach(struct vnode *vp, vm_prot_t a
/*
 * add a reference to the vnode.   this reference will stay as long
 * as there is a valid mapping of the vnode.   dropped when the
-* reference count goes to zero [and we either free or persist].
+* reference count goes to zero.
 */
vref(vp);
if (oldflags & UVM_VNODE_WANTED)
@@ -320,16 +317,6 @@ uvn_detach(struct uvm_object *uobj)
 */
vp->v_flag &= ~VTEXT;
 
-   /*
-* we just dropped the last reference to the uvn.   see if we can
-* let it "stick around".
-*/
-   if (uvn->u_flags & UVM_VNODE_CANPERSIST) {
-   /* won't block */
-   uvn_flush(uobj, 0, 0, PGO_DEACTIVATE|PGO_ALLPAGES);
-   goto out;
-   }
-
/* its a goner! */
uvn->u_flags |= UVM_VNODE_DYING;
 
@@ -379,7 +366,6 @@ uvn_detach(struct uvm_object *uobj)
/* wake up any sleepers */
if (oldflags & UVM_VNODE_WANTED)
wakeup(uvn);
-out:
rw_exit(uobj->vmobjlock);
 
/* drop our reference to the vnode. */
@@ -495,8 +481,8 @@ uvm_vnp_terminate(struct vnode *vp)
}
 
/*
-* done.   now we free the uvn if its reference count is zero
-* (true if we are zapping a persisting uvn).   however, if we are
+* done.   now we free the uvn if its reference count is zero.
+* however, if we are
 * terminating a uvn with active mappings we let it live ... future
 * calls down to the vnode layer will fail.
 */
@@ -504,14 +490,14 @@ uvm_vnp_terminate(struct vnode *vp)
if (uvn->u_obj.uo_refs) {
/*
 * uvn must live on it is dead-vnode state until all references
-* are gone.   restore flags.clear CANPERSIST state.
+* are gone.   restore flags.
 */
uvn->u_flags &= ~(UVM_VNODE_DYING|UVM_VNODE_VNISLOCKED|
- UVM_VNODE_WANTED|UVM_VNODE_CANPERSIST);
+ UVM_VNODE_WANTED);
} else {
/*
 * free the uvn now.   note t

Re: macppc panic: vref used where vget required

2022-05-17 Thread Martin Pieuchot

On 06/05/22(Fri) 22:16, Alexander Bluhm wrote:
> Same with this diff.

Thanks for testing.  Here's a possible fix.  The idea is to call
uvm_vnp_terminate() when recycling a vnode.  This will flush any
pending pages that are still associated with the vnode.  Ironically that
is what the comment above uvm_vnp_terminate() says, even if this has
never been true.

I find this approach less intrusive than removing the CANPERSIST flag.
I'd suggest we do that later.

Andrew, Alexander, could you test this and report back?

Thanks!

Index: kern/vfs_subr.c
===
RCS file: /cvs/src/sys/kern/vfs_subr.c,v
retrieving revision 1.315
diff -u -p -r1.315 vfs_subr.c
--- kern/vfs_subr.c 27 Mar 2022 16:19:39 -  1.315
+++ kern/vfs_subr.c 17 May 2022 15:28:30 -
@@ -459,6 +459,10 @@ getnewvnode(enum vtagtype tag, struct mo
vp->v_flag = 0;
vp->v_socket = NULL;
}
+   /*
+* Clean out any VM data associated with the vnode.
+*/
+   uvm_vnp_terminate(vp);
cache_purge(vp);
vp->v_type = VNON;
vp->v_tag = tag;

Re: macppc panic: vref used where vget required

2022-05-05 Thread Martin Pieuchot

On 04/05/22(Wed) 18:30, Alexander Bluhm wrote:
> On Wed, May 04, 2022 at 05:58:14PM +0200, Martin Pieuchot wrote:
> > I don't understand the mechanism around UVM_VNODE_CANPERSIST.  I looked
> > for missing uvm_vnp_uncache() and found the following two.  I doubt
> > those are the one triggering the bug because they are in NFS & softdep.
> 
> It crashes while compiling clang.
> 
> c++ -O2 -pipe  -fno-ret-protector -std=c++14 -fvisibility-inlines-hidden 
> -fno-exceptions -fno-rtti -Wall -W -Wno-unused-parameter -Wwrite-strings 
> -Wcast-qual  -Wno-missing-field-initializers -pedantic -Wno-long-long  
> -Wdelete-non-virtual-dtor -Wno-comment -fPIE  -MD -MP  
> -I/usr/src/gnu/usr.bin/clang/liblldbPluginExpressionParser/../../../llvm/llvm/include
>  -I/usr/src/gnu/usr.bin/clang/liblldbPluginExpressionParser/../include 
> -I/usr/src/gnu/usr.bin/clang/liblldbPluginExpressionParser/obj  
> -I/usr/src/gnu/usr.bin/clang/liblldbPluginExpressionParser/obj/../include 
> -DNDEBUG -D__STDC_LIMIT_MACROS -D__STDC_CONSTANT_MACROS  
> -D__STDC_FORMAT_MACROS -DLLVM_PREFIX="/usr" 
> -I/usr/src/gnu/usr.bin/clang/liblldbPluginExpressionParser/../../../llvm/lldb/include
>   
> -I/usr/src/gnu/usr.bin/clang/liblldbPluginExpressionParser/../../../llvm/lldb/source
>  
> -I/usr/src/gnu/usr.bin/clang/liblldbPluginExpressionParser/../../../llvm/clang/include
>  -c 
> /usr/src/gnu/usr.bin/clang/liblldbPluginExpressionParser/../../../llvm/lldb/source/Plugins/ExpressionParser/Clang/ClangExpressionParser.cpp
>  -o ClangExpressionParser.o
> Timeout, server ot26 not responding.
> 
> No softdep, but NFS client.  I use it to mount cvs mirror read-only.
> This file system should not be used during make build.

Hard to believe it is related to the diff below, can you reproduce it?

Re: macppc panic: vref used where vget required

2022-05-05 Thread Martin Pieuchot

On 04/05/22(Wed) 18:23, Mark Kettenis wrote:
> > Date: Wed, 4 May 2022 17:58:14 +0200
> > From: Martin Pieuchot 
> > 
> > On 04/05/22(Wed) 09:16, Sebastien Marie wrote:
> > > [...] 
> > > we don't have any vclean label ("vclean (inactive)" or "vclean 
> > > (active)"), so 
> > > vclean() was not called in this timeframe.
> > 
> > So we are narrowing down the issue:
> > 
> > 1. A file is opened
> > 2. Then mmaped
> > 3. Some of its pages are swapped to disk
> 
> Hmm, why does this happen?  Is this because the mmap(2) was done using
> MAP_PRIVATE?

I believe so otherwise uvm_vnp_uncache() would have been called.

> But then what's the point of setting UVM_VNODE_CANPERSIST?

I don't know.  It looks to me like a way to not flush the data if a file
is munmap(2)ed then mmap(2)ed again, no?

It is like an extra UVM object reference which doesn't account for the
vnode reference.  That'd explain why uvm_vnp_sync() and uvn_attach() have
checks for uo_refs == 0.

Re: macppc panic: vref used where vget required

2022-05-04 Thread Martin Pieuchot

On 04/05/22(Wed) 09:16, Sebastien Marie wrote:
> [...] 
> we don't have any vclean label ("vclean (inactive)" or "vclean (active)"), so 
> vclean() was not called in this timeframe.

So we are narrowing down the issue:

1. A file is opened
2. Then mmaped
3. Some of its pages are swapped to disk
4. The process die, closing the file
5. The reaper calls uvn_detach() on the vnode which has UVM_VNODE_CANPERSIST
  . This release the last reference of the vnode without sync' the pages
  -> the vnode ends up on the free list
6. The page daemon tries to sync the pages, grab a reference on the vnode
  which has already been recycled.

I don't understand the mechanism around UVM_VNODE_CANPERSIST.  I looked
for missing uvm_vnp_uncache() and found the following two.  I doubt
those are the one triggering the bug because they are in NFS & softdep.

So my question is should UVM_VNODE_CANPERSIST be cleared at some point
in this scenario?  If so, when?

What is the interaction between this flag and mmap pages which are on
swap?  In other words, is it safe to call vrele(9) in uvn_detach() if
uvn_flush() hasn't been called with PGO_FREE|PGO_ALLPAGES?  If yes, why?

What it this flag suppose to say?  Why is it always cleared before
VOP_REMOVE() & VOP_RENAME()?

Index: nfs/nfs_serv.c
===
RCS file: /cvs/src/sys/nfs/nfs_serv.c,v
retrieving revision 1.120
diff -u -p -r1.120 nfs_serv.c
--- nfs/nfs_serv.c  11 Mar 2021 13:31:35 -  1.120
+++ nfs/nfs_serv.c  4 May 2022 15:29:06 -
@@ -1488,6 +1488,9 @@ nfsrv_rename(struct nfsrv_descript *nfsd
error = -1;
 out:
if (!error) {
+   if (tvp) {
+   (void)uvm_vnp_uncache(tvp);
+   }
error = VOP_RENAME(fromnd.ni_dvp, fromnd.ni_vp, _cnd,
   tond.ni_dvp, tond.ni_vp, _cnd);
} else {
Index: ufs/ffs/ffs_inode.c
===
RCS file: /cvs/src/sys/ufs/ffs/ffs_inode.c,v
retrieving revision 1.81
diff -u -p -r1.81 ffs_inode.c
--- ufs/ffs/ffs_inode.c 12 Dec 2021 09:14:59 -  1.81
+++ ufs/ffs/ffs_inode.c 4 May 2022 15:32:15 -
@@ -172,11 +172,12 @@ ffs_truncate(struct inode *oip, off_t le
if (length > fs->fs_maxfilesize)
return (EFBIG);
 
-   uvm_vnp_setsize(ovp, length);
oip->i_ci.ci_lasta = oip->i_ci.ci_clen 
= oip->i_ci.ci_cstart = oip->i_ci.ci_lastw = 0;
 
if (DOINGSOFTDEP(ovp)) {
+   uvm_vnp_setsize(ovp, length);
+   (void) uvm_vnp_uncache(ovp);
if (length > 0 || softdep_slowdown(ovp)) {
/*
 * If a file is only partially truncated, then

Re: macppc panic: vref used where vget required

2022-04-28 Thread Martin Pieuchot

On 28/04/22(Thu) 16:54, Sebastien Marie wrote:
> On Thu, Apr 28, 2022 at 04:04:41PM +0200, Alexander Bluhm wrote:
> > On Wed, Apr 27, 2022 at 09:16:48AM +0200, Sebastien Marie wrote:
> > > Here a new diff (sorry for the delay) which add a new 
> > > vnode_history_record()
> > > point inside uvn_detach() (when 'uvn' object has UVM_VNODE_CANPERSIST 
> > > flag sets).
> > 
> > [-- MARK -- Thu Apr 28 14:10:00 2022]
> > uvn_io: start: 0x23ae1400, type VREG, use 0, write 0, hold 0, flags 
> > (VBIOONFREELIST)
> > tag VT_UFS, ino 495247, on dev 0, 10 flags 0x100, effnlink 1, nlink 
> > 1
> > mode 0100660, owner 21, group 21, size 13647873
> > ==> vnode_history_print 0x23ae1400, next=6
> >  [3] c++[44194] usecount 2>1
> > #0  0x626946ec
> >  [4] reaper[10898] usecount 1>1
> > #0  entropy_pool0+0xf54
> 
> even if the stacktrace is somehow grabage, the "usecount 1>1" is due to 
> VH_NOP 
> (no increment neither decrement), so it is the vnode_history_record() newly 
> added at:
> 
> @@ -323,6 +325,10 @@ uvn_detach(struct uvm_object *uobj)
>  * let it "stick around".
>  */
> if (uvn->u_flags & UVM_VNODE_CANPERSIST) {
> +   extern void vnode_history_record(struct vnode *, int);
> +
> +   vnode_history_record(vp, 0);
> +
> /* won't block */
> uvn_flush(uobj, 0, 0, PGO_DEACTIVATE|PGO_ALLPAGES);
> goto out;
> 
> mpi@, it confirms that uvn_flush() is called without PGO_FREE for this uvn.

Thanks!

Has vclean() been called for this vnode?  If so the problem might indeed
be related to the `uo_refs' fix I just committed, if not that might be
the bug.

Re: macppc panic: vref used where vget required

2022-04-19 Thread Martin Pieuchot

On 14/04/22(Thu) 18:29, Alexander Bluhm wrote:
> [...]
> vn_lock: v_usecount == 0: 0x23e6b910, type VREG, use 0, write 0, hold 0, 
> flags (VBIOONFREELIST)
> tag VT_UFS, ino 703119, on dev 0, 10 flags 0x100, effnlink 1, nlink 1
> mode 0100660, owner 21, group 21, size 13647873
> ==> vnode_history_print 0x23e6b910, next=5
>  [3] c++[68540] usecount 2>1
> #0  0x625703e0
>  [4] reaper[31684] usecount 1>0
> #0  splx+0x30
> #1  0xfffc
> #2  vrele+0x5c
> #3  uvn_detach+0x154

I wonder if the `uvn' has the UVM_VNODE_CANPERSIST set at this point.
This would explain why the associate pages are not freed and end up in
the inactive list.

Is there an easy way to check that?

> #4  uvm_unmap_detach+0x1a4
> #5  uvm_map_teardown+0x184
> #6  uvmspace_free+0x60
> #7  uvm_exit+0x30
> #8  reaper+0x138
> #9  fork_trampoline+0x14
> 
> panic: vn_lock: v_usecount == 0
> Stopped at  db_enter+0x24:  lwz r11,12(r1)
> TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
>  433418  19000 21 0x2  00  c++
> *232867  36343  0 0x14000  0x2001K pagedaemon
> db_enter() at db_enter+0x20
> panic(9434f4) at panic+0x158
> vn_lock(49535400,202000) at vn_lock+0x1c4
> uvn_io(5db64f85,a994c4,a979f4,,e401) at uvn_io+0x254
> uvn_put(fec3fe2c,e7ebbdd4,23ca9af0,40e0880) at uvn_put+0x64
> uvm_pager_put(0,0,e7ebbd70,32c0c0,200,8000,0) at uvm_pager_put+0x15c
> uvmpd_scan_inactive(0) at uvmpd_scan_inactive+0x224
> uvmpd_scan() at uvmpd_scan+0x158
> uvm_pageout(5d4841c9) at uvm_pageout+0x398
> fork_trampoline() at fork_trampoline+0x14
> end trace frame: 0x0, count: 5
> https://www.openbsd.org/ddb.html describes the minimum info required in bug
> reports.  Insufficient info makes it difficult to find and fix bugs.
> 
> ddb{1}> show panic
> *cpu1: vn_lock: v_usecount == 0
> 
> ddb{1}> trace
> db_enter() at db_enter+0x20
> panic(9434f4) at panic+0x158
> vn_lock(49535400,202000) at vn_lock+0x1c4
> uvn_io(5db64f85,a994c4,a979f4,,e401) at uvn_io+0x254
> uvn_put(fec3fe2c,e7ebbdd4,23ca9af0,40e0880) at uvn_put+0x64
> uvm_pager_put(0,0,e7ebbd70,32c0c0,200,8000,0) at uvm_pager_put+0x15c
> uvmpd_scan_inactive(0) at uvmpd_scan_inactive+0x224
> uvmpd_scan() at uvmpd_scan+0x158
> uvm_pageout(5d4841c9) at uvm_pageout+0x398
> fork_trampoline() at fork_trampoline+0x14
> end trace frame: 0x0, count: -10
> 
> ddb{1}> show register
> r0  0x8552f8panic+0x15c
> r10xe7ebbb60
> r2 0
> r3  0xafcc00cpu_info+0x4c0
> r4  0xb0uvm_small_amap_pool+0x130
> r5   0x1
> r6 0
> r70xe7bb9000
> r8 0
> r9  0x91ae69digits
> r10 0x14
> r11   0xb9c58763
> r12   0x972ab42e
> r130
> r14 0xaf51b8bcstats
> r15 0xa979d0uvmexp
> r160
> r170
> r180
> r19 0x202000rtable_match+0x4c
> r20   0x1000tlbdsmsize+0xf18
> r21   0x1000tlbdsmsize+0xf18
> r22 0xa80294netlock
> r23   0xe401
> r24   0x23ca9b08
> r25   0x23ca9b0c
> r26 0x925186digits+0xa31d
> r270
> r280
> r29 0xafcec0cpu_info+0x780
> r30 0x9434f4cy_pio_rec+0x10c93
> r31 0xa97afcuvmexp+0x12c
> lr  0x465b64db_enter+0x24
> cr0x48228204
> xer   0x2000
> ctr 0x84331copenpic_splx
> iar 0x465b64db_enter+0x24
> msr   0x9032tlbdsmsize+0x8f4a
> dar0
> dsisr  0
> db_enter+0x24:  lwz r11,12(r1)
> 
> ddb{1}> ps
>PID TID   PPIDUID  S   FLAGS  WAIT  COMMAND
>  19000  433418  88497 21  7 0x2c++
>  88497  403151  58565 21  30x10008a  sigsusp   sh
>  41513  242275   9837 21  2 0x2c++
>   9837   71908  58565 21  30x10008a  sigsusp   sh
>  58565  163952  54207 21  30x10008a  sigsusp   make
>  54207   76575  55963 21  30x10008a  sigsusp   sh
>  55963  491542  28020 21  30x10008a  sigsusp   make
>  28020  11887215 21  30x10008a  sigsusp   make
>  17215  246222  89415 21  30x10008a  sigsusp   sh
>  89415  114664  17720 21  30x10008a  sigsusp   make
>  17720  164665  66695  0  30x10008a  sigsusp   sh
>  66695   16970  75998  0  30x10008a  sigsusp   make
>  75998  377562  93606  0  30x10008a  sigsusp   make
>  93606  189705  64097  0  30x10008a  sigsusp   sh
>  64097  134632  17259  0  30x82  piperdperl
>  17259  145120  30986  0  3

Re: macppc panic: vref used where vget required

2022-04-13 Thread Martin Pieuchot

On 12/04/22(Tue) 14:58, Sebastien Marie wrote:
> [...] 
> uvn_io: start: 0x8000ffab9688, type VREG, use 0, write 0, hold 0, flags 
> (VBIOONFREELIST)
>   tag VT_UFS, ino 14802284, on dev 4, 30 flags 0x100, effnlink 1, nlink 1
>   mode 0100644, owner 858, group 1000, size 345603015
> => vnode_history_print 0x8000ffab9688, next=1

So we see the same crash in the pdaemon when trying to sync, at least,
an inactive page to disk...

>  [0]
> #0 vput
> #1 dofstatat
> #2 syscall
> #3 Xsyscall
>  [1>]
> #0 vput
> #1 dofstatat
> #2 syscall
> #3 Xsyscall
>  [2]
> #0 vget
> #1 ufs_ihashget
> #2 ffs_vget
> #3 ufs_lookup
> #4 VOP_LOOKUP
> #5 vfs_lookup
> #6 namei
> #7 vn_open
> #8 doopenat
> #9 syscall
> #10 Xsyscall
>  [3]
> #0 uvn_attach
> #1 uvm_mmapfile
> #2 sys_mmap
> #3 syscall
> #4 Xsyscall

The vnode is mmaped here. 

>  [4]
> #0 vput
> #1 vn_closefile
> #2 fdrop
> #3 closef
> #4 fdfree
> #5 exit1
> #6 single_thread_check_locked
> #7 userret
> #8 intr_user_exit

The process exit here, it seems single-threaded.

>  [5]
> #0 vrele
> #1 uvm_unmap_detach
> #2 uvm_map_teardown
> #3 uvmspace_free
> #4 reaper
> #5 proc_trampoline

The reaper unmap all the memory and release the referenced grabbed
during mmap.

>  [6]
> #0 vget
> #1 ufs_ihashget
> #2 ffs_vget
> #3 ufs_lookup
> #4 VOP_LOOKUP
> #5 vfs_lookup
> #6 namei
> #7 dofstatat
> #8 syscall
> #9 Xsyscall

Who is doing this?  Could you print the pid of the process?  It seems
the vnode has been recycled at this point.

>  [7]
> #0 vput
> #1 dofstatat
> #2 syscall
> #3 Xsyscall
>  [8]
> #0 vget
> #1 ufs_ihashget
> #2 ffs_vget
> #3 ufs_lookup
> #4 VOP_LOOKUP
> #5 vfs_lookup
> #6 namei
> #7 dofstatat
> #8 syscall
> #9 Xsyscall

Same for this, has the vnode been recycled?


> panic: vn_lock: v_usecount == 0
> Stopped at   db_enter+0x10: popq %rbp
> TIDPID UID PRFLAGS PFLAGS CPU COMMAND
>  448838  87877 858   0  0x400  3  qbittorrent-nox
> *281933  50305   0 0x14000  0x200  1K pagedaemon
> db_enter() at ...
> panic(...)
> vn_lock(800ffab9688,81)
> uvn_io(fd837a66bee8,...,1,90,1)
> uvm_pager_put(fd837a66bee8, )
> uvmpd_scan_inactive(...)
> uvmpd_scan()
> uvm_pageout(80005270)

This indicates a page hasn't been freed during uvn_detach() which is
called as part of uvm_unmap_detach() by the reaper... Weird.

> > But just in case, here is also the output of the 'show vnode'
> > command itself:
> > http://46.23.91.227:8098/show_vnode0.jpg
> > http://46.23.91.227:8098/show_vnode1.jpg
> > http://46.23.91.227:8098/show_vnode2.jpg
> > http://46.23.91.227:8098/show_vnode3.jpg
> > 
> > show mount:
> > http://46.23.91.227:8098/show_mount.jpg
> > 
> > show uvm:
> > http://46.23.91.227:8098/show_uvm.jpg
> > 
> > After I rebooted into OS from ddb and got distracted by some
> > personal stuff, qbittorrent continued rechecking from where it
> > stopped during the panic (about half of 27Gb). I was searching for
> > something in firefox at the time.
> > Very soon I saw the kernel panic again, now with a slightly different
> > content. I'm not sure if these things are related, but just in case,
> > I'll leave the data from the console for this panic as well:
> > 
> > http://46.23.91.227:8098/second_panic0.jpg
> > http://46.23.91.227:8098/second_panic_bcstats.jpg
> > http://46.23.91.227:8098/second_panic_precpu0.jpg
> > http://46.23.91.227:8098/second_panic_precpu1.jpg
> > http://46.23.91.227:8098/second_panic_uvm.jpg
> > 
> > (I suspect I should open a separate bug report for this problem?)
> 
> at this point, I dunno if it is related.
> 
> just transcribing the main elements:
> 
> panic: kernel diagnostic assertion "rv" failed: file "/sys/uvm/uvm_glue.c", 
> line 428
> 
> current process: aiodoned
> (others: 3 firefox processes)
> 
> trace:
> __assert(...)
> uvm_atopg(...)
> uvm_aio_aiodone(...)
> uvm_aiodone_daemon(...)
> 
> the assert itself is:
>417  /*
>418   * uvm_atopg: convert KVAs back to their page structures.
>419   */
>420  struct vm_page *
>421  uvm_atopg(vaddr_t kva)
>422  {
>423  struct vm_page *pg;
>424  paddr_t pa;
>425  boolean_t rv;
>426   
>427  rv = pmap_extract(pmap_kernel(), kva, );
>428  KASSERT(rv);
>429  pg = PHYS_TO_VM_PAGE(pa);
>430  KASSERT(pg != NULL);
>431  return (pg);
>432  }
> 
> Thanks.
> -- 
> Sebastien Marie
>

Re: macppc panic: vref used where vget required

2022-04-11 Thread Martin Pieuchot

On 28/03/22(Mon) 13:35, Alexander Bluhm wrote:
> Hi,
> 
> There was a discussion about file system bugs with macppc.  My dual
> core macppc never completed a make release.  I get various panics.
> One of them is below.
> 
> bluhm
> 
> panic: vref used where vget required
> Stopped at  db_enter+0x24:  lwz r11,12(r1)
> TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
>  192060  78628 21 0x2  00  c++
> *132472  74971  0 0x14000  0x2001K pagedaemon
> db_enter() at db_enter+0x20
> panic(91373c) at panic+0x158
> vref(23b8fa20) at vref+0xac
> uvm_vnp_uncache(e7eb7c50) at uvm_vnp_uncache+0x88
> ffs_write(7ed2e423) at ffs_write+0x3b0
> VOP_WRITE(23b8fa20,e7eb7c50,40,1ff3f60) at VOP_WRITE+0x48
> uvn_io(7ed2e423,a93d0c,a93814,,e401) at uvn_io+0x264
> uvn_put(414b673a,e7eb7dd4,24f00070,5326e90) at uvn_put+0x64
> uvm_pager_put(0,0,e7eb7d70,6ee0b8,200,8000,0) at uvm_pager_put+0x15c
> uvmpd_scan_inactive(0) at uvmpd_scan_inactive+0x224
> uvmpd_scan() at uvmpd_scan+0x158
> uvm_pageout(7e932633) at uvm_pageout+0x398

The page daemon is trying to write a page on the inactive list back to
the disk.  Sadly it seems the vnode has already been recycled, which
indicate the page should already be on the disk.  

That suggests uvm_pagefree() hasn't been called for this page. 

So how can this page be on the list if the vnode has been closed?  A
race between mmap/fault handler/close?  A page leak?  Something else?

vmm guests die during host's supspend/resume

2022-03-12 Thread Martin Pieuchot

I see the following in the dmesg:

vcpu_run_vmx: failed vmresume for unknown reason
vcpu_run_vmx: error code = 5, VMRESUME: non-launched VMCS


Then after resuming guests need to be restarted and go through a fsck
check.

Dmesg below

OpenBSD 7.1-beta (GENERIC.MP) #401: Thu Mar  3 12:48:28 MST 2022
dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
real mem = 8238301184 (7856MB)
avail mem = 7971405824 (7602MB)
random: good seed from bootblocks
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: SMBIOS rev. 2.7 @ 0xccbfd000 (65 entries)
bios0: vendor LENOVO version "N14ET26W (1.04 )" date 01/23/2015
bios0: LENOVO 20BS006BGE
acpi0 at bios0: ACPI 5.0
acpi0: sleep states S0 S3 S4 S5
acpi0: tables DSDT FACP SLIC ASF! HPET ECDT APIC MCFG SSDT SSDT SSDT SSDT SSDT 
SSDT SSDT SSDT SSDT SSDT PCCT SSDT UEFI MSDM BATB FPDT UEFI DMAR
acpi0: wakeup devices LID_(S4) SLPB(S3) IGBE(S4) EXP2(S4) XHCI(S3) EHC1(S3)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpihpet0 at acpi0: 14318179 Hz
acpiec0 at acpi0
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Core(TM) i7-5500U CPU @ 2.40GHz, 2295.03 MHz, 06-3d-04
cpu0: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,PT,SRBDS_CTRL,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu0: 256KB 64b/line 8-way L2 cache
cpu0: smt 0, core 0, package 0
mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
cpu0: apic clock running at 99MHz
cpu0: mwait min=64, max=64, C-substates=0.2.1.2.4.1.1.1, IBE
cpu1 at mainbus0: apid 1 (application processor)
cpu1: Intel(R) Core(TM) i7-5500U CPU @ 2.40GHz, 2294.70 MHz, 06-3d-04
cpu1: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,PT,SRBDS_CTRL,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu1: 256KB 64b/line 8-way L2 cache
cpu1: smt 1, core 0, package 0
cpu2 at mainbus0: apid 2 (application processor)
cpu2: Intel(R) Core(TM) i7-5500U CPU @ 2.40GHz, 2294.71 MHz, 06-3d-04
cpu2: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,PT,SRBDS_CTRL,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu2: 256KB 64b/line 8-way L2 cache
cpu2: smt 0, core 1, package 0
cpu3 at mainbus0: apid 3 (application processor)
cpu3: Intel(R) Core(TM) i7-5500U CPU @ 2.40GHz, 2294.70 MHz, 06-3d-04
cpu3: 
FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,EST,TM2,SSSE3,SDBG,FMA3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,MOVBE,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,PAGE1GB,RDTSCP,LONG,LAHF,ABM,3DNOWP,PERF,ITSC,FSGSBASE,TSC_ADJUST,BMI1,AVX2,SMEP,BMI2,ERMS,INVPCID,RDSEED,ADX,SMAP,PT,SRBDS_CTRL,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu3: 256KB 64b/line 8-way L2 cache
cpu3: smt 1, core 1, package 0
ioapic0 at mainbus0: apid 2 pa 0xfec0, version 20, 40 pins
acpimcfg0 at acpi0
acpimcfg0: addr 0xf800, bus 0-63
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus -1 (PEG_)
acpiprt2 at acpi0: bus 3 (EXP1)
acpiprt3 at acpi0: bus 4 (EXP2)
acpiprt4 at acpi0: bus -1 (EXP3)
acpiprt5 at acpi0: bus -1 (EXP6)
acpibtn0 at acpi0: LID_
acpibtn1 at acpi0: SLPB
acpipci0 at acpi0 PCI0: 0x 0x0011 0x0001
acpicmos0 at acpi0
acpibat0 at acpi0: BAT0 model "00HW003" serial   887 type LiP oem "SMP"
acpiac0 at acpi0: AC unit online
acpithinkpad0 at acpi0: version 1.0
"PNP0C14" at acpi0 not configured
"PNP0C14" at acpi0 not configured
"PNP0C14" at acpi0 not configured
"INT340F" at acpi0 not configured
acpicpu0 at acpi0: C3(200@233 mwait.1@0x40), C2(200@148 mwait.1@0x33), 
C1(1000@1 mwait.1), PSS
acpicpu1 at acpi0: C3(200@233 mwait.1@0x40), C2(200@148 mwait.1@0x33), 
C1(1000@1 mwait.1), PSS
acpicpu2 at acpi0: C3(200@233 mwait.1@0x40), C2(200@148 mwait.1@0x33), 
C1(1000@1 mwait.1), PSS
acpicpu3 at acpi0: C3(200@233 mwait.1@0x40), C2(200@148 mwait.1@0x33), 
C1(1000@1 mwait.1), PSS
acpipwrres0

Re: witness: acquiring duplicate lock of same type: ">vmobjlock"

2022-02-21 Thread Martin Pieuchot

On 17/02/22(Thu) 16:22, Klemens Nanni wrote:
> On Wed, Feb 16, 2022 at 11:39:19PM +0100, Mark Kettenis wrote:
> > > Date: Wed, 16 Feb 2022 21:13:03 +
> > > From: Klemens Nanni 
> > > 
> > > Unmodified -current with WITNESS enabled booting into X on my X230:
> > > 
> > > wsdisplay0: screen 1-5 added (std, vt100 emulation)
> > > witness: acquiring duplicate lock of same type: ">vmobjlock"
> > >  1st uobjlk
> > >  2nd uobjlk
> > > Starting stack trace...
> > > witness_checkorder(fd83b625f9b0,9,0) at witness_checkorder+0x8ac
> > > rw_enter(fd83b625f9a0,1) at rw_enter+0x68
> > > uvm_obj_wire(fd843c39e948,0,4,800033b70428) at 
> > > uvm_obj_wire+0x46
> > > shmem_get_pages(88008500) at shmem_get_pages+0xb8
> > > __i915_gem_object_get_pages(88008500) at 
> > > __i915_gem_object_get_pages+0x6d
> > > i915_gem_fault(88008500,800033b707c0,10009b000,a43d6b1c000,800033b70740,1,35ba896911df1241,800aa078,800aa178)
> > >  at i915_gem_fault+0x203
> > > drm_fault(800033b707c0,a43d6b1c000,800033b70740,1,0,0,7eca45006f70ee0,800033b707c0)
> > >  at drm_fault+0x156
> > > uvm_fault(fd843a7cf480,a43d6b1c000,0,2) at uvm_fault+0x179
> > > upageflttrap(800033b70920,a43d6b1c000) at upageflttrap+0x62
> > > usertrap(800033b70920) at usertrap+0x129
> > > recall_trap() at recall_trap+0x8
> > > end of kernel
> > > end trace frame: 0x7f7dc7c0, count: 246
> > > End of stack trace.
> > > 
> > > The system works fine (unless booted with kern.witness.watch=3), so I'm
> > > posting it here for reference -- haven't had time to look into this.
> > 
> > Yes, this is expected.  The graphics buffers are implented as a uvm
> > object and this object is backed by an anonymous memory uvm_object
> > (aobj).  So I think the vmobjlock needs a RW_DUPOK flag.
> 
> I see, thanks for the hint.
> 
> I looked at drm first to see if I could easily add RW_DUPOK to their
> init/enter calls only such that RW_DUPOK for objlk is contained within
> drm, but that's neither easy nor needed.
> 
> uvm_obj_wire() is only called from sys/dev/pci/drm/ anyway, so we can
> just treat drm there.
> 
> The lock order reversal is about uvm_obj_wire() only and I haven't seen
> one in uvm_obj_unwire(), but my diff consequently adds RW_DUPOK to both
> as both are being used in drm.
> 
> This makes the witness report go away on my X230.
> 
> Does that RW_DUPOK deserve a comment?

If the commit message is clear enough I don't think so.

> Feedback? Objections? OK?

ok mpi@

> > > wsdisplay0: screen 1-5 added (std, vt100 emulation)
> > > witness: acquiring duplicate lock of same type: ">vmobjlock"
> > >  1st uobjlk
> > >  2nd uobjlk
> > > Starting stack trace...
> > > witness_checkorder(fd83b625f9b0,9,0) at witness_checkorder+0x8ac
> > > rw_enter(fd83b625f9a0,1) at rw_enter+0x68
> > > uvm_obj_wire(fd843c39e948,0,4,800033b70428) at 
> > > uvm_obj_wire+0x46
> > > shmem_get_pages(88008500) at shmem_get_pages+0xb8
> > > __i915_gem_object_get_pages(88008500) at 
> > > __i915_gem_object_get_pages+0x6d
> > > i915_gem_fault(88008500,800033b707c0,10009b000,a43d6b1c000,800033b70740,1,35ba896911df1241,800aa078,800aa178)
> > >  at i915_gem_fault+0x203
> > > drm_fault(800033b707c0,a43d6b1c000,800033b70740,1,0,0,7eca45006f70ee0,800033b707c0)
> > >  at drm_fault+0x156
> > > uvm_fault(fd843a7cf480,a43d6b1c000,0,2) at uvm_fault+0x179
> > > upageflttrap(800033b70920,a43d6b1c000) at upageflttrap+0x62
> > > usertrap(800033b70920) at usertrap+0x129
> > > recall_trap() at recall_trap+0x8
> > > end of kernel
> > > end trace frame: 0x7f7dc7c0, count: 246
> > > End of stack trace.
> 
> 
> Index: uvm_object.c
> ===
> RCS file: /cvs/src/sys/uvm/uvm_object.c,v
> retrieving revision 1.24
> diff -u -p -r1.24 uvm_object.c
> --- uvm_object.c  17 Jan 2022 13:55:32 -  1.24
> +++ uvm_object.c  17 Feb 2022 16:12:54 -
> @@ -133,7 +133,7 @@ uvm_obj_wire(struct uvm_object *uobj, vo
>  
>   left = (end - start) >> PAGE_SHIFT;
>  
> - rw_enter(uobj->vmobjlock, RW_WRITE);
> + rw_enter(uobj->vmobjlock, RW_WRITE | RW_DUPOK);
>   while (left) {
>  
>   npages = MIN(FETCH_PAGECOUNT, left);
> @@ -147,7 +147,7 @@ uvm_obj_wire(struct uvm_object *uobj, vo
>   if (error)
>   goto error;
>  
> - rw_enter(uobj->vmobjlock, RW_WRITE);
> + rw_enter(uobj->vmobjlock, RW_WRITE | RW_DUPOK);
>   for (i = 0; i < npages; i++) {
>  
>   KASSERT(pgs[i] != NULL);
> @@ -197,7 +197,7 @@ uvm_obj_unwire(struct uvm_object *uobj, 
>   struct vm_page *pg;
>   off_t offset;
>  
> - rw_enter(uobj->vmobjlock, RW_WRITE);
> + rw_enter(uobj->vmobjlock, RW_WRITE | RW_DUPOK);
>   uvm_lock_pageq();
>   for (offset = start; offset < end; offset +=

Re: crash on booting GENERIC.MP since upgrade to Jan 18 snapshot

2022-01-31 Thread Martin Pieuchot

On 31/01/22(Mon) 19:18, Jonathan Gray wrote:
> On Mon, Jan 31, 2022 at 12:54:53AM -0700, Thomas Frohwein wrote:
> > On Sat, 29 Jan 2022 12:15:10 -0300
> > Martin Pieuchot  wrote:
> > 
> > > On 28/01/22(Fri) 23:03, Thomas Frohwein wrote:
> > > > On Sat, 29 Jan 2022 15:19:20 +1100
> > > > Jonathan Gray  wrote:
> > > >   
> > > > > does this diff to revert uvm_fault.c rev 1.124 change anything?  
> > > > 
> > > > Unfortunately no. Same pmap error as in the original bug report occurs
> > > > with a kernel with this diff.  
> > > 
> > > Could you submit a new bug report?  Could you manage to include ps and the
> > > trace of all the CPUs when the pmap corruption occurs?
> > 
> > See below
> > 
> > > 
> > > Do you have some steps to reproduce the corruption?  Which program is
> > > currently running?  Is it multi-threaded?  What is the simplest scenario
> > > to trigger the corruption?
> > 
> > It's during boot of the MP kernel. The only scenario I can provide is
> > booting this machine with an MP kernel from January 18 or newer. If I
> > boot SP kernel, or build an MP kernel with jsg@'s diff that adds
> > `pool_debug = 2`, the panic does _not_ occur.
> 
> That pool_debug change also avoids what Paul de Weerd sees on a
> Dell XPS 13 9305 with i7-1165G7 as does running SP
> 
> panic: pool_do_get: idrpl: page empty
> Stopped atdb_enter+0x10:  popq%rbp
> TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
>
> *293226   4683  0 0x14000  0x2000K drmwq  
> 
How can this error happen?  Does that mean there's a corruption in the
pool?  Is some synchronisation incorrect or some lock missing?

David you know the pool subsystem better than us, do you have any
inside?  Thanks!

> db_enter() at db_enter+0x10
> panic(81f08e21) at panic+0xbf
> pool_do_get(823b3710,1,80001d9a11e4) at pool_do_get+0x2f6
> pool_get(823b3710,1) at pool_get+0x96
> idr_alloc(803cc2e0,80fba500,1,0,5) at idr_alloc+0x78
> __drm_mode_object_add(803cc078,80fba500,,1,8102dda0)
>  at __drm_mode_object_add+0xa6
> drm_property_create_blob(803cc078,80,8119ef80) at 
> drm_property_create_blob+0xa7
> drm_property_replace_global_blob(803cc078,80e9c950,80,8119ef80,80e9c828,8095a180)
>  at drm_property_replace_global_blob+0x84
> drm_connector_update_edid_property(80e9c800,8119ef80) at 
> drm_connector_update_edid_property+0x118
> intel_connector_update_modes(80e9c800,8119ef80) at 
> intel_connector_update_modes+0x15
> intel_dp_get_modes(80e9c800) at intel_dp_get_modes+0x33
> drm_helper_probe_single_connector_modes(80e9c800,f00,870) at 
> drm_helper_probe_single_connector_modes+0x353
> drm_client_modeset_probe(80edda00,f00,870) at 
> drm_client_modeset_probe+0x281
> drm_fb_helper_hotplug_event(80edda00) at 
> drm_fb_helper_hotplug_event+0xd3
> end trace frame: 0x80001d9a1800, count: 0
> 
> some tiger lake machines don't see either problem
> for example thinkpad x1 nano, framework laptop
> 
> > 
> > Here some new (hand-typed from a picture) output when I boot a freshly
> > downloaded snapshot MP kernel from January 30th (note this is an 8 core/16
> > hyperthreads CPU; I have _not_ enabled hyperthreading). I attached dmesg 
> > from
> > booting bsd.sp, too.
> > 
> > ... (boot, see dmesg in original bugs@ submission)
> > wsdisplay0: screen 1-5 added (std, vt100 emulation)
> > iwm0: hw rev 0x200, fw ver 36.ca7b901d.0, address [...]
> > va 7f7fb000 ppa ff000
> > panic: pmap_get_ptp: unmanaged user PTP
> > Stopped at db_enter+0x10: popq   %rbp
> > TID PID UID PRFLAGS PFLAGS CPU COMMAND
> > * 28644   1   0   0  0   2K swapper
> > db_enter() at db_enter+0x10
> > panic(81f3dd1f) at panic+0xbf
> > pmap_get_ptp(fd888e52ee58,7f7fb000) at pmap_get_ptp+0x303
> > pmap_enter(fd888e52ee58,7f7fb000,13d151000,3,22) at pmap_enter+0x188
> > uvm_fault_lower(8000156852a0,8000156852d8,800015685220,0) at 
> > uvm_fault_lower+0x63d
> > uvm_fault(fd888e52fdd0,7f7fb000,0,2) at uvm_fault+0x1b3
> > kpageflttrap(800015685420,7f7fbff5) at kpageflttrap+0x12c
> > kerntrap(800015685420) at kerntrap+0x91
> > alltraps_kern_meltdown() at alltraps_kern_meltdown+0x7b
> > copyout() at copyout+0x53
> > end trace frame: 0x0, count: 5
>

Re: crash on booting GENERIC.MP since upgrade to Jan 18 snapshot

2022-01-31 Thread Martin Pieuchot

On 31/01/22(Mon) 00:54, Thomas Frohwein wrote:
> On Sat, 29 Jan 2022 12:15:10 -0300
> Martin Pieuchot  wrote:
> 
> > On 28/01/22(Fri) 23:03, Thomas Frohwein wrote:
> > > On Sat, 29 Jan 2022 15:19:20 +1100
> > > Jonathan Gray  wrote:
> > >   
> > > > does this diff to revert uvm_fault.c rev 1.124 change anything?  
> > > 
> > > Unfortunately no. Same pmap error as in the original bug report occurs
> > > with a kernel with this diff.  
> > 
> > Could you submit a new bug report?  Could you manage to include ps and the
> > trace of all the CPUs when the pmap corruption occurs?
> 
> See below
> 
> > 
> > Do you have some steps to reproduce the corruption?  Which program is
> > currently running?  Is it multi-threaded?  What is the simplest scenario
> > to trigger the corruption?
> 
> It's during boot of the MP kernel. The only scenario I can provide is
> booting this machine with an MP kernel from January 18 or newer. If I
> boot SP kernel, or build an MP kernel with jsg@'s diff that adds
> `pool_debug = 2`, the panic does _not_ occur.

This indicates some race is present and not triggered if more context
switches occur.

> Here some new (hand-typed from a picture) output when I boot a freshly
> downloaded snapshot MP kernel from January 30th (note this is an 8 core/16
> hyperthreads CPU; I have _not_ enabled hyperthreading). I attached dmesg from
> booting bsd.sp, too.

Thanks, so most CPUs already reached the idle loop and are not yet running
anything.

Nobody is running the KERNEL_LOCK(), the faulting process obviously
isn't and I don't understand which one it is.

Note that the corruption occurred on CPU2.  We don't know where it
occurred the previous time.  This is interesting to watch to understand
between which CPUs the race is occurring. 

> ... (boot, see dmesg in original bugs@ submission)
> wsdisplay0: screen 1-5 added (std, vt100 emulation)
> iwm0: hw rev 0x200, fw ver 36.ca7b901d.0, address [...]
> va 7f7fb000 ppa ff000
 
That the faulting address, right?  This is the same as in the first
report.  It seems to be inside the level 1 page table range, is it?
What does that mean?

I don't understand which process is triggering the fault.  Maybe
somebody (jsg@?) could craft a diff to figure out if this same address
fault and which thread/context is faulting it in SP and/or with
pool_debug = 2.

Something like:

if (va == 0x7f7fb000)
db_enter();

> panic: pmap_get_ptp: unmanaged user PTP
> Stopped at db_enter+0x10: popq   %rbp
> TID   PID UID PRFLAGS PFLAGS CPU COMMAND
> * 28644   1   0   0  0   2K swapper
> db_enter() at db_enter+0x10
> panic(81f3dd1f) at panic+0xbf
> pmap_get_ptp(fd888e52ee58,7f7fb000) at pmap_get_ptp+0x303
> pmap_enter(fd888e52ee58,7f7fb000,13d151000,3,22) at pmap_enter+0x188
> uvm_fault_lower(8000156852a0,8000156852d8,800015685220,0) at 
> uvm_fault_lower+0x63d
> uvm_fault(fd888e52fdd0,7f7fb000,0,2) at uvm_fault+0x1b3
> kpageflttrap(800015685420,7f7fbff5) at kpageflttrap+0x12c
> kerntrap(800015685420) at kerntrap+0x91
> alltraps_kern_meltdown() at alltraps_kern_meltdown+0x7b
> copyout() at copyout+0x53
> end trace frame: 0x0, count: 5
> https://www.openbsd.org/ [...]
> ddb{2}> show panic
> *cpu2: pmap_get_ptp: unmanaged user PTP
> ddb{2}> mach ddbcpu 0
> Stopped atx86_ipi_db+0x12:leave
> x86_ipi_db(822acff0) at x86_ipi_db+0x12
> x86_ipi_handler() at x86_ipi_handler+0x80
> Xresume_lapic_ipi() at Xresume_lapic_ipi+0x23
> acpicpu_idle() at acpicpu_idle+0x203
> sched_idle(f822acff0) at sched_idle+0x280
> end trace frame: 0x0, count: 10
> ddb{0}> mach ddbcpu 1
> Stopped atx86_ipi_db+0x12:leave
> x86_ipi_db(800015363ff0) at x86_ipi_db+0x12
> x86_ipi_handler() at x86_ipi_handler+0x80
> Xresume_lapic_ipi() at Xresume_lapic_ipi+0x23
> acpicpu_idle() at acpicpu_idle+0x203
> sched_idle(800015363ff0) at sched_idle+0x280
> end trace frame: 0x0, count: 10
> ddb{1}> mach ddbcpu 3
> Stopped atx86_ipi_db+0x12:leave
> x86_ipi_db(800015375ff0) at x86_ipi_db+0x12
> x86_ipi_handler() at x86_ipi_handler+0x80
> Xresume_lapic_ipi() at Xresume_lapic_ipi+0x23
> acpicpu_idle() at acpicpu_idle+0x203
> sched_idle(800015375ff0) at sched_idle+0x280
> end trace frame: 0x0, count: 10
> ddb{3}> mach ddbcpu 4
> Stopped atx86_ipi_db+0x12:leave
> x86_ipi_db(80001537eff0) at x86_ipi_db+0x12
> x86_ipi_handler() at x86_ipi_handler+0x80
> Xresume_lapic_ipi() at Xresume_lapic_ipi+0x23
> acpicpu_idle() at acpicpu_idle+0x203
> sched_idle(80001537eff0

Re: crash on booting GENERIC.MP since upgrade to Jan 18 snapshot

2022-01-29 Thread Martin Pieuchot

On 28/01/22(Fri) 23:03, Thomas Frohwein wrote:
> On Sat, 29 Jan 2022 15:19:20 +1100
> Jonathan Gray  wrote:
> 
> > does this diff to revert uvm_fault.c rev 1.124 change anything?
> 
> Unfortunately no. Same pmap error as in the original bug report occurs
> with a kernel with this diff.

Could you submit a new bug report?  Could you manage to include ps and the
trace of all the CPUs when the pmap corruption occurs?

Do you have some steps to reproduce the corruption?  Which program is
currently running?  Is it multi-threaded?  What is the simplest scenario
to trigger the corruption?

Re: panic: kernel diagnostic assertion "uvm_page_owner_locked_p(pg)" failed: file "/usr/src/sys/uvm/uvm_page.c", line 1064

2022-01-18 Thread Martin Pieuchot

Thanks for the report.

On 18/01/22(Tue) 22:46, Ralf Horstmann wrote:
> >Synopsis:panic: kernel diagnostic assertion 
> >"uvm_page_owner_locked_p(pg)" failed: file "/usr/src/sys/uvm/uvm_page.c", 
> >line 1064
> >Category:kernel
> >Environment:
>   System  : OpenBSD 7.0
>   Details : OpenBSD 7.0-current (GENERIC.MP) #248: Tue Jan 11 
> 10:12:07 MST 2022
>
> dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> 
>   Architecture: OpenBSD.amd64
>   Machine : amd64
> >Description:
>   The panic typically happens while using some X programs, in
>   most cases chromium. It can take days of uptime and occasional
>   system usage for the problem to show, sometimes but not always
>   with suspend and resume between boot and panic.
> 
>   I have seen the same diagnostics assertion with snapshots from
>   2021-31-12, 2022-01-05 and now 2022-01-11.
> 
>   Even though I do have ddb enabled by default, the system does
>   not always enter ddb and print a backtrace. But for the most
>   recent case I have the following details and a backtrace
>   (typed from screen):
> 
>   Stopped at db_enter+0x10: popq %rbp
>   TIDPID  UIDPRFLAGSPFLAGS CPUCOMMAND
>   *310366  5500400x14000 0x200   0K   aiodoned
>508990  6254100x14000 0x200   2srdis
>   db_enter () at db_enter+0x10
>   panic(81e5053e) at panic+0xbf
>   __assert(81ebcdc6,81e23741,428,81e718b1) at 
> __assert+0x25
>   uvm_page_unbusy(800022738d90,10) at uvm_page_unbusy+0x20e
>   uvm_aio_aiodone(fd81cd592360) at uvm_aio_aiodone+0x252
>   uvm_aiodone_daemon(8000fffefa40) at uvm_aiodone_daemon+0x124
>   end trace frame: 0x0, count: 9

This is caused by an incorrect lock assertion.  I just committed a fix.

The problem can be triggered when swapping anon.  It should be fixed in
the next snapshot.

Thanks again,
Martin

Re: NULL dereference in uvm_fault_lower

2021-12-16 Thread Martin Pieuchot

On 16/12/21(Thu) 17:32, Benjamin Baier wrote:
> Hello,
> 
> i think there is a possible NULL dereference in uvm_fault_lower.
> 
> uvm_fault.c:1282 assigns
>   uobjpage = PGO_DONTCARE;
>   uobj = NULL;
> 
> and then tries to lock on line 1288
>   locked = uvmfault_relock(ufi);
> 
> if this fails it continues to line 1324 which will NULL dereference
>   if (locked == FALSE) {
>   rw_exit(uobj->vmobjlock);
>   return ERESTART;
>   }

Thanks, syzkaller also triggered this.  Here's a quick fix as a first
step.  Then we should investigate which pgo_get() value is returned and
why the current fault handler is trying to do promotion when NetBSD's
uvm_fault_lower_io() isn't.

Index: uvm/uvm_fault.c
===
RCS file: /cvs/src/sys/uvm/uvm_fault.c,v
retrieving revision 1.122
diff -u -p -r1.122 uvm_fault.c
--- uvm/uvm_fault.c 15 Dec 2021 12:53:53 -  1.122
+++ uvm/uvm_fault.c 16 Dec 2021 21:57:39 -
@@ -1322,7 +1322,8 @@ uvm_fault_lower(struct uvm_faultinfo *uf
}
 
if (locked == FALSE) {
-   rw_exit(uobj->vmobjlock);
+   if (uobjpage != PGO_DONTCARE)
+   rw_exit(uobj->vmobjlock);
return ERESTART;
}

Re: panic: kernel diagnostic assertion "!ISSET(rt->rt_flags, RTF_UP)" failed: file "/usr/src/sys/net/route.c", line 506

2021-11-28 Thread Martin Pieuchot

On 26/11/21(Fri) 17:08, Alexander Bluhm wrote:
> On Fri, Nov 26, 2021 at 12:22:39PM +0100, Claudio Jeker wrote:
> > Guess someone introduced a double rtfree() somewhere.
> > Only explenation for this panic.
> 
> Here is a report with OpenBSD 6.9.  Bug has been there for a long
> time.
> 
> https://marc.info/?l=openbsd-bugs=162435709704591=2

I wonder if there isn't a race with rtm_output() or a timeout.  It would
help a lot if one could monitor the routing messages to know which RTM_*
command is issued to the kernel prior to the panic.

If you could also figure out which route (dst, src, flags) is triggering
the panic.

Re: ppp panic: locking against myself

2021-11-28 Thread Martin Pieuchot

On 08/09/21(Wed) 07:33, Anton Lindqvist wrote:
> On Tue, Sep 07, 2021 at 09:59:22PM -0500, j...@jcs.org wrote:
> > >Synopsis:  ppp panic: locking against myself
> > >Category:  kernel
> > >Environment:
> > System  : OpenBSD 6.9
> > Details : OpenBSD 6.9 (GENERIC) #2: Tue Aug 10 08:12:32 MDT 2021
> >  
> > r...@syspatch-69-i386.openbsd.org:/usr/src/sys/arch/i386/compile/GENERIC
> > 
> > Architecture: OpenBSD.i386
> > Machine : i386
> > >Description:
> > Running pppd over a serial modem. (What year is it?)
> > 
> > Ran pkg_add vim--no_x11, came back a half hour later and it had
> > panicked while installing the last dependency.
> > 
> > com0: 2 silo overflows, 0 ibuf overflows
> > com0: 2 silo overflows, 0 ibuf overflows
> > com0: 2 silo overflows, 0 ibuf overflows
> > com0: 1 silo overflow, 0 ibuf overflows
> > com0: 4 silo overflows, 0 ibuf overflows
> > panic: mtx 0xd14b3054: locking against myself
> > Stopped at  db_enter+0x4:   popl%ebp
> > panic: mtx 0xd14b3054: locking against myself
> > Stopped at  db_enter+0x4:   popl%ebp
> > TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND  
> >  
> > * 67354   3343  0 0x14000  0x2000  softnet  
> >   
> > db_enter() at db_enter+0x4
> > panic(d0bc8c2b) at panic+0xd3
> > mtx_enter(d14b3054) at mtx_enter+0x4e
> > task_add(d14b3040,d0df4d7c) at task_add+0x1d
> > ppp_restart(d1511800) at ppp_restart+0x3a
> > pppstart(d17d2200) at pppstart+0x55
> > comintr(d14da000) at comintr+0x4a5
> > intr_handler(f17d69d8,d14b3740) at intr_handler+0x18
> > Xintr_legacy4_untramp() at Xintr_legacy4_untramp+0xfb
> > taskq_next_work(d14b3040,f17d6a40) at taskq_next_work+0x8d
> > taskq_thread(d14b3040) at taskq_thread+0x43
> > https://www.openbsd.org/ddb.html describes the minimum info required in bug
> > reports.  Insufficient info makes it difficult to find and fix bugs.
> 
> Looks like it's trying to schedule a task while already handling one.
> The mutex associated with each net task queue have their IPL set to
> IPL_NET whereas IPL_TTY is probably needed here.

This sounds reasonable, or even IPL_HIGH because the same could happen
in any "real" interrupt handler, no?

Re: riscv64 panic

2021-11-01 Thread Martin Pieuchot

On 31/10/21(Sun) 15:57, Jeremie Courreges-Anglas wrote:
> On Fri, Oct 08 2021, Jeremie Courreges-Anglas  wrote:
> > riscv64.ports was running dpb(1) with two other members in the build
> > cluster.  A few minutes ago I found it in ddb(4).  The report is short,
> > sadly, as the machine doesn't return from the 'bt' command.
> >
> > The machine is acting both as an NFS server and and NFS client.
> >
> > OpenBSD/riscv64 (riscv64.ports.openbsd.org) (console)
> >
> > login: panic: pool_anic:t: pol_ free l: p mod fiee liat m  oxifief:c a2e 
> > 07ff0ff fte21ade0 00f ifem c0d
> > 1 07f1f0ffcf2177 010=0 c16ce6 7x090xc52c !
> > 0x9066d21 919 xc1521
> > Stopped at  panic+0xfe: addia0,zero,256TIDPIDUID
> >  PR
> > FLAGS PFLAGS  CPU  COMMAND
> >   24243  43192 55 0x2  00  cc
> > *480349  52543  00x11  01  perl
> >  480803  72746 55 0x2  03  c++
> >  366351   3003 55 0x2  02K c++
> > panic() at panic+0xfa
> > panic() at pool_do_get+0x29a
> > pool_do_get() at pool_get+0x76
> > pool_get() at pmap_enter+0x128
> > pmap_enter() at uvm_fault_upper+0x1c2
> > uvm_fault_upper() at uvm_fault+0xb2
> > uvm_fault() at do_trap_user+0x120
> > https://www.openbsd.org/ddb.html describes the minimum info required in bug
> > reports.  Insufficient info makes it difficult to find and fix bugs.
> > ddb{1}> bt
> > panic() at panic+0xfa
> > panic() at pool_do_get+0x29a
> > pool_do_get() at pool_get+0x76
> > pool_get() at pmap_enter+0x128
> > pmap_enter() at uvm_fault_upper+0x1c2
> > uvm_fault_upper() at uvm_fault+0xb2
> > uvm_fault() at do_trap_user+0x120
> > do_trap_user() at cpu_exception_handler_user+0x7a
> > 
> 
> Another panic on riscv64-1, a new board which doesn't have RTC/I2C
> problems anymore and is acting as a dpb(1) cluster member/NFS client.

Why are both traces ending in pool_do_get()?  Are CPU0 and CPU1 there at
the same time?

This corruption as well as the one above arise in the top part of the
fault handler which already runs concurrently.  Did you try putting
KERNEL_LOCK/UNLOCK() dances around uvm_fault() in trap.c?  That could
help figure out if something is still unsafe in riscv64's pmap.

> 
> panic: pool_do_get: rwobjpl fane c: sool_difget:  ragbjpx 
> ffef li22 6od0fi; d: pm gd 0 xfff222baa0e ^M8addo 
> ff0x020b6adf8f; 3cf0e 94ic  =p0 
> l d4_ef 85 0xof4cl fStopped at  panic+0xfe: addia0,zero,256
> TIDPIDUID PR
> FLAGS PFLAGS  CPU  COMMAND
> * 94448  18837 550x12  01  bzip2
>  139717  98504 55 0x2  00  perl
>  451857  10216 55 0x2  03  c++
>  215599  53280 55 0x2  02  c++
> panic() at panic+0xfa
> panic() at pool_do_get+0x29a
> pool_do_get() at pool_get+0x76
> pool_get() at _rw_obj_alloc_flags+0x1e
> _rw_obj_alloc_flags() at amap_alloc+0x3a
> amap_alloc() at amap_copy+0x2b6
> amap_copy() at uvm_fault_check+0x1ec
> https://www.openbsd.org/ddb.html describes the minimum info required in bug
> reports.  Insufficient info makes it difficult to find and fix bugs.
> ddb{1}> [-- jca@localhost attached -- Sun Oct 31 08:36:49 2021]
> 
> 
> ddb{1}> show panic
>  cpu0: pool_do_get: rwobjpl free list modified: page 0xffc22b6ad000; item 
> a
> ddr 0xffc22b6ade88; offset 0x0=0xcf4fef853c0094c7 != 0xcf4fef853cfc94c7
>  cpu3: pool_do_get: rwobjpl free list modified: page 0xffc22b6ad000; item 
> a
> ddr 0xffc22b6ade88; offset 0x0=0xcf4fef853c0094c7 != 0xcf4fef853cfc94c7
> *cpu1: pool_do_get: rwobjpl free list modified: page 0xffc22b6ad000; item 
> a
> ddr 0xffc22b6ade88; offset 0x0=0xcf4fef853c0094c7 != 0xcf4fef853cfc94c7
> ddb{1}> trace
> panic() at panic+0xfa
> panic() at pool_do_get+0x29a
> pool_do_get() at pool_get+0x76
> pool_get() at _rw_obj_alloc_flags+0x1e
> _rw_obj_alloc_flags() at amap_alloc+0x3a
> amap_alloc() at amap_copy+0x2b6
> amap_copy() at uvm_fault_check+0x1ec
> uvm_fault_check() at uvm_fault+0xd0
> uvm_fault() at do_trap_user+0x120
> do_trap_user() at cpu_exception_handler_user+0x7a
> address 0xfffe is invalid
> ddb{1}> mach ddbcpu 0
> Stopped at  ipi_intr+0x22:  c.lia0,1
> ipi_intr() at ipi_intr+0x1e
> ipi_intr() at riscv_cpu_intr+0x1e
> riscv_cpu_intr() at cpu_exception_handler_supervisor+0x78
> cpu_exception_handler_supervisor() at cnputc+0x2a
> cnputc() at db_putchar+0x322
> db_putchar() at kprintf+0xc36
> kprintf() at db_printf+0x4a
> ddb{0}> trace
> ipi_intr() at ipi_intr+0x1e
> ipi_intr() at riscv_cpu_intr+0x1e
> riscv_cpu_intr() at cpu_exception_handler_supervisor+0x78
> cpu_exception_handler_supervisor() at cnputc+0x2a
> cnputc() at db_putchar+0x322
> db_putchar() at kprintf+0xc36
> kprintf() at db_printf+0x4a
> db_printf() at panic+0x8a
> panic() at pool_do_get+0x29a
> pool_do_get() at pool_get+0x76
> pool_get() at _rw_obj_alloc_flags+0x1e
> _rw_obj_alloc_flags() at amap_alloc+0x3a
>

Re: poll: hangs when polling closed fd for POLLHUP

2021-10-29 Thread Martin Pieuchot

On 29/10/21(Fri) 14:47, Martin Pieuchot wrote:
> On 29/10/21(Fri) 10:50, Anton Lindqvist wrote:
> > On Fri, Oct 29, 2021 at 09:13:58AM +0100, Larry Hynes wrote:
> > > Hi
> > > 
> > > In last two snapshots I installed, poll seems to hang in certain
> > > circumstances.
> > > 
> > > Here's a reproducer, from Leah Neukirchen (cc'ed on this mail):
> 
> Thanks, do you agree to license it under ISC, as in 
> /usr/share/misc/license.template, so we can put it in src/regress?
> 
> > > 
> > > --
> > > 
> > > #include 
> > > #include 
> > > #include 
> > > 
> > > int
> > > main()
> > > {
> > >   struct pollfd fds[1];
> > >   
> > >   fds[0].fd = 0;
> > >   fds[0].events = POLLIN | POLLHUP;
> > >   close(0);
> > > 
> > >   printf("%d\n", poll(fds, 1, -1));
> > >   printf("%d\n", fds[0].revents & POLLNVAL);
> > > }
> > > 
> > > --
> > > 
> > > When compiled and run, that should hang on latest snap. It would be
> > > expected to return immediately with POLLNVAL set.
> > 
> > I think poll() needs similar handling as select() recently gained.
> > Caution, only compile tested.
> 
> Here's a simpler fix, do not sleep at all if we already have something
> to report.

After some discussions with anton@ I believe we should instead go with a
version similar to what sys_select is doing.  It is necessary to call
ppollcollect() for FDs that are ready.

Here's anton@'s version without extra argument:

Index: kern/sys_generic.c
===
RCS file: /cvs/src/sys/kern/sys_generic.c,v
retrieving revision 1.138
diff -u -p -r1.138 sys_generic.c
--- kern/sys_generic.c  24 Oct 2021 11:23:22 -  1.138
+++ kern/sys_generic.c  29 Oct 2021 14:05:56 -
@@ -1128,7 +1128,7 @@ doppoll(struct proc *p, struct pollfd *f
 {
struct kqueue_scan_state scan;
struct pollfd pfds[4], *pl = pfds;
-   int error, nevents = 0;
+   int error, ncollected, nevents = 0;
size_t sz;
 
/* Standards say no more than MAX_OPEN; this is possibly better. */
@@ -1154,14 +1154,14 @@ doppoll(struct proc *p, struct pollfd *f
dosigsuspend(p, *sigmask &~ sigcantmask);
 
/* Register kqueue events */
-   *retval = ppollregister(p, pl, nfds, );
+   ncollected = ppollregister(p, pl, nfds, );
 
/*
 * The poll/select family of syscalls has been designed to
 * block when file descriptors are not available, even if
 * there's nothing to wait for.
 */
-   if (nevents == 0) {
+   if (nevents == 0 && ncollected == 0) {
uint64_t nsecs = INFSLP;
 
if (timeout != NULL) {
@@ -1184,7 +1184,7 @@ doppoll(struct proc *p, struct pollfd *f
struct kevent kev[KQ_NEVENTS];
int i, ready, count;
 
-   /* Maxium number of events per iteration */
+   /* Maximum number of events per iteration */
count = MIN(nitems(kev), nevents);
ready = kqueue_scan(, count, kev, timeout, p, );
 #ifdef KTRACE
@@ -1193,7 +1193,7 @@ doppoll(struct proc *p, struct pollfd *f
 #endif
/* Convert back events that are ready. */
for (i = 0; i < ready; i++)
-   *retval += ppollcollect(p, [i], pl, nfds);
+   ncollected += ppollcollect(p, [i], pl, nfds);
 
/*
 * Stop if there was an error or if we had enough
@@ -1205,6 +1205,7 @@ doppoll(struct proc *p, struct pollfd *f
nevents -= ready;
}
kqueue_scan_finish();
+   *retval = ncollected;
 done:
/*
 * NOTE: poll(2) is not restarted after a signal and EWOULDBLOCK is

Re: poll: hangs when polling closed fd for POLLHUP

2021-10-29 Thread Martin Pieuchot

On 29/10/21(Fri) 10:50, Anton Lindqvist wrote:
> On Fri, Oct 29, 2021 at 09:13:58AM +0100, Larry Hynes wrote:
> > Hi
> > 
> > In last two snapshots I installed, poll seems to hang in certain
> > circumstances.
> > 
> > Here's a reproducer, from Leah Neukirchen (cc'ed on this mail):

Thanks, do you agree to license it under ISC, as in 
/usr/share/misc/license.template, so we can put it in src/regress?

> > 
> > --
> > 
> > #include 
> > #include 
> > #include 
> > 
> > int
> > main()
> > {
> > struct pollfd fds[1];
> > 
> > fds[0].fd = 0;
> > fds[0].events = POLLIN | POLLHUP;
> > close(0);
> > 
> > printf("%d\n", poll(fds, 1, -1));
> > printf("%d\n", fds[0].revents & POLLNVAL);
> > }
> > 
> > --
> > 
> > When compiled and run, that should hang on latest snap. It would be
> > expected to return immediately with POLLNVAL set.
> 
> I think poll() needs similar handling as select() recently gained.
> Caution, only compile tested.

Here's a simpler fix, do not sleep at all if we already have something
to report.

This works for me, ok?

Index: kern/sys_generic.c
===
RCS file: /cvs/src/sys/kern/sys_generic.c,v
retrieving revision 1.138
diff -u -p -r1.138 sys_generic.c
--- kern/sys_generic.c  24 Oct 2021 11:23:22 -  1.138
+++ kern/sys_generic.c  29 Oct 2021 12:26:48 -
@@ -1155,6 +1155,8 @@ doppoll(struct proc *p, struct pollfd *f
 
/* Register kqueue events */
*retval = ppollregister(p, pl, nfds, );
+   if (*retval != 0)
+   goto done;
 
/*
 * The poll/select family of syscalls has been designed to

Re: OpenBSD amd64 6.9 repeatable kernel panic starting X

2021-09-15 Thread Martin Pieuchot

On 13/09/21(Mon) 08:25, M Smith wrote:
> On 8/09/21 3:37 am, Martin Pieuchot wrote:
> > Hello,
> > 
> > Thanks for your bug report.
> > 
> > On 07/09/21(Tue) 15:18, M Smith wrote:
> > > > Synopsis:   OpenBSD amd64 6.9 repeatable kernel panic starting X
> > > > Category:   kernel
> > > > Environment:
> > > 
> > >   System  : OpenBSD 6.9
> > >   Details : OpenBSD 6.9 (GENERIC.MP) #4: Tue Aug 10 08:12:23 MDT 2021
> > >   
> > > r...@syspatch-69-amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> > > 
> > >   Architecture: OpenBSD.amd64
> > >   Machine : amd64
> > > 
> > > > Description:
> > > 
> > >   I have been investigating a largely repeatable OpenBSD 6.9 
> > > amd64 panic.  Essentially the OS drops into the kernel debugger about 90% 
> > > of the time when starting X on specific hardware, and is doing so with 
> > > what seems like a memory related issue - possibly errant modification by 
> > > concurrent threads.
> > 
> > Indeed.  You're certainly hitting a VM/pmap bug.
> > 
> > >   The event is reproducible across two independent machines (both new).  
> > > Each machine has identical underlying hardware.  A memory checker run 
> > > overnight on one machine did not identify any underlying memory issues.
> > 
> > That points to something in your setup which exposes the bug.
> > 
> > >   The hardware: Avalue EMS-TGL-S85-A1-1R, CPU an 11th Gen Intel(R) 
> > > Core(TM) i7-1185G7E @ 2.80GHz with 2x 16GB memory boards (32GB in total).
> > > 
> > >   The mentioned possible errant memory modification, the assertion 
> > > underlying this panic 
> > > (https://www.sirranet.co.nz/openbsd_542456/69_panic.html) suggests that 
> > > kernel execution has failed to obtain a necessary exclusivity lock.  
> > > Various other panics differ in that many feature assertions based on 
> > > "pool_do_get ... offset ???" with the offset identifying the trigger 
> > > condition, hinting at a memory inconsistency.
> > > 
> > >   Testing on 7.0-current 
> > > (https://www.sirranet.co.nz/openbsd_542456/70_panic.html) sometimes 
> > > results in a panic on boot before invoking startX, other times the boot 
> > > fails to complete cleanly at the kernel linking step with the error 
> > > "reodering libraries ld in calloc(): chunk infor corrupted" and simular 
> > > errors.  Whether these two events are related to the 6.9 panic is 
> > > anything but conclusive.
> > > 
> > >   I see others have posted what looks like the same issue.  I have posted 
> > > the above detail however as the assert identifying the lack of kernel 
> > > lock looks as though it may be of some value.
> > >   https://marc.info/?t=16176931482=1=2
> > >   https://marc.info/?t=16239060261=1=2
> > 
> > All those report have in common a 1th Gen Intel CPU.
> > 
> > >   Any ideas would be greatly appreciated.
> > 
> > You could start by booting bsd.sp to rule out any HW problem.
> 
> Sorry for the delay in replying.
> 
> Both 6.9 and 7.0 crash when booting bsd.sp
> https://www.sirranet.co.nz/openbsd_542456/69_reply.html
> https://www.sirranet.co.nz/openbsd_542456/70_reply.html

That rules out any concurrency issue.

> > Does the corruption happen with a vanilla install or does running
> > particular program makes it easier to happen?
> 
> These are both basic installs. After a fresh install I have run fw_update,
> and on the 6.9 machine syspatch was run. Other than that we have enabled
> xenodm. No other software or packages are installed or running. The machines
> don't always crash on first boot, but after a handful of reboot they do.
> 
> > >   I can easily test/re-test on both 6.9 and 7.0-current).
> > 
> > Does it also happen if you disable drm at boot?
> > 
> 
> On both 6.9 and 7.0  if I disable drm the machine panics on reboot. (Images
> in the links above.)

Please make sure you also disable inteldrm(4).  That's why you're
getting a panic on 6.9.  This is to see if the issue is related to
the graphic driver.

Re: __mp_lock_spin: 0xffffffff822d1120 lock spun out

2021-09-15 Thread Martin Pieuchot

On 15/09/21(Wed) 12:06, Paul de Weerd wrote:
> Hi all,
> 
> After some off-list advice from Patrick to enable MP_LOCKDEBUG in
> order to debug the hangs I reported [1], I did exactly that and was
> running a self-built kernel for some time.  This morning, I wanted to
> upgrade to the latest snapshot so I also cvs up'd and rebuilt my
> kernel with MP_LOCKDEBUG.  However, now I get __mp_lock_spin during
> boot:
> 
> root on sd2a (a0b80508b6693ba1.a) swap on sd2b dump on sd2b
> inteldrm0: 1920x1080, 32bpp
> wsdisplay0 at inteldrm0 mux 1
> __mp_lock_spin: 0x822d1120 lock spun out
> Stopped at  db_enter+0x10:  popq%rbp
> ddb{1}> trace
> db_enter() at db_enter+0x10
> __mp_lock(822d1120) at __mp_lock+0xa2
> __mp_acquire_count(822d1120,1) at __mp_acquire_count+0x38
> mi_switch() at mi_switch+0x299
> sleep_finish(8000226d4f80,1) at sleep_finish+0x11c
> msleep(8011d980,8011d998,20,81e828e3,0) at msleep+0xcc
> taskq_next_work(8011d980,8000226d5040) at taskq_next_work+0x61
> taskq_thread(8011d980) at taskq_thread+0x6c
> end trace frame: 0x0, count: -8

That means another CPU is holding the KERNEL_LOCK() for too long.  When
this happens it is more important to look at what other CPUs are doing
because one of them is holding the KERNEL_LOCK().  If you can reproduce
this, please include the output of "ps /o" and the trace from all the
CPUs.

Note that the default value of MP_LOCKDEBUG might be too sensitive for
some workloads, using WITNESS might not spot the same issue, but does
not present false positive.

Thanks,
Martin

Re: OpenBSD amd64 6.9 repeatable kernel panic starting X

2021-09-07 Thread Martin Pieuchot

Hello,

Thanks for your bug report.

On 07/09/21(Tue) 15:18, M Smith wrote:
> > Synopsis:   OpenBSD amd64 6.9 repeatable kernel panic starting X
> > Category:   kernel
> > Environment:
> 
>   System  : OpenBSD 6.9
>   Details : OpenBSD 6.9 (GENERIC.MP) #4: Tue Aug 10 08:12:23 MDT 2021
>   
> r...@syspatch-69-amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC.MP
> 
>   Architecture: OpenBSD.amd64
>   Machine : amd64
> 
> > Description:
> 
>   I have been investigating a largely repeatable OpenBSD 6.9 
> amd64 panic.  Essentially the OS drops into the kernel debugger about 90% of 
> the time when starting X on specific hardware, and is doing so with what 
> seems like a memory related issue - possibly errant modification by 
> concurrent threads.

Indeed.  You're certainly hitting a VM/pmap bug.  

>   The event is reproducible across two independent machines (both new).  
> Each machine has identical underlying hardware.  A memory checker run 
> overnight on one machine did not identify any underlying memory issues.

That points to something in your setup which exposes the bug.

>   The hardware: Avalue EMS-TGL-S85-A1-1R, CPU an 11th Gen Intel(R) 
> Core(TM) i7-1185G7E @ 2.80GHz with 2x 16GB memory boards (32GB in total).
> 
>   The mentioned possible errant memory modification, the assertion 
> underlying this panic 
> (https://www.sirranet.co.nz/openbsd_542456/69_panic.html) suggests that 
> kernel execution has failed to obtain a necessary exclusivity lock.  Various 
> other panics differ in that many feature assertions based on "pool_do_get ... 
> offset ???" with the offset identifying the trigger condition, hinting at a 
> memory inconsistency.
> 
>   Testing on 7.0-current 
> (https://www.sirranet.co.nz/openbsd_542456/70_panic.html) sometimes results 
> in a panic on boot before invoking startX, other times the boot fails to 
> complete cleanly at the kernel linking step with the error "reodering 
> libraries ld in calloc(): chunk infor corrupted" and simular errors.  Whether 
> these two events are related to the 6.9 panic is anything but conclusive.
> 
>   I see others have posted what looks like the same issue.  I have posted 
> the above detail however as the assert identifying the lack of kernel lock 
> looks as though it may be of some value.
>   https://marc.info/?t=16176931482=1=2
>   https://marc.info/?t=16239060261=1=2

All those report have in common a 1th Gen Intel CPU.  

>   Any ideas would be greatly appreciated.

You could start by booting bsd.sp to rule out any HW problem.

Does the corruption happen with a vanilla install or does running
particular program makes it easier to happen?
 
>   I can easily test/re-test on both 6.9 and 7.0-current).

Does it also happen if you disable drm at boot?

WITNESS kernel on sparc64

2021-09-05 Thread Martin Pieuchot

Kernels compiled with the WITNESS option do not boot on sparc64.   It's
not a size issue since I tried disabling drm to make sure the size did
not grow.

It would help me a lot debugging SMP issues if this could be fixed.  I
don't have the knowledge to do it myself.

Here's the output.  Don't be fooled by the name, I copy the file as
bsd.upgrade to test development kernels:

Boot device: disk  File and args: 
OpenBSD IEEE 1275 Bootblock 2.1
..>> OpenBSD BOOT 1.21

ERROR: /iscsi-hba: No iscsi-network-bootpath property
upgrade detected: switching to /bsd.upgrade
Trying /bsd.upgrade...
|
Evaluating: 
Flushbuf error
Booting /virtual-devices@100/channel-devices@200/disk@0:a/bsd.upgrade
7207632@0x100+1328@0x16dfad0+156436@0x1c0+4037868@0x1c26314 
symbols @ 0xfe41c400 336191+165+477096+306048 start=0x100
[ using 1120528 bytes of bsd ELF symbol table ]
ERROR: Last Trap: Fast Data Access MMU Miss

Re: kernel: page fault trap in rw_status

2021-08-30 Thread Martin Pieuchot

On 06/08/21(Fri) 08:08, Theo Buehler wrote:
> > The diff below fixes this by setting the "source" amap lock to the newly
> > allocated one.  This is not strictly necessary on OpenBSD since the amap
> > is only inserted on the global list at the end of amap copy, but this
> > satisfies the locking requirement of amap_wipeout() and is IMHO the
> > simplest solution.  This has also the advantage of reducing the
> > differences with NetBSD.
> 
> This diff causes a reproducible hang with mount_mfs(8) on my laptop. I
> use an mfs noperm partition as described in
> https://www.openbsd.org/faq/faq5.html#Release
> 
> $ tail -1 /etc/fstab
> swap /dest mfs rw,nosuid,noperm,-P/var/dest,-s1.5G,noauto 0 0
> 
> If I run 'top -S -s.1' on an otherwise idle system and do 'mount /dest',
> I see two mount_mfs(8) processes spin in fltamap and the pagedaemon is
> spinning in pgdaemon before the system locks up completely (this takes
> something between 1 and 20 seconds to happen).

I couldn't reproduce this hang here.  Do you also see it with the
smaller fix below?

Index: uvm/uvm_amap.c
===
RCS file: /cvs/src/sys/uvm/uvm_amap.c,v
retrieving revision 1.89
diff -u -p -r1.89 uvm_amap.c
--- uvm/uvm_amap.c  26 Mar 2021 13:40:05 -  1.89
+++ uvm/uvm_amap.c  30 Aug 2021 08:31:21 -
@@ -618,6 +618,13 @@ amap_copy(struct vm_map *map, struct vm_
return;
srcamap = entry->aref.ar_amap;
 
+   /*
+* Make the new amap share the source amap's lock, and then lock
+* both.
+*/
+   amap->am_lock = srcamap->am_lock;
+   rw_obj_hold(amap->am_lock);
+
amap_lock(srcamap);
 
/*
@@ -655,7 +662,7 @@ amap_copy(struct vm_map *map, struct vm_
 
chunk = amap_chunk_get(amap, lcv, 1, PR_NOWAIT);
if (chunk == NULL) {
-   amap_unlock(srcamap);
+   /* amap_wipeout() releases the lock. */
amap->am_ref = 0;
amap_wipeout(amap);
return;
@@ -695,10 +702,10 @@ amap_copy(struct vm_map *map, struct vm_
 * If we referenced any anons, then share the source amap's lock.
 * Otherwise, we have nothing in common, so allocate a new one.
 */
-   KASSERT(amap->am_lock == NULL);
-   if (amap->am_nused != 0) {
-   amap->am_lock = srcamap->am_lock;
-   rw_obj_hold(amap->am_lock);
+   KASSERT(amap->am_lock == srcamap->am_lock);
+   if (amap->am_nused == 0) {
+   rw_obj_free(amap->am_lock);
+   amap->am_lock = NULL;
}
amap_unlock(srcamap);

Re: Crash when unplugging a UPS USB connection

2021-08-05 Thread Martin Pieuchot

On 05/08/21(Thu) 09:41, Aaron Bieber wrote:
> 
> Stuart Henderson writes:
> 
> > On 2021/07/13 01:29, Anindya Mukherjee wrote:
> >> I have a Cyber Power CP1500PFCLCDa UPS and get exactly the same crash on 
> >> the
> >> latest snapshot if the USB cable is unplugged. My dmesg is very similar so 
> >> I've
> >> omitted it, but I'd also be happy to help debug this issue.
> >> 
> >> Regards,
> >> Anindya
> >> 
> >
> > First try backing out the "Allow uhidev child devices to claim selective
> > report ids" commit from March.
> 
> Here is a diff that reverts:
> 
> 2021-03-08 6545f693 jcs   Add another Type Cover device
> 2021-03-08 1f85050a jcs   regen
> 2021-03-08 fc9d2605 jcs   Add Surface Pro Type Cover
> 2021-03-08 f31b43ce jcs   Allow uhidev child devices to claim selective 
> report ids
> 
> With them reverted, my UPS(s) no longer trigger panics on disconnect.

The bugs lies in upd_match() and has been introduced with the change
containing UHIDEV_CLAIM_MULTIPLE_REPORTID.

upd(4) owns all the reportID of a USB device, so a "uha.claimed" logic
similar to what has been written for umt_match() is missing.

> diff --git a/sys/dev/usb/fido.c b/sys/dev/usb/fido.c
> index c6d846aaa84..77bd9b12175 100644
> --- a/sys/dev/usb/fido.c
> +++ b/sys/dev/usb/fido.c
> @@ -1,4 +1,4 @@
> -/*   $OpenBSD: fido.c,v 1.3 2021/03/08 14:35:57 jcs Exp $*/
> +/*   $OpenBSD: fido.c,v 1.2 2019/12/18 05:09:53 deraadt Exp $*/
>  
>  /*
>   * Copyright (c) 2019 Reyk Floeter 
> @@ -63,7 +63,7 @@ fido_match(struct device *parent, void *match, void *aux)
>   void *desc;
>   int   ret = UMATCH_NONE;
>  
> - if (uha->reportid == UHIDEV_CLAIM_MULTIPLE_REPORTID)
> + if (uha->reportid == UHIDEV_CLAIM_ALLREPORTID)
>   return (ret);
>  
>   /* Find the FIDO usage page and U2F collection */
> diff --git a/sys/dev/usb/ucycom.c b/sys/dev/usb/ucycom.c
> index ca8636f0a7f..ca6d6e9c6b2 100644
> --- a/sys/dev/usb/ucycom.c
> +++ b/sys/dev/usb/ucycom.c
> @@ -1,4 +1,4 @@
> -/*   $OpenBSD: ucycom.c,v 1.39 2021/03/08 14:35:57 jcs Exp $ */
> +/*   $OpenBSD: ucycom.c,v 1.38 2020/02/25 10:03:39 mpi Exp $ */
>  /*   $NetBSD: ucycom.c,v 1.3 2005/08/05 07:27:47 skrll Exp $ */
>  
>  /*
> @@ -165,7 +165,7 @@ ucycom_match(struct device *parent, void *match, void 
> *aux)
>  {
>   struct uhidev_attach_arg *uha = aux;
>  
> - if (uha->reportid == UHIDEV_CLAIM_MULTIPLE_REPORTID)
> + if (uha->reportid == UHIDEV_CLAIM_ALLREPORTID)
>   return (UMATCH_NONE);
>  
>   return (usb_lookup(ucycom_devs, uha->uaa->vendor, uha->uaa->product) != 
> NULL ?
> diff --git a/sys/dev/usb/ugold.c b/sys/dev/usb/ugold.c
> index 752ecff64d2..865618cc17d 100644
> --- a/sys/dev/usb/ugold.c
> +++ b/sys/dev/usb/ugold.c
> @@ -1,4 +1,4 @@
> -/*   $OpenBSD: ugold.c,v 1.17 2021/04/05 16:26:06 landry Exp $   */
> +/*   $OpenBSD: ugold.c,v 1.15 2020/08/17 04:26:57 gnezdo Exp $   */
>  
>  /*
>   * Copyright (c) 2013 Takayoshi SASANO 
> @@ -113,7 +113,7 @@ ugold_match(struct device *parent, void *match, void *aux)
>   int size;
>   void *desc;
>  
> - if (uha->reportid == UHIDEV_CLAIM_MULTIPLE_REPORTID)
> + if (uha->reportid == UHIDEV_CLAIM_ALLREPORTID)
>   return (UMATCH_NONE);
>  
>   if (usb_lookup(ugold_devs, uha->uaa->vendor, uha->uaa->product) == NULL)
> diff --git a/sys/dev/usb/uhid.c b/sys/dev/usb/uhid.c
> index 085c1523ccf..ba21e8cf96f 100644
> --- a/sys/dev/usb/uhid.c
> +++ b/sys/dev/usb/uhid.c
> @@ -1,4 +1,4 @@
> -/*   $OpenBSD: uhid.c,v 1.84 2021/03/08 14:35:57 jcs Exp $ */
> +/*   $OpenBSD: uhid.c,v 1.83 2021/01/29 16:59:41 sthen Exp $ */
>  /*   $NetBSD: uhid.c,v 1.57 2003/03/11 16:44:00 augustss Exp $   */
>  
>  /*
> @@ -115,7 +115,7 @@ uhid_match(struct device *parent, void *match, void *aux)
>  {
>   struct uhidev_attach_arg *uha = aux;
>  
> - if (uha->reportid == UHIDEV_CLAIM_MULTIPLE_REPORTID)
> + if (uha->reportid == UHIDEV_CLAIM_ALLREPORTID)
>   return (UMATCH_NONE);
>  
>   return (UMATCH_IFACECLASS_GENERIC);
> diff --git a/sys/dev/usb/uhidev.c b/sys/dev/usb/uhidev.c
> index cbd0b8336f6..2333c260d71 100644
> --- a/sys/dev/usb/uhidev.c
> +++ b/sys/dev/usb/uhidev.c
> @@ -1,5 +1,4 @@
>  /*   $OpenBSD: uhidev.c,v 1.92 2021/03/18 09:21:53 anton Exp $   */
> -/*   $NetBSD: uhidev.c,v 1.14 2003/03/11 16:44:00 augustss Exp $ */
>  
>  /*
>   * Copyright (c) 2001 The NetBSD Foundation, Inc.
> @@ -250,28 +249,21 @@ uhidev_attach(struct device *parent, struct device 
> *self, void *aux)
>  
>   uha.uaa = uaa;
>   uha.parent = sc;
> - uha.reportid = UHIDEV_CLAIM_MULTIPLE_REPORTID;
> - uha.nreports = nrepid;
> - uha.claimed = malloc(nrepid, M_TEMP, M_WAITOK|M_ZERO);
> + uha.reportid = UHIDEV_CLAIM_ALLREPORTID;
>  
> - /* Look for a driver claiming multiple report IDs first. */
> + /* Look for a driver claiming all report IDs first. */
>   dev =

Re: kernel: page fault trap in rw_status

2021-08-05 Thread Martin Pieuchot

Hello Thomas,

Thanks a lot for your great but report, see below for an explanation and
a possible fix.

On 04/08/21(Wed) 12:18, Thomas L. wrote:
> >Synopsis:  page fault trap in rw_status
> >Category:  kernel
> >Environment:
> System  : OpenBSD 6.9
> Details : OpenBSD 6.9 (GENERIC) #4: Mon Jun  7 08:20:14 MDT 2021
>  
> r...@syspatch-69-amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC
> 
> Architecture: OpenBSD.amd64
> Machine : amd64
> >Description:
> One of my VMs crashed with a page fault trap
> ddb> show panic
> kernel page fault
> uvm_fault(0xfd801a844120, 0x0, 0, 1) -> e
> rw_status(0) at rw_status+0x11
> end trace frame: 0x8f9991e0, count: 0
> ddb> trace
> rw_status(0) at rw_status+0x11
> amap_wipeout(fd8005893740) at amap_wipeout+0x2a
> uvm_fault_check(8f999380,8f9993b8,8f9993e0) at 
> uvm_faul
> t_check+0x272
> uvm_fault(fd801a844120,7f7c,0,2) at uvm_fault+0xab
> upageflttrap(8f9994e0,7f7c04c8) at upageflttrap+0x59

The problem comes from an error path in amap_copy().  Your workload put
some memory pressure which makes amap_chunk_get() fails when trying to
handle a page fault.
Then amap_wipeout() is called on a newly allocated amap which hasn't a
lock.

> [...]
> ddb> show uvm
> Current UVM status:
>   pagesize=4096 (0x1000), pagemask=0xfff, pageshift=12
>   121478 VM pages: 60248 active, 30019 inactive, 65 wired, 5 free (0 zero)
>   min  10% (25) anon, 10% (25) vnode, 5% (12) vtext
>   freemin=4049, free-target=5398, inactive-target=30036, wired-max=40492
>   faults=9051552, traps=9231593, intrs=26202481, ctxswitch=6058861 fpuswitch=0
>   softint=13077905, syscalls=25514901, kmapent=11
>   fault counts:
> noram=1191470, noanon=0, noamap=0, pgwait=7, pgrele=0
> ok relocks(total)=1652265(1652346), anget(retries)=4025885(1468035), 
> amapco
> py=1540576
> neighbor anon/obj pg=1512054/4165883, gets(lock/unlock)=1652565/184269
> cases: anon=3835980, anoncow=189847, obj=1363156, prcopy=288413, 
> przero=337
> 7927
>   daemon and swap counts:
> woke=1165876, revs=1187026, scans=1832267541, obscans=21522202, 
> anscans=181
> 0745339
> busy=0, freed=746569, reactivate=0, deactivate=1694102
> pageouts=94994196, pending=91077, nswget=282429
> nswapdev=1
> swpages=262143, swpginuse=262142, swpgonly=240501 paging=0
>   kernel pointers:
> objs(kern)=0x82158318

The diff below fixes this by setting the "source" amap lock to the newly
allocated one.  This is not strictly necessary on OpenBSD since the amap
is only inserted on the global list at the end of amap copy, but this
satisfies the locking requirement of amap_wipeout() and is IMHO the
simplest solution.  This has also the advantage of reducing the
differences with NetBSD.

The diff below includes other styles issues that I might commit
separately, but it is easier for me to include them since I checked
NetBSD's sources anyway.

Note that your workload, involving an application in ruby26 puts a lot
of pressure on the memory subsystem.  So I wont be surprise if you find
other bugs.  In any case I'd be delighted to look at your bug reports,
thanks a lot!

Index: uvm/uvm_amap.c
===
RCS file: /cvs/src/sys/uvm/uvm_amap.c,v
retrieving revision 1.89
diff -u -p -r1.89 uvm_amap.c
--- uvm/uvm_amap.c  26 Mar 2021 13:40:05 -  1.89
+++ uvm/uvm_amap.c  5 Aug 2021 10:11:45 -
@@ -525,7 +525,6 @@ amap_wipeout(struct vm_amap *amap)
/*
 * Finally, destroy the amap.
 */
-   amap->am_ref = 0;   /* ... was one */
amap->am_nused = 0;
amap_unlock(amap);
amap_free(amap);
@@ -555,13 +554,17 @@ amap_copy(struct vm_map *map, struct vm_
int i, j, k, n, srcslot;
struct vm_amap_chunk *chunk = NULL, *srcchunk = NULL;
struct vm_anon *anon;
+   vsize_t len;
 
KASSERT(map != kernel_map); /* we use sleeping locks */
 
+   srcamap = entry->aref.ar_amap;
+   len = entry->end - entry->start;
+
/*
 * Is there an amap to copy?  If not, create one.
 */
-   if (entry->aref.ar_amap == NULL) {
+   if (srcamap == NULL) {
/*
 * Check to see if we have a large amap that we can
 * chunk.  We align startva/endva to chunk-sized
@@ -571,7 +574,7 @@ amap_copy(struct vm_map *map, struct vm_
 * that makes it grow or shrink dynamically with
 * the number of slots.
 */
-   if (atop(entry->end - entry->start) >= UVM_AMAP_LARGE) {
+   if (atop(len) >= UVM_AMAP_LARGE) {
if (canchunk) {
/* convert slots to bytes */
chunksize = UVM_AMAP_CHUNK << PAGE_SHIFT;
@@ -586,10 +589,10 @@

X1 carbon gen2 & flickering screen in X11

2021-06-18 Thread Martin Pieuchot

Default i386 install on a X1 carbon gen2, dmesg below, results in a
flickering screen in X11.  The experience is comparable to an high
refresh on an old CRT screen and makes X11 unusable.

Default Xorg.0.log attached.

I tried using the "intel" driver instead of the "modesetting" by using a
custom xorg.conf, this doesn't change anything.

OpenBSD 6.9-current (GENERIC.MP) #63: Sun Jun 13 22:33:26 MDT 2021
dera...@i386.openbsd.org:/usr/src/sys/arch/i386/compile/GENERIC.MP
real mem  = 3446788096 (3287MB)
avail mem = 3367546880 (3211MB)
random: good seed from bootblocks
mpath0 at root
scsibus0 at mpath0: 256 targets
mainbus0 at root
bios0 at mainbus0: date 04/29/13, BIOS32 rev. 0 @ 0xfc200, SMBIOS rev. 2.7 @ 
0xdae9d000 (68 entries)
bios0: vendor LENOVO version "G6ET96WW (2.56 )" date 04/29/2013
bios0: LENOVO 3460BR9
acpi0 at bios0: ACPI 5.0
acpi0: sleep states S0 S3 S4 S5
acpi0: tables DSDT FACP SLIC TCPA SSDT SSDT SSDT HPET APIC MCFG ECDT FPDT ASF! 
UEFI UEFI POAT SSDT SSDT DMAR UEFI SSDT DBG2
acpi0: wakeup devices LID_(S4) SLPB(S3) IGBE(S4) EXP2(S4) XHCI(S3) EHC1(S3) 
EHC2(S3) HDEF(S4)
acpitimer0 at acpi0: 3579545 Hz, 24 bits
acpihpet0 at acpi0: 14318179 Hz
acpimadt0 at acpi0 addr 0xfee0: PC-AT compat
cpu0 at mainbus0: apid 0 (boot processor)
cpu0: Intel(R) Core(TM) i5-3427U CPU @ 1.80GHz ("GenuineIntel" 686-class) 1.70 
GHz, 06-3a-09
cpu0: 
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,FSGSBASE,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
mtrr: Pentium Pro MTRR support, 10 var ranges, 88 fixed ranges
cpu0: apic clock running at 99MHz
cpu0: mwait min=64, max=64, C-substates=0.2.1.1.2, IBE
cpu1 at mainbus0: apid 1 (application processor)
cpu1: Intel(R) Core(TM) i5-3427U CPU @ 1.80GHz ("GenuineIntel" 686-class) 2.30 
GHz, 06-3a-09
cpu1: 
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,FSGSBASE,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu2 at mainbus0: apid 2 (application processor)
cpu2: Intel(R) Core(TM) i5-3427U CPU @ 1.80GHz ("GenuineIntel" 686-class) 2.30 
GHz, 06-3a-09
cpu2: 
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,FSGSBASE,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
cpu3 at mainbus0: apid 3 (application processor)
cpu3: Intel(R) Core(TM) i5-3427U CPU @ 1.80GHz ("GenuineIntel" 686-class) 2.30 
GHz, 06-3a-09
cpu3: 
FPU,V86,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CFLUSH,DS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE,SSE3,PCLMUL,DTES64,MWAIT,DS-CPL,VMX,SMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,PCID,SSE4.1,SSE4.2,x2APIC,POPCNT,DEADLINE,AES,XSAVE,AVX,F16C,RDRAND,NXE,RDTSCP,LONG,LAHF,PERF,ITSC,FSGSBASE,SMEP,ERMS,MD_CLEAR,IBRS,IBPB,STIBP,L1DF,SSBD,SENSOR,ARAT,XSAVEOPT,MELTDOWN
ioapic0 at mainbus0: apid 2 pa 0xfec0, version 20, 24 pins
acpimcfg0 at acpi0
acpimcfg0: addr 0xf800, bus 0-63
acpiec0 at acpi0
acpiprt0 at acpi0: bus 0 (PCI0)
acpiprt1 at acpi0: bus -1 (PEG_)
acpiprt2 at acpi0: bus 2 (EXP1)
acpiprt3 at acpi0: bus 3 (EXP2)
acpibtn0 at acpi0: LID_
acpibtn1 at acpi0: SLPB
"MSFT9000" at acpi0 not configured
"MSFT9001" at acpi0 not configured
"PNP0A08" at acpi0 not configured
acpicmos0 at acpi0
acpibat0 at acpi0: BAT0 model "45N1071" serial  2719 type LiP oem "SMP"
acpiac0 at acpi0: AC unit online
"LEN0078" at acpi0 not configured
acpithinkpad0 at acpi0: version 1.0
"PNP0C14" at acpi0 not configured
"PNP0C14" at acpi0 not configured
"INT33A0" at acpi0 not configured
acpicpu0 at acpi0: C2(350@80 mwait.1@0x20), C1(1000@1 mwait.1), PSS
acpicpu1 at acpi0: C2(350@80 mwait.1@0x20), C1(1000@1 mwait.1), PSS
acpicpu2 at acpi0: C2(350@80 mwait.1@0x20), C1(1000@1 mwait.1), PSS
acpicpu3 at acpi0: C2(350@80 mwait.1@0x20), C1(1000@1 mwait.1), PSS
acpipwrres0 at acpi0: PUBS, resource for XHCI, EHC1, EHC2
acpitz0 at acpi0: critical temperature is 200 degC
acpivideo0 at acpi0: VID_
acpivout0 at acpivideo0: LCD0
acpivideo1 at acpi0: VID_
bios0: ROM list: 0xc/0x1!
cpu0: Enhanced SpeedStep 2295 MHz: speeds: 1801, 1800, 1700, 1600, 1500, 1400, 
1300, 1200, 1100, 1000, 900, 800 MHz
pci0 at mainbus0 bus 0: configuration mode 1 (bios)
pchb0 at pci0 dev 0 function 0 "Intel Core 3G Host" rev 0x09
inteldrm0 at pci0 dev 2 function 0 "Intel HD Graphics 4000" rev 0x09
drm0 at inteldrm0
inteldrm0:

Re: i386 pagedaemon panic pg->wire_count == 0

2021-04-29 Thread Martin Pieuchot

On 29/04/21(Thu) 16:59, Alexander Bluhm wrote:
> On Thu, Apr 29, 2021 at 04:17:05PM +0200, Martin Pieuchot wrote:
> > On 29/04/21(Thu) 12:07, Alexander Bluhm wrote:
> > > On Thu, Apr 29, 2021 at 11:08:30AM +0200, Mark Kettenis wrote:
> > > > > > panic: kernel diagnostic assertion "pg->wire_count == 0" failed: 
> > > > > > file "/usr/src/sys/uvm/uvm_page.c", line 1265
> > > > 
> > > > I suspect pmapae.c rev 1.61 causes this issue.  Does reverting that
> > > > commit "fix" the issue?
> > > > 
> > > > It won't really fix the issue as you may still hit the "can't locate PD 
> > > > page"
> > > > panic.
> > > 
> > > I think this diff prevents the panic.  But I need one more test run
> > > to be sure.
> 
> One test without and one with this diff.  Either panic or make build
> passes.  I am convinced that this triggers the bug.  And one of my
> i386 regress machines can easily reproduce it.  Console access for
> developers possible.
> 
> > This 4 pages pdir is never freed, so ok with me to revert this chunk if
> > it is the cause of the panic you see.
> 
> How to proceed?  Revert this chunk?  Or does someone want to look 
> into the underlying cause soon.

First revert, then look into the underlying cause.

Re: i386 pagedaemon panic pg->wire_count == 0

2021-04-29 Thread Martin Pieuchot

On 29/04/21(Thu) 12:07, Alexander Bluhm wrote:
> On Thu, Apr 29, 2021 at 11:08:30AM +0200, Mark Kettenis wrote:
> > > > panic: kernel diagnostic assertion "pg->wire_count == 0" failed: file 
> > > > "/usr/src/sys/uvm/uvm_page.c", line 1265
> > 
> > I suspect pmapae.c rev 1.61 causes this issue.  Does reverting that
> > commit "fix" the issue?
> > 
> > It won't really fix the issue as you may still hit the "can't locate PD 
> > page"
> > panic.
> 
> I think this diff prevents the panic.  But I need one more test run
> to be sure.

This 4 pages pdir is never freed, so ok with me to revert this chunk if
it is the cause of the panic you see.  

> One of my i386 machines triggers it during every make build, the
> other one is stable.
> 
> wire count is 1
> 
> struct vm_page at 0xd4fd3404 (76 bytes) {pageq = {tqe_next = (struct vm_page 
> *)0x, tqe_prev = 0x}, objt = {rbt_parent = (struct rb_entry 
> *)0xd267d084, rbt_left = (struct rb_entry *)0xd286028c, rbt_right = (struct 
> rb_entry *)0xd4fd33c0, rbt_color = 0x0}, uanon = (struct vm_anon *)0x0, 
> uobject = (struct uvm_object *)0xd0e58d0c, offset = 0x2552c000, pg_flags = 
> 0x324, pg_version = 0x1, wire_count = 0x1, phys_addr = 0xcfd1e000, fpgsz 
> = 0x0, mdpage = {pv_mtx = {mtx_owner = (volatile void *)0x0, mtx_wantipl = 
> 0x90, mtx_oldipl = 0x90}, pv_list = (struct pv_entry *)0x0}}
> 
> bluhm
> 
> Index: arch/i386/i386/pmapae.c
> ===
> RCS file: /mount/openbsd/cvs/src/sys/arch/i386/i386/pmapae.c,v
> retrieving revision 1.61
> diff -u -p -r1.61 pmapae.c
> --- arch/i386/i386/pmapae.c   24 Apr 2021 09:44:45 -  1.61
> +++ arch/i386/i386/pmapae.c   28 Apr 2021 19:30:13 -
> @@ -1938,20 +1938,7 @@ pmap_enter_special_pae(vaddr_t va, paddr
>   __func__, va);
>  
>   if (!pmap->pm_pdir_intel) {
> -#if notyet
> - /*
> -  * XXX mapping is established via pmap_kenter() and lost
> -  * after enabling PAE.
> -  */
> - vapd = (vaddr_t)km_alloc(4 * NBPG, _any, _zero,
> - _waitok);
> -#else
> - vapd = (vaddr_t)km_alloc(4 * NBPG, _any, _pageable,
> - _waitok);
> - if (vapd != 0)
> - bzero((void *)vapd, 4 * NBPG);
> -#endif
> - if (vapd == 0)
> + if ((vapd = uvm_km_zalloc(kernel_map, 4 * NBPG)) == 0)
>   panic("%s: kernel_map out of virtual space!", __func__);
>   pmap->pm_pdir_intel = vapd;
>   if (!pmap_extract(pmap, (vaddr_t)>pm_pdidx_intel,
>

Re: kernel panic when invoking ddb from another tty than ttyC0

2021-04-21 Thread Martin Pieuchot

On 16/04/21(Fri) 16:50, Jérôme Frgacic wrote:
> Thanks for your answer. :)
> 
> > Could you set "sysctl kern.splassert=2" in order to get a useful stacktrace 
> > for this issue?  This is probably where some attention is required.
> 
> Sure, here is the new output I get.
> 
> splassert: assertwaitok: want 0 have 9
> Starting stack trace...
> assertwaitok() at assertwaitok+0x3c
> malloc(5b8,91,d) at malloc+0x55
> intel_atomic_state_alloc(80202078) at intel_atomic_state_alloc+0x2f
> drm_client_modeset_commit_atomic(80cf9a00,1) at 
> drm_client_modeset_commit_atomic+0x40
> drm_client_modeset_commit_locked(80cf9a00) at 
> drm_client_modeset_commit_locked+0x53
> drm_fb_helper_restore_fbdev_mode_unlocked(80cf9a00) at 
> drm_fb_helper_restore_fbdev_mode_unlocked+0x44
> intel_fbdev_restore_mode(80202078) at intel_fbdev_restore_mode+0x33
> db_ktrap(1,0,800022e1c730) at db_ktrap+0x2c
> kerntrap(800022e1c730) at kerntrap+0x8e
> alltraps_kern_meltdown() at alltraps_kern_meltdown+0x7b
> db_enter() at db_enter+0x10
> internal_command(8064b600,800022e1c88c,f420,1b) at 
> internal_command+0x281
> wskbd_translate(82163d18,2,1) at wskbd_translate+0xdf
> wskbd_input(8064b600,2,1) at wskbd_input+0x80
> pckbcintr_internal(82175308,80647e80) at 
> pckbcintr_internal+0x11d
> intr_handler(800022e1c9c0,80647f80) at intr_handler+0x38

It seems that the DRM code that needs to be executed as a result of
switching console, in wsdisplay_enter_ddb(), isn't playing nicely with
ddb(4).

Getting rid of the SPL checks is easy but then there's the way CPUs are
parked and the relation with the sleeping points.

Maybe we should simply disable this key combination when entered from a
tty other than ttyC0 if drm(4) is used.

Re: i386 panic: pmap_pinit_pd_pae: can't locate PD page

2021-04-21 Thread Martin Pieuchot

On 18/04/21(Sun) 15:22, Alexander Bluhm wrote:
> Hi,
> 
> Tonight one of my i386 regress tests machines crashed.  It happend
> before the tests started when some scripts were copied with scp
> onto the machine.  This is done once a day for years, I have never
> seen this panic before.  Note that it is an old single CPU machine
> runing GENERIC.  Nothing has changed in sys/arch/i386 for a month.
> 
> OpenBSD 6.9 (GENERIC) #773: Fri Apr 16 00:53:44 MDT 2021
> dera...@i386.openbsd.org:/usr/src/sys/arch/i386/compile/GENERIC
> 
> panic: pmap_pinit_pd_pae: can't locate PD page
> 
> Stopped at  db_enter+0x4:   popl%ebp
> TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
> *465474  54685  0   0  00  sshd
> db_enter() at db_enter+0x4
> panic(d0c47472) at panic+0xd3
> pmap_pinit_pd_pae(d2d75140) at pmap_pinit_pd_pae+0x354
> pmap_create() at pmap_create+0x93
> uvmspace_fork(ff7ff3d4) at uvmspace_fork+0x56
> process_new(d2d79b30,ff7ff3d4,1) at process_new+0xf2
> fork1(d339bcb0,1,d0845810,0,f5e2f9d8,0) at fork1+0x1ba
> sys_fork(d339bcb0,f5e2f9e0,f5e2f9d8) at sys_fork+0x39
> syscall(f5e2fa20) at syscall+0x28e
> Xsyscall_untramp() at Xsyscall_untramp+0xa9
> end of kernel

Next time, please include "show all pools" and "show uvmexp" if you hit
any memory/pmap related problem.

Re: kernel panic when invoking ddb from another tty than ttyC0

2021-04-16 Thread Martin Pieuchot

On 15/04/21(Thu) 22:35, Jérôme FRGACIC wrote:
> >Synopsis:kernel panic when invoking ddb from another tty than ttyC0
> >Category:kernel
> >Environment:
>   System  : OpenBSD 6.8
>   Details : OpenBSD 6.8 (GENERIC) #97: Sun Oct  4 18:00:46 MDT 2020
>
> dera...@amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC
> 
>   Architecture: OpenBSD.amd64
>   Machine : amd64
> >Description:
>   When I try to invoke ddb with Ctrl-Alt-Esc from another tty than ttyC0, 
> the
> kernel panics.
>   You will find panic message, traceback and ps output below.
>   If needed, I still have the crash dump.
> >How-To-Repeat:
>   Enable ddb.console.
>   Go to any tty except ttyC0.
>   Enter Crtl-Alt-Esc.
> >Fix:
>   N/A
> 
> 
> errors before panic message:
> splassert: assertwaitok: want 0 have 9

Could you set "sysctl kern.splassert=2" in order to get a useful stack
trace for this issue?  This is probably where some attention is
required.

> splassert: assertwaitok: want 0 have 9
> splassert: assertwaitok: want 0 have 9
> splassert: assertwaitok: want 0 have 9
> splassert: assertwaitok: want 0 have 9
> splassert: assertwaitok: want 0 have 9
> splassert: assertwaitok: want 0 have 9
> splassert: assertwaitok: want 0 have 9
> splassert: assertwaitok: want 0 have 9
> splassert: assertwaitok: want 0 have 9
> splassert: assertwaitok: want 0 have 9
> splassert: assertwaitok: want 0 have 9
> splassert: assertwaitok: want 0 have 9
> splassert: assertwaitok: want 0 have 9
> splassert: assertwaitok: want 0 have 9
> splassert: assertwaitok: want 0 have 9
> splassert: assertwaitok: want 0 have 9
> splassert: assertwaitok: want 0 have 9
> splassert: assertwaitok: want 0 have 9
> splassert: assertwaitok: want 0 have 9
> splassert: assertwaitok: want 0 have 9
> splassert: assertwaitok: want 0 have 9
> splassert: assertwaitok: want 0 have 9
> splassert: assertwaitok: want 0 have 9
> 
> panic message:
> panic: kernel diagnostic assertion "p->p_wchan == NULL" failed: file
> "/usr/src/sys/kern/kern_sched.c", line 353
> 
> traceback:
> Stopped at  db_enter+0x10:  popq%rbp
> TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
> db_enter() at db_enter+0x10
> panic(81ddb0f2) at panic+0x12a
> __assert(81e3fe94,81dc952b,161,81dc296e) at
> __assert+0x2b
> sched_chooseproc() at sched_chooseproc+0x101
> mi_switch() at mi_switch+0x109
> msleep(801f7280,801f7298,20,81e06c00,0) at
> msleep+0x101
> taskq_next_work(801f7280,800022e46858) at taskq_next_work+0x61
> taskq_thread(801f7280) at taskq_thread+0x82
> end trace frame: 0x0, count: 7
> 
> ps output:
> ddb>PID TID   PPIDUID  S   FLAGS  WAIT  COMMAND
>  46716  222641  59719  0  30x1000b3  ttyin login_passwd
>  62320  332948  67717   1000  30x1000b0  poll  ssh-agent
>   8898  218895  67717   1000  30x82  selecttint2
>  63549   66422  67717   1000  30x82  nanosleep redshift
>  90460  262322  67717   1000  30x82  poll  xcompmgr
>  67717  150947  41045   1000  30x100082  poll  cwm
>  41045  177573  94871   1000  30x10008a  pause sh
>  94871  276330  55185  0  30x100080  wait  xenodm
>  39144  48  80037  0  30x100080  netio Xorg
>  80037  386456  55185 35  30x92  poll  Xorg
>  80037  462408  55185 35  3   0x492  poll  Xorg
>  38954  305428  1  0  30x100083  ttyin getty
>  75488  428761  1  0  30x100083  ttyin getty
>   5109  349049  1  0  30x100083  ttyin getty
>  608763318  1  0  30x100083  ttyin getty
>  59719  521598  1  0  30x83  netio login
>  55185  191659  1  0  30x88  pause xenodm
>  26243   59952  1  0  30x100098  poll  cron
>   5560   28593  1  0  30x80  kqreadapmd
>  63659  502916  1 99  30x100090  poll  sndiod
>  23271  107324  1110  30x100090  poll  sndiod
>  98469   32165  57416 95  30x100092  kqreadsmtpd
>  21012  225120  57416103  30x100092  kqreadsmtpd
>  60831  405428  57416 95  30x100092  kqreadsmtpd
>  75376  197549  57416 95  30x100092  kqreadsmtpd
>  54403  119307  57416 95  30x100092  kqreadsmtpd
>  134609150  57416 95  30x100092  kqreadsmtpd
>  57416  468565  1  0  30x100080  kqreadsmtpd
>  89850   30114  1  0  30x100080  poll  ntpd
>   7051  260526  81175 83  30x100092  poll  ntpd
>  81175  348924  1 83  30x100092  poll  ntpd
>   7581  421124  1 53  30x90  kqreadunbound
>  54721  150640  12052 74  3

firefox vs jitsi: stack exhaustion?

2021-04-08 Thread Martin Pieuchot

firefox often crash when somebody else connects to the jitsi I'm in.
The trace looks like a stack exhaustion, see below. 

Does this ring a bell?

0  0x in ?? ()
#1  0x02ae66ef2359 in WasmTrapHandler(int, siginfo_t*, void*) ()
   from /usr/local/lib/firefox/libxul.so.101.0
#2  
#3  0x in ?? ()
#4  0x02ae66ef2359 in WasmTrapHandler(int, siginfo_t*, void*) ()
   from /usr/local/lib/firefox/libxul.so.101.0
#5  
#6  0x in ?? ()
#7  0x02ae66ef2359 in WasmTrapHandler(int, siginfo_t*, void*) ()
   from /usr/local/lib/firefox/libxul.so.101.0
#8  
#9  0x in ?? ()
#10 0x02ae66ef2359 in WasmTrapHandler(int, siginfo_t*, void*) ()
   from /usr/local/lib/firefox/libxul.so.101.0
#11 
#12 0x in ?? ()
#13 0x02ae66ef2359 in WasmTrapHandler(int, siginfo_t*, void*) ()
   from /usr/local/lib/firefox/libxul.so.101.0
#14 
#15 0x in ?? ()
#16 0x02ae66ef2359 in WasmTrapHandler(int, siginfo_t*, void*) ()
   from /usr/local/lib/firefox/libxul.so.101.0
#17 
#18 0x in ?? ()
#19 0x02ae66ef2359 in WasmTrapHandler(int, siginfo_t*, void*) ()
   from /usr/local/lib/firefox/libxul.so.101.0
#20 
#21 0x in ?? ()
#22 0x02ae66ef2359 in WasmTrapHandler(int, siginfo_t*, void*) ()
   from /usr/local/lib/firefox/libxul.so.101.0
#23 
#24 0x in ?? ()
#25 0x02ae66ef2359 in WasmTrapHandler(int, siginfo_t*, void*) ()
   from /usr/local/lib/firefox/libxul.so.101.0
#26 
#27 0x in ?? ()
#28 0x02ae66ef2359 in WasmTrapHandler(int, siginfo_t*, void*) ()
   from /usr/local/lib/firefox/libxul.so.101.0
#29 
#30 0x in ?? ()
#31 0x02ae66ef2359 in WasmTrapHandler(int, siginfo_t*, void*) ()
   from /usr/local/lib/firefox/libxul.so.101.0
#32 
#33 0x in ?? ()
#34 0x02ae66ef2359 in WasmTrapHandler(int, siginfo_t*, void*) ()
   from /usr/local/lib/firefox/libxul.so.101.0
#35 
#36 0x in ?? ()
#37 0x02ae66ef2359 in WasmTrapHandler(int, siginfo_t*, void*) ()
   from /usr/local/lib/firefox/libxul.so.101.0
#38 
#39 0x in ?? ()
#40 0x02ae66ef2359 in WasmTrapHandler(int, siginfo_t*, void*) ()
   from /usr/local/lib/firefox/libxul.so.101.0
#41 
#42 0x in ?? ()
#43 0x02ae66ef2359 in WasmTrapHandler(int, siginfo_t*, void*) ()
   from /usr/local/lib/firefox/libxul.so.101.0
#44 
#45 0x in ?? ()
#46 0x02ae66ef2359 in WasmTrapHandler(int, siginfo_t*, void*) ()
   from /usr/local/lib/firefox/libxul.so.101.0
#47 
#48 0x in ?? ()
#49 0x02ae66ef2359 in WasmTrapHandler(int, siginfo_t*, void*) ()
   from /usr/local/lib/firefox/libxul.so.101.0
#50 
#51 0x in ?? ()
#52 0x02ae66ef2359 in WasmTrapHandler(int, siginfo_t*, void*) ()
   from /usr/local/lib/firefox/libxul.so.101.0
#53 
#54 0x in ?? ()
#55 0x02ae66ef2359 in WasmTrapHandler(int, siginfo_t*, void*) ()
   from /usr/local/lib/firefox/libxul.so.101.0
#56 
#57 0x in ?? ()
#58 0x02ae66ef2359 in WasmTrapHandler(int, siginfo_t*, void*) ()
   from /usr/local/lib/firefox/libxul.so.101.0
#59 
#60 0x in ?? ()
#61 0x02ae66ef2359 in WasmTrapHandler(int, siginfo_t*, void*) ()
   from /usr/local/lib/firefox/libxul.so.101.0
#62 
#63 0x in ?? ()
#64 0x02ae66ef2359 in WasmTrapHandler(int, siginfo_t*, void*) ()
   from /usr/local/lib/firefox/libxul.so.101.0
#65 
#66 0x in ?? ()
#67 0x02ae66ef2359 in WasmTrapHandler(int, siginfo_t*, void*) ()
   from /usr/local/lib/firefox/libxul.so.101.0
#68 
#69 0x in ?? ()
#70 0x02ae66ef2359 in WasmTrapHandler(int, siginfo_t*, void*) ()
  from /usr/local/lib/firefox/libxul.so.101.0
#71 
#72 0x in ?? ()
#73 0x02ae66ef2359 in WasmTrapHandler(int, siginfo_t*, void*) ()
   from /usr/local/lib/firefox/libxul.so.101.0
#74 
#75 0x in ?? ()
#76 0x02ae66ef2359 in WasmTrapHandler(int, siginfo_t*, void*) ()
   from /usr/local/lib/firefox/libxul.so.101.0
#77 
#78 0x in ?? ()
#79 0x02ae66ef2359 in WasmTrapHandler(int, siginfo_t*, void*) ()
   from /usr/local/lib/firefox/libxul.so.101.0
#80 
#81 0x in ?? ()
#82 0x02ae66ef2359 in WasmTrapHandler(int, siginfo_t*, void*) ()
   from /usr/local/lib/firefox/libxul.so.101.0
#83 
#84 0x in ?? ()
#85 0x02ae66ef2359 in WasmTrapHandler(int, siginfo_t*, void*) ()
   from /usr/local/lib/firefox/libxul.so.101.0
#86 
#87 0x in ?? ()
#88 0x02ae66ef2359 in WasmTrapHandler(int, siginfo_t*, void*) ()
   from /usr/local/lib/firefox/libxul.so.101.0
#89 
#90 0x in ?? ()
#91 0x02ae66ef2359 in WasmTrapHandler(int, siginfo_t*, void*) ()
   from /usr/local/lib/firefox/libxul.so.101.0
#92 
#93 0x in ?? ()
#94 0x02ae66ef2359 in WasmTrapHandler(int, siginfo_t*, void*) ()
   from

Re: wg(4) crash

2021-03-20 Thread Martin Pieuchot

On 19/03/21(Fri) 20:15, Stuart Henderson wrote:
> Not a great report but I don't have much more to go on, machine had
> ddb.panic=0 and ddb hanged while printing the stack trace. Retyped by
> hand, may contain typos. Happened a few hours after setting up wg on it.
> 
> uvm_fault(0x82204e38, 0x20, 0, 1) -> e
> fatal page fault in supervisor mode
> trap type 6 code 0 rip 81752116 cs 8 rflags 10246 cr2 20 cpl 0 rsp 
> 00023b35eb0
> gsbase 0x820eaff0 kgsbase 0x0
> panic: trap type 6, code=0, pc=81752116
> Starting stack trace...
> panic(81ddc97a) at panic+0x11d
> kerntrap(800023b35e00) at kerntrap+0x114
> alltraps_kern_meltdown() at alltraps_kern_meltdown+0x7b
> wg_index_drop(812ae000,0) at wg_index_drop+0x96
> noise_create_initiation(

This is a NULL dereference at line 1981 of net/if_wg.c:

wg_index_drop(void *_sc, uint32_t key0)
{
...
/* We expect a peer */
peer = CONTAINER_OF(iter->i_value, struct wg_peer, p_remote);
...
}

Does that mean that `iter' is NULL and i_value' is at ofset 0x20 in that
struct?

Re: panic: uao_fin_swhash_elt: can't allocate entry

2021-02-23 Thread Martin Pieuchot

On 23/02/21(Tue) 07:53, Jonathan Matthew wrote:
> On Mon, Feb 22, 2021 at 01:48:01PM +, Stuart Henderson wrote:
> > Not much information on this but it's an unusual one so I thought I'd
> > post in case it's of interest to anyone. (Re-typed from a screen photo,
> > it's remote and used by non-technical people, this is all I have).
> > 
> > panic: uao_fin_swhash_elt: can't allocate entry
> 
> uao_find_swhash_elt():
> 
> /* allocate a new entry for the bucket and init/insert it in */
> elt = pool_get(_swhash_elt_pool, PR_NOWAIT | PR_ZERO);
> /*
>  * XXX We cannot sleep here as the hash table might disappear
>  * from under our feet.  And we run the risk of deadlocking
>  * the pagedeamon.  In fact this code will only be called by
>  * the pagedaemon and allocation will only fail if we
>  * exhausted the pagedeamon reserve.  In that case we're
>  * doomed anyway, so panic.
>  */
> if (elt == NULL)
> panic("%s: can't allocate entry", __func__);
> 
> so it sounds like the machine was so out of memory it couldn't swap.

Another hypothesis would be a kind of deadlock, showing "ps", "all pools"
and "uvmexp" would help get a better understanding.

Re: panic: uao_fin_swhash_elt: can't allocate entry

2021-02-23 Thread Martin Pieuchot

On 22/02/21(Mon) 13:48, Stuart Henderson wrote:
> Not much information on this but it's an unusual one so I thought I'd
> post in case it's of interest to anyone. (Re-typed from a screen photo,
> it's remote and used by non-technical people, this is all I have).
> 
> panic: uao_fin_swhash_elt: can't allocate entry
> Stopped at db_enter+0x10: popq %rbp
> TID   PID UID PRFLAGS PFLAGS  CPU COMMAND
> 38724523522   10010x100   0   sh
> *428940   98261   0   0x14000 0x200   1K  pagedaemon
> db_enter+0x10
> panic+0x12a
> uao_set_swslot(fd80c1ecc980,150,1f4d1) at uao_set_swslot+0x1a1
> uvmpd_scan_inactive(82188790) at uvmpd_scan_inactive+0x537
> uvmpd_scan+0x9f
> uvm_pageout(800053d0) at uvm_pageout+0x375
> end trace frame 0x0, count: 9

If it happens again could you include "show uvmexp" and "show all pools".

firefox pledge violation

2021-02-19 Thread Martin Pieuchot

Firefox from -current, tab crashes, kernel says:

firefox[86270]: pledge "", syscall 289

Trace is:

#0  shmget () at /tmp/-:3
#1  0x0b38d9347d7b in ?? () from /usr/X11R6/lib/modules/dri/swrast_dri.so
#2  0x0b38d994ac4b in ?? () from /usr/X11R6/lib/modules/dri/swrast_dri.so
#3  0x0b38d8c79eb0 in ?? () from /usr/X11R6/lib/modules/dri/swrast_dri.so
#4  0x0b38d8c7aa2b in ?? () from /usr/X11R6/lib/modules/dri/swrast_dri.so
#5  0x0b38d8ce44ed in ?? () from /usr/X11R6/lib/modules/dri/swrast_dri.so
#6  0x0b38d8ce553e in ?? () from /usr/X11R6/lib/modules/dri/swrast_dri.so
#7  0x0b38d8c7bfa1 in ?? () from /usr/X11R6/lib/modules/dri/swrast_dri.so
#8  0x0b38d925495a in ?? () from /usr/X11R6/lib/modules/dri/swrast_dri.so
#9  0x0b3808396dea in drisw_bind_context (context=0xb37ef35a600, 
old=, draw=, read=)
at /usr/xenocara/lib/mesa/mk/libGL/../../src/glx/drisw_glx.c:394
#10 0x0b380839b30e in MakeContextCurrent (dpy=0xb38afd6, 
draw=14680067, read=14680067, gc_user=0xb37ef35a600)
at /usr/xenocara/lib/mesa/mk/libGL/../../src/glx/glxcurrent.c:220
#11 0x0b38c7109b3a in mozilla::gl::GLContextGLX::MakeCurrentImpl() const ()
   from /usr/local/lib/firefox/libxul.so.99.0
#12 0x0b38c7113f1a in mozilla::gl::GLContext::InitImpl() ()
   from /usr/local/lib/firefox/libxul.so.99.0
#13 0x0b38c7113e58 in mozilla::gl::GLContext::Init() ()
   from /usr/local/lib/firefox/libxul.so.99.0
#14 0x0b38c7109aab in mozilla::gl::GLContextGLX::Init() ()
   from /usr/local/lib/firefox/libxul.so.99.0
---Type  to continue, or q  to quit---
#15 0x0b38c71098e5 in 
mozilla::gl::GLContextGLX::CreateGLContext(mozilla::gl::GLContextDesc const&, 
_XDisplay*, unsigned long, __GLXFBConfigRec*, bool, gfxXlibSurface*) () from 
/usr/local/lib/firefox/libxul.so.99.0
#16 0x0b38c710a8bc in 
mozilla::gl::GLContextProviderGLX::CreateHeadless(mozilla::gl::GLContextCreateDesc
 const&, nsTSubstring*) ()
   from /usr/local/lib/firefox/libxul.so.99.0
#17 0x0b38c80d977b in mozilla::WebGLContext::CreateAndInitGL(bool, 
std::__1::vector >*) () from 
/usr/local/lib/firefox/libxul.so.99.0
#18 0x0b38c80da009 in 
mozilla::WebGLContext::Create(mozilla::HostWebGLContext&, 
mozilla::webgl::InitContextDesc const&, mozilla::webgl::InitContextResult*)
() from /usr/local/lib/firefox/libxul.so.99.0
#19 0x0b38c80699c1 in 
mozilla::ClientWebGLContext::CreateHostContext(mozilla::avec2 
const&) () from /usr/local/lib/firefox/libxul.so.99.0
#20 0x0b38c806c502 in mozilla::ClientWebGLContext::SetDimensions(int, int)
() from /usr/local/lib/firefox/libxul.so.99.0
#21 0x0b38c80677d7 in 
mozilla::dom::CanvasRenderingContextHelper::UpdateContext(JSContext*, 
JS::Handle, mozilla::ErrorResult&) ()
   from /usr/local/lib/firefox/libxul.so.99.0
#22 0x0b38c8067579 in 
mozilla::dom::CanvasRenderingContextHelper::GetContext(JSContext*, 
nsTSubstring const&, JS::Handle, mozilla::ErrorResult&) () 
from /usr/local/lib/firefox/libxul.so.99.0
#23 0x0b38c7f48113 in mozilla::dom::HTMLCanvasElement_Binding::getContext(JS
Context*, JS::Handle, void*, JSJitMethodCallArgs const&) ()
   from /usr/local/lib/firefox/libxul.so.99.0
#24 0x0b38c80034cc in bool 
mozilla::dom::binding_detail::GenericMethod(JSContext*, unsigned int, 
JS::Value*) ()
   from /usr/local/lib/firefox/libxul.so.99.0
#25 0x0b38ca7695e5 in js::InternalCallOrConstruct(JSContext*, JS::CallArgs 
const&, js::MaybeConstruct, js::CallReason) ()
   from /usr/local/lib/firefox/libxul.so.99.0
#26 0x0b38ca765cbb in Interpret(JSContext*, js::RunState&) ()
   from /usr/local/lib/firefox/libxul.so.99.0
#27 0x0b38ca75c022 in js::RunScript(JSContext*, js::RunState&) ()
   from /usr/local/lib/firefox/libxul.so.99.0
#28 0x0b38ca7696ec in js::InternalCallOrConstruct(JSContext*, JS::CallArgs 
const&, js::MaybeConstruct, js::CallReason) ()
   from /usr/local/lib/firefox/libxul.so.99.0
#29 0x0b38ca769e2a in js::Call(JSContext*, JS::Handle, 
JS::Handle, js::AnyInvokeArgs const&, JS::MutableHandle, 
js::CallReason) () from /usr/local/lib/firefox/libxul.so.99.0
#30 0x0b38cad6ae6d in js::jit::InvokeFunction(JSContext*, 
JS::Handle, bool, bool, unsigned int, JS::Value*, 
JS::MutableHandle) ()
   from /usr/local/lib/firefox/libxul.so.99.0
#31 0x0b38cad6b20a in js::jit::InvokeFromInterpreterStub(JSContext*, 
js::jit::InterpreterStubExitFrameLayout*) ()
   from /usr/local/lib/firefox/libxul.so.99.0

Re: top over SSH runaway after network drop

2020-12-25 Thread Martin Pieuchot

Hello,

On 24/12/20(Thu) 12:35, th...@liquidbinary.com wrote:
> >Synopsis:If network drops while running top over SSH, runaway process
> >Category:minor, poor handling of failure mode
> >Environment:
>   System  : OpenBSD 6.7
>   Details : OpenBSD 6.7 (GENERIC) #5: Wed Oct 28 00:25:20 MDT 2020
>
> t...@syspatch-67-amd64.openbsd.org:/usr/src/sys/arch/amd64/compile/GENERIC
> 
>   Architecture: OpenBSD.amd64
>   Machine : amd64
> >Description:
>   If I SSH into any of various amd64 OpenBSD servers, virtual or 
> physical, 
> if m running a monitoring process like top, or multitail -f, on a remote 
> machine 
> over SSH and the network drops or client machine disconnects, the server 
> process 
> consumes nearly 100% of CPU and does not stop itself.  I can log back in and 
> kill the process, but until I do I have a CPU being consumed.  This affects 
> performance, possibly costing money on a virtual server.  This behavior is 
> years old.

Did you try to reproduce this bug on -current?  Is it still there?

If it is, could you please ktrace(1) the program consuming 100% of CPU
before killing it?  Then add the kdump(1) output to this bug report so
we have an idea of what it is doing and hopefully what needs to be
fixed

Thanks for your report

Re: 6.8 GENERIC MP#1 Kernel panic on ASUS VivoBook S510U

2020-12-21 Thread Martin Pieuchot

Thanks for the report.

On 21/12/20(Mon) 17:00, Aning wrote:
> It's the second mail i try to send to mailing list. After 12 hours i still 
> can't view the first one on marc.info
> It have 15 photo attachments, but all mail was less than 25 mg. Often 
> protonmail responds when email wasn't received, but not this time.
> I hope this gives me excuse to upload screen photos onto mega.co.nz, sorry i 
> have not established my own email service and ftp yet.
> 
> Anyway here all the screen photos of ddb: 
> https://mega.nz/folder/9cwCzLIL#CymzilZEOzuA9ugLPKiVeA

It seems that sleep_finish() is called with a mutex held.  If you can
hit this panic again, could you try to type "ps /o" after getting the
"trace". 

>From the output it is not clear which thread is running and since the
trace stops (starts) at sleep_finish(), I can't figure out which code
path we're dealing with.

Re: kernel panic when removing interface

2020-11-27 Thread Martin Pieuchot

On 27/11/20(Fri) 15:47, Denis Fondras wrote:
> > It is, I guess a fix should go in net/rtsock.c to prevent adding "-link"
> > entry on routing table different from ifp->if_rdomain.
> > 
> 
> I came up with this, which is more radical.

Which is not exactly what we want.  This will prevent adding any route
on a routing table different from rdomain.

What needs to be enforced is the check from a request coming from
userland trying to insert a "-link" route.  Such check should have the
benefit of documenting that L2 entries should be only inserted in the
rdomain table of an interface.

> Index: route.c
> ===
> RCS file: /cvs/src/sys/net/route.c,v
> retrieving revision 1.397
> diff -u -p -r1.397 route.c
> --- route.c   29 Oct 2020 21:15:27 -  1.397
> +++ route.c   27 Nov 2020 09:39:53 -
> @@ -865,6 +865,8 @@ rtrequest(int req, struct rt_addrinfo *i
>   return (EINVAL);
>   ifa = info->rti_ifa;
>   ifp = ifa->ifa_ifp;
> + if (tableid != ifp->if_rdomain)
> + return (EINVAL);
>   if (prio == 0)
>   prio = ifp->if_priority + RTP_STATIC;
>  
>

Re: kernel panic when removing interface

2020-11-27 Thread Martin Pieuchot

On 26/11/20(Thu) 20:38, Pierre Emeriaud wrote:
> Hello Martin
> 
> Le jeu. 26 nov. 2020 à 14:27, Martin Pieuchot  a écrit :
> >
> > >
> > > $ doas route -T1 add 192.0.2.2/32 -link -iface vlan12
> >
> > I wonder if the problem isn't in the validation of these parameters.
> >
> > Should we accept a L2 (-link) entry on a routing table which isn't the
> > routing domain?  If so why does the entry persist in the ARP cache?
> 
> Which arp entry are you referring to? The one from the route I added?

Yes.  In the kernel ARP entries are represented as route entries.  So
when you add a "-link" route it is an ARP entry.

> > Can you reproduce the problem if you don't specify T1?
> 
> No. The routes are correctly removed when the interface is destroyed.
> It only crashes when the routes are added to another (non-empty if
> that matters) rdomain, but again, this was a silly mistake on my side.

Still, silly mistakes should be prevented and not crash the kernel ;)

> I reported it as it might be of interest to fix this for the sake of
> it, but it causes almost no harm.

It is, I guess a fix should go in net/rtsock.c to prevent adding "-link"
entry on routing table different from ifp->if_rdomain.

> PS: I've managed to crash my first router just by waiting a few
> seconds - no need to remove the route - same thing as the second
> router:
> ddb> show panic
> kernel diagnostic assertion "ifp != NULL" failed: file 
> "/usr/src/sys/netinet/if
> _ether.c", line 718
> 
> ddb> trace
> db_enter() at db_enter+0x10
> panic(81dc761f) at panic+0x12a
> __assert(81e321c2,81db9f2b,2ce,81d9e429) at 
> __assert+0x
> 2b
> arp_rtrequest(fd800baa10a8,fd800baa10a8,fd801aa63dc0) at 
> arp_rtrequ
> est
> arptimer(8216a090) at arptimer+0x67
> softclock_thread(8000ea40) at softclock_thread+0x13f
> end trace frame: 0x0, count: -6

Re: VPS crash to kernel panic on boot

2020-11-26 Thread Martin Pieuchot

On 26/11/20(Thu) 09:21, AIsha Tammy wrote:
> On 11/26/20 6:51 AM, Martin Pieuchot wrote:
> > On 25/11/20(Wed) 19:41, AIsha Tammy wrote:
> >> Replicable bug that has happened from sysupgrading to snapshot.
> >> VPS was working perfectly until this sysupgrade.
> >>
> >> VPS boots - drops to kernel panic ddb
> >>
> >> Seems to be some mutex issue?
> >> Had to manually copy information cuz weird web console, so my apologies
> >> if this isn't enough information.
> > What is the date of the snapshots?  If you can reproduce this could you
> > give us the output of the "trace" command?
> >
> > Thanks,
> > Martin
> >
> 
> Yes, reproducible crashes on multiple reboots.

Thanks, the diff below should fix it, could you test it?

Index: uvm/uvm_page.c
===
RCS file: /cvs/src/sys/uvm/uvm_page.c,v
retrieving revision 1.151
diff -u -p -r1.151 uvm_page.c
--- uvm/uvm_page.c  24 Nov 2020 13:49:09 -  1.151
+++ uvm/uvm_page.c  26 Nov 2020 17:17:55 -
@@ -180,7 +180,7 @@ uvm_page_init(vaddr_t *kvm_startp, vaddr
TAILQ_INIT(_active);
TAILQ_INIT(_inactive_swp);
TAILQ_INIT(_inactive_obj);
-   mtx_init(, IPL_NONE);
+   mtx_init(, IPL_VM);
mtx_init(, IPL_VM);
uvm_pmr_init();

Re: kernel panic when removing interface

2020-11-26 Thread Martin Pieuchot

On 24/11/20(Tue) 09:23, Pierre Emeriaud wrote:
> > Trying to use mgre(4), I found what looks like a reliable way to crash
> > the kernel which might be of interest.
> >
> > This machine is a one-month-old-current fairly light router, with inet
> > default within rdomain 1. I will upgrade to a more recent snap
> > shortly.
> 
> I just upgraded to OpenBSD 6.8-current (GENERIC) #181: Mon Nov 23
> 20:55:15 MST 2020 and the same thing happens with vlan(4):
> 
> $ doas ifconfig vlan12 inet 192.0.2.1/24 parent vio0 vnetid 12
> $ ifconfig vlan
> vlan12: flags=8843 mtu 1500
> lladdr 02:00:00:ef:3d:d7
> index 8 priority 0 llprio 3
> encap: vnetid 12 parent vio0 txprio packet rxprio outer
> groups: vlan
> media: Ethernet autoselect
> status: active
> inet 192.0.2.1 netmask 0xff00 broadcast 192.0.2.255
> 
> $ doas route -T1 add 192.0.2.2/32 -link -iface vlan12

I wonder if the problem isn't in the validation of these parameters.

Should we accept a L2 (-link) entry on a routing table which isn't the
routing domain?  If so why does the entry persist in the ARP cache?

Can you reproduce the problem if you don't specify T1? 

> add host 192.0.2.2/32: gateway vlan12
> 
> $ route -T1 -n show -inet
> DestinationGatewayFlags   Refs  Use   Mtu  Prio Iface
> 192.0.2.2  link#8 UHLS   00 - 8 vlan12
> 
> $ route -n show -inet
> Internet:
> DestinationGatewayFlags   Refs  Use   Mtu  Prio Iface
> 192.0.2/24 192.0.2.1  UCn00 - 4 vlan12
> 192.0.2.1  02:00:00:ef:3d:d7  UHLl   00 - 1 vlan12
> 192.0.2.255192.0.2.1  UHb00 - 1 vlan12
> 
> $ doas ifconfig vlan12 down
> $ doas ifconfig vlan12 destroy
> 
> $ route -T1 -n show -inet
> DestinationGatewayFlags   Refs  Use   Mtu  Prio Iface
> 192.0.2.2  link#8 UHLS   00 - 8 (null)
> 
> $ doas route -T1 del 192.0.2.2/32
> 
> login: panic: kernel diagnostic assertion "ifp != NULL" failed: file
> "/usr/src/sys/net/rtsock.c", line 975
> Stopped at  db_enter+0x10:  popq%rbp
> TIDPIDUID PRFLAGS PFLAGS  CPU  COMMAND
> *189431  84402  00x13  00  route
> db_enter() at db_enter+0x10
> panic(81dcc1d7) at panic+0x12a
> __assert(81e32678,81e40e69,3cf,81d9f5fd) at 
> __assert+0x
> 2b
> rtm_output(80071480,8e77ce80,8e77cdd8,40,1) at 
> rtm_outp
> ut+0x7ee
> route_output(fd801ef36c00,fd801af0d698,0,0) at route_output+0x3c3
> route_usrreq(fd801af0d698,9,fd801ef36c00,0,0,8e720540) at 
> route
> _usrreq+0x21a
> sosend(fd801af0d698,0,8e77d0d8,0,0,0) at sosend+0x35b
> dofilewritev(8e720540,3,8e77d0d8,0,8e77d1b0) at 
> dofilew
> ritev+0x14d
> sys_write(8e720540,8e77d150,8e77d1b0) at 
> sys_write+0x51
> 
> syscall(8e77d220) at syscall+0x315
> Xsyscall() at Xsyscall+0x128
> end of kernel
> end trace frame: 0x7f7d35b0, count: 4
> https://www.openbsd.org/ddb.html describes the minimum info required in bug
> reports.  Insufficient info makes it difficult to find and fix bugs.
> ddb>
>

Re: VPS crash to kernel panic on boot

2020-11-26 Thread Martin Pieuchot

On 25/11/20(Wed) 19:41, AIsha Tammy wrote:
> Replicable bug that has happened from sysupgrading to snapshot.
> VPS was working perfectly until this sysupgrade.
> 
> VPS boots - drops to kernel panic ddb
> 
> Seems to be some mutex issue?
> Had to manually copy information cuz weird web console, so my apologies
> if this isn't enough information.

What is the date of the snapshots?  If you can reproduce this could you
give us the output of the "trace" command?

Thanks,
Martin

Re: uaudio device works on usb2 port; fails on usb3 port

2020-08-23 Thread Martin Pieuchot

On 21/08/20(Fri) 11:46, Marcus Glocker wrote:
> On Wed, 19 Aug 2020 20:31:05 +0200
> Marcus Glocker  wrote:
> 
> > On Wed, Aug 19, 2020 at 01:21:35PM +0200, Marcus Glocker wrote:
> > 
> > > On Wed, 19 Aug 2020 12:02:23 +0200
> > > Martin Pieuchot  wrote:
> > >   
> > > > On 18/08/20(Tue) 18:53, Marcus Glocker wrote:  
> > > > > On Wed, 12 Aug 2020 21:39:15 +0200
> > > > > Marcus Glocker  wrote:
> > > > > 
> > > > > > jmc was so nice to send me his trouble device over to do some
> > > > > > further investigations.  Just some updates on what I've
> > > > > > noticed today:
> > > > > > 
> > > > > > - The issue isn't specific to xhci(4).  I also see the same
> > > > > > issue on some of my ehci(4) machines when attaching this
> > > > > > device.
> > > > > > 
> > > > > > - It seems like the device gets in to an 'corrupted state'
> > > > > > after running a couple of control transfer against it.
> > > > > > Initially they work fine, with smaller and larger transfer
> > > > > > sizes, and at one point the device hangs up and doesn't
> > > > > > recover until re-attaching it. While on some ehci(4) machines
> > > > > > the uhidev(4) attach works fine, after running lsusb against
> > > > > > the device, I see transfer errors coming up again;  On
> > > > > > xhci(4) namely XHCI_CODE_TXERR.
> > > > > > 
> > > > > > - Attaching an USB 2.0 hub doesn't make any difference, no
> > > > > > matter if attached to an xhci(4) or an ehci(4) controller.
> > > > > > 
> > > > > > Not sure what is going wrong with this little beast ...
> > > > > 
> > > > > OK, I give up :-)  Following my summary report.
> > > > > 
> > > > > This device seems to have issues with two control request types:
> > > > > 
> > > > > - UR_GET_STATUS, not called for this device from the kernel
> > > > > in the default code path.  But e.g. 'lsusb -v' will call it.
> > > > > 
> > > > > - UR_SET_IDLE, as called in uhidev_attach().
> > > > > 
> > > > > UR_GET_STATUS will stall the device for good on *all* controller
> > > > > drivers.
> > > > 
> > > > Does this also happen when the device attaches as ugen(4)?  If yes
> > > > that would rules out concurrency issues that might happen when
> > > > using lsusb(1) while other transfers are in fly.  To test you
> > > > need to disable the current attaching driver in ukc.  
> > > 
> > > Yes, it does also happen when attaching the device to ugen(4).
> > > But honestly, I was playing around yesterday evening a bit further
> > > with this device, and I noticed that the device also stalls with
> > > lsusb when I remove the get status and get report request in the
> > > lsusb code.
> > > 
> > > Therefore I need to correct my statement, saying instead that *some*
> > > request in lsusb makes the device stall as well.  What I just found
> > > in the lsusb ChangeLog:
> > > 
> > > Added (somewhat dummy) Set_Protocol and Set_Idle requests to
> > > stream dumping setup.
> > > 
> > > I'll try to confirm if the stall really happens there.  At least
> > > that would be in line with our findings in the kernel.  
> > 
> > OK, I've tracked the two lsusb requests down finally which also stall
> > this device beside our set idle call in the kernel.
> > 
> > UR_GET_DESCRIPTOR, UDESC_DEVICE_QUALIFIER:
> > 
> > ret = usb_control_msg(fd, LIBUSB_ENDPOINT_IN |
> > LIBUSB_REQUEST_TYPE_STANDARD | LIBUSB_RECIPIENT_DEVICE,
> > LIBUSB_REQUEST_GET_DESCRIPTOR,
> > USB_DT_DEBUG << 8, 0,
> > buf, sizeof buf, CTRL_TIMEOUT);
> > 
> > UR_GET_DESCRIPTOR, UDESC_DEBUG:
> > 
> > ret = usb_control_msg(fd, LIBUSB_ENDPOINT_IN |
> > LIBUSB_REQUEST_TYPE_STANDARD | LIBUSB_RECIPIENT_DEVICE,
> > LIBUSB_REQUEST_GET_DESCRIPTOR,
> > USB_DT_DEBUG << 8, 0,
> > buf, sizeof buf, CTRL_TIMEOUT);
> > 
> > When you comment those two control requests out, lsusb -v runs
> > through.
> > 
> > If I wouldn't know better, I would say that this device isn't able to
> > handle UR_SET_IDLE, UDESC_DEVIC

Re: uaudio device works on usb2 port; fails on usb3 port

2020-08-19 Thread Martin Pieuchot

On 18/08/20(Tue) 18:53, Marcus Glocker wrote:
> On Wed, 12 Aug 2020 21:39:15 +0200
> Marcus Glocker  wrote:
> 
> > jmc was so nice to send me his trouble device over to do some further
> > investigations.  Just some updates on what I've noticed today:
> > 
> > - The issue isn't specific to xhci(4).  I also see the same issue on
> >   some of my ehci(4) machines when attaching this device.
> > 
> > - It seems like the device gets in to an 'corrupted state' after
> >   running a couple of control transfer against it.  Initially they
> >   work fine, with smaller and larger transfer sizes, and at one point
> >   the device hangs up and doesn't recover until re-attaching it.
> > While on some ehci(4) machines the uhidev(4) attach works fine, after
> >   running lsusb against the device, I see transfer errors coming up
> >   again;  On xhci(4) namely XHCI_CODE_TXERR.
> > 
> > - Attaching an USB 2.0 hub doesn't make any difference, no matter if
> >   attached to an xhci(4) or an ehci(4) controller.
> > 
> > Not sure what is going wrong with this little beast ...
> 
> OK, I give up :-)  Following my summary report.
> 
> This device seems to have issues with two control request types:
> 
> - UR_GET_STATUS, not called for this device from the kernel in the
>   default code path.  But e.g. 'lsusb -v' will call it.
> 
> - UR_SET_IDLE, as called in uhidev_attach().
> 
> UR_GET_STATUS will stall the device for good on *all* controller
> drivers.

Does this also happen when the device attaches as ugen(4)?  If yes that
would rules out concurrency issues that might happen when using lsusb(1)
while other transfers are in fly.  To test you need to disable the
current attaching driver in ukc.

> UR_SET_IDLE works only on ehci(4) - Don't ask me why.
> On all the other controller drivers the following UR_GET_REPORT request
> will fail, stalling the device as well.  I tried all kind of things to
> get the UR_SET_IDLE request working on xhci(4), but without any luck.

Does the device respond to GET_IDLE?

It it a timing problem?  How much time does the device need to be idle?
Does introducing a delay before and/or after usbd_set_idle() change the
behavior? 

Did you try passing a non-0 duration parameter to the SET_IDLE command?

Taking a step back, why does a uaudio(4) needs a UR_SET_IDLE?  This
tells the device to only respond to IN interrupt transfers when new
events occur, right?  Does all devices attaching to uhidev want this
behavior?

> The good news is that when we skip the UR_SET_IDLE request on xhci(4),
> the following UR_GET_REPORT request works, and isoc transfers also work
> perfectly fine.  You can use the device for audio streaming.
> 
> Therefore the only thing I can offer is a quirk to skip the
> UR_SET_IDLE request when attaching this device.  On ehci(4) the
> device continues to work as before with this quirk.  Therefore I
> didn't include any code to only apply the quirk on non-ehci
> controllers.
> 
> I know it's not a nice solution, but at least it makes this device
> usable on xhci(4) while not impacting other things.

Maybe it is a step towards a real solution.  Should usbd_set_idle()
stay in uhidev(4) or, if it doesn't make sense for all devices, should
we move it in child drivers like ukbd(4), etc?

> If anyone is OK with that and has no better idea how to fix it, I'm
> happy to commit.
> 
> Cheers,
> Marcus
> 
> 
> Index: uhidev.c
> ===
> RCS file: /cvs/src/sys/dev/usb/uhidev.c,v
> retrieving revision 1.80
> diff -u -p -u -p -r1.80 uhidev.c
> --- uhidev.c  31 Jul 2020 10:49:33 -  1.80
> +++ uhidev.c  18 Aug 2020 13:36:13 -
> @@ -151,7 +151,8 @@ uhidev_attach(struct device *parent, str
>   sc->sc_ifaceno = uaa->ifaceno;
>   id = usbd_get_interface_descriptor(sc->sc_iface);
>  
> - usbd_set_idle(sc->sc_udev, sc->sc_ifaceno, 0, 0);
> + if (!(usbd_get_quirks(uaa->device)->uq_flags & UQ_NO_SET_IDLE))
> + usbd_set_idle(sc->sc_udev, sc->sc_ifaceno, 0, 0);
>  
>   sc->sc_iep_addr = sc->sc_oep_addr = -1;
>   for (i = 0; i < id->bNumEndpoints; i++) {
> Index: usb_quirks.c
> ===
> RCS file: /cvs/src/sys/dev/usb/usb_quirks.c,v
> retrieving revision 1.76
> diff -u -p -u -p -r1.76 usb_quirks.c
> --- usb_quirks.c  5 Jan 2020 00:54:13 -   1.76
> +++ usb_quirks.c  18 Aug 2020 13:36:13 -
> @@ -52,6 +52,7 @@ const struct usbd_quirk_entry {
>   u_int16_t bcdDevice;
>   struct usbd_quirks quirks;
>  } usb_quirks[] = {
> + { USB_VENDOR_MICROCHIP, USB_PRODUCT_MICROCHIP_SOUNDKEY, ANY, {
> UQ_NO_SET_IDLE }}, { USB_VENDOR_KYE, USB_PRODUCT_KYE_NICHE,
> 0x100, { UQ_NO_SET_PROTO}}, { USB_VENDOR_INSIDEOUT,
> USB_PRODUCT_INSIDEOUT_EDGEPORT4, 0x094, { UQ_SWAP_UNICODE}},
> Index: usb_quirks.h
> ===
> RCS file: /cvs/src/sys/dev/usb/usb_quirks.h,v
> retrieving

Re: ipmi problem introduced with sys/conf.h 1.150 enodev->selfalse

2020-06-29 Thread Martin Pieuchot

On 28/06/20(Sun) 22:17, Stuart Henderson wrote:
> Thanks to Jens A. Griepentrog for reporting and bisecting, we discovered
> that sys/conf.h r1.150 broke /dev/ipmi. I found a machine to test on and
> reverting the commit fixes things, but given the commit message I guess
> the diff below (which also fixes it) might be better?

Thanks for the finding.  Your diff is indeed better and is ok mpi@.

Could you please commit the version below that adds a matching kqfilter
filter for `seltrue' as well?  That will allow us to keep the behavior
when switching poll(2) to use kqueue filters.

Index: sys/conf.h
===
RCS file: /cvs/src/sys/sys/conf.h,v
retrieving revision 1.152
diff -u -p -r1.152 conf.h
--- sys/conf.h  26 May 2020 07:53:00 -  1.152
+++ sys/conf.h  29 Jun 2020 07:22:40 -
@@ -473,8 +473,8 @@ extern struct cdevsw cdevsw[];
 #define cdev_ipmi_init(c,n) { \
dev_init(c,n,open), dev_init(c,n,close), (dev_type_read((*))) enodev, \
(dev_type_write((*))) enodev, dev_init(c,n,ioctl), \
-   (dev_type_stop((*))) enodev, 0, selfalse, \
-   (dev_type_mmap((*))) enodev, 0 }
+   (dev_type_stop((*))) enodev, 0, seltrue, (dev_type_mmap((*))) enodev, \
+   0, 0, seltrue_kqfilter }
 
 /* open, close, ioctl, mmap */
 #define cdev_kcov_init(c,n) { \

Re: X hangs

2020-06-09 Thread Martin Pieuchot

On 29/05/20(Fri) 15:57, Visa Hankala wrote:
> On Fri, May 29, 2020 at 04:27:46PM +0200, Alexandre Ratchov wrote:
> > On Thu, May 28, 2020 at 01:41:43PM +0100, Stuart Henderson wrote:
> > > uaudio0 at uhub7 port 2 configuration 1 interface 1 "GN Netcom GN 9350" 
> > > rev 2.00/1.00 addr 7
> > > uaudio0: class v1, full-speed, sync, channels: 1 play, 1 rec, 4 ctls
> > > audio1 at uaudio0
> > > uhidev0 at uhub7 port 2 configuration 1 interface 3 "GN Netcom GN 9350" 
> > > rev 2.00/1.00 addr 7
> > > uhidev0: iclass 3/0
> > > uhid0 at uhidev0: input=2, output=2, feature=0
> > > uaudio0: can't reset interface
> > > uaudio0: can't reset interface
> > > audio1 detached
> > > uaudio0 detached
> > > uhid0 detached
> > > uhidev0 detached
> > > RA\xaf\xdeRA\xaf\xdeRA\xaf\xdeRA\xaf\xdeRA\xaf\xdeRA\xaf\xdeRA\xaf\xde: 
> > > can't set interface
> > > kernel: protection fault trap, code=0
> > > Stopped at  uaudio_stream_close+0x8a:   movzbl  0x8(%r12),%esi
> > > ddb{3}> [-- sthen@localhost attached -- Thu May 28 11:58:19 2020]
> > > 
> > > ddb{3}> 
> > > ddb{3}> tr
> > > uaudio_stream_close(81dfb000,1) at uaudio_stream_close+0x8a
> > > uaudio_stream_open(81dfb000,1,801e8000,801eaa80,2a8,816f7630)
> > >  at uaudio_stream_open+0x761
> > > uaudio_trigger_output(81dfb000,801e8000,801eaa80,2a8,816f7630,81e95c00)
> > >  at uaudio_trigger_output+0x47
> > > audio_start_do(81e95c00) at audio_start_do+0xb5
> > > audioioctl(2a01,20004126,800035a74470,7,800034fe6750) at 
> > > audioioctl+0x71
> > > VOP_IOCTL(fd867a72e9e0,20004126,800035a74470,7,fd84fea6f9c0,800034fe6750)
> > >  at VOP_IOCTL+0x55
> > > vn_ioctl(fd867d490f10,20004126,800035a74470,800034fe6750) at 
> > > vn_ioctl+0x75
> > > sys_ioctl(800034fe6750,800035a74580,800035a745e0) at 
> > > sys_ioctl+0x2df
> > > syscall(800035a74650) at syscall+0x389
> > > Xsyscall() at Xsyscall+0x128
> > > end of kernel
> > 
> > According to dmesg, audio1 was detached, so we shouldn't enter
> > audio_start_do().
> > 
> > At this point the DVF_ACTIVE flag is clear; audioioctl() calls
> > device_lookup() which is supposed to return NULL in this case, so
> > ioctl() is supposed to return ENXIO, not attempt to start playback.
> 
> Lets assume that audio_start_do() started when the device was still
> attached to the system. In that case device_lookup() returned a pointer
> to a good softc. This is supported by the fact that audio_start_do() did
> not crash earlier.
> 
> Did usbd_set_interface() block for a moment, letting the detachment
> happen? The trace suggests that usbd_set_interface() failed, and when
> audio_start_do() resumed, sc pointed to freed memory.

The audio(4) drivers has an unaccounted reference to uaudio(4)'s softc.
So when the USB thread responsible for detaching device kicks in to
clean up the software state of an uaudio(4), it first spins on the
KERNEL_LOCK().  If any of the threads playing/recording audio sleeps
while holding an unaccounted reference to uaudio(4)'s softc, the above
issue can happen.

A way to fix this is to use usbd_ref_incr(9) and its counterpart
usbd_ref_wait(9) in uaudio_detach().

I'm not sure if it's possible for audio(4) to increment the reference
only once.  Is there a place where such increment/decrement can be put?

Otherwise every operation should do the dance.

Re: OpenBSD 6.7 crashes on APU2C4 with LTE modem Huawei E3372s-153 HiLink

2020-05-25 Thread Martin Pieuchot

On 25/05/20(Mon) 12:56, Gerhard Roth wrote:
> On 5/22/20 9:05 PM, Mark Kettenis wrote:
> > > From: Łukasz Lejtkowski 
> > > Date: Fri, 22 May 2020 20:51:57 +0200
> > > 
> > > Probably power supply 12 V is broken. Showing 16,87 V(Fluke 179) -
> > > too high. Should be 12,25-12,50 V. I replaced to the new one.
> > 
> > That might be why the device stops responding.  The fact that cleaning
> > up from a failed USB transaction leads to this panic is a bug though.
> > 
> > And somebody just posted a very similar panic with ure(4).  Something
> > in the network stack is holding a mutex when it shouldn't.
> 
> I think that holding the mutex is ok. The bug is calling the stop
> routine in case of errors.
> 
> This is what common foo_start() does:
> 
>   m_head = ifq_deq_begin(>if_snd);
>   if (foo_encap(sc, m_head, 0)) {
>   ifq_deq_rollback(>if_snd, m_head);
>   ...
>   return;
>   }
>   ifq_deq_commit(>if_snd, m_head);
> 
> Here, ifq_deq_begin() grabs a mutex and it is held while
> calling foo_encap().
> 
> For USB network interfaces foo_encap() mostly does this:
> 
>   err = usbd_transfer(sc->sc_xfer);
>   if (err != USBD_IN_PROGRESS) {
>   foo_stop(sc);
>   return EIO;
>   }
> 
> And foo_stop() calls usbd_abort_pipe() -> xhci_command_submit(),
> which might sleep.
> 
> How to fix? We could do the foo_encap() after the ifq_deq_commit(),
> possibly dropping the current mbuf if encap fails (who cares
> for the packets after foo_stop() anyway).

That's the approach taken by drivers using ifq_dequeue(9) instead of
ifq_deq_begin/commit().

> Or change all the drivers to follow the path that if_aue.c takes:
> 
>   err = usbd_transfer(c->aue_xfer);
>   if (err != USBD_IN_PROGRESS) {
>   ...
>   /* Stop the interface from process context. */
>   usb_add_task(sc->aue_udev, >aue_stop_task);
>   return (EIO);
>   }

That's just trading the current problem for another one with higher
complexity.

> Any ideas, what's better? Or alternative proposals?

Using ifq_dequeue(9) would have the advantage of unifying the code base.
It introduces a behavior change.  A simpler fix would be to call
foo_stop() in the error path after ifq_deq_rollback().

wsemul_vt100 & wsmux's ioctl rwlock taken in interrupt context

2020-05-06 Thread Martin Pieuchot

Following backtrace found by robert@'s syzkaller exposes a context / 
locking issue related to wsmux's ioctl rwlock:

  panic: acquiring blockable sleep lock with spinlock or critical section held 
(rwlock) wsmuxlk

trace:

panic+0x15c
witness_checkorder+0x10e0
rw_enter_read+0x66
wsmux_do_displayioctl+0x7e
wsdisplay_emulbell+0x68
wsemul_vt100_output_c0c1+0x2f5
wsemul_vt100_output+0x34e
wsdisplaystart+0x396
ttrstrt+0x4b
timeout_run+0xc4
softclock+0x175
softintr_dispatch+0x107
Xsoftclock+0x1f


Grabbing `sc_lock' should obviously not be possible from softclock context.
I'm not sure what's the best way to fix this issue.  timeout_set_proc(9)
will make the warning disappear but is it the right thing to do?

Is there other interrupt-context paths that can enter this code?

The lock has been introduced to prevent access to `sc_cld' in case a
thread was sleeping in the middle of an operation.  Are we sure those
sleeping points cannot be reached by entry points from interrupt
context?

Did we consider alternative fixes than a lock?

Re: pty leak or corruption w/ openpty + dup2?

2020-05-06 Thread Martin Pieuchot

On 02/05/20(Sat) 16:02, Mark Kettenis wrote:
> > Date: Sat, 2 May 2020 11:33:17 +0200
> > From: Martin Pieuchot 
> > [...]
> > Do we see that the issue is caused by the order in which descriptors are
> > closed in fdfree()?  The current deadlock occurs because the duped master
> > has a higher fd number than the slave which means it is still open when the
> > slave is closed.
> 
> I'm sure we could construct an example where the file descriptors are
> in a different oder.  So changing the order is not going to help.

Obviously :)

> > But why would that be a problem?  By default *close() functions,
> > including ttylclose() are blocking.  So any exiting process might end up
> > hanging in fdfree().  Diff below illustrates that by forcing all *close()
> > during exit1() to be non-blocking, it also fix the issue.
> 
> I very much fear that is going to have unintended side-effects with
> output not being flushed properly.  And the process could still
> deadlock itself by using close(2) directly isn't it?

Indeed.

> > Does it make sense to close fds as non-blocking when existing?  What
> > should a dying thread wait for?  What can be the cons of such approach? 
> > 
> > Now regarding your fix, why does it make sense to wait 5sec instead of
> > indefinitely?  Did you look at r1.263 of NetBSD's kern/tty.c?  If we go
> > with this change could you please change the 'timo' suffix and variables
> > to 'nsec' and use uint64_t instead of int?
> 
> r1.263 was reverted in r1.264.  Then r1.265 is the commit quoted by anton@.
> There is also r2.267 which adds an additional fix to r1.265.
> 
> In ttywait(), NetBSD only calls ttyflush() if there is a timeout. That
> makes sense, because we have ttywflush() to combine the wait and flush
> so ttywait() shouldn't flush when there is no error.

Updated diff below reflecting those changes.  I'm still questioning the
5sec timeout, but it is without doubt an improvement over the current
behavior.

The previously mentioned test as well as a modified version closing the
slave before exit(2) now hang for 5 seconds instead of deadlocking
indefinitely.

I believe we want that for release, ok?

Index: kern/tty.c
===
RCS file: /cvs/src/sys/kern/tty.c,v
retrieving revision 1.154
diff -u -p -r1.154 tty.c
--- kern/tty.c  7 Apr 2020 13:27:51 -   1.154
+++ kern/tty.c  6 May 2020 07:44:53 -
@@ -80,6 +80,8 @@ void  filt_ttyrdetach(struct knote *kn);
 intfilt_ttywrite(struct knote *kn, long hint);
 void   filt_ttywdetach(struct knote *kn);
 void   ttystats_init(struct itty **, size_t *);
+intttywait_nsec(struct tty *tp, uint64_t nsecs);
+intttysleep_nsec(struct tty *, void *, int, char *, uint64_t);
 
 /* Symbolic sleep message strings. */
 char ttclos[]  = "ttycls";
@@ -1202,10 +1204,10 @@ ttnread(struct tty *tp)
 }
 
 /*
- * Wait for output to drain.
+ * Wait for output to drain, or if this times out, flush it.
  */
 int
-ttywait(struct tty *tp)
+ttywait_nsec(struct tty *tp, uint64_t nsecs)
 {
int error, s;
 
@@ -1219,7 +1221,10 @@ ttywait(struct tty *tp)
(ISSET(tp->t_state, TS_CARR_ON) || ISSET(tp->t_cflag, 
CLOCAL))
&& tp->t_oproc) {
SET(tp->t_state, TS_ASLEEP);
-   error = ttysleep(tp, >t_outq, TTOPRI | PCATCH, 
ttyout);
+   error = ttysleep_nsec(tp, >t_outq, TTOPRI | PCATCH,
+   ttyout, nsecs);
+   if (error == EWOULDBLOCK)
+   ttyflush(tp, FWRITE);
if (error)
break;
} else
@@ -1229,6 +1234,12 @@ ttywait(struct tty *tp)
return (error);
 }
 
+int
+ttywait(struct tty *tp)
+{
+   return (ttywait_nsec(tp, INFSLP));
+}
+
 /*
  * Flush if successfully wait.
  */
@@ -1237,7 +1248,8 @@ ttywflush(struct tty *tp)
 {
int error;
 
-   if ((error = ttywait(tp)) == 0)
+   error = ttywait_nsec(tp, SEC_TO_NSEC(5));
+   if (error == 0 || error == EWOULDBLOCK)
ttyflush(tp, FREAD);
return (error);
 }
@@ -2281,11 +2293,18 @@ tputchar(int c, struct tty *tp)
 int
 ttysleep(struct tty *tp, void *chan, int pri, char *wmesg)
 {
+
+   return (ttysleep_nsec(tp, chan, pri, wmesg, INFSLP));
+}
+
+int
+ttysleep_nsec(struct tty *tp, void *chan, int pri, char *wmesg, uint64_t nsecs)
+{
int error;
short gen;
 
gen = tp->t_gen;
-   if ((error = tsleep_nsec(chan, pri, wmesg, INFSLP)) != 0)
+   if ((error = tsleep_nsec(chan, pri, wmesg, nsecs)) != 0)
return (error);
return (tp->t_gen == gen ? 0 : ERESTART);
 }

Re: pty leak or corruption w/ openpty + dup2?

2020-05-02 Thread Martin Pieuchot

On 02/05/20(Sat) 10:40, Anton Lindqvist wrote:
> On Fri, May 01, 2020 at 05:17:36PM +0200, Martin Pieuchot wrote:
> > On 01/05/20(Fri) 12:13, Anton Lindqvist wrote:
> > > The order in which the pty master/slave is closed seems to be the
> > > trigger here. While not duping the master, it's closed before the slave.
> > > In the opposite scenario, the slave is closed before the master. While
> > > closing the slave, it ends up here expressed as a simplified backtrace:
> > > 
> > >   tsleep()
> > >   ttysleep()
> > >   ttywait()
> > >   ttywflush()
> > >   ttylclose()
> > >   ptsclose()
> > >   fdfree()
> > >   exit1()
> > > 
> > > In order words, it ends up doing a tsleep(INFSLP) causing the thread to
> > > hang. Note that this is not the case when the master is closed before
> > > the slave since `tp->t_oproc == NULL' causing ttywait() to bail early.
> > 
> > Why is the sleeper never awaken?  Does that mean a ttwakeup() is missing?
> 
> In this case, the process is single threaded, about to exit and the only
> consumer of the pty. I don't see how it could be any other process
> responsibility to perform the wakeup.

Do we see that the issue is caused by the order in which descriptors are
closed in fdfree()?  The current deadlock occurs because the duped master
has a higher fd number than the slave which means it is still open when the
slave is closed.

But why would that be a problem?  By default *close() functions,
including ttylclose() are blocking.  So any exiting process might end up
hanging in fdfree().  Diff below illustrates that by forcing all *close()
during exit1() to be non-blocking, it also fix the issue.

Does it make sense to close fds as non-blocking when existing?  What
should a dying thread wait for?  What can be the cons of such approach? 

Now regarding your fix, why does it make sense to wait 5sec instead of
indefinitely?  Did you look at r1.263 of NetBSD's kern/tty.c?  If we go
with this change could you please change the 'timo' suffix and variables
to 'nsec' and use uint64_t instead of int?

Index: kern/vfs_vnops.c
===
RCS file: /cvs/src/sys/kern/vfs_vnops.c,v
retrieving revision 1.114
diff -u -p -r1.114 vfs_vnops.c
--- kern/vfs_vnops.c8 Apr 2020 08:07:51 -   1.114
+++ kern/vfs_vnops.c2 May 2020 09:18:28 -
@@ -601,6 +601,7 @@ vn_closefile(struct file *fp, struct pro
 {
struct vnode *vp = fp->f_data;
struct flock lf;
+   unsigned int flag;
int error;

KERNEL_LOCK();
@@ -611,7 +612,10 @@ vn_closefile(struct file *fp, struct pro
lf.l_type = F_UNLCK;
(void) VOP_ADVLOCK(vp, (caddr_t)fp, F_UNLCK, , F_FLOCK);
}
-   error = vn_close(vp, fp->f_flag, fp->f_cred, p);
+   flag = fp->f_flag;
+   if (p != NULL && p->p_flag & P_WEXIT)
+   flag |= O_NONBLOCK;
+   error = vn_close(vp, flag, fp->f_cred, p);
KERNEL_UNLOCK();
return (error);
 }

Re: pty leak or corruption w/ openpty + dup2?

2020-05-01 Thread Martin Pieuchot

On 01/05/20(Fri) 12:13, Anton Lindqvist wrote:
> The order in which the pty master/slave is closed seems to be the
> trigger here. While not duping the master, it's closed before the slave.
> In the opposite scenario, the slave is closed before the master. While
> closing the slave, it ends up here expressed as a simplified backtrace:
> 
>   tsleep()
>   ttysleep()
>   ttywait()
>   ttywflush()
>   ttylclose()
>   ptsclose()
>   fdfree()
>   exit1()
> 
> In order words, it ends up doing a tsleep(INFSLP) causing the thread to
> hang. Note that this is not the case when the master is closed before
> the slave since `tp->t_oproc == NULL' causing ttywait() to bail early.

Why is the sleeper never awaken?  Does that mean a ttwakeup() is missing?

> NetBSD does a sleep with a timeout in ttywflush(). I've applied the same
> approach in the diff below which does fix the hang.

This seems like a racy workaround for a bug that we do not fully
understand.  If this is a proper solution I'd be happy to understand
why.  If we go with such fix we should be using a value in "nsecs"
instead of ticks and INFSLP should be used instead of 0.  We should
refrain from introducing new usages of `hz' ;)

Re: Xorg hangs on recent snapshots

2020-05-01 Thread Martin Pieuchot

Hello Mark,

Thanks for the report.

On 01/05/20(Fri) 16:51, Mark Patruck wrote:
> Problem:
> 
> With amdgpu(4) enabled, everything runs fine and smooth for minutes,
> sometimes hours (especially if you don't start lots of programs), but
> all of a sudden X freezes. That means, you can move your mouse, ssh in,
> also top and other programs are still running, but you have to kill -9
> X, to get back to business. This only applies for Polaris 11-see Results
> below.

Such 'freeze' is a symptom.  If you can ssh into the machine when that
happens a useful piece of informations would be the output of:

# ps -Sx -Owchan

similarly the output of "ps -S" would show where current threads are
blocking.

Another interesting piece of information would be the output of 'dmesg'
at that given moment.  The kernel might have printed some valuable
informations when something wrong happens.

Maybe /var/log/Xorg.0.log would also contain valuable informations.

These pieces of information might help us pinpoint the underlying
problem.

> [...] 
> I know about this thread on freedesktop.org [1], but again...
> before buying sth new, i'd like to know about your findings.
> 
> [1] https://bugs.freedesktop.org/show_bug.cgi?id=105733#c75

Do you know if it's the issue you're experiencing?

1 2 3 4 5 >

1 - 100 of 460 matches

Mail list logo