Re: thundering pipe herds

2020-11-23 Thread David Holland
On Tue, Nov 24, 2020 at 09:09:59AM +1100, matthew green wrote:
 > > @@ -395,9 +401,8 @@ pipeunlock(struct pipe *pipe)
 > >KASSERT(pipe->pipe_state & PIPE_LOCKFL);
 > >  
 > >pipe->pipe_state &= ~PIPE_LOCKFL;
 > > -  if (pipe->pipe_state & PIPE_LWANT) {
 > > -  pipe->pipe_state &= ~PIPE_LWANT;
 > > -  cv_broadcast(>pipe_lkcv);
 > > +  if (pipe->pipe_waiters > 0) {
 > > +  cv_signal(>pipe_lkcv);
 > >}
 > >  }
 > 
 > this part misses the while loop from the freebsd version i think?

...shouldn't, that would turn it back into broadcast.

(except not even that, because the decrement doesn't happen until
wakeup so it'd loop forever)

-- 
David A. Holland
dholl...@netbsd.org


Re: zfs panic in zfs:vdev_disk_open.part.4

2020-11-23 Thread Yorick Hardy
Dear David,

On 2020-11-22, David Brownlee wrote:
> I'm seeing a (new?) panic on netbsd-9 with zfs. It seems to trigger
> when a newly created zfs pool attempts to be mounted:
> 
> panic: vrelel: bad ref count
> cpu0: Begin traceback...
> vpanic() at netbsd:vpanic+0x160
> vcache_reclaim() at netbsd:vcache_reclaim
> vrelel() at netbsd:vrelel+0x22e
> vdev_disk_open.part.4() at zfs:vdev_disk_open.part.4+0x44e
> vdev_open() at zfs:vdev_open+0x9e
> vdev_open_children() at zfs:vdev_open_children+0x39
> vdev_root_open() at zfs:vdev_root_open+0x33
> vdev_open() at zfs:vdev_open+0x9e
> vdev_create() at zfs:vdev_create+0x1b
> spa_create() at zfs:spa_create+0x28c
> zfs_ioc_pool_create() at zfs:zfs_ioc_pool_create+0x19b
> zfsdev_ioctl() at zfs:zfsdev_ioctl+0x265
> nb_zfsdev_ioctl() at zfs:nb_zfsdev_ioctl+0x38
> VOP_IOCTL() at netbsd:VOP_IOCTL+0x54
> vn_ioctl() at netbsd:vn_ioctl+0xa5
> sys_ioctl() at netbsd:sys_ioctl+0x5ab
> syscall() at netbsd:syscall+0x157
> --- syscall (number 54) ---
> 7e047af6822a:
> cpu0: End traceback...
> 
> Anyone seeing anything similar (I continue to have a bunch of other
> boxes which use zfs without issue)
> 
> David

I don't know if it helps, but it looks like vn_close(vp,..) should be called 
instead
of vrele(vp) at

 
https://nxr.netbsd.org/xref/external/cddl/osnet/dist/uts/common/fs/zfs/vdev_disk.c#218

and

 
https://nxr.netbsd.org/xref/external/cddl/osnet/dist/uts/common/fs/zfs/vdev_disk.c#250


-- 
Kind regards,

Yorick Hardy


re: thundering pipe herds

2020-11-23 Thread matthew green
> @@ -395,9 +401,8 @@ pipeunlock(struct pipe *pipe)
>   KASSERT(pipe->pipe_state & PIPE_LOCKFL);
>  
>   pipe->pipe_state &= ~PIPE_LOCKFL;
> - if (pipe->pipe_state & PIPE_LWANT) {
> - pipe->pipe_state &= ~PIPE_LWANT;
> - cv_broadcast(>pipe_lkcv);
> + if (pipe->pipe_waiters > 0) {
> + cv_signal(>pipe_lkcv);
>   }
>  }

this part misses the while loop from the freebsd version i think?


.mrg.


thundering pipe herds

2020-11-23 Thread David Holland
mjg at freebsd says that their pipes have a thundering herd problem
that manifests in make's token pipe stuff at high -j values. We have
the same logic, and it's easily fixed.

Completely untested patch follows; does anyone object to this assuming
that it works (possibly after minor changes)?

(pipe_waiters is inserted where it is so it occupies what's currently
struct padding on 64-bit machines)



Index: sys/pipe.h
===
RCS file: /cvsroot/src/sys/sys/pipe.h,v
retrieving revision 1.36
diff -u -p -r1.36 pipe.h
--- sys/pipe.h  22 Aug 2018 01:05:24 -  1.36
+++ sys/pipe.h  23 Nov 2020 21:23:07 -
@@ -93,8 +93,7 @@ struct pipemapping {
 #define PIPE_DIRECTR   0x080   /* Pipe direct read request (setup complete) */
 #definePIPE_LOCKFL 0x100   /* Process has exclusive access to
   pointers/data. */
-#definePIPE_LWANT  0x200   /* Process wants exclusive access to
-  pointers/data. */
+/* unused  0x200   */
 #definePIPE_RESTART0x400   /* Return ERESTART to blocked syscalls 
*/
 
 /*
@@ -114,6 +113,7 @@ struct pipe {
struct  timespec pipe_mtime;/* time of last modify */
struct  timespec pipe_btime;/* time of creation */
pid_t   pipe_pgid;  /* process group for sigio */
+   u_int   pipe_waiters;   /* number of waiters pending */
struct  pipe *pipe_peer;/* link with other direction */
u_int   pipe_state; /* pipe status info */
int pipe_busy;  /* busy flag, to handle rundown */
Index: kern/sys_pipe.c
===
RCS file: /cvsroot/src/sys/kern/sys_pipe.c,v
retrieving revision 1.148
diff -u -p -r1.148 sys_pipe.c
--- kern/sys_pipe.c 26 Apr 2019 17:24:23 -  1.148
+++ kern/sys_pipe.c 23 Nov 2020 21:23:07 -
@@ -371,13 +371,19 @@ pipelock(struct pipe *pipe, bool catch_p
KASSERT(mutex_owned(pipe->pipe_lock));
 
while (pipe->pipe_state & PIPE_LOCKFL) {
-   pipe->pipe_state |= PIPE_LWANT;
+   pipe->pipe_waiters++;
+   KASSERT(pipe->pipe_waiters != 0); /* just in case */
if (catch_p) {
error = cv_wait_sig(>pipe_lkcv, pipe->pipe_lock);
-   if (error != 0)
+   if (error != 0) {
+   KASSERT(pipe->pipe_waiters > 0);
+   pipe->pipe_waiters--;
return error;
+   }
} else
cv_wait(>pipe_lkcv, pipe->pipe_lock);
+   KASSERT(pipe->pipe_waiters > 0);
+   pipe->pipe_waiters--;
}
 
pipe->pipe_state |= PIPE_LOCKFL;
@@ -395,9 +401,8 @@ pipeunlock(struct pipe *pipe)
KASSERT(pipe->pipe_state & PIPE_LOCKFL);
 
pipe->pipe_state &= ~PIPE_LOCKFL;
-   if (pipe->pipe_state & PIPE_LWANT) {
-   pipe->pipe_state &= ~PIPE_LWANT;
-   cv_broadcast(>pipe_lkcv);
+   if (pipe->pipe_waiters > 0) {
+   cv_signal(>pipe_lkcv);
}
 }
 


-- 
David A. Holland
dholl...@netbsd.org


Re: USB lockup

2020-11-23 Thread Edgar Fuß
> So, during the partial lockup, I see
>   ohci_softintr#63@0: add TD 0x80013ec2de20
>   ohci_softintr#63@0: add TD 0x80013ec2dea0
that's  ohci_softintr#63@0: add TD 0x80013ec2dfa0
>   ohci_softintr#63@0: add TD 0x80013ec2dee0
So I think it's endlessly looping in the "Reverse the done list." loop 
in ohci_softintr(), so the td list must have gone circular, no?


Re: USB lockup

2020-11-23 Thread Edgar Fuß
> The ddb backtrace usually is
> bus_space_read_4()
> bintime()
> ohci_softintr()
> usb_soft_intr()
> softint_dispatch()
> 
> The system call causing the lock-up is a USB_DEVICEINFO ioctl on /dev/usb0 
> with udi_addr=2, which corresponds to ugen0.
I tried a -current kernel from nyftp today, and it locks up the same way.