Re: avoid uvideo lockup when nothing is transferred

2020-02-28 Thread Martin Pieuchot
On 16/01/20(Thu) 01:07, Vadim Zhukov wrote:
> Hi all.
> 
> This fixes an issue I'm seeing with a uvideo(4), that doesn't like
> commands we're sending to it. The camera simply sends nothing,
> and since we're sleeping forever (xfer timeout is 0, which is
> USBD_NO_TIMEOUT), we never get out from 'while (bulk_running)' loop,
> visible in the scond chunk of the diff. This not only makes video(1)
> and other V4L2 users lockup, but also results in process freeze upon
> detach: it calls uvideo_vs_close(), which in turn calls
> usbd_ref_wait(), which doesn't return because we still have something
> in queue.
> 
> Also, in case of error we keep bulk_running set. This seems just
> a leftover, as well as the first chunk: if we failed to create kthread,
> there won't be anything running under bulk_running==1 for sure.
> 
> I must thank kettenis@ for help during diagnosis and mpi@ for a patch
> for related issue.
> 
> OK?

Comments below.

> Index: uvideo.c
> ===
> RCS file: /cvs/src/sys/dev/usb/uvideo.c,v
> retrieving revision 1.205
> diff -u -p -r1.205 uvideo.c
> --- uvideo.c  14 Oct 2019 09:20:48 -  1.205
> +++ uvideo.c  15 Jan 2020 21:54:43 -
> @@ -2026,6 +2027,7 @@ uvideo_vs_start_bulk(struct uvideo_softc
>   error = kthread_create(uvideo_vs_start_bulk_thread, sc, NULL,
>   DEVNAME(sc));
>   if (error) {
> + sc->sc_vs_cur->bulk_running = 0;
>   printf("%s: can't create kernel thread!", DEVNAME(sc));
>   return (error);
>   }

This is unrelated to your issue and could already be committed, if the
thread cannot be created we won't be able to detach the device.

> @@ -2044,6 +2046,12 @@ uvideo_vs_start_bulk_thread(void *arg)
>   while (sc->sc_vs_cur->bulk_running) {
>   size = UGETDW(sc->sc_desc_probe.dwMaxPayloadTransferSize);
>  
> + /*
> +  * We can't wait infinitely, since otherwise we'll
> +  * block forever if camera stops (or don't even starts)
> +  * sending frames. Use '2*' multiplier to compensate
> +  * possible lags.
> +  */
>   usbd_setup_xfer(
>   sc->sc_vs_cur->bxfer.xfer,
>   sc->sc_vs_cur->pipeh,
> @@ -2051,10 +2059,11 @@ uvideo_vs_start_bulk_thread(void *arg)
>   sc->sc_vs_cur->bxfer.buf,
>   size,
>   USBD_NO_COPY | USBD_SHORT_XFER_OK | USBD_SYNCHRONOUS,
> - 0,
> + 2 * 1000 / (sc->sc_frame_rate || 1),
>   NULL);
>   error = usbd_transfer(sc->sc_vs_cur->bxfer.xfer);

Avoiding infinite sleeps sounds like an improvement.  However I'm unsure
about fiddling with `sc_frame_rate' here.

This field is not consistently set/used in the driver.  Why not simply use
a big hammer and put a 2sec (2000ms) timeout here?  If somebody wants to
improve this it can look at `dwFrameInterval' value and/or `sc_frame_rate'
and makes all of that shine :o)

>   if (error != USBD_NORMAL_COMPLETION) {
> + sc->sc_vs_cur->bulk_running = 0;

Setting this field to 0 before calling usbd_ref_decr() is not optimal.
This is fine because of the KERNEL_LOCK().  Using the reverse order
of operation is generally what we want, so here's some pseudo-code.

  bulk_running = 1
  usbd_ref_incr()
  while (...) {
  }
  usbd_ref_decr()
  bulk_running = 0

This could be committed with the fix above :o)



avoid uvideo lockup when nothing is transferred

2020-01-15 Thread Vadim Zhukov
Hi all.

This fixes an issue I'm seeing with a uvideo(4), that doesn't like
commands we're sending to it. The camera simply sends nothing,
and since we're sleeping forever (xfer timeout is 0, which is
USBD_NO_TIMEOUT), we never get out from 'while (bulk_running)' loop,
visible in the scond chunk of the diff. This not only makes video(1)
and other V4L2 users lockup, but also results in process freeze upon
detach: it calls uvideo_vs_close(), which in turn calls
usbd_ref_wait(), which doesn't return because we still have something
in queue.

Also, in case of error we keep bulk_running set. This seems just
a leftover, as well as the first chunk: if we failed to create kthread,
there won't be anything running under bulk_running==1 for sure.

I must thank kettenis@ for help during diagnosis and mpi@ for a patch
for related issue.

OK?

--
WBR,
  Vadim Zhukov


Index: uvideo.c
===
RCS file: /cvs/src/sys/dev/usb/uvideo.c,v
retrieving revision 1.205
diff -u -p -r1.205 uvideo.c
--- uvideo.c14 Oct 2019 09:20:48 -  1.205
+++ uvideo.c15 Jan 2020 21:54:43 -
@@ -2026,6 +2027,7 @@ uvideo_vs_start_bulk(struct uvideo_softc
error = kthread_create(uvideo_vs_start_bulk_thread, sc, NULL,
DEVNAME(sc));
if (error) {
+   sc->sc_vs_cur->bulk_running = 0;
printf("%s: can't create kernel thread!", DEVNAME(sc));
return (error);
}
@@ -2044,6 +2046,12 @@ uvideo_vs_start_bulk_thread(void *arg)
while (sc->sc_vs_cur->bulk_running) {
size = UGETDW(sc->sc_desc_probe.dwMaxPayloadTransferSize);
 
+   /*
+* We can't wait infinitely, since otherwise we'll
+* block forever if camera stops (or don't even starts)
+* sending frames. Use '2*' multiplier to compensate
+* possible lags.
+*/
usbd_setup_xfer(
sc->sc_vs_cur->bxfer.xfer,
sc->sc_vs_cur->pipeh,
@@ -2051,10 +2059,11 @@ uvideo_vs_start_bulk_thread(void *arg)
sc->sc_vs_cur->bxfer.buf,
size,
USBD_NO_COPY | USBD_SHORT_XFER_OK | USBD_SYNCHRONOUS,
-   0,
+   2 * 1000 / (sc->sc_frame_rate || 1),
NULL);
error = usbd_transfer(sc->sc_vs_cur->bxfer.xfer);
if (error != USBD_NORMAL_COMPLETION) {
+   sc->sc_vs_cur->bulk_running = 0;
DPRINTF(1, "%s: error in bulk xfer: %s!\n",
DEVNAME(sc), usbd_errstr(error));
break;