On 25/05/20(Mon) 12:56, Gerhard Roth wrote:
> On 5/22/20 9:05 PM, Mark Kettenis wrote:
> > > From: Łukasz Lejtkowski <[email protected]>
> > > Date: Fri, 22 May 2020 20:51:57 +0200
> > >
> > > Probably power supply 12 V is broken. Showing 16,87 V(Fluke 179) -
> > > too high. Should be 12,25-12,50 V. I replaced to the new one.
> >
> > That might be why the device stops responding. The fact that cleaning
> > up from a failed USB transaction leads to this panic is a bug though.
> >
> > And somebody just posted a very similar panic with ure(4). Something
> > in the network stack is holding a mutex when it shouldn't.
>
> I think that holding the mutex is ok. The bug is calling the stop
> routine in case of errors.
>
> This is what common foo_start() does:
>
> m_head = ifq_deq_begin(&ifp->if_snd);
> if (foo_encap(sc, m_head, 0)) {
> ifq_deq_rollback(&ifp->if_snd, m_head);
> ...
> return;
> }
> ifq_deq_commit(&ifp->if_snd, m_head);
>
> Here, ifq_deq_begin() grabs a mutex and it is held while
> calling foo_encap().
>
> For USB network interfaces foo_encap() mostly does this:
>
> err = usbd_transfer(sc->sc_xfer);
> if (err != USBD_IN_PROGRESS) {
> foo_stop(sc);
> return EIO;
> }
>
> And foo_stop() calls usbd_abort_pipe() -> xhci_command_submit(),
> which might sleep.
>
> How to fix? We could do the foo_encap() after the ifq_deq_commit(),
> possibly dropping the current mbuf if encap fails (who cares
> for the packets after foo_stop() anyway).
That's the approach taken by drivers using ifq_dequeue(9) instead of
ifq_deq_begin/commit().
> Or change all the drivers to follow the path that if_aue.c takes:
>
> err = usbd_transfer(c->aue_xfer);
> if (err != USBD_IN_PROGRESS) {
> ...
> /* Stop the interface from process context. */
> usb_add_task(sc->aue_udev, &sc->aue_stop_task);
> return (EIO);
> }
That's just trading the current problem for another one with higher
complexity.
> Any ideas, what's better? Or alternative proposals?
Using ifq_dequeue(9) would have the advantage of unifying the code base.
It introduces a behavior change. A simpler fix would be to call
foo_stop() in the error path after ifq_deq_rollback().