Re: [RFT PATCH] xhci: Fix use-after-free regression in xhci clear hub TT implementation

Johan Hovold Mon, 14 Oct 2019 03:17:03 -0700

On Fri, Oct 11, 2019 at 03:58:42PM +0300, Mathias Nyman wrote:
> commit ef513be0a905 ("usb: xhci: Add Clear_TT_Buffer") schedules work
> to clear TT buffer, but causes a use-after-free regression at the same time
> 
> Make sure hub_tt_work finishes before endpoint is disabled, otherwise
> the work will dereference already freed endpoint and device related
> pointers.
> 
> This was triggered when usb core failed to read the configuration
> descriptor of a FS/LS device during enumeration.
> xhci driver queued clear_tt_work while usb core freed and reallocated
> a new device for the next enumeration attempt.
> 
> EHCI driver implents ehci_endpoint_disable() that makes sure
> clear_tt_work has finished before it returns, but xhci lacks this support.
> usb core will call hcd->driver->endpoint_disable() callback before
> disabling endpoints, so we want this in xhci as well.
> 
> The added xhci_endpoint_disable() is based on ehci_endpoint_disable()
> 
> Fixes: ef513be0a905 ("usb: xhci: Add Clear_TT_Buffer")
> Cc: <[email protected]> # v5.3
> Reported-by: Johan Hovold <[email protected]>
> Signed-off-by: Mathias Nyman <[email protected]>
> ---
>  drivers/usb/host/xhci.c | 43 +++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 43 insertions(+)
> 
> diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
> index 5cfbf9a04494..6e817686d04f 100644
> --- a/drivers/usb/host/xhci.c
> +++ b/drivers/usb/host/xhci.c
> @@ -3071,6 +3071,48 @@ void xhci_cleanup_stalled_ring(struct xhci_hcd *xhci, 
> unsigned int ep_index,
>       }
>  }
>  
> +static void xhci_endpoint_disable(struct usb_hcd *hcd,
> +                               struct usb_host_endpoint *host_ep)
> +{
> +     struct xhci_hcd         *xhci;
> +     struct xhci_virt_device *vdev;
> +     struct xhci_virt_ep     *ep;
> +     struct usb_device       *udev;
> +     unsigned long           flags;
> +     unsigned int            ep_index;
> +
> +     xhci = hcd_to_xhci(hcd);
> +rescan:
> +     spin_lock_irqsave(&xhci->lock, flags);
> +
> +     udev = (struct usb_device *)host_ep->hcpriv;
> +     if (!udev || !udev->slot_id)
> +             goto done;
> +
> +     vdev = xhci->devs[udev->slot_id];
> +     if (!vdev)
> +             goto done;
> +
> +     ep_index = xhci_get_endpoint_index(&host_ep->desc);
> +     ep = &vdev->eps[ep_index];
> +     if (!ep)
> +             goto done;
> +
> +     /* wait for hub_tt_work to finish clearing hub TT */
> +     if (ep->ep_state & EP_CLEARING_TT) {
> +             spin_unlock_irqrestore(&xhci->lock, flags);
> +             schedule_timeout_uninterruptible(1);
> +             goto rescan;
> +     }
> +
> +     if (ep->ep_state)
> +             xhci_dbg(xhci, "endpoint disable with ep_state 0x%x\n",
> +                      ep->ep_state);
> +done:
> +     host_ep->hcpriv = NULL;
> +     spin_unlock_irqrestore(&xhci->lock, flags);
> +}
> +


I used essentially the same reproducer as you did for debugging this
after I first hit it with an actually stalled control endpoint, and this
patch works also with my fault-injection hack.

I've reviewed the code and it looks good to me except for one mostly
theoretical issue. You need to check ep->hc_priv while holding the
xhci->lock in xhci_clear_tt_buffer_complete() or you could end up having
xhci_endpoint_disable() reschedule indefinitely while waiting for
EP_CLEARING_TT to be cleared on a sufficiently weakly ordered
system.

Since cfbb8a84c2d2 ("xhci: Fix NULL pointer dereference in
xhci_clear_tt_buffer_complete()") isn't needed anymore and is slightly
misleading, I suggest amending the patch with the following:

diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c
index 9b1e15fe2c8e..6c17e3fe181a 100644
--- a/drivers/usb/host/xhci.c
+++ b/drivers/usb/host/xhci.c
@@ -5280,20 +5280,13 @@ static void xhci_clear_tt_buffer_complete(struct 
usb_hcd *hcd,
        unsigned int ep_index;
        unsigned long flags;
 
-       /*
-        * udev might be NULL if tt buffer is cleared during a failed device
-        * enumeration due to a halted control endpoint. Usb core might
-        * have allocated a new udev for the next enumeration attempt.
-        */
-
        xhci = hcd_to_xhci(hcd);
+
+       spin_lock_irqsave(&xhci->lock, flags);
        udev = (struct usb_device *)ep->hcpriv;
-       if (!udev)
-               return;
        slot_id = udev->slot_id;
        ep_index = xhci_get_endpoint_index(&ep->desc);
 
-       spin_lock_irqsave(&xhci->lock, flags);
        xhci->devs[slot_id]->eps[ep_index].ep_state &= ~EP_CLEARING_TT;
        xhci_ring_doorbell_for_active_rings(xhci, slot_id, ep_index);
        spin_unlock_irqrestore(&xhci->lock, flags);

Feel free to add my:

Suggested-by: Johan Hovold <[email protected]>
Reviewed-by: Johan Hovold <[email protected]>
Tested-by: Johan Hovold <[email protected]>

Johan

Re: [RFT PATCH] xhci: Fix use-after-free regression in xhci clear hub TT implementation

Reply via email to