Hi, Wesley Cheng <[email protected]> writes: > On 9/6/2020 11:20 PM, Felipe Balbi wrote: >> >> Hi, >> >> Wesley Cheng <[email protected]> writes: >>> diff --git a/drivers/usb/dwc3/ep0.c b/drivers/usb/dwc3/ep0.c >>> index 59f2e8c31bd1..456aa87e8778 100644 >>> --- a/drivers/usb/dwc3/ep0.c >>> +++ b/drivers/usb/dwc3/ep0.c >>> @@ -197,7 +197,7 @@ int dwc3_gadget_ep0_queue(struct usb_ep *ep, struct >>> usb_request *request, >>> int ret; >>> >>> spin_lock_irqsave(&dwc->lock, flags); >>> - if (!dep->endpoint.desc) { >>> + if (!dep->endpoint.desc || !dwc->pullups_connected) { >> >> this looks odd. If we don't have pullups connected, we shouldn't have a >> descriptor, likewise if we don't have a a description, we haven't been >> enumerated, therefore we shouldn't have pullups connected. >> >> What am I missing here? >> > > Hi Felipe, > > When we > echo "" > /sys/kernel/config/usb_gadget/g1/UDC > > This triggers the usb_gadget_disconnect() routine to execute. > > int usb_gadget_disconnect(struct usb_gadget *gadget) > { > ... > ret = gadget->ops->pullup(gadget, 0); > if (!ret) { > gadget->connected = 0; > gadget->udc->driver->disconnect(gadget); > } > > So it is possible that we've already disabled the pullup before running > the disable() callbacks in the function drivers. The disable()
we used to have usage counts for those, are they gone? I think they're
still there.
> callbacks usually are the ones responsible for calling usb_ep_disable(),
> where we clear the desc field. This means there is a brief period where
> the pullups_connected = 0, but we still have valid ep desc, as it has
> not been disabled yet.
this is a valid point, though
> Also, for function drivers like mass storage, the fsg_disable() routine
> defers the actual usb_ep_disable() call to the fsg_thread, so its not
> always ensured that the disconnect() execution would result in the
> usb_ep_disable() to occur synchronously.
also a good point.
>>> @@ -1926,6 +1926,21 @@ static int dwc3_gadget_set_selfpowered(struct
>>> usb_gadget *g,
>>> return 0;
>>> }
>>>
>>> +static void dwc3_stop_active_transfers(struct dwc3 *dwc)
>>> +{
>>> + u32 epnum;
>>> +
>>> + for (epnum = 2; epnum < DWC3_ENDPOINTS_NUM; epnum++) {
>>
>> dwc3 knows the number of endpoints available in the HW. Use dwc->num_eps
>> instead.
>>
>
> Sure, will do.
>
>>> @@ -1971,6 +1986,8 @@ static int dwc3_gadget_run_stop(struct dwc3 *dwc, int
>>> is_on, int suspend)
>>> return 0;
>>> }
>>>
>>> +static void __dwc3_gadget_stop(struct dwc3 *dwc);
>>> +
>>> static int dwc3_gadget_pullup(struct usb_gadget *g, int is_on)
>>> {
>>> struct dwc3 *dwc = gadget_to_dwc(g);
>>> @@ -1994,9 +2011,37 @@ static int dwc3_gadget_pullup(struct usb_gadget *g,
>>> int is_on)
>>> }
>>> }
>>>
>>> + /*
>>> + * Synchronize and disable any further event handling while controller
>>> + * is being enabled/disabled.
>>> + */
>>> + disable_irq(dwc->irq_gadget);
>>
>> why isn't dwc3_gadget_disable_irq() enough?
>>
>>> spin_lock_irqsave(&dwc->lock, flags);
>>
>> spin_lock_irqsave() will disable interrupts, why disable_irq() above?
>>
>
> In the discussion I had with Thinh, the concern was that with the newly
> added code to override the lpos here, if the interrupt routine
> (dwc3_check_event_buf()) runs, then it will reference the lpos for
that's running in hardirq context. All interrupts are disabled while
that runs, there's no risk of race, right?
> copying the event buffer contents to the event cache, and potentially
> process events. There is no locking in place, so it could be possible
> to have both run in parallel.
Is this academic or have you actually found a situation where this
could, indeed, happen? The spin_lock_irqsave() should be enough to
synchronize dwc3_gadget_pullup() and the interrupt handler.
> Hence, the reason if there was already a pending IRQ triggered, the
> dwc3_gadget_disable_irq() won't ensure the IRQ is handled. We can do
> something like:
> if (!is_on)
> dwc3_gadget_disable_irq()
> synchronize_irq()
> spin_lock_irqsave()
> if(!is_on) {
> ...
>
> But the logic to only apply this on the pullup removal case is a little
> messy. Also, from my understanding, the spin_lock_irqsave() will only
> disable the local CPU IRQs, but not the interrupt line on the GIC, which
> means other CPUs can handle it, unless we explicitly set the IRQ
> affinity to CPUX.
Yeah, the way I understand this can't really happen. But I'm open to
being educated. Maybe Alan can explain if this is really possibility?
>>> + dwc3_stop_active_transfers(dwc);
>>> + __dwc3_gadget_stop(dwc);
>>> +
>>> + count = dwc3_readl(dwc->regs, DWC3_GEVNTCOUNT(0));
>>> + count &= DWC3_GEVNTCOUNT_MASK;
>>> + if (count > 0) {
>>> + dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), count);
>>> + dwc->ev_buf->lpos = (dwc->ev_buf->lpos + count) %
>>> + dwc->ev_buf->length;
>>> + }
>>
>> don't duplicate code. Add a patch before this extracting this into
>> helper and use it for both irq handler and gadget pullup.
>>
>
> We actually removed this call in the IRQ handler, as if we ensure that
> the IRQ routine has fully complete and won't trigger anymore, then this
> sequence will handle clearing of the event count.
oh, makes sense :-)
--
balbi
signature.asc
Description: PGP signature

