Hi, John Youn <[email protected]> writes: >>> John Youn <[email protected]> writes: >>>>> Thinh Nguyen <[email protected]> writes: >>>>>> The dwc3 driver can overwite its previous events if its top half IRQ >>>>>> handler gets invoked again before processing the events in the cache. We >>>>> >>>>> interrupts are masked, why would top half get invoked again? Is this, >>>>> perhaps, related to DWC3 3.00a which has the "Interrupt line doesn't >>>>> lower when masked" problem? We've added a lot of code to workaround that >>>>> problem and, apparently, it wasn't enough. >>>> >>>> No, it is not related to that. We verified with PCIe traces. The >>>> interrupt line gets deasserted after we mask it. And we put the >>>> masking as close to the beginning of the top-half as possible. >>>> >>>>> >>>>> In any case, there's no way top half would be invoked again in a >>>>> properly working DWC3. >>>> >>>> Yet we still see it sometimes. Usually it doesn't create a problem, >>> >>> that's fair, but it's not for the reason you're describing :-) There >>> might be another problem going on, because since we masked the interrupt >>> and cleared all events, IRQ shouldn't be raised at all; unless, as I >>> mentioned on the other subthread, the IRQ line is shared. >>> >>>> but if there happens to be a new event there, we get the failure. >>>> >>>> We didn't trace into that part of the kernel so we can't explain why. >>>> But if there is any chance the interrupt line deassertion wasn't >>>> detected in time, whatever part of the kernel that thinks it is still >>>> asserted might just call our top-half again. This could be a totally >>>> wrong assumption, but it doesn't seem too far-fetched. >>> >>> The kernel doesn't detect IRQ line assertion/deassertion. CPU gets an >>> exception when that happens and calls Kernel IRQ handler vector. That >>> will, in turn, figure out which line triggered, call the handler and so >>> on. >> >> We're talking about PCIe though, where interrupt assertion and >> deassertion are packets. So I would imagine the kernel has to do >> something and there could be some latency associated with that. > > Also, another thing is that the device uses legacy, level-triggered, > PCIe interrupts, so for as long as the interrupt is asserted, the TH > is called repeatedly.
yes, and that's why we have:
> static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt)
> {
> struct dwc3 *dwc = evt->dwc;
> u32 amount;
> u32 count;
> u32 reg;
> if (pm_runtime_suspended(dwc->dev)) {
> pm_runtime_get(dwc->dev);
> disable_irq_nosync(dwc->irq_gadget);
> dwc->pending_events = true;
> return IRQ_HANDLED;
> }
>
> count = dwc3_readl(dwc->regs, DWC3_GEVNTCOUNT(0));
> count &= DWC3_GEVNTCOUNT_MASK;
check how many events are pending in the event buffer.
> if (!count)
> return IRQ_NONE;
>
> evt->count = count;
> evt->flags |= DWC3_EVENT_PENDING;
>
> /* Mask interrupt */
> reg = dwc3_readl(dwc->regs, DWC3_GEVNTSIZ(0));
> reg |= DWC3_GEVNTSIZ_INTMASK;
mask interrupt generation
> dwc3_writel(dwc->regs, DWC3_GEVNTSIZ(0), reg);
>
> amount = min(count, evt->length - evt->lpos);
> memcpy(evt->cache + evt->lpos, evt->buf + evt->lpos, amount);
>
> if (amount < count)
> memcpy(evt->cache, evt->buf, count - amount);
>
> dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), count);
clear ALL events from event buffer. This brings the line down, so we
shouldn't re-enter.
> return IRQ_WAKE_THREAD;
> }
> So we mask the interrupt in the TH and a short time later, the
> interrupt de-assertion packet is sent on PCIe bus and if that's not
> seen right away we may already have another call to TH before the BH
> gets scheduled.
not sure this can happen. If that's the case, every PCI driver would
have all sorts of tricks to cope with this, not only dwc3 :-)
Bjorn, is this something that can happen on PCIe?
Quick summary of the problem:
John and Thinh are experiencing a re-entrant top-half handler even
though we have cleared pending IRQ status _and_ masked Interrupts. SNPS
is using an FPGA model of the latest DWC3 core under x86.
I have never seen this behavior on ARM or any of the x86 devices
containing this core (and this includes all the newest x86 cores, see
drivers/usb/dwc3/dwc3-pci.c for PCI IDs if you care enough :-)
Anyway, from my point of view, this is either a bug in IRQ subsystem
which only John and Thinh can reproduce at this moment, or a regression
with DWC3 IP Core :-s
--
balbi
signature.asc
Description: PGP signature
