Hi,

Thinh Nguyen <thinh.ngu...@synopsys.com> writes:
>> (Thinh, for whatever I didn't receive your email via the list, replying
>> to myself)
>
> Could it be because of the attached file?
>
>> 
>> Felipe Balbi <ba...@kernel.org> writes:
>>>> Hmm, can you apply this patch and provide output when the failure
>>>> happens? I suggest setting up a 100MiB trace buffer. You don't need to
>>>> enable any tracers nor any event. trace_printk() is always enabled.
>>>
>>> Sure.
>>>
>>> Attached is a 7z with the full kernel trace with the failure.
>> 
>> Thank you :-)
>> 
>>> More often than not, we will see something like below, where the TH
>>> coming in back to back with 0 event count on the second TH.
>>>
>>> 2990.496588: dwc3_interrupt: TH
>>> 2990.496588: dwc3_interrupt: evt->flags 00000000
>>> 2990.496590: dwc3_readl: addr ffffc90001b2840c value 00000004
>>> 2990.496590: dwc3_interrupt: GEVNTCOUNT 00000004
>>> 2990.496590: dwc3_interrupt: 4 events pending
>>> 2990.496591: dwc3_readl: addr ffffc90001b28408 value 00001000
>>> 2990.496591: dwc3_interrupt: GEVNTSIZ 00001000
>>> 2990.496592: dwc3_writel: addr ffffc90001b28408 value 80001000
>>> 2990.496592: dwc3_interrupt: copying events to cache
>>> 2990.496592: dwc3_writel: addr ffffc90001b2840c value 00000004
>>> 2990.496614: dwc3_interrupt: TH
>>> 2990.496614: dwc3_interrupt: evt->flags 00000001
>>> 2990.496615: dwc3_readl: addr ffffc90001b2840c value 00000000
>>> 2990.496615: dwc3_interrupt: GEVNTCOUNT 00000000
>>> 2990.496616: dwc3_interrupt: 0 events pending
>>> 2990.496616: dwc3_thread_interrupt: BH
>>> 2990.496616: dwc3_event: event (00006084): ep1out: Transfer In-Progress
>>>
>>>
>>> Here is the snippet of the attached trace where the TH is called twice
>>> with a new event coming in and overwrite the old one:
>>>
>>> 2990.601357: dwc3_readl: addr ffffc90001b2883c value 00030007
>>> 2990.601357: dwc3_gadget_ep_cmd: ep1in: cmd 'Update Transfer' [196615]..
>>> 2990.601360: dwc3_interrupt: TH
>>> 2990.601360: dwc3_interrupt: evt->flags 00000000
>>> 2990.601362: dwc3_readl: addr ffffc90001b2840c value 00000004
>>> 2990.601362: dwc3_interrupt: GEVNTCOUNT 00000004
>>> 2990.601362: dwc3_interrupt: 4 events pending
>>> 2990.601364: dwc3_readl: addr ffffc90001b28408 value 00001000
>>> 2990.601364: dwc3_interrupt: GEVNTSIZ 00001000
>>> 2990.601364: dwc3_writel: addr ffffc90001b28408 value 80001000
>>> 2990.601364: dwc3_interrupt: copying events to cache
>>> 2990.601364: dwc3_writel: addr ffffc90001b2840c value 00000004
>>> 2990.601369: dwc3_interrupt: TH
>>> 2990.601369: dwc3_interrupt: evt->flags 00000001
>>> 2990.601369: dwc3_readl: addr ffffc90001b2840c value 00000004
>>> 2990.601370: dwc3_interrupt: GEVNTCOUNT 00000004
>>> 2990.601370: dwc3_interrupt: 4 events pending
>>> 2990.601372: dwc3_readl: addr ffffc90001b28408 value 80001000
>>> 2990.601372: dwc3_interrupt: GEVNTSIZ 80001000
>>> 2990.601372: dwc3_writel: addr ffffc90001b28408 value 80001000
>>> 2990.601372: dwc3_interrupt: copying events to cache
>>> 2990.601372: dwc3_writel: addr ffffc90001b2840c value 00000004
>>> 2990.601375: dwc3_thread_interrupt: BH
>>> 2990.601376: dwc3_event: event (00004086): ep1in: Transfer In-Progress..
>>>
>>> The device hangs at the end of the log.
>
> I repeated the test with USB 3.0 IP version 2.90a and USB 3.1 IP version 
> 1.60, and I found the same result where the TH is occasionally called 
> twice back-to-back.
>
>> 
>> Right, found this in the logs. So the only thing that I can think of
>> here, is that we need to flush posted writes. Does this help?
>> 
>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>> index 79e7a3480d51..6f06738273f2 100644
>> --- a/drivers/usb/dwc3/gadget.c
>> +++ b/drivers/usb/dwc3/gadget.c
>> @@ -3099,6 +3099,10 @@ static irqreturn_t dwc3_check_event_buf(struct 
>> dwc3_event_buffer *evt)
>>   
>>          dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), count);
>>   
>> +       /* flush posted writes */
>> +       dwc3_readl(dwc->regs, DWC3_GEVNTCOUNT(0));
>> +       dwc3_readl(dwc->regs, DWC3_GEVNTSIZ(0));
>> +
>>          return IRQ_WAKE_THREAD;
>>   }
>>   
>> 
>
> With the change above, I don't see the TH get called twice in in a row. 
> However, I also tested with replacing the code above with a udelay(4), 
> from reading the kernel trace, to take in account the latency of reading 
> the registers. I also don't see TH get called twice in a row.
>
> The extra latency from reading the registers above can mask this issue.

okay, so it is a problem with posted writes. Can you send above as a
formal patch? That's something we can backport to every stable
kernel. And the real reason here _IS_ caused by posted writes. There's
no spurious interrupt happening.

-- 
balbi

Attachment: signature.asc
Description: PGP signature

Reply via email to