On 04.06.2018 18:28, Sudip Mukherjee wrote:
On Thu, May 24, 2018 at 04:35:34PM +0300, Mathias Nyman wrote:

Log show two rings having the same TRB segment dma address, this will 
completely mess up the transfer:

While allocating rigs the enque pointers for the two rings are the same:

461.859315: xhci_ring_alloc: ISOC efa4e580: enq 
0x0000000033386000(0x0000000033386000) deq 
0x0000000033386000(0x0000000033386000) segs 2 stream 0 ...bs
461.859320: xhci_ring_alloc: ISOC f0ce1f00: enq 
0x0000000033386000(0x0000000033386000) deq 
0x0000000033386000(0x0000000033386000) segs 2 stream 0 ...

So something goes really wrong when allocating or setting up the rings in one 
of these functions:

To verify and rule out dma_pool_zalloc(), could you apply the attached patch 
and reproduce with new logs?

I spoke too soon in my yesterday's mail. We were able to reproduce it
on the automated tests. The log and the trace is at:
https://drive.google.com/open?id=1h-3r-1lfjg8oblBGkzdRIq8z3ZNgGZx-

Will request you to have a look at it.


Odd and unlikely, but to me this looks like some issue in allocating dma memory
from pool using dma_pool_zalloc()

Adding people with DMA knowledge to cc, maybe someone knows what is going on.

Here's the story:
Sudip sees usb issues on a Intel Atom based board with 4.14.2 kernel.
All tracing points to dma_pool_zalloc() returning the same dma address block on
consecutive calls.

In the failing case dma_pool_zalloc() is called 3 - 6us apart.

<...>-26362 [002] ....  1186.756739: xhci_ring_mem_detail: MATTU 
xhci_segment_alloc dma @ 0x000000002d92b000
<...>-26362 [002] ....  1186.756745: xhci_ring_mem_detail: MATTU 
xhci_segment_alloc dma @ 0x000000002d92b000
<...>-26362 [002] ....  1186.756748: xhci_ring_mem_detail: MATTU 
xhci_segment_alloc dma @ 0x000000002d92b000

dma_pool_zalloc() is called from xhci_segment_alloc() in 
drivers/usb/host/xhci-mem.c
see:
https://elixir.bootlin.com/linux/v4.14.2/source/drivers/usb/host/xhci-mem.c#L52

prints above are custom traces added right after dma_pool_zalloc()
@@ -44,10 +44,15 @@ static struct xhci_segment *xhci_segment_alloc(struct 
xhci_hcd *xhci,
                return NULL;
        }
+ xhci_dbg_trace(xhci, trace_xhci_ring_mem_detail,
+                      "MATTU xhci_segment_alloc dma @ %pad", &dma);
+

Any idea what's going on?
dma_pool_alloc() has a comment that it drops &pool->lock if it needs to allocate
a page, can it be related?

Thanks
-Mathias

_______________________________________________
iommu mailing list
iommu@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/iommu

Reply via email to