Re: 5.10 regression, many XHCI swiotlb buffer is full / DMAR: Device bounce map failed errors on thunderbolt connected XHCI controller

2020-11-27 Thread Hans de Goede
Hi,

On 11/24/20 11:27 AM, Christoph Hellwig wrote:
> On Mon, Nov 23, 2020 at 03:49:09PM +0100, Hans de Goede wrote:
>> Hi,
>>
>> +Cc Christoph Hellwig 
>>
>> Christoph, this is still an issue, so I've been looking around a bit and 
>> think this
>> might have something to do with the dma-mapping-5.10 changes.
>>
>> Do you have any suggestions to debug this, or is it time to do a git bisect
>> on this before 5.10 ships with regression?
> 
> Given that DMAR prefix this seems to be about using intel-iommu + bounce
> buffering for external devices.  I can't really think of anything specific
> in 5.10 related to that, so maybe you'll need to bisect.
> 
> I doub this means we are actually leaking swiotlb buffers, so while
> I'm pretty sure we broke something in lower layers this also means
> xhci doesn't handle swiotlb operation very gracefully in general.

I've done a git bisect, and the result is somewhat surprising. The git-bisect
points to:

commit 558033c2828f ("uas: fix sdev->host->dma_dev")

Use scsi_add_host_with_dma() instead of scsi_add_host().

When the scsi request queue is initialized/allocated, hw_max_sectors is 
clamped
to the dma max mapping size. Therefore, the correct device that should be 
used
for the clamping needs to be set.

The same clamping is still needed in uas as hw_max_sectors could be changed
there. The original clamping would be invalidated in such cases.

I do have an UAS drive connected to the thunderbolt-dock, so I guess that this
change is causing the UAS driver to gobble all all available swiotlb space.

Regards,

Hans



Re: 5.10 regression, many XHCI swiotlb buffer is full / DMAR: Device bounce map failed errors on thunderbolt connected XHCI controller

2020-11-24 Thread Mathias Nyman
On 24.11.2020 12.31, Hans de Goede wrote:
> Hi,
> 
> On 11/24/20 11:27 AM, Christoph Hellwig wrote:
>> On Mon, Nov 23, 2020 at 03:49:09PM +0100, Hans de Goede wrote:
>>> Hi,
>>>
>>> +Cc Christoph Hellwig 
>>>
>>> Christoph, this is still an issue, so I've been looking around a bit and 
>>> think this
>>> might have something to do with the dma-mapping-5.10 changes.
>>>
>>> Do you have any suggestions to debug this, or is it time to do a git bisect
>>> on this before 5.10 ships with regression?
>>
>> Given that DMAR prefix this seems to be about using intel-iommu + bounce
>> buffering for external devices.  I can't really think of anything specific
>> in 5.10 related to that, so maybe you'll need to bisect.
>>
>> I doub this means we are actually leaking swiotlb buffers, so while
>> I'm pretty sure we broke something in lower layers this also means
>> xhci doesn't handle swiotlb operation very gracefully in general.

Can't think of any xhci change since 5.9 that would cause this.
It's possible there's some underlying xhci issue the 5.10 dma-mapping
changes reveal.

> 
> Ok, I've re-arranged my schedule a bit so that I have time to bisect this
> tomorrow, so with some luck I will be able to provide info on which commit
> introduced this issue tomorrow around the end of the day.

Thanks for looking into it.

-Mathias



Re: 5.10 regression, many XHCI swiotlb buffer is full / DMAR: Device bounce map failed errors on thunderbolt connected XHCI controller

2020-11-24 Thread Hans de Goede
Hi,

On 11/24/20 11:27 AM, Christoph Hellwig wrote:
> On Mon, Nov 23, 2020 at 03:49:09PM +0100, Hans de Goede wrote:
>> Hi,
>>
>> +Cc Christoph Hellwig 
>>
>> Christoph, this is still an issue, so I've been looking around a bit and 
>> think this
>> might have something to do with the dma-mapping-5.10 changes.
>>
>> Do you have any suggestions to debug this, or is it time to do a git bisect
>> on this before 5.10 ships with regression?
> 
> Given that DMAR prefix this seems to be about using intel-iommu + bounce
> buffering for external devices.  I can't really think of anything specific
> in 5.10 related to that, so maybe you'll need to bisect.
> 
> I doub this means we are actually leaking swiotlb buffers, so while
> I'm pretty sure we broke something in lower layers this also means
> xhci doesn't handle swiotlb operation very gracefully in general.

Ok, I've re-arranged my schedule a bit so that I have time to bisect this
tomorrow, so with some luck I will be able to provide info on which commit
introduced this issue tomorrow around the end of the day.

Regards,

Hans



Re: 5.10 regression, many XHCI swiotlb buffer is full / DMAR: Device bounce map failed errors on thunderbolt connected XHCI controller

2020-11-24 Thread Christoph Hellwig
On Mon, Nov 23, 2020 at 03:49:09PM +0100, Hans de Goede wrote:
> Hi,
> 
> +Cc Christoph Hellwig 
> 
> Christoph, this is still an issue, so I've been looking around a bit and 
> think this
> might have something to do with the dma-mapping-5.10 changes.
> 
> Do you have any suggestions to debug this, or is it time to do a git bisect
> on this before 5.10 ships with regression?

Given that DMAR prefix this seems to be about using intel-iommu + bounce
buffering for external devices.  I can't really think of anything specific
in 5.10 related to that, so maybe you'll need to bisect.

I doub this means we are actually leaking swiotlb buffers, so while
I'm pretty sure we broke something in lower layers this also means
xhci doesn't handle swiotlb operation very gracefully in general.


Re: 5.10 regression, many XHCI swiotlb buffer is full / DMAR: Device bounce map failed errors on thunderbolt connected XHCI controller

2020-11-23 Thread Hans de Goede
Hi,

+Cc Christoph Hellwig 

Christoph, this is still an issue, so I've been looking around a bit and think 
this
might have something to do with the dma-mapping-5.10 changes.

Do you have any suggestions to debug this, or is it time to do a git bisect
on this before 5.10 ships with regression?

Regards,

Hans




On 11/10/20 12:36 PM, Hans de Goede wrote:
> Hi All,
> 
> Not sure if this is a XHCI driver problem at all, but I needed to start
> somewhere with reporting this so I went with:
> 
> scripts/get_maintainer.pl -f drivers/usb/host/xhci-pci.c
> 
> And added a Cc: linux-...@vger.kernel.org as bonus.
> 
> I'm seeing the following errors and very slow network performance with
> the USB NIC in a Lenovo Thunderbolt gen 2 dock.
> 
> Note that the USB NIC is connected to the XHCI controller which is
> embedded inside the dock and is connected over thunderbolt!
> 
> So the errors are:
> 
> [ 1148.744205] swiotlb_tbl_map_single: 6 callbacks suppressed
> [ 1148.744210] xhci_hcd :0a:00.0: swiotlb buffer is full (sz: 8192 
> bytes), total 32768 (slots), used 16 (slots)
> [ 1148.744218] xhci_hcd :0a:00.0: DMAR: Device bounce map: 16ea@1411c 
> dir 1 --- failed
> [ 1148.744226] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
> [ 1148.744368] xhci_hcd :0a:00.0: swiotlb buffer is full (sz: 8192 
> bytes), total 32768 (slots), used 16 (slots)
> [ 1148.744375] xhci_hcd :0a:00.0: DMAR: Device bounce map: 16ea@10aabc000 
> dir 1 --- failed
> [ 1148.744381] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
> [ 1148.745141] xhci_hcd :0a:00.0: swiotlb buffer is full (sz: 8192 
> bytes), total 32768 (slots), used 16 (slots)
> [ 1148.745148] xhci_hcd :0a:00.0: DMAR: Device bounce map: 118e@1411c 
> dir 1 --- failed
> [ 1148.745155] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
> [ 1148.951282] xhci_hcd :0a:00.0: swiotlb buffer is full (sz: 8192 
> bytes), total 32768 (slots), used 16 (slots)
> [ 1148.951388] xhci_hcd :0a:00.0: DMAR: Device bounce map: 118e@140988000 
> dir 1 --- failed
> [ 1148.951420] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
> [ 1151.013342] xhci_hcd :0a:00.0: swiotlb buffer is full (sz: 8192 
> bytes), total 32768 (slots), used 16 (slots)
> [ 1151.013357] xhci_hcd :0a:00.0: DMAR: Device bounce map: 1d2a@1411c 
> dir 1 --- failed
> [ 1151.013373] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
> [ 1151.018660] xhci_hcd :0a:00.0: swiotlb buffer is full (sz: 8192 
> bytes), total 32768 (slots), used 18 (slots)
> [ 1151.018696] xhci_hcd :0a:00.0: DMAR: Device bounce map: 11da@1411c 
> dir 1 --- failed
> [ 1151.018711] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
> [ 1151.223022] xhci_hcd :0a:00.0: swiotlb buffer is full (sz: 8192 
> bytes), total 32768 (slots), used 16 (slots)
> [ 1151.223102] xhci_hcd :0a:00.0: DMAR: Device bounce map: 11da@10aabc000 
> dir 1 --- failed
> [ 1151.223133] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
> [ 1151.228810] xhci_hcd :0a:00.0: swiotlb buffer is full (sz: 8192 
> bytes), total 32768 (slots), used 16 (slots)
> [ 1151.228870] xhci_hcd :0a:00.0: DMAR: Device bounce map: 11da@10aabc000 
> dir 1 --- failed
> [ 1151.228898] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
> [ 1151.234792] xhci_hcd :0a:00.0: swiotlb buffer is full (sz: 8192 
> bytes), total 32768 (slots), used 16 (slots)
> [ 1151.234852] xhci_hcd :0a:00.0: DMAR: Device bounce map: 11da@10aabc000 
> dir 1 --- failed
> [ 1151.234882] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
> 
> etc.
> 
> This happens as soon as I generate any serious amount of outgoing network 
> traffic. E.g. rsyncing files
> to another machine.
> 
> Regards,
> 
> Hans
> 



Re: 5.10 regression, many XHCI swiotlb buffer is full / DMAR: Device bounce map failed errors on thunderbolt connected XHCI controller

2020-11-18 Thread Hans de Goede
Hi All,

On 11/10/20 12:36 PM, Hans de Goede wrote:
> Hi All,
> 
> Not sure if this is a XHCI driver problem at all, but I needed to start
> somewhere with reporting this so I went with:
> 
> scripts/get_maintainer.pl -f drivers/usb/host/xhci-pci.c
> 
> And added a Cc: linux-...@vger.kernel.org as bonus.
> 
> I'm seeing the following errors and very slow network performance with
> the USB NIC in a Lenovo Thunderbolt gen 2 dock.
> 
> Note that the USB NIC is connected to the XHCI controller which is
> embedded inside the dock and is connected over thunderbolt!

Ping? This is still happening and although the errors are not fatal,
outgoing network performance is very bad.

I know a lot of Linux users use thunderbolt docks and for some
reason almost all thunderbolt docks seem to be using USB attached
nics inside, so this is going to hit a lot of users if we do not
get this fixed before 5.10 gets released!

Regards,

Hans





> So the errors are:
> 
> [ 1148.744205] swiotlb_tbl_map_single: 6 callbacks suppressed
> [ 1148.744210] xhci_hcd :0a:00.0: swiotlb buffer is full (sz: 8192 
> bytes), total 32768 (slots), used 16 (slots)
> [ 1148.744218] xhci_hcd :0a:00.0: DMAR: Device bounce map: 16ea@1411c 
> dir 1 --- failed
> [ 1148.744226] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
> [ 1148.744368] xhci_hcd :0a:00.0: swiotlb buffer is full (sz: 8192 
> bytes), total 32768 (slots), used 16 (slots)
> [ 1148.744375] xhci_hcd :0a:00.0: DMAR: Device bounce map: 16ea@10aabc000 
> dir 1 --- failed
> [ 1148.744381] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
> [ 1148.745141] xhci_hcd :0a:00.0: swiotlb buffer is full (sz: 8192 
> bytes), total 32768 (slots), used 16 (slots)
> [ 1148.745148] xhci_hcd :0a:00.0: DMAR: Device bounce map: 118e@1411c 
> dir 1 --- failed
> [ 1148.745155] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
> [ 1148.951282] xhci_hcd :0a:00.0: swiotlb buffer is full (sz: 8192 
> bytes), total 32768 (slots), used 16 (slots)
> [ 1148.951388] xhci_hcd :0a:00.0: DMAR: Device bounce map: 118e@140988000 
> dir 1 --- failed
> [ 1148.951420] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
> [ 1151.013342] xhci_hcd :0a:00.0: swiotlb buffer is full (sz: 8192 
> bytes), total 32768 (slots), used 16 (slots)
> [ 1151.013357] xhci_hcd :0a:00.0: DMAR: Device bounce map: 1d2a@1411c 
> dir 1 --- failed
> [ 1151.013373] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
> [ 1151.018660] xhci_hcd :0a:00.0: swiotlb buffer is full (sz: 8192 
> bytes), total 32768 (slots), used 18 (slots)
> [ 1151.018696] xhci_hcd :0a:00.0: DMAR: Device bounce map: 11da@1411c 
> dir 1 --- failed
> [ 1151.018711] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
> [ 1151.223022] xhci_hcd :0a:00.0: swiotlb buffer is full (sz: 8192 
> bytes), total 32768 (slots), used 16 (slots)
> [ 1151.223102] xhci_hcd :0a:00.0: DMAR: Device bounce map: 11da@10aabc000 
> dir 1 --- failed
> [ 1151.223133] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
> [ 1151.228810] xhci_hcd :0a:00.0: swiotlb buffer is full (sz: 8192 
> bytes), total 32768 (slots), used 16 (slots)
> [ 1151.228870] xhci_hcd :0a:00.0: DMAR: Device bounce map: 11da@10aabc000 
> dir 1 --- failed
> [ 1151.228898] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
> [ 1151.234792] xhci_hcd :0a:00.0: swiotlb buffer is full (sz: 8192 
> bytes), total 32768 (slots), used 16 (slots)
> [ 1151.234852] xhci_hcd :0a:00.0: DMAR: Device bounce map: 11da@10aabc000 
> dir 1 --- failed
> [ 1151.234882] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
> 
> etc.
> 
> This happens as soon as I generate any serious amount of outgoing network 
> traffic. E.g. rsyncing files
> to another machine.
> 
> Regards,
> 
> Hans
> 



5.10 regression, many XHCI swiotlb buffer is full / DMAR: Device bounce map failed errors on thunderbolt connected XHCI controller

2020-11-10 Thread Hans de Goede
Hi All,

Not sure if this is a XHCI driver problem at all, but I needed to start
somewhere with reporting this so I went with:

scripts/get_maintainer.pl -f drivers/usb/host/xhci-pci.c

And added a Cc: linux-...@vger.kernel.org as bonus.

I'm seeing the following errors and very slow network performance with
the USB NIC in a Lenovo Thunderbolt gen 2 dock.

Note that the USB NIC is connected to the XHCI controller which is
embedded inside the dock and is connected over thunderbolt!

So the errors are:

[ 1148.744205] swiotlb_tbl_map_single: 6 callbacks suppressed
[ 1148.744210] xhci_hcd :0a:00.0: swiotlb buffer is full (sz: 8192 bytes), 
total 32768 (slots), used 16 (slots)
[ 1148.744218] xhci_hcd :0a:00.0: DMAR: Device bounce map: 16ea@1411c 
dir 1 --- failed
[ 1148.744226] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
[ 1148.744368] xhci_hcd :0a:00.0: swiotlb buffer is full (sz: 8192 bytes), 
total 32768 (slots), used 16 (slots)
[ 1148.744375] xhci_hcd :0a:00.0: DMAR: Device bounce map: 16ea@10aabc000 
dir 1 --- failed
[ 1148.744381] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
[ 1148.745141] xhci_hcd :0a:00.0: swiotlb buffer is full (sz: 8192 bytes), 
total 32768 (slots), used 16 (slots)
[ 1148.745148] xhci_hcd :0a:00.0: DMAR: Device bounce map: 118e@1411c 
dir 1 --- failed
[ 1148.745155] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
[ 1148.951282] xhci_hcd :0a:00.0: swiotlb buffer is full (sz: 8192 bytes), 
total 32768 (slots), used 16 (slots)
[ 1148.951388] xhci_hcd :0a:00.0: DMAR: Device bounce map: 118e@140988000 
dir 1 --- failed
[ 1148.951420] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
[ 1151.013342] xhci_hcd :0a:00.0: swiotlb buffer is full (sz: 8192 bytes), 
total 32768 (slots), used 16 (slots)
[ 1151.013357] xhci_hcd :0a:00.0: DMAR: Device bounce map: 1d2a@1411c 
dir 1 --- failed
[ 1151.013373] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
[ 1151.018660] xhci_hcd :0a:00.0: swiotlb buffer is full (sz: 8192 bytes), 
total 32768 (slots), used 18 (slots)
[ 1151.018696] xhci_hcd :0a:00.0: DMAR: Device bounce map: 11da@1411c 
dir 1 --- failed
[ 1151.018711] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
[ 1151.223022] xhci_hcd :0a:00.0: swiotlb buffer is full (sz: 8192 bytes), 
total 32768 (slots), used 16 (slots)
[ 1151.223102] xhci_hcd :0a:00.0: DMAR: Device bounce map: 11da@10aabc000 
dir 1 --- failed
[ 1151.223133] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
[ 1151.228810] xhci_hcd :0a:00.0: swiotlb buffer is full (sz: 8192 bytes), 
total 32768 (slots), used 16 (slots)
[ 1151.228870] xhci_hcd :0a:00.0: DMAR: Device bounce map: 11da@10aabc000 
dir 1 --- failed
[ 1151.228898] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11
[ 1151.234792] xhci_hcd :0a:00.0: swiotlb buffer is full (sz: 8192 bytes), 
total 32768 (slots), used 16 (slots)
[ 1151.234852] xhci_hcd :0a:00.0: DMAR: Device bounce map: 11da@10aabc000 
dir 1 --- failed
[ 1151.234882] r8152 4-2.1.2:1.0 ens1u2u1u2: failed tx_urb -11

etc.

This happens as soon as I generate any serious amount of outgoing network 
traffic. E.g. rsyncing files
to another machine.

Regards,

Hans