Re: [PATCH v3 3/4] scsi: core: Cap shost max_sectors according to DMA optimum mapping limits
On 10/06/2022 16:37, John Garry via iommu wrote: On 6/9/22 10:54, John Garry wrote: ok, but do you have a system where the UFS host controller is behind an IOMMU? I had the impression that UFS controllers would be mostly found in embedded systems and IOMMUs are not as common on there. Modern phones have an IOMMU. Below one can find an example from a Pixel 6 phone. The UFS storage controller is not controller by the IOMMU as far as I can see but I wouldn't be surprised if the security team would ask us one day to enable the IOMMU for the UFS controller. OK, then unfortunately it seems that you have no method to test. I might be able to test USB MSC but I am not even sure if I can even get DMA mappings who length exceeds the IOVA rcache limit there. I was able to do some testing on USB MSC for an XHCI controller. The result is that limiting the max HW sectors there does not affect performance in normal conditions. However if I hack the USB driver and fiddle with request queue settings then it can: - lift max_sectors limit in usb_stor_host_template 120KB -> 256KB - lift request queue read_ahead_kb 128KB -> 256KB In this scenario I can get 42.5MB/s read throughput, as opposed to 39.5MB/s in normal conditions. Since .can_queue=1 for that host it would not fall foul of some issues I experience in IOVA allocator performance, so limiting max_sectors would not be required for that reason. So this is an artificial test, but it may be worth considering only applying this DMA mapping optimal max_sectors limit to SAS controllers which I know can benefit. Christoph, any opinion? thanks, John ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 3/4] scsi: core: Cap shost max_sectors according to DMA optimum mapping limits
On 09/06/2022 21:34, Bart Van Assche wrote: On 6/9/22 10:54, John Garry wrote: ok, but do you have a system where the UFS host controller is behind an IOMMU? I had the impression that UFS controllers would be mostly found in embedded systems and IOMMUs are not as common on there. Modern phones have an IOMMU. Below one can find an example from a Pixel 6 phone. The UFS storage controller is not controller by the IOMMU as far as I can see but I wouldn't be surprised if the security team would ask us one day to enable the IOMMU for the UFS controller. OK, then unfortunately it seems that you have no method to test. I might be able to test USB MSC but I am not even sure if I can even get DMA mappings who length exceeds the IOVA rcache limit there. Thanks, John ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 3/4] scsi: core: Cap shost max_sectors according to DMA optimum mapping limits
On 6/9/22 10:54, John Garry wrote: ok, but do you have a system where the UFS host controller is behind an IOMMU? I had the impression that UFS controllers would be mostly found in embedded systems and IOMMUs are not as common on there. Modern phones have an IOMMU. Below one can find an example from a Pixel 6 phone. The UFS storage controller is not controller by the IOMMU as far as I can see but I wouldn't be surprised if the security team would ask us one day to enable the IOMMU for the UFS controller. # (cd /sys/class/iommu && ls */devices) 1a09.sysmmu/devices: 1900.aoc 1a51.sysmmu/devices: 1a44.lwis_csi 1a54.sysmmu/devices: 1aa4.lwis_pdp 1a88.sysmmu/devices: 1a84.lwis_g3aa 1ad0.sysmmu/devices: 1ac4.lwis_ipp 1ac8.lwis_gtnr_align 1b08.sysmmu/devices: 1b45.lwis_itp 1b78.sysmmu/devices: 1b7b.sysmmu/devices: 1b76.lwis_mcsc 1b7e.sysmmu/devices: 1baa.sysmmu/devices: 1a4e.lwis_votf 1ba4.lwis_gdc 1bad.sysmmu/devices: 1ba6.lwis_gdc 1bb0.sysmmu/devices: 1ba8.lwis_scsc 1bc7.sysmmu/devices: 1bc4.lwis_gtnr_merge 1bca.sysmmu/devices: 1bcd.sysmmu/devices: 1bd0.sysmmu/devices: 1bd3.sysmmu/devices: 1c10.sysmmu/devices: 1c30.drmdecon 1c302000.drmdecon 1c11.sysmmu/devices: 1c12.sysmmu/devices: 1c66.sysmmu/devices: 1c64.g2d 1c69.sysmmu/devices: 1c71.sysmmu/devices: 1c70.smfc 1c87.sysmmu/devices: 1c8d.MFC-0 mfc 1c8a.sysmmu/devices: 1ca4.sysmmu/devices: 1cb0.bigocean 1cc4.sysmmu/devices: 1ce0.abrolhos Bart. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 3/4] scsi: core: Cap shost max_sectors according to DMA optimum mapping limits
On 09/06/2022 18:18, Bart Van Assche wrote: SCSI host bus adapters that support 64-bit DMA may support much larger transfer sizes than 128 KiB. Indeed, and that is my problem today, as my storage controller is generating DMA mapping lengths which exceeds 128K and they slow everything down. If you say that SRP enjoys best peformance with larger transfers then can you please test this with an IOMMU enabled (iommu group type DMA or DMA-FQ)? Hmm ... what exactly do you want me to test? Do you perhaps want me to measure how much performance drops with an IOMMU enabled? Yes, I would like to know of any performance change with an IOMMU enabled and then with an IOMMU enabled and including my series. I don't have access anymore to the SRP setup I referred to in my previous email. But I do have access to devices that boot from UFS storage. For these devices we need to transfer 2 MiB per request to achieve full bandwidth. ok, but do you have a system where the UFS host controller is behind an IOMMU? I had the impression that UFS controllers would be mostly found in embedded systems and IOMMUs are not as common on there. Thanks, John ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 3/4] scsi: core: Cap shost max_sectors according to DMA optimum mapping limits
On 6/9/22 01:00, John Garry wrote: On 08/06/2022 22:07, Bart Van Assche wrote: On 6/8/22 10:50, John Garry wrote: Please note that this limit only applies if we have an IOMMU enabled for the scsi host dma device. Otherwise we are limited by dma direct or swiotlb max mapping size, as before. SCSI host bus adapters that support 64-bit DMA may support much larger transfer sizes than 128 KiB. Indeed, and that is my problem today, as my storage controller is generating DMA mapping lengths which exceeds 128K and they slow everything down. If you say that SRP enjoys best peformance with larger transfers then can you please test this with an IOMMU enabled (iommu group type DMA or DMA-FQ)? Hmm ... what exactly do you want me to test? Do you perhaps want me to measure how much performance drops with an IOMMU enabled? I don't have access anymore to the SRP setup I referred to in my previous email. But I do have access to devices that boot from UFS storage. For these devices we need to transfer 2 MiB per request to achieve full bandwidth. Thanks, Bart. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 3/4] scsi: core: Cap shost max_sectors according to DMA optimum mapping limits
On 08/06/2022 22:07, Bart Van Assche wrote: On 6/8/22 10:50, John Garry wrote: Please note that this limit only applies if we have an IOMMU enabled for the scsi host dma device. Otherwise we are limited by dma direct or swiotlb max mapping size, as before. SCSI host bus adapters that support 64-bit DMA may support much larger transfer sizes than 128 KiB. Indeed, and that is my problem today, as my storage controller is generating DMA mapping lengths which exceeds 128K and they slow everything down. If you say that SRP enjoys best peformance with larger transfers then can you please test this with an IOMMU enabled (iommu group type DMA or DMA-FQ)? Thanks, John ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 3/4] scsi: core: Cap shost max_sectors according to DMA optimum mapping limits
On 6/8/22 10:50, John Garry wrote: Please note that this limit only applies if we have an IOMMU enabled for the scsi host dma device. Otherwise we are limited by dma direct or swiotlb max mapping size, as before. SCSI host bus adapters that support 64-bit DMA may support much larger transfer sizes than 128 KiB. Thanks, Bart. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 3/4] scsi: core: Cap shost max_sectors according to DMA optimum mapping limits
On 08/06/2022 18:33, Bart Van Assche wrote: On 6/6/22 02:30, John Garry wrote: + if (dma_dev->dma_mask) { + shost->max_sectors = min_t(unsigned int, shost->max_sectors, + dma_opt_mapping_size(dma_dev) >> SECTOR_SHIFT); + } Since IOVA_RANGE_CACHE_MAX_SIZE = 6 this limits max_sectors to 2**6 * PAGE_SIZE or 256 KiB if the page size is 4 KiB. It's actually 128K for 4K page size, as any IOVA size is roundup to power-of-2 when testing if we may cache it, which means anything >128K would roundup to 256K and cannot be cached. I think that's too small. Some (SRP) storage arrays require much larger transfers to achieve optimal performance. Have you tried this achieve this optimal performance with an IOMMU enabled? Please note that this limit only applies if we have an IOMMU enabled for the scsi host dma device. Otherwise we are limited by dma direct or swiotlb max mapping size, as before. Thanks, John ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu
Re: [PATCH v3 3/4] scsi: core: Cap shost max_sectors according to DMA optimum mapping limits
On 6/6/22 02:30, John Garry wrote: + if (dma_dev->dma_mask) { + shost->max_sectors = min_t(unsigned int, shost->max_sectors, + dma_opt_mapping_size(dma_dev) >> SECTOR_SHIFT); + } Since IOVA_RANGE_CACHE_MAX_SIZE = 6 this limits max_sectors to 2**6 * PAGE_SIZE or 256 KiB if the page size is 4 KiB. I think that's too small. Some (SRP) storage arrays require much larger transfers to achieve optimal performance. Thanks, Bart. ___ iommu mailing list iommu@lists.linux-foundation.org https://lists.linuxfoundation.org/mailman/listinfo/iommu