Re: [PATCH v3 0/2] iommu/virtio: Enable IOMMU_CAP_DERRED_FLUSH
On Mon, Nov 20, 2023 at 03:51:55PM +0100, Niklas Schnelle wrote: > Niklas Schnelle (2): > iommu/virtio: Make use of ops->iotlb_sync_map > iommu/virtio: Add ops->flush_iotlb_all and enable deferred flush > > drivers/iommu/virtio-iommu.c | 33 - > 1 file changed, 32 insertions(+), 1 deletion(-) Applied, thanks.
Re: [PATCH v3 0/2] iommu/virtio: Enable IOMMU_CAP_DERRED_FLUSH
Hi Niklas, On Mon, Nov 20, 2023 at 03:51:55PM +0100, Niklas Schnelle wrote: > Hi All, > > Previously I used virtio-iommu as a non-s390x test vehicle[0] for the > single queue flushing scheme introduced by my s390x DMA API conversion > series[1]. For this I modified virtio-iommu to a) use .iotlb_sync_map > and b) enable IOMMU_CAP_DEFERRED_FLUSH. It turned out that deferred > flush and even just the introduction of ops->iotlb_sync_map yield > performance uplift[2] even with per-CPU queues. So here is a small > series of these two changes. > > The code is also available on the b4/viommu-deferred-flush branch of my > kernel.org git repository[3]. > > Note on testing: I tested this series on my AMD Ryzen 3900X workstation > using QEMU 8.1.2 a pass-through NVMe and Intel 82599 NIC VFs. For the > NVMe I saw an increase of about 10% in IOPS and 30% in read bandwidth > compared with v6.7-rc2. One odd thing though is that QEMU seemed to make > the entire guest resident/pinned once I passed-through a PCI device. > I seem to remember this wasn't the case with my last version but not > sure which QEMU version I used back then. That's probably expected, now that boot-bypass is enabled by default: on VM boot, endpoints are able to do DMA to the entire guest-physical address space, until a virtio-iommu driver disables global bypass in the config space (at which point the pinned memory is hopefully reclaimed by the host). QEMU enables it by default to mimic other IOMMU implementations, and to allow running firmware or OS that don't support virtio-iommu. It can be disabled with boot-bypass=off > @Jean-Philippe: I didn't include your R-b's as I changed back to the > nr_endpoints check and this is like 30% of the patches. Thank you for the patches. For the series: Reviewed-by: Jean-Philippe Brucker
[PATCH v3 0/2] iommu/virtio: Enable IOMMU_CAP_DERRED_FLUSH
Hi All, Previously I used virtio-iommu as a non-s390x test vehicle[0] for the single queue flushing scheme introduced by my s390x DMA API conversion series[1]. For this I modified virtio-iommu to a) use .iotlb_sync_map and b) enable IOMMU_CAP_DEFERRED_FLUSH. It turned out that deferred flush and even just the introduction of ops->iotlb_sync_map yield performance uplift[2] even with per-CPU queues. So here is a small series of these two changes. The code is also available on the b4/viommu-deferred-flush branch of my kernel.org git repository[3]. Note on testing: I tested this series on my AMD Ryzen 3900X workstation using QEMU 8.1.2 a pass-through NVMe and Intel 82599 NIC VFs. For the NVMe I saw an increase of about 10% in IOPS and 30% in read bandwidth compared with v6.7-rc2. One odd thing though is that QEMU seemed to make the entire guest resident/pinned once I passed-through a PCI device. I seem to remember this wasn't the case with my last version but not sure which QEMU version I used back then. @Jean-Philippe: I didn't include your R-b's as I changed back to the nr_endpoints check and this is like 30% of the patches. Thanks, Niklas [0] https://lore.kernel.org/lkml/20230726111433.1105665-1-schne...@linux.ibm.com/ [1] https://lore.kernel.org/lkml/20230825-dma_iommu-v12-0-413445599...@linux.ibm.com/ [2] https://lore.kernel.org/lkml/20230802123612.GA6142@myrica/ Signed-off-by: Niklas Schnelle --- Changes in v3: - Removed NULL check from viommu_sync_req() (Jason) - Went back to checking for 0 endpoints in IOTLB ops (Robin) - Rebased on v6.7-rc2 which includes necessary iommu-dma changes - Link to v2: https://lore.kernel.org/r/20230918-viommu-sync-map-v2-0-f33767f6c...@linux.ibm.com Changes in v2: - Check for viommu == NULL in viommu_sync_req() instead of for 0 endpoints in ops (Jean-Philippe) - Added comment where viommu can be NULL (me) - Link to v1: https://lore.kernel.org/r/20230825-viommu-sync-map-v1-0-56bdcfaa2...@linux.ibm.com To: Jean-Philippe Brucker To: Joerg Roedel To: Will Deacon To: Jason Gunthorpe To: Robin Murphy Cc: virtualization@lists.linux-foundation.org, Cc: io...@lists.linux.dev Cc: linux-ker...@vger.kernel.org, Cc: Niklas Schnelle --- Niklas Schnelle (2): iommu/virtio: Make use of ops->iotlb_sync_map iommu/virtio: Add ops->flush_iotlb_all and enable deferred flush drivers/iommu/virtio-iommu.c | 33 - 1 file changed, 32 insertions(+), 1 deletion(-) --- base-commit: 98b1cc82c4affc16f5598d4fa14b1858671b2263 change-id: 20230825-viommu-sync-map-1bf0cc4fdc15 Best regards, -- Niklas Schnelle