On Sun, May 25, 2025 at 04:07:03PM -0300, Jason Gunthorpe wrote: > On Tue, May 20, 2025 at 03:42:24PM -0700, Shyam Saini wrote: > > Hi Jason, > > > > apologies for the delayed response. > > > > > On Wed, Apr 16, 2025 at 11:04:27AM -0700, Jacob Pan wrote: > > > > > > > Per last discussion "SMMU driver have a list of potential addresses and > > > > select the first one that does not intersect with the non-working IOVA > > > > ranges.". If we don't know what the "non-working IOVA" is, how do we > > > > know it does not intersect the "potential addresses"? > > > > > > I had understood from previous discussions that this platform is > > > properly creating IOMMU_RESV_RESERVED regions for the IOVA that > > > doesn't work. Otherwise everything is broken.. > > > > > > Presumably that happens through iommu_dma_get_resv_regions() calling > > > of_iommu_get_resv_regions() on a DT platform. There is a schema > > > describing how to do this, so platform firmware should be able to do it.. > > > > > > So the fix seems trivial enough to me: > > > > > > diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c > > > b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c > > > index b4c21aaed1266a..ebba18579151bc 100644 > > > --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c > > > +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c > > > @@ -3562,17 +3562,29 @@ static int arm_smmu_of_xlate(struct device *dev, > > > static void arm_smmu_get_resv_regions(struct device *dev, > > > struct list_head *head) > > > { > > > - struct iommu_resv_region *region; > > > - int prot = IOMMU_WRITE | IOMMU_NOEXEC | IOMMU_MMIO; > > > - > > > - region = iommu_alloc_resv_region(MSI_IOVA_BASE, MSI_IOVA_LENGTH, > > > - prot, IOMMU_RESV_SW_MSI, GFP_KERNEL); > > > - if (!region) > > > - return; > > > - > > > - list_add_tail(®ion->list, head); > > > + static const u64 msi_bases[] = { MSI_IOVA_BASE, 0x12340000 }; > > > > > > iommu_dma_get_resv_regions(dev, head); > > > > my understand is, this hook is not called for all the devices, eg: pcie dts > > node > > doesn't use [1] "iommus" property instead it uses "iommu-map" property > > as a consequence, [1] while loop exits prematurely and > > iommu_dma_get_resv_regions() > > is not called, so there is no IOVA reservation for the pcie device. > > I can't really understand this sentance. > > The above is the only place that creates a IOMMU_RESV_SW_MSI so it is > definately called and used, right? If not where does your > IOMMU_RESV_SW_MSI come from?
code tracing and printks in that code path suggests iommu_dma_get_resv_regions() called by vfio-pci driver, i didn't mention vfio-pci in my last reply since it doesn't have an associated device tree node, sorry about that By enabling this [1] dev_dbg message i get this: vfio-pci 0000:01:00.2: device is behind an iommu In case of 0000:01:00.2 device, when it invokes iommu_dma_get_resv_regions(), code hit [2] this path > > This function is also the only thing that computes the reserved ranges > that iommu_get_resv_regions() returns. > > As above, I've asked a few times now if your resv_regions() is > correct, meaning there is a reserved range covering the address space > that doesn't have working translation. That means > iommu_get_resv_regions() returns such a range. sorry about missing that, i see msi iova being reserved: cat /sys/kernel/iommu_groups/*/reserved_regions 0x0000000008000000 0x00000000080fffff msi 0x0000000008000000 0x00000000080fffff msi 0x0000000008000000 0x00000000080fffff msi 0x0000000008000000 0x00000000080fffff msi [output trimmed] > > If you don't have that then you have a bigger platform problem, IMHO, > as vfio/iommufd only respect reserved ranges. > > Otherwise, what is the issue you see, exactly? Did you even try it? > Yes, i tried that, This is how my dts node looked like reserved-memory { faulty_iova: resv_faulty { iommu-addresses = <&pcieX 0x8000000 0x100000>; }; .. .. } &pcieX { memory-region = <&faulty_iova>; }; I see it working for the devices which are calling iommu_get_resv_regions(), eg if I specify faulty_iova for dma controller dts node then i see an additional entry in the related group, say Y: /sys/kernel/iommu_groups/Y/reserved_regions Did i misunderstood? appreciate your help on this Thanks, Shyam [1] https://elixir.bootlin.com/linux/v6.15-rc7/source/drivers/of/device.c#L170 [2] https://elixir.bootlin.com/linux/v6.15-rc7/source/drivers/iommu/of_iommu.c#L145