在 2026/6/1 7:51, Jason Gunthorpe 写道:
On Sun, May 31, 2026 at 05:36:29PM +0800, Guanghui Feng wrote:
Add iova_to_phys_length callback to struct iommu_domain_ops alongside
the existing iova_to_phys. The new callback returns both the physical
address and the PTE mapping page size in a single page table walk.
Add iommu_iova_to_phys_length() core function that:
- Checks ops->iova_to_phys_length first (preferred path)
- Falls back to ops->iova_to_phys for unmigrated drivers
This enables callers like VFIO to efficiently traverse IOVA space
by actual mapping granularity instead of fixed PAGE_SIZE steps.
Signed-off-by: Guanghui Feng <[email protected]>
Acked-by: Shiqiang Zhang <[email protected]>
Acked-by: Simon Guo <[email protected]>
---
drivers/iommu/iommu.c | 34 ++++++++++++++++++++++++++++++++--
include/linux/iommu.h | 9 +++++++++
2 files changed, 41 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c
index d1a9e713d3a0..43323229a1df 100644
--- a/drivers/iommu/iommu.c
+++ b/drivers/iommu/iommu.c
@@ -2545,15 +2545,45 @@ void iommu_detach_group(struct iommu_domain *domain,
struct iommu_group *group)
}
EXPORT_SYMBOL_GPL(iommu_detach_group);
-phys_addr_t iommu_iova_to_phys(struct iommu_domain *domain, dma_addr_t iova)
+/**
+ * iommu_iova_to_phys_length - Translate IOVA and return mapping page size
+ * @domain: IOMMU domain to query
+ * @iova: IO virtual address to translate
+ * @mapped_length: Output parameter for the PTE page size (e.g. 4KB/2MB/1GB)
+ *
+ * Like iommu_iova_to_phys() but additionally returns the page size of the
+ * PTE mapping at @iova through @mapped_length.
+ *
+ * Return: The physical address for the given IOVA, or 0 if no translation.
+ */
When introducing the new function I would like to fix this 0 error as
well, it should return PHYS_MAX for error
Implementations such as arm_smmu_iova_to_phys/DOMAIN_NS(iova_to_phys)
all use a return value of 0 as an invalid state, so 0 is used as the
representation
of an invalid state to maintain compatibility.
+phys_addr_t iommu_iova_to_phys_length(struct iommu_domain *domain,
+ dma_addr_t iova,
+ size_t *mapped_length)
{
+ if (mapped_length)
+ *mapped_length = 0;
+
if (domain->type == IOMMU_DOMAIN_IDENTITY)
return iova;
if (domain->type == IOMMU_DOMAIN_BLOCKED)
return 0;
Any domain that doesn't have an op should fail, blocked is one example
In accordance with the implementation of iommu_iova_to_phys, it returns
a phy value of 0 in invalid states.
- return domain->ops->iova_to_phys(domain, iova);
+ if (domain->ops->iova_to_phys_length)
+ return domain->ops->iova_to_phys_length(domain, iova,
+ mapped_length);
+
+ /* Fallback to legacy iova_to_phys without length info */
+ if (domain->ops->iova_to_phys)
+ return domain->ops->iova_to_phys(domain, iova);
If it falls back it should return something sensible for the length.
I suggest you approach the patch plan a little differently, the first
patches should implement the new function and an iommput
implementation
Arrange things so the normal iova_to_phys calls the new function if it
is available and discards the length.
Then convert callers that can take advantage of it. Have the fallback
path also compute the length by iterating internally.
Finally one patch per driver implementing the new op, this could even
be a second series.
Don't remove iova_to_phys(), it is fine for things that don't need the
length.
Does this mean retaining the iommu_iova_to_phys implementation but
implementing it through domain->ops->iova_to_phys_length (mapped_length
is NULL)?
Jason