On Wed, May 21, 2025 at 07:14:45PM +0800, Zhenzhong Duan wrote:
> +static const MemoryListener iommufd_s2domain_memory_listener = {
> +    .name = "iommufd_s2domain",
> +    .priority = 1000,
> +    .region_add = iommufd_listener_region_add_s2domain,
> +    .region_del = iommufd_listener_region_del_s2domain,
> +};

Would you mind elaborating When and how vtd does all S2 mappings?

On ARM, the default vfio_memory_listener could capture the entire
guest RAM and add to the address space. So what we do is basically
reusing the vfio_memory_listener:
https://lore.kernel.org/qemu-devel/20250311141045.66620-13-shameerali.kolothum.th...@huawei.com/
 
The thing is that when a VFIO device is attached to the container
upon a nesting configuration, the ->get_address_space op should
return the system address space as S1 nested HWPT isn't allocated
yet. Then all the iommu as routines in vfio_listener_region_add()
would be skipped, ending up with mapping the guest RAM in S2 HWPT
correctly. Not until the S1 nested HWPT is allocated by the guest
OS (after guest boots), can the ->get_address_space op return the
iommu address space.

With this address space shift, S2 mappings can be simply captured
and done by vfio_memory_listener. Then, such an s2domain listener
would be largely redundant.

So the second question is:
Does vtd have to own this iommufd_s2domain_memory_listener? IOW,
does vtd_host_dma_iommu() have to return the iommu address space
all the time?

> +static int vtd_create_s1_hwpt(HostIOMMUDeviceIOMMUFD *idev,
> +                              VTDS2Hwpt *s2_hwpt, VTDHwpt *hwpt,
> +                              VTDPASIDEntry *pe, Error **errp)
> +{
> +    struct iommu_hwpt_vtd_s1 vtd;
> +    uint32_t hwpt_id, s2_hwpt_id = s2_hwpt->hwpt_id;
> +
> +    vtd_init_s1_hwpt_data(&vtd, pe);
> +
> +    if (!iommufd_backend_alloc_hwpt(idev->iommufd, idev->devid,
> +                                    s2_hwpt_id, 0, IOMMU_HWPT_DATA_VTD_S1,
> +                                    sizeof(vtd), &vtd, &hwpt_id, errp)) {
> +        return -EINVAL;
> +    }
> +
> +    hwpt->hwpt_id = hwpt_id;
> +
> +    return 0;
> +}
> +
> +static void vtd_destroy_s1_hwpt(HostIOMMUDeviceIOMMUFD *idev, VTDHwpt *hwpt)
> +{
> +    iommufd_backend_free_id(idev->iommufd, hwpt->hwpt_id);
> +}

I think you did some substantial work to isolate the get_hw_info
part inside the iommufd backend code, which looks nice and clean
as the vIOMMU code simply does iodc->get_cap().

However, that then makes these direct raw backend function calls
very awkward :-/

In my view, the way to make sense is either:
* We don't do any isolation, but just call raw backend functions
  in vIOMMU code
* We do every isolation, and never call raw backend functions in
  vIOMMU code
?

Thanks
Nicolin

Reply via email to