unbind guest page table to host

Duan, Zhenzhong Mon, 16 Jun 2025 03:17:58 -0700


>-----Original Message-----
>From: Liu, Yi L <yi.l....@intel.com>
>Subject: Re: [PATCH rfcv3 15/21] intel_iommu: Bind/unbind guest page table to
>host
>
>On 2025/6/16 16:54, Duan, Zhenzhong wrote:
>>
>>
>>> -----Original Message-----
>>> From: Nicolin Chen <nicol...@nvidia.com>
>>> Subject: Re: [PATCH rfcv3 15/21] intel_iommu: Bind/unbind guest page table
>to
>>> host
>>>
>>> On Mon, Jun 16, 2025 at 03:24:06AM +0000, Duan, Zhenzhong wrote:
>>>> Hi @Liu, Yi L @Nicolin Chen, for emulated/passthru devices
>>>> behind the same pcie-pci bridge, I think of an idea, adding
>>>> a new PCI callback:
>>>>
>>>> AddressSpace * (*get_address_space_extend)(PCIBus *bus,
>>>> void *opaque, int devfn, bool accel_dev);
>>>>
>>>> which pass in real bus/devfn and a new param accel_dev which
>>>> is true for vfio device.
>>>
>>> Just =y for all vfio (passthrough) devices?
>>>
>
>TBH. It's a bit hacky to me in concept. It may be more cleaner to detect
>and block such topology.


OK, then we don't need get_address_space_extend(). Will do in v2.

>
>BTW. @Nic, I suppose nesting vSMMUv3 does not have this concern since
>you will put the passthru devices under a separate vIOMMU which should
>ensure that the emulated devices won't share AS with passthrough device.
>right?
>
>>> ARM tentatively does this for get_address_space using Shameer's
>>> trick to detect if the device is a passthrough VFIO one:
>>>
>>>     PCIDevice *pdev = pci_find_device(bus, pci_bus_num(bus), devfn);
>>>     bool has_iommufd = !!object_property_find(OBJECT(pdev), "iommufd");
>>>
>>>     if (smmu->nested && ... && has_iommufd) {
>>>         return &sdev->as_sysmem;
>>>     }
>>>
>>> So, I guess "accel_dev" could be just:
>>>     !!object_property_find(OBJECT(pdev), "iommufd")
>>> ?
>>
>> You are right, we don't need param accel_dev. Below should work:
>>
>> object_dynamic_cast(OBJECT(hiod), TYPE_HOST_IOMMU_DEVICE_IOMMUFD)
>>
>>>
>>>> Vtd implements this callback and return separate AS for vfio
>>>> device if it's under an pcie-pci bridge and flts=on;
>>>> otherwise it fallback to call .get_address_space(). This way
>>>> emulated devices and passthru devices behind the same pcie-pci
>>>> bridge can have different AS.
>>>
>>> Again, if "vfio-device" tag with "iommufd" property is enough to
>>> identify devices to separate their address spaces, perhaps the
>>> existing get_address_space is enough.
>>
>> We need get_address_space_extend() to pass real BDF.
>> get_address_space pass group's BDF which made pci_find_device return wrong
>device.
>>
>>>
>>>> If above idea is acceptable, then only obstacle is ERRATA_772415,
>>>> maybe we can let VFIO check this errata and bypass RO mapping from
>>>> beginning?
>>>
>>> Yes. There can be some communication between vIOMMU and the VFIO
>>> core.
>>>
>>>> Or we just block this VFIO device running with flts=on if
>>>> ERRATA_772415 and suggesting running with flts=off?
>>>
>>> That sounds like a simpler solution, so long as nobody complains
>>> about this limitation :)
>>
>> I plan to apply this simpler solution except there is objection, because
>> I don't want to bring complexity to VFIO just for an Errata. I remember
>> ERRATA_772415 exists only on old SPR, @Liu, Yi L can correct me if I'm wrong.
>
>hmmm. I'm fine to pass some info to vfio hence let vfio skip RO mappings.
>Is there other info that VFIO needs to get from vIOMMU? Hope start adding
>such mechanism with normal requirement. :)

I can think of ERRATA_772415 and NESTED capability. NESTED used for creating
VFIO default HWPT in stage2 mode.

Thanks
Zhenzhong

RE: [PATCH rfcv3 15/21] intel_iommu: Bind/unbind guest page table to host

Reply via email to