>-----Original Message-----
>From: Steven Sistare <steven.sist...@oracle.com>
>Subject: Re: [PATCH V3 39/42] vfio/iommufd: reconstruct hwpt
>
>On 5/18/2025 11:25 PM, Duan, Zhenzhong wrote:
>>> -----Original Message-----
>>> From: Steve Sistare <steven.sist...@oracle.com>
>>> Subject: [PATCH V3 39/42] vfio/iommufd: reconstruct hwpt
>>>
>>> Save the hwpt_id in vmstate.  In realize, skip its allocation from
>>> iommufd_cdev_attach -> iommufd_cdev_attach_container ->
>>> iommufd_cdev_autodomains_get.
>>>
>>> Rebuild userland structures to hold hwpt_id by calling
>>> iommufd_cdev_rebuild_hwpt at post load time.  This depends on hw_caps,
>which
>>> was restored by the post_load call to vfio_device_hiod_create_and_realize.
>>>
>>> Signed-off-by: Steve Sistare <steven.sist...@oracle.com>
>>> ---
>>> hw/vfio/cpr-iommufd.c      |  7 +++++++
>>> hw/vfio/iommufd.c          | 24 ++++++++++++++++++++++--
>>> hw/vfio/trace-events       |  1 +
>>> hw/vfio/vfio-iommufd.h     |  3 +++
>>> include/hw/vfio/vfio-cpr.h |  1 +
>>> 5 files changed, 34 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/hw/vfio/cpr-iommufd.c b/hw/vfio/cpr-iommufd.c
>>> index 24cdf10..6d3f4e0 100644
>>> --- a/hw/vfio/cpr-iommufd.c
>>> +++ b/hw/vfio/cpr-iommufd.c
>>> @@ -110,6 +110,12 @@ static int vfio_device_post_load(void *opaque, int
>>> version_id)
>>>          error_report_err(err);
>>>          return false;
>>>      }
>>> +    if (!vbasedev->mdev) {
>>> +        VFIOIOMMUFDContainer *container = container_of(vbasedev-
>>bcontainer,
>>> +                                                       
>>> VFIOIOMMUFDContainer,
>>> +                                                       bcontainer);
>>> +        iommufd_cdev_rebuild_hwpt(vbasedev, container);
>>> +    }
>>>      return true;
>>> }
>>>
>>> @@ -121,6 +127,7 @@ static const VMStateDescription vfio_device_vmstate
>= {
>>>      .needed = cpr_needed_for_reuse,
>>>      .fields = (VMStateField[]) {
>>>          VMSTATE_INT32(devid, VFIODevice),
>>> +        VMSTATE_UINT32(cpr.hwpt_id, VFIODevice),
>>>          VMSTATE_END_OF_LIST()
>>>      }
>>> };
>>> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
>>> index d980684..ec79c83 100644
>>> --- a/hw/vfio/iommufd.c
>>> +++ b/hw/vfio/iommufd.c
>>> @@ -318,6 +318,7 @@ static bool
>>> iommufd_cdev_detach_ioas_hwpt(VFIODevice *vbasedev, Error **errp)
>>> static void iommufd_cdev_use_hwpt(VFIODevice *vbasedev, VFIOIOASHwpt
>>> *hwpt)
>>> {
>>>      vbasedev->hwpt = hwpt;
>>> +    vbasedev->cpr.hwpt_id = hwpt->hwpt_id;
>>>      vbasedev->iommu_dirty_tracking = iommufd_hwpt_dirty_tracking(hwpt);
>>>      QLIST_INSERT_HEAD(&hwpt->device_list, vbasedev, hwpt_next);
>>> }
>>> @@ -373,6 +374,23 @@ static bool iommufd_cdev_make_hwpt(VFIODevice
>>> *vbasedev,
>>>      return true;
>>> }
>>>
>>> +void iommufd_cdev_rebuild_hwpt(VFIODevice *vbasedev,
>>> +                               VFIOIOMMUFDContainer *container)
>>> +{
>>> +    VFIOIOASHwpt *hwpt;
>>> +    int hwpt_id = vbasedev->cpr.hwpt_id;
>>> +
>>> +    trace_iommufd_cdev_rebuild_hwpt(container->be->fd, hwpt_id);
>>> +
>>> +    QLIST_FOREACH(hwpt, &container->hwpt_list, next) {
>>> +        if (hwpt->hwpt_id == hwpt_id) {
>>> +            iommufd_cdev_use_hwpt(vbasedev, hwpt);
>>> +            return;
>>> +        }
>>> +    }
>>> +    iommufd_cdev_make_hwpt(vbasedev, container, hwpt_id, false, NULL);
>>> +}
>>> +
>>> static bool iommufd_cdev_autodomains_get(VFIODevice *vbasedev,
>>>                                           VFIOIOMMUFDContainer *container,
>>>                                           Error **errp)
>>> @@ -567,7 +585,8 @@ static bool iommufd_cdev_attach(const char *name,
>>> VFIODevice *vbasedev,
>>>              vbasedev->iommufd != container->be) {
>>>              continue;
>>>          }
>>> -        if (!iommufd_cdev_attach_container(vbasedev, container, &err)) {
>>> +        if (!vbasedev->cpr.reused &&
>>> +            !iommufd_cdev_attach_container(vbasedev, container, &err)) {
>>>              const char *msg = error_get_pretty(err);
>>>
>>>              trace_iommufd_cdev_fail_attach_existing_container(msg);
>>> @@ -605,7 +624,8 @@ skip_ioas_alloc:
>>>      bcontainer = &container->bcontainer;
>>>      vfio_address_space_insert(space, bcontainer);
>>>
>>> -    if (!iommufd_cdev_attach_container(vbasedev, container, errp)) {
>>> +    if (!vbasedev->cpr.reused &&
>>> +        !iommufd_cdev_attach_container(vbasedev, container, errp)) {
>>
>> All container attaching is bypassed in new qemu. I have a concern that new
>qemu doesn't generate same containers as old qemu if there are more than one
>container in old qemu.
>> Then there can be devices attached to wrong container or attaching fail in 
>> post
>load.
>
>Yes, this relates to our discussion in patch 35.  Please explain, how can a 
>single
>iommufd backend have multiple containers?

Similar as legacy container, there can be multiple containers in one address 
space.
If existing mapping in one container conflicts with new device's reserved 
region,
Attaching to that container will fail and a new container need to be created to 
accept new device's reserved region.

Maybe you need to do same thing just like you do for legacy container, e.g., 
saving  ioas_id just like you saving container->fd, then checking existing 
ioas_id and restore iommufd container based on that.

Zhenzhong

Reply via email to