>-----Original Message----- >From: Cédric Le Goater <c...@redhat.com> >Sent: Thursday, September 21, 2023 5:45 PM >Subject: Re: [PATCH v1 09/22] vfio/container: Introduce >vfio_[attach/detach]_device > >On 8/30/23 12:37, Zhenzhong Duan wrote: >> From: Eric Auger <eric.au...@redhat.com> >> >> We want the VFIO devices to be able to use two different >> IOMMU callbacks, the legacy VFIO one and the new iommufd one. >> >> Introduce vfio_[attach/detach]_device which aim at hiding the >> underlying IOMMU backend (IOCTLs, datatypes, ...). >> >> Once vfio_attach_device completes, the device is attached >> to a security context and its fd can be used. Conversely >> When vfio_detach_device completes, the device has been >> detached to the security context. >> >> In this patch, only the vfio-pci device gets converted to use >> the new API. Subsequent patches will handle other devices. >> >> Signed-off-by: Eric Auger <eric.au...@redhat.com> >> Signed-off-by: Yi Liu <yi.l....@intel.com> >> Signed-off-by: Zhenzhong Duan <zhenzhong.d...@intel.com> >> --- >> hw/vfio/container.c | 66 +++++++++++++++++++++++++++++++++++ >> hw/vfio/pci.c | 50 ++++---------------------- >> hw/vfio/trace-events | 2 +- >> include/hw/vfio/vfio-common.h | 3 ++ >> 4 files changed, 76 insertions(+), 45 deletions(-) >> >> diff --git a/hw/vfio/container.c b/hw/vfio/container.c >> index 175cdbbdff..74556da0c7 100644 >> --- a/hw/vfio/container.c >> +++ b/hw/vfio/container.c >> @@ -1083,3 +1083,69 @@ int vfio_eeh_as_op(AddressSpace *as, uint32_t op) >> } >> return vfio_eeh_container_op(container, op); >> } >> + >> +static int vfio_device_groupid(VFIODevice *vbasedev, Error **errp) >> +{ >> + char *tmp, group_path[PATH_MAX], *group_name; >> + int ret, groupid; >> + ssize_t len; >> + >> + tmp = g_strdup_printf("%s/iommu_group", vbasedev->sysfsdev); >> + len = readlink(tmp, group_path, sizeof(group_path)); >> + g_free(tmp); >> + >> + if (len <= 0 || len >= sizeof(group_path)) { >> + ret = len < 0 ? -errno : -ENAMETOOLONG; >> + error_setg_errno(errp, -ret, "no iommu_group found"); >> + return ret; >> + } >> + >> + group_path[len] = 0; >> + >> + group_name = basename(group_path); >> + if (sscanf(group_name, "%d", &groupid) != 1) { >> + error_setg_errno(errp, errno, "failed to read %s", group_path); >> + return -errno; >> + } >> + return groupid; >> +} > >VFIO has 4 other routines reading the iommu_group from sysfs : > > vfio_ccw_get_group() > vfio_ap_get_group() > vfio_base_device_init() > sysfs_find_group_file() > >which could use this helper. Thanks for introducing it ! > > > >> + >> +int vfio_attach_device(char *name, VFIODevice *vbasedev, >> + AddressSpace *as, Error **errp) >> +{ >> + int groupid = vfio_device_groupid(vbasedev, errp); >> + VFIODevice *vbasedev_iter; >> + VFIOGroup *group; >> + int ret; >> + >> + if (groupid < 0) { >> + return groupid; >> + } >> + >> + group = vfio_get_group(groupid, as, errp); >> + if (!group) { >> + return -ENOENT; >> + } >> + >> + QLIST_FOREACH(vbasedev_iter, &group->device_list, next) { >> + if (strcmp(vbasedev_iter->name, vbasedev->name) == 0) { >> + error_setg(errp, "device is already attached"); >> + vfio_put_group(group); >> + return -EBUSY; >> + } >> + } >> + ret = vfio_get_device(group, name, vbasedev, errp); >> + if (ret) { >> + vfio_put_group(group); >> + } >> + >> + return ret; >> +} >> + >> +void vfio_detach_device(VFIODevice *vbasedev) >> +{ >> + VFIOGroup *group = vbasedev->group; >> + >> + vfio_put_base_device(vbasedev); >> + vfio_put_group(group); >> +} >> diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c >> index a205c6b113..34f65ecd17 100644 >> --- a/hw/vfio/pci.c >> +++ b/hw/vfio/pci.c >> @@ -2828,10 +2828,10 @@ static void vfio_populate_device(VFIOPCIDevice >*vdev, Error **errp) >> >> static void vfio_put_device(VFIOPCIDevice *vdev) >> { >> + vfio_detach_device(&vdev->vbasedev); >> + >> g_free(vdev->vbasedev.name); >> g_free(vdev->msix); >> - >> - vfio_put_base_device(&vdev->vbasedev); >> } >> >> static void vfio_err_notifier_handler(void *opaque) >> @@ -2978,13 +2978,9 @@ static void vfio_realize(PCIDevice *pdev, Error >**errp) >> { >> VFIOPCIDevice *vdev = VFIO_PCI(pdev); >> VFIODevice *vbasedev = &vdev->vbasedev; >> - VFIODevice *vbasedev_iter; >> - VFIOGroup *group; >> - char *tmp, *subsys, group_path[PATH_MAX], *group_name; >> + char *tmp, *subsys; >> Error *err = NULL; >> - ssize_t len; >> struct stat st; >> - int groupid; >> int i, ret; >> bool is_mdev; >> char uuid[UUID_FMT_LEN]; >> @@ -3015,38 +3011,7 @@ static void vfio_realize(PCIDevice *pdev, Error >**errp) >> vbasedev->type = VFIO_DEVICE_TYPE_PCI; >> vbasedev->dev = DEVICE(vdev); >> >> - tmp = g_strdup_printf("%s/iommu_group", vbasedev->sysfsdev); >> - len = readlink(tmp, group_path, sizeof(group_path)); >> - g_free(tmp); >> - >> - if (len <= 0 || len >= sizeof(group_path)) { >> - error_setg_errno(errp, len < 0 ? errno : ENAMETOOLONG, >> - "no iommu_group found"); >> - goto error; >> - } >> - >> - group_path[len] = 0; >> - >> - group_name = basename(group_path); >> - if (sscanf(group_name, "%d", &groupid) != 1) { >> - error_setg_errno(errp, errno, "failed to read %s", group_path); >> - goto error; >> - } >> - >> - trace_vfio_realize(vbasedev->name, groupid); >> - >> - group = vfio_get_group(groupid, pci_device_iommu_address_space(pdev), >errp); >> - if (!group) { >> - goto error; >> - } >> - >> - QLIST_FOREACH(vbasedev_iter, &group->device_list, next) { >> - if (strcmp(vbasedev_iter->name, vbasedev->name) == 0) { >> - error_setg(errp, "device is already attached"); >> - vfio_put_group(group); >> - goto error; >> - } >> - } >> + trace_vfio_realize(vbasedev->name); > >I would move the trace event after vfio_attach_device() and print out the >group. >Or simply add trace events in vfio_detach/attach_device(). > >This is a general comment on the VFIO PCI routines which do not use a >'vfio_pci' >prefix and I find it confusing, sometimes. Like this call stack : > > vfio_put_device() > vfio_detach_device() > vfio_put_base_device() > >I think we should rename vfio_put_device() in vfio_pci_put_device(). This is >not for this series. Good suggestion! I had ever been confused by this function too. I can help if you have not done that yet.
Thanks Zhenzhong