When used by vhost-vDPA bus driver for VM, the control virtqueue should be shadowed via userspace VMM (QEMU) instead of being assigned directly to Guest. This is because QEMU needs to know the device state in order to start and stop device correctly (e.g for Live Migration).
This requies to isolate the memory mapping for control virtqueue presented by vhost-vDPA to prevent guest from accessing it directly. This series add support to multiple address spaces in VDUSE device allowing selective virtqueue isolation through address space IDs (ASID). The VDUSE device needs to report: * Number of virtqueue groups * Association of each vq group with each virtqueue * Number of address spaces supported. Then, the vDPA driver can modify the ASID assigned to each VQ group to isolate the memory AS. This aligns VDUSE with vdpa_sim and nvidia mlx5 devices which already support ASID. This helps to isolate the environments for the virtqueues that will not be assigned directly. E.g in the case of virtio-net, the control virtqueue will not be assigned directly to guest. This series depends on the series that reworks the VDUSE mapping API: https://lore-kernel.gnuweeb.org/all/20250924070045.10361-1-jasow...@redhat.com/ Also, to be able to test this patch, the user needs to manually revert 56e71885b034 ("vduse: Temporarily fail if control queue feature requested"). PATCH v5: * Properly return errno if copy_to_user returns >0 in VDUSE_IOTLB_GET_FD ioctl (Jason). * Properly set domain bounce size to divide equally between nas (Jason). * Revert core vdpa changes (Jason). * Fix group == ngroup case in checking VQ_SETUP argument (Jason). * Exclude "padding" member from the only >V1 members in vduse_dev_request. PATCH v4: * Consider config->nas == 0 and config->ngroups == 0 as a fail (Jason). * Revert the "invalid vq group" concept and assume 0 if not set. * Divide each domain bounce size between the device bounce size (Jason). * Revert unneeded addr = NULL assignment (Jason) * Change if (x && (y || z)) return to if (x) { if (y) return; if (z) return; } (Jason) * Change a bad multiline comment, using @ caracter instead of * (Jason). PATCH v3: * Make the default group an invalid group as long as VDUSE device does not set it to some valid u32 value. Modify the vdpa core to take that into account (Jason). Adapt all the virtio_map_ops callbacks to it. * Make setting status DRIVER_OK fail if vq group is not valid. * Create the VDUSE_DEV_MAX_GROUPS and VDUSE_DEV_MAX_AS instead of using a magic number * Remove the _int name suffix from struct vduse_vq_group. * Get the vduse domain through the vduse_as in the map functions (Jason). * Squash the patch implementing the AS logic with the patch creating the vduse_as struct (Jason). PATCH v2: * Now the vq group is in vduse_vq_config struct instead of issuing one VDUSE message per vq. * Convert the use of mutex to rwlock (Xie Yongji). PATCH v1: * Fix: Remove BIT_ULL(VIRTIO_S_*), as _S_ is already the bit (Maxime) * Using vduse_vq_group_int directly instead of an empty struct in union virtio_map. RFC v3: * Increase VDUSE_MAX_VQ_GROUPS to 0xffff (Jason). It was set to a lower value to reduce memory consumption, but vqs are already limited to that value and userspace VDUSE is able to allocate that many vqs. Also, it's a dynamic array now. Same with ASID. * Move the valid vq groups range check to vduse_validate_config. * Embed vduse_iotlb_entry into vduse_iotlb_entry_v2. * Use of array_index_nospec in VDUSE device ioctls. * Move the umem mutex to asid struct so there is no contention between ASIDs. * Remove the descs vq group capability as it will not be used and we can add it on top. * Do not ask for vq groups in number of vq groups < 2. * Remove TODO about merging VDUSE_IOTLB_GET_FD ioctl with VDUSE_IOTLB_GET_INFO. RFC v2: * Cache group information in kernel, as we need to provide the vq map tokens properly. * Add descs vq group to optimize SVQ forwarding and support indirect descriptors out of the box. * Make iotlb entry the last one of vduse_iotlb_entry_v2 so the first part of the struct is the same. * Fixes detected testing with OVS+VDUSE. Eugenio Pérez (6): vduse: make domain_lock an rwlock vduse: add v1 API definition vduse: add vq group support vduse: return internal vq group struct as map token vduse: add vq group asid support vduse: bump version number drivers/vdpa/vdpa_user/vduse_dev.c | 492 ++++++++++++++++++++++------- include/linux/virtio.h | 6 +- include/uapi/linux/vduse.h | 67 +++- 3 files changed, 441 insertions(+), 124 deletions(-) -- 2.51.0