On Tue, 15 Mar 2022 21:43:15 +0000 Thanos Makatos <thanos.maka...@nutanix.com> wrote:
> > -----Original Message----- > > From: Qemu-devel <qemu-devel- > > bounces+thanos.makatos=nutanix....@nongnu.org> On Behalf Of Alex > > Williamson > > Sent: 09 March 2022 22:35 > > To: John Johnson <john.g.john...@oracle.com> > > Cc: qemu-devel@nongnu.org > > Subject: Re: [RFC v4 01/21] vfio-user: introduce vfio-user protocol > > specification > > > > On Tue, 11 Jan 2022 16:43:37 -0800 > > John Johnson <john.g.john...@oracle.com> wrote: > > > +VFIO region info cap sparse mmap > > > +"""""""""""""""""""""""""""""""" > > > + > > > ++----------+--------+------+ > > > +| Name | Offset | Size | > > > ++==========+========+======+ > > > +| nr_areas | 0 | 4 | > > > ++----------+--------+------+ > > > +| reserved | 4 | 4 | > > > ++----------+--------+------+ > > > +| offset | 8 | 8 | > > > ++----------+--------+------+ > > > +| size | 16 | 9 | > > > ++----------+--------+------+ > > > > Typo, I'm pretty sure size isn't 9 bytes. > > > > > +| ... | | | > > > ++----------+--------+------+ > > > + > > > +* *nr_areas* is the number of sparse mmap areas in the region. > > > +* *offset* and size describe a single area that can be mapped by the > > > client. > > > + There will be *nr_areas* pairs of offset and size. The offset will be > > > added to > > > + the base offset given in the ``VFIO_USER_DEVICE_GET_REGION_INFO`` to > > form the > > > + offset argument of the subsequent mmap() call. > > > + > > > +The VFIO sparse mmap area is defined in ``<linux/vfio.h>`` (``struct > > > +vfio_region_info_cap_sparse_mmap``). > > > + > > > +VFIO region type cap header > > > +""""""""""""""""""""""""""" > > > + > > > ++------------------+---------------------------+ > > > +| Name | Value | > > > ++==================+===========================+ > > > +| id | VFIO_REGION_INFO_CAP_TYPE | > > > ++------------------+---------------------------+ > > > +| version | 0x1 | > > > ++------------------+---------------------------+ > > > +| next | <next> | > > > ++------------------+---------------------------+ > > > +| region info type | VFIO region info type | > > > ++------------------+---------------------------+ > > > + > > > +This capability is defined when a region is specific to the device. > > > + > > > +VFIO region info type cap > > > +""""""""""""""""""""""""" > > > + > > > +The VFIO region info type is defined in ``<linux/vfio.h>`` > > > +(``struct vfio_region_info_cap_type``). > > > + > > > ++---------+--------+------+ > > > +| Name | Offset | Size | > > > ++=========+========+======+ > > > +| type | 0 | 4 | > > > ++---------+--------+------+ > > > +| subtype | 4 | 4 | > > > ++---------+--------+------+ > > > + > > > +The only device-specific region type and subtype supported by vfio-user > > > is > > > +``VFIO_REGION_TYPE_MIGRATION`` (3) and > > ``VFIO_REGION_SUBTYPE_MIGRATION`` (1). > > > > These should be considered deprecated from the kernel interface. I > > hope there are plans for vfio-user to adopt the new interface that's > > currently available in linux-next and intended for v5.18. > > > > ... > > > +Unused VFIO ``ioctl()`` commands > > > +-------------------------------- > > > + > > > +The following VFIO commands do not have an equivalent vfio-user > > command: > > > + > > > +* ``VFIO_GET_API_VERSION`` > > > +* ``VFIO_CHECK_EXTENSION`` > > > +* ``VFIO_SET_IOMMU`` > > > +* ``VFIO_GROUP_GET_STATUS`` > > > +* ``VFIO_GROUP_SET_CONTAINER`` > > > +* ``VFIO_GROUP_UNSET_CONTAINER`` > > > +* ``VFIO_GROUP_GET_DEVICE_FD`` > > > +* ``VFIO_IOMMU_GET_INFO`` > > > + > > > +However, once support for live migration for VFIO devices is finalized > > > some > > > +of the above commands may have to be handled by the client in their > > > +corresponding vfio-user form. This will be addressed in a future protocol > > > +version. > > > > As above, I'd go ahead and drop the migration region interface support, > > it's being removed from the kernel. Dirty page handling might also be > > something you want to pull back on as we're expecting in-kernel vfio to > > essentially deprecate its iommu backends in favor of a new shared > > userspace iommufd interface. We expect to have backwards compatibility > > via that interface, but as QEMU migration support for vfio-pci devices > > is experimental and there are desires not to consolidate dirty page > > tracking behind the iommu interface in the new model, it's not clear if > > the kernel will continue to expose the current dirty page tracking. > > > > AIUI, we're expecting to see patches officially proposing the iommufd > > interface in the kernel "soon". Thanks, > > Are you referring to the "[RFC v2] /dev/iommu uAPI proposal" work > (https://lkml.org/lkml/2021/7/9/89)? There's a more recent proposal here: https://lore.kernel.org/all/20210919063848.1476776-1-yi.l....@intel.com/ But I suspect based on discussions that it's evolved quite a lot since then. Based on various test robot reports, I gather the current pre-release is tracking in Yi's tree here: https://github.com/luxis1999/iommufd Thanks, Alex