The logic of this commit looks correct to me and I have tested this commit on the Nvidia RTX A5000 and it works well!
Tested-by: Zhiyi Guo <zh...@redhat.com> Reviewed-by: Zhiyi Guo <zh...@redhat.com> Regards, Zhiyi On Wed, Jan 8, 2025 at 1:44 PM Laine Stump <la...@redhat.com> wrote: > Ping. It would be nice to get this into the upcoming release. > > > On 12/13/24 1:07 PM, Laine Stump wrote: > > GPU vendors are moving away from using mdev to create virtual GPUs > > towards using SRIOV VFs that are vGPUs. In both cases, once created > > the vGPUs are assigned to guests via <hostdev> (i.e. VFIO device > > assignment), and inside the guest the devices look identical, but mdev > > vGPUs are located by QEMU/VFIO using a uuid, while VF vGPUs are > > located with a PCI address. So although we generally require the > > device on the source host to exactly match the device on the > > destination host, in the case of mdev-created vGPU vs. VF vGPU > > migration *can* potentially work, except that libvirt has a hard-coded > > check that prevents us from even trying. > > > > This patch loosens up that check so that we will allow attempts to > > migrate a guest from a source host that has mdev-created vGPUs to a > > destination host that has VF vGPUs (and vice versa). The expectation > > is that if this doesn't actually work then QEMU will fail and generate > > an error that we can report. > > > > Based-on-patch-by: Zhiyi Guo <zh...@redhat.com> > > Signed-off-by: Laine Stump <la...@redhat.com> > > --- > > > > Zhiyi's original patch removed the check for subsys type completely, > > and this worked. My modified patch keeps the check in place, but > > allows it to pass if the src type is pci and dst is mdev, or vice > > versa. > > > > src/conf/domain_conf.c | 28 +++++++++++++++++++++------- > > 1 file changed, 21 insertions(+), 7 deletions(-) > > > > diff --git a/src/conf/domain_conf.c b/src/conf/domain_conf.c > > index 4ad8289b89..9d5fda0469 100644 > > --- a/src/conf/domain_conf.c > > +++ b/src/conf/domain_conf.c > > @@ -20647,13 +20647,27 @@ > virDomainHostdevDefCheckABIStability(virDomainHostdevDef *src, > > return false; > > } > > > > - if (src->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS && > > - src->source.subsys.type != dst->source.subsys.type) { > > - virReportError(VIR_ERR_CONFIG_UNSUPPORTED, > > - _("Target host device subsystem %1$s does not > match source %2$s"), > > - > virDomainHostdevSubsysTypeToString(dst->source.subsys.type), > > - > virDomainHostdevSubsysTypeToString(src->source.subsys.type)); > > - return false; > > + if (src->mode == VIR_DOMAIN_HOSTDEV_MODE_SUBSYS) { > > + virDomainHostdevSubsysType srcType = src->source.subsys.type; > > + virDomainHostdevSubsysType dstType = dst->source.subsys.type; > > + > > + /* If the source and destination subsys types aren't the same, > > + * then migration can't be supported, *except* that it might > > + * be supported to migrate from subsys type 'pci' to 'mdev' > > + * and vice versa. (libvirt can't know for certain whether or > > + * not it will actually work, so we have to just allow it and > > + * count on QEMU to provide us with an error if it fails) > > + */ > > + > > + if (srcType != dstType > > + && ((srcType != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && > srcType != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV) > > + || (dstType != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_PCI && > dstType != VIR_DOMAIN_HOSTDEV_SUBSYS_TYPE_MDEV))) { > > + virReportError(VIR_ERR_CONFIG_UNSUPPORTED, > > + _("Target host device subsystem type %1$s is > not compatible with source subsystem type %2$s"), > > + virDomainHostdevSubsysTypeToString(dstType), > > + virDomainHostdevSubsysTypeToString(srcType)); > > + return false; > > + } > > } > > > > if (!virDomainDeviceInfoCheckABIStability(src->info, dst->info)) > >