* Zhang, Yulei (yulei.zh...@intel.com) wrote:
> 
> 
> > -----Original Message-----
> > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > Sent: Tuesday, April 17, 2018 10:35 PM
> > To: Zhang, Yulei <yulei.zh...@intel.com>
> > Cc: qemu-devel@nongnu.org; Tian, Kevin <kevin.t...@intel.com>;
> > joonas.lahti...@linux.intel.com; zhen...@linux.intel.com;
> > kwankh...@nvidia.com; Wang, Zhi A <zhi.a.w...@intel.com>;
> > dgilb...@redhat.com; quint...@redhat.com
> > Subject: Re: [RFC PATCH V4 3/4] vfio: Add SaveVMHanlders for VFIO device
> > to support live migration
> > 
> > On Tue, 17 Apr 2018 08:11:12 +0000
> > "Zhang, Yulei" <yulei.zh...@intel.com> wrote:
> > 
> > > > -----Original Message-----
> > > > From: Alex Williamson [mailto:alex.william...@redhat.com]
> > > > Sent: Tuesday, April 17, 2018 4:38 AM
> > > > To: Zhang, Yulei <yulei.zh...@intel.com>
> > > > Cc: qemu-devel@nongnu.org; Tian, Kevin <kevin.t...@intel.com>;
> > > > joonas.lahti...@linux.intel.com; zhen...@linux.intel.com;
> > > > kwankh...@nvidia.com; Wang, Zhi A <zhi.a.w...@intel.com>;
> > > > dgilb...@redhat.com; quint...@redhat.com
> > > > Subject: Re: [RFC PATCH V4 3/4] vfio: Add SaveVMHanlders for VFIO
> > device
> > > > to support live migration
> > > >
> > > > On Tue, 10 Apr 2018 14:03:13 +0800
> > > > Yulei Zhang <yulei.zh...@intel.com> wrote:
> > > >
> > > > > Instead of using vm state description, add SaveVMHandlers for VFIO
> > > > > device to support live migration.
> > > > >
> > > > > Introduce new Ioctl VFIO_DEVICE_GET_DIRTY_BITMAP to fetch the
> > > > memory
> > > > > bitmap that dirtied by vfio device during the iterative precopy stage
> > > > > to shorten the system downtime afterward.
> > > > >
> > > > > For vfio pci device status migrate, during the system downtime, it
> > > > > will save the following states 1. pci configuration space addr0~addr5
> > > > > 2. pci configuration space msi_addr msi_data 3. pci device status
> > > > > fetch from device driver
> > > > >
> > > > > And on the target side the vfio_load will restore the same states 1.
> > > > > re-setup the pci bar configuration 2. re-setup the pci device msi
> > > > > configuration 3. restore the pci device status
> > > >
> > > > Interrupts are configured via ioctl, but I don't see any code here to
> > configure
> > > > the device into the correct interrupt state.  How do we know the target
> > > > device is compatible with the source device?  Do we rely on the vendor
> > > > driver to implicitly include some kind of device and version information
> > and
> > > > fail at the very end of the migration?  It seems like we need to somehow
> > > > front-load that sort of device compatibility checking since a vfio-pci
> > device
> > > > can be anything (ex. what happens if a user tries to migrate a GVT-g
> > vGPU to
> > > > an NVIDIA vGPU?).  Thanks,
> > > >
> > > > Alex
> > >
> > > We emulate the write to the pci configure space in vfio_load() which will
> > help
> > > setup the interrupt state.
> > 
> > But you're only doing that for MSI, not MSI-X, we cannot simply say
> > that we don't have an MSI-X devices right now and add it later or we'll
> > end up with incompatible vmstate, we need to plan for how we'll support
> > it within the save state stream now.
> > 
> Agree with u. 
> 
> > > For compatibility I think currently the vendor driver would put version
> > number
> > > or device specific info in the device state region, so during the pre-copy
> > stage
> > > the target side will discover the difference and call off the migration.
> > 
> > Those sorts of things should be built into the device state region, we
> > shouldn't rely on vendor drivers to make these kinds of considerations,
> > we should build it into the API, which also allows QEMU to check state
> > compatibility before attempting a migration.  Thanks,
> > 
> > Alex
> 
> Not sure about how to check the compatibility before attempting a migration,
> to my understanding, qemu doesn't know the configuration on target side,
> target may call off the migration when it finds the input device state isn't 
> compatible,
> and source vm will resume.

There are a few parts to this:
   a) Tools, like libvirt etc need to be able to detect that the two
configurations are actually compatible before they try - so there should
be some way of exposing enough detail about the host/drivers for
something somewhere to be able to say 'yes it should work' before even
starting the migration.

   b) The migration stream should contain enough information to be able
to *cleanly* detect an incompatibility so that the failure is obvious;
so make sure you have enough information in there (preferably in the
'setup' part if it really is iterative).  Then when loading check that
data and print a clear reason why it's incompatible.

   c) We've got to try and keep that compatibility across versions - so
migrations to a newer QEMU, or newer drivers (or depending on the
hardware) to newer hardware should be able to work if possible.  So the
stream might need to have version identifiers in to help with that.

Dave

--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK

Reply via email to