Re: [Qemu-devel] [RFC V4 PATCH 0/4] vfio: Introduce live migation capability to

2018-04-13 Thread Zhang, Yulei
Hi Kirti, what do you think of the pre-copy interface in this series? 

Thanks,
Yulei
> -Original Message-
> From: Zhang, Yulei
> Sent: Tuesday, April 10, 2018 2:02 PM
> To: qemu-devel@nongnu.org
> Cc: Tian, Kevin ; joonas.lahti...@linux.intel.com;
> zhen...@linux.intel.com; kwankh...@nvidia.com; Wang, Zhi A
> ; alex.william...@redhat.com;
> dgilb...@redhat.com; quint...@redhat.com; Zhang, Yulei
> 
> Subject: [RFC V4 PATCH 0/4] vfio: Introduce live migation capability to
> 
> Summary
> 
> This series RFC would like to resume the discussion about how to
> introduce the live migration capability to vfio mdev device.
> 
> A new subtype region VFIO_REGION_SUBTYPE_DEVICE_STATE is introduced
> for vfio device status migrate, during the initialization it will
> check if the region is supported by the vfio device, otherwise it
> will remain non-migratable.
> 
> The intention to add the new region is using it for mdev device status
> save and restore during the migration. The access to this region
> will be trapped and forward to the mdev device driver, it also uses
> the first byte in the new region to control the running state of mdev
> device, so during the migration after stop the mdev driver, qemu could
> retrieve the specific device status from this region and transfer to
> the target VM side for the mdev device restore.
> 
> In addition, during the pre-copy period, it will be able to fetch the
> dirty bitmap of vfio device through ioctl VFIO_DEVICE_GET_DIRTY_BITMAP
> iteratively, which will be able to shorten the system downtime during
> the static copy.
> 
> Below is the vfio mdev device migration sequence
> Source VM side:
>   start migration
>   |
>   V
>  in pre-copy stage, fetch the device dirty bitmap
>  and add into qemu dirty list for migrate iteratively.
> |
> V
>get the cpu state change callback, write to the
>subregion's first byte to stop the mdev device
>   |
>   V
>quary the dirty page bitmap from iommu container
>and add into qemu dirty list for last synchronization
>   |
>   V
>save the deivce status into Qemufile which is
>  read from the vfio device subregion
> 
> Target VM side:
>restore the mdev device after get the
>saved status context from Qemufile
>   |
>   V
> get the cpu state change callback write to
>   subregion's first byte to start the mdev device
>   to put it in running status
>   |
>   V
>   finish migration
> 
> V3->V4:
> 1. add migration_blocker if device state region isnot supported.
> 2. instead of using vmsd, register SaveVMHandlers for VFIO device
>to leverage the pro-copy facility, and add new ioctl for VFIO
>device to fetch dirty bitmap during pro-copy.
> 3. remove the intel vendor ID dependence for the device state
>subregion.
> 
> V2->V3:
> 1. rebase the patch to Qemu stable 2.10 branch.
> 2. use a common name for the subregion instead of specific for
>intel IGD.
> 
> V1->V2:
> Per Alex's suggestion:
> 1. use device subtype region instead of VFIO PCI fixed region.
> 2. remove unnecessary ioctl, use the first byte of subregion to
>control the running state of mdev device.
> 3. for dirty page synchronization, implement the interface with
>VFIOContainer instead of vfio pci device.
> 
> Yulei Zhang (4):
>   vfio: introduce a new VFIO subregion for mdev device migration support
>   vfio: Add vm status change callback to stop/restart the mdev device
>   vfio: Add SaveVMHanlders for VFIO device to support live migration
>   vifo: introduce new VFIO ioctl VFIO_IOMMU_GET_DIRTY_BITMAP
> 
>  hw/vfio/common.c  |  34 ++
>  hw/vfio/pci.c | 240
> --
>  hw/vfio/pci.h |   2 +
>  include/hw/vfio/vfio-common.h |   1 +
>  linux-headers/linux/vfio.h|  43 +++-
>  roms/seabios  |   2 +-
>  6 files changed, 312 insertions(+), 10 deletions(-)
> 
> --
> 2.7.4




[Qemu-devel] [RFC V4 PATCH 0/4] vfio: Introduce live migation capability to

2018-04-09 Thread Yulei Zhang
Summary

This series RFC would like to resume the discussion about how to
introduce the live migration capability to vfio mdev device. 

A new subtype region VFIO_REGION_SUBTYPE_DEVICE_STATE is introduced
for vfio device status migrate, during the initialization it will
check if the region is supported by the vfio device, otherwise it 
will remain non-migratable.

The intention to add the new region is using it for mdev device status
save and restore during the migration. The access to this region
will be trapped and forward to the mdev device driver, it also uses 
the first byte in the new region to control the running state of mdev
device, so during the migration after stop the mdev driver, qemu could
retrieve the specific device status from this region and transfer to 
the target VM side for the mdev device restore.

In addition, during the pre-copy period, it will be able to fetch the
dirty bitmap of vfio device through ioctl VFIO_DEVICE_GET_DIRTY_BITMAP
iteratively, which will be able to shorten the system downtime during
the static copy.

Below is the vfio mdev device migration sequence
Source VM side:
start migration
|
V
 in pre-copy stage, fetch the device dirty bitmap
 and add into qemu dirty list for migrate iteratively.
|
V
 get the cpu state change callback, write to the
 subregion's first byte to stop the mdev device
|
V
 quary the dirty page bitmap from iommu container 
 and add into qemu dirty list for last synchronization
|
V
 save the deivce status into Qemufile which is 
 read from the vfio device subregion

Target VM side:
 restore the mdev device after get the
 saved status context from Qemufile
|
V
  get the cpu state change callback write to 
  subregion's first byte to start the mdev device
  to put it in running status
|
V
finish migration

V3->V4:
1. add migration_blocker if device state region isnot supported.
2. instead of using vmsd, register SaveVMHandlers for VFIO device
   to leverage the pro-copy facility, and add new ioctl for VFIO
   device to fetch dirty bitmap during pro-copy.
3. remove the intel vendor ID dependence for the device state 
   subregion.

V2->V3:
1. rebase the patch to Qemu stable 2.10 branch.
2. use a common name for the subregion instead of specific for 
   intel IGD.

V1->V2:
Per Alex's suggestion:
1. use device subtype region instead of VFIO PCI fixed region.
2. remove unnecessary ioctl, use the first byte of subregion to 
   control the running state of mdev device.  
3. for dirty page synchronization, implement the interface with
   VFIOContainer instead of vfio pci device.

Yulei Zhang (4):
  vfio: introduce a new VFIO subregion for mdev device migration support
  vfio: Add vm status change callback to stop/restart the mdev device
  vfio: Add SaveVMHanlders for VFIO device to support live migration
  vifo: introduce new VFIO ioctl VFIO_IOMMU_GET_DIRTY_BITMAP

 hw/vfio/common.c  |  34 ++
 hw/vfio/pci.c | 240 --
 hw/vfio/pci.h |   2 +
 include/hw/vfio/vfio-common.h |   1 +
 linux-headers/linux/vfio.h|  43 +++-
 roms/seabios  |   2 +-
 6 files changed, 312 insertions(+), 10 deletions(-)

-- 
2.7.4