Re: [Qemu-devel] live migration vs device assignment (motivation)

Dr. David Alan Gilbert Thu, 10 Dec 2015 02:19:56 -0800

* Lan, Tianyu (tianyu....@intel.com) wrote:
> On 12/8/2015 12:50 AM, Michael S. Tsirkin wrote:
> >I thought about what this is doing at the high level, and I do have some
> >value in what you are trying to do, but I also think we need to clarify
> >the motivation a bit more.  What you are saying is not really what the
> >patches are doing.
> >
> >And with that clearer understanding of the motivation in mind (assuming
> >it actually captures a real need), I would also like to suggest some
> >changes.
> 
> Motivation:
> Most current solutions for migration with passthough device are based on
> the PCI hotplug but it has side affect and can't work for all device.
> 
> For NIC device:
> PCI hotplug solution can work around Network device migration
> via switching VF and PF.
> 
> But switching network interface will introduce service down time.
> 
> I tested the service down time via putting VF and PV interface
> into a bonded interface and ping the bonded interface during plug
> and unplug VF.
> 1) About 100ms when add VF
> 2) About 30ms when del VF
> 
> It also requires guest to do switch configuration. These are hard to
> manage and deploy from our customers. To maintain PV performance during
> migration, host side also needs to assign a VF to PV device. This
> affects scalability.
> 
> These factors block SRIOV NIC passthough usage in the cloud service and
> OPNFV which require network high performance and stability a lot.


Right, that I'll agree it's hard to do migration of a VM which uses
an SRIOV device; and while I think it should be possible to bond a virtio device
to a VF for networking and then hotplug the SR-IOV device I agree it's hard to 
manage.

> For other kind of devices, it's hard to work.
> We are also adding migration support for QAT(QuickAssist Technology) device.
> 
> QAT device user case introduction.
> Server, networking, big data, and storage applications use QuickAssist
> Technology to offload servers from handling compute-intensive operations,
> such as:
> 1) Symmetric cryptography functions including cipher operations and
> authentication operations
> 2) Public key functions including RSA, Diffie-Hellman, and elliptic curve
> cryptography
> 3) Compression and decompression functions including DEFLATE and LZS
> 
> PCI hotplug will not work for such devices during migration and these
> operations will fail when unplug device.

I don't understand that QAT argument; if the device is purely an offload
engine for performance, then why can't you fall back to doing the
same operations in the VM or in QEMU if the card is unavailable?
The tricky bit is dealing with outstanding operations.

> So we are trying implementing a new solution which really migrates
> device state to target machine and won't affect user during migration
> with low service down time.

Right, that's a good aim - the only question is how to do it.

It looks like this is always going to need some device-specific code;
the question I see is whether that's in:
    1) qemu
    2) the host kernel
    3) the guest kernel driver

The objections to this series seem to be that it needs changes to (3);
I can see the worry that the guest kernel driver might not get a chance
to run during the right time in migration and it's painful having to
change every guest driver (although your change is small).

My question is what stage of the migration process do you expect to tell
the guest kernel driver to do this?

    If you do it at the start of the migration, and quiesce the device,
    the migration might take a long time (say 30 minutes) - are you
    intending the device to be quiesced for this long? And where are
    you going to send the traffic?
    If you are, then do you need to do it via this PCI trick, or could
    you just do it via something higher level to quiesce the device.

    Or are you intending to do it just near the end of the migration?
    But then how do we know how long it will take the guest driver to
    respond?

It would be great if we could avoid changing the guest; but at least your guest
driver changes don't actually seem to be that hardware specific; could your
changes actually be moved to generic PCI level so they could be made
to work for lots of drivers?

Dave
--
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Re: [Qemu-devel] live migration vs device assignment (motivation)

Reply via email to