Re: [RFC v3] VFIO Migration

Stefan Hajnoczi Mon, 16 Nov 2020 07:26:23 -0800

On Wed, Nov 11, 2020 at 04:18:34PM +0000, Thanos Makatos wrote:
> 
> > VFIO Migration
> > ==============
> > This document describes how to ensure migration compatibility for VFIO
> > devices,
> > including mdev and vfio-user devices.
> 
> Is this something all VFIO/user devices will have to support? If it's not
> mandatory, how can a device advertise support?


The --print-migration-info-json command-line option described below must
be implemented by the vfio-user device emulation program. Similarly,
VFIO/mdev devices must provide the migration/ sysfs group.

If the device implementation does not expose these standard interfaces
then management tools can still attempt to migrate them, but there is no
migration compatibility check or algorithm for setting up the
destination device. In other words, it will only succeed with some luck
or by hardcoding knowledge of the specific device implementation into
the management tool.

> 
> > Multiple device implementations can support the same device model. Doing
> > so
> > means that the device implementations can offer migration compatiblity
> > because
> > they support the same hardware interface, device state representation, and
> > migration parameters.
> 
> Does the above mean that a passthrough function can be migrated to a vfio-user
> program and vice versa? If so, then it's worth mentioning.

Yes, if they are migration compatible (they support the same device
model and migration parameters) then migration is possible. I'll make
this clear in the next revision.

Note VFIO migration is currently only working for mdev devices. Alex
Williamson mentioned that it could be extended to core VFIO PCI devices
(without mdev) in the future.

> > More complex device emulation programs may host multiple devices. The
> > interface
> > for configuring these device emulation programs is not standardized.
> > Therefore,
> > migrating these devices is beyond the scope of this document.
> 
> Most likely a device emulation program hosting multile devices would allow
> some form of communication for control purposes (e.g. SPDK implements a 
> JSON-RPC
> server). So maybe it's possible to define interacting with such programs in
> this document?

Yes, it's definitely possible. There needs to be agreement on the RPC
mechanism. QEMU implements QMP, SPDK has something similar but
different, gRPC/Protobuf is popular, and D-Bus is another alternative. I
asked about RPC mechanisms on the muser Slack instance to see if there
was consensus but it seems to be a bit early for that.

Perhaps the most realistic option will be to define bindings to several
RPC mechanisms. That way everyone can use their preferred RPC mechanism,
at the cost of requiring management tools to support more than one
(which some already do, e.g. libvirt uses XDR itself but also implements
QEMU's QMP).

> > 
> > The migration information JSON is printed to standard output by a vfio-user
> > device emulation program as follows:
> > 
> > .. code:: bash
> > 
> >   $ my-device --print-migration-info-json
> > 
> > The device is instantiated by launching the destination process with the
> > migration parameter list from the source:
> 
> Must 'my-device --print-migration-info-json' always generate the same 
> migration
> information JSON? If so, then what if the output generated by
> 'my-device --print-migration-info-json' depends on additional arguments passed
> to 'my-device' when it was originally started?

Yes, it needs to be stable in the sense that you can invoke the program
with --print-migration-info-json and then expect launching the program
to succeed with migration parameters that are valid according to the
JSON.

Running the same device emulation binary on different hosts can produce
different JSON. This is because the binary may rely on host hardware
resources or features (e.g. does this host have GPUs available?).

It gets trickier when considering host reboots. I think the JSON can
change between reboots. However, the management tools may cache the JSON
so there needs to be a rule about when to refresh it.

Regarding additional command-line arguments, they can affect the JSON
output. For example, they could include the connection details to an
iSCSI LUN and affect the block size migration parameter. This leads to
the same issue - can they be cached by the management tool? The answer
is the same - stability is needed in the short-term to avoid unexpected
failures when launching the program, but over the longer term we should
allow JSON changes.

Thanks for raising these points. I'll add details to the next revision.

Stefan

signature.asc
Description: PGP signature

Re: [RFC v3] VFIO Migration

Reply via email to