* Michael S. Tsirkin (m...@redhat.com) wrote:
> On Tue, Oct 05, 2021 at 12:10:08PM -0400, Eduardo Habkost wrote:
> > On Tue, Oct 05, 2021 at 03:01:05PM +0100, Dr. David Alan Gilbert wrote:
> > > * Michael S. Tsirkin (m...@redhat.com) wrote:
> > > > On Tue, Oct 05, 2021 at 02:18:40AM +0300, Roman Kagan wrote:
> > > > > On Mon, Oct 04, 2021 at 11:11:00AM -0400, Michael S. Tsirkin wrote:
> > > > > > On Mon, Oct 04, 2021 at 06:07:29PM +0300, Denis Plotnikov wrote:
> > > > > > > It might be useful for the cases when a slow block layer should 
> > > > > > > be replaced
> > > > > > > with a more performant one on running VM without stopping, i.e. 
> > > > > > > with very low
> > > > > > > downtime comparable with the one on migration.
> > > > > > > 
> > > > > > > It's possible to achive that for two reasons:
> > > > > > > 
> > > > > > > 1.The VMStates of "virtio-blk" and "vhost-user-blk" are almost 
> > > > > > > the same.
> > > > > > >   They consist of the identical VMSTATE_VIRTIO_DEVICE and differs 
> > > > > > > from
> > > > > > >   each other in the values of migration service fields only.
> > > > > > > 2.The device driver used in the guest is the same: virtio-blk
> > > > > > > 
> > > > > > > In the series cross-migration is achieved by adding a new type.
> > > > > > > The new type uses virtio-blk VMState instead of vhost-user-blk 
> > > > > > > specific
> > > > > > > VMstate, also it implements migration save/load callbacks to be 
> > > > > > > compatible
> > > > > > > with migration stream produced by "virtio-blk" device.
> > > > > > > 
> > > > > > > Adding the new type instead of modifying the existing one is 
> > > > > > > convenent.
> > > > > > > It ease to differ the new virtio-blk-compatible vhost-user-blk
> > > > > > > device from the existing non-compatible one using qemu machinery 
> > > > > > > without any
> > > > > > > other modifiactions. That gives all the variety of qemu device 
> > > > > > > related
> > > > > > > constraints out of box.
> > > > > > 
> > > > > > Hmm I'm not sure I understand. What is the advantage for the user?
> > > > > > What if vhost-user-blk became an alias for vhost-user-virtio-blk?
> > > > > > We could add some hacks to make it compatible for old machine types.
> > > > > 
> > > > > The point is that virtio-blk and vhost-user-blk are not
> > > > > migration-compatible ATM.  OTOH they are the same device from the 
> > > > > guest
> > > > > POV so there's nothing fundamentally preventing the migration between
> > > > > the two.  In particular, we see it as a means to switch between the
> > > > > storage backend transports via live migration without disrupting the
> > > > > guest.
> > > > > 
> > > > > Migration-wise virtio-blk and vhost-user-blk have in common
> > > > > 
> > > > > - the content of the VMState -- VMSTATE_VIRTIO_DEVICE
> > > > > 
> > > > > The two differ in
> > > > > 
> > > > > - the name and the version of the VMStateDescription
> > > > > 
> > > > > - virtio-blk has an extra migration section (via .save/.load callbacks
> > > > >   on VirtioDeviceClass) containing requests in flight
> > > > > 
> > > > > It looks like to become migration-compatible with virtio-blk,
> > > > > vhost-user-blk has to start using VMStateDescription of virtio-blk and
> > > > > provide compatible .save/.load callbacks.  It isn't entirely obvious 
> > > > > how
> > > > > to make this machine-type-dependent, so we came up with a simpler idea
> > > > > of defining a new device that shares most of the implementation with 
> > > > > the
> > > > > original vhost-user-blk except for the migration stuff.  We're 
> > > > > certainly
> > > > > open to suggestions on how to reconcile this under a single
> > > > > vhost-user-blk device, as this would be more user-friendly indeed.
> > > > > 
> > > > > We considered using a class property for this and defining the
> > > > > respective compat clause, but IIUC the class constructors (where .vmsd
> > > > > and .save/.load are defined) are not supposed to depend on class
> > > > > properties.
> > > > > 
> > > > > Thanks,
> > > > > Roman.
> > > > 
> > > > So the question is how to make vmsd depend on machine type.
> > > > CC Eduardo who poked at this kind of compat stuff recently,
> > > > paolo who looked at qom things most recently and dgilbert
> > > > for advice on migration.
> > > 
> > > I don't think I've seen anyone change vmsd name dependent on machine
> > > type; making fields appear/disappear is easy - that just ends up as a
> > > property on the device that's checked;  I guess if that property is
> > > global (rather than per instance) then you can check it in
> > > vhost_user_blk_class_init and swing the dc->vmsd pointer?
> > 
> > class_init can be called very early during QEMU initialization,
> > so it's too early to make decisions based on machine type.
> > 
> > Making a specific vmsd appear/disappear based on machine
> > configuration or state is "easy", by implementing
> > VMStateDescription.needed.  But this would require registering
> > both vmsds (one of them would need to be registered manually
> > instead of using DeviceClass.vmsd).
> > 
> > I don't remember what are the consequences of not using
> > DeviceClass.vmsd to register a vmsd, I only remember it was
> > subtle.  See commit b170fce3dd06 ("cpu: Register
> > VMStateDescription through CPUState") and related threads.  CCing
> > Philippe, who might remember the details here.
> > 
> > If that's an important use case, I would suggest allowing devices
> > to implement a DeviceClass.get_vmsd method, which would override
> > DeviceClass.vmsd if necessary.  Is the problem we're trying to
> > address worth the additional complexity?
> 
> The tricky part is that we generally dont support migration when
> command line is different on source and destination ...

The reality has always been a bit more subtle than that.
For example, it's fine if the path to a block device is different on the
source and destination; or if it's accessed by iSCSI on the destination
say.  As long as what the guest sees, and the migration stream carries
are the same, then in principal it's OK - but that does start getting
trickier; also it would prboably get interesting to let libvirt know
that this combo is OK.

> So maybe the actual answer is that vhost-user-blk should really
> be a drive supplied to a virtio blk device, not a device
> itself?
> This way it's sane, and also matches what we do e.g. for net.

Hmm a bit of a fudge; it's not quite the same as a drive is it; there's
almost another layer split in there.

Dave

> -- 
> MST
> 
-- 
Dr. David Alan Gilbert / dgilb...@redhat.com / Manchester, UK


Reply via email to