On Thu, Mar 14, 2019 at 11:24:22AM +0000, Daniel P. Berrangé wrote: > On Tue, Mar 12, 2019 at 12:49:35PM -0400, Michael S. Tsirkin wrote: > > On Thu, Feb 28, 2019 at 04:53:54PM +0800, elohi...@gmail.com wrote: > > > From: Xie Yongji <xieyon...@baidu.com> > > > > > > Since we now support the message VHOST_USER_GET_INFLIGHT_FD > > > and VHOST_USER_SET_INFLIGHT_FD. The backend is able to restart > > > safely because it can track inflight I/O in shared memory. > > > This patch allows qemu to reconnect the backend after > > > connection closed. > > > > > > Signed-off-by: Xie Yongji <xieyon...@baidu.com> > > > Signed-off-by: Ni Xun <ni...@baidu.com> > > > Signed-off-by: Zhang Yu <zhangy...@baidu.com> > > > --- > > > hw/block/vhost-user-blk.c | 205 +++++++++++++++++++++++------ > > > include/hw/virtio/vhost-user-blk.h | 4 + > > > 2 files changed, 167 insertions(+), 42 deletions(-) > > > > > static void vhost_user_blk_device_realize(DeviceState *dev, Error **errp) > > > { > > > VirtIODevice *vdev = VIRTIO_DEVICE(dev); > > > VHostUserBlk *s = VHOST_USER_BLK(vdev); > > > VhostUserState *user; > > > - struct vhost_virtqueue *vqs = NULL; > > > int i, ret; > > > + Error *err = NULL; > > > > > > if (!s->chardev.chr) { > > > error_setg(errp, "vhost-user-blk: chardev is mandatory"); > > > @@ -312,27 +442,28 @@ static void > > > vhost_user_blk_device_realize(DeviceState *dev, Error **errp) > > > } > > > > > > s->inflight = g_new0(struct vhost_inflight, 1); > > > - > > > - s->dev.nvqs = s->num_queues; > > > - s->dev.vqs = g_new(struct vhost_virtqueue, s->dev.nvqs); > > > - s->dev.vq_index = 0; > > > - s->dev.backend_features = 0; > > > - vqs = s->dev.vqs; > > > - > > > - vhost_dev_set_config_notifier(&s->dev, &blk_ops); > > > - > > > - ret = vhost_dev_init(&s->dev, s->vhost_user, > > > VHOST_BACKEND_TYPE_USER, 0); > > > - if (ret < 0) { > > > - error_setg(errp, "vhost-user-blk: vhost initialization failed: > > > %s", > > > - strerror(-ret)); > > > - goto virtio_err; > > > - } > > > + s->vqs = g_new(struct vhost_virtqueue, s->num_queues); > > > + s->watch = 0; > > > + s->should_start = false; > > > + s->connected = false; > > > + > > > + qemu_chr_fe_set_handlers(&s->chardev, NULL, NULL, > > > vhost_user_blk_event, > > > + NULL, (void *)dev, NULL, true); > > > + > > > +reconnect: > > > + do { > > > + if (qemu_chr_fe_wait_connected(&s->chardev, &err) < 0) { > > > + error_report_err(err); > > > + err = NULL; > > > + sleep(1); > > > > Seems arbitrary. Is this basically waiting until backend will reconnect? > > Why not block until event on the fd triggers? > > > > Also, it looks like this will just block forever with no monitor input > > and no way for user to figure out what is going on short of > > crashing QEMU. > > FWIW, the current vhost-user-net device does exactly the same thing > with calling qemu_chr_fe_wait_connected during its realize() function.
Hmm yes. It doesn't sleep for an arbitrary 1 sec so less of an eyesore :) > Can't say I'm thrilled about the existing usage either, but I don't > see a particularly nice alternative if it isn't possible to realize > the device without having a backend available. > > I don't think this an especially serious problem during cold startup > though since doesn't expect the monitor to become responsive until > the device model is fully realized & all backends connected. > > The real concern is if you need to hotplug this kind of device > at runtime. In that case blocking the main loop / monitor is a > serious problem that will actively harm both the guest OS and > mgmt applications. > > The vhost-user-net device will already suffer from that. Right. > To solve the problem with hotplug, the mgmt app would need to > plug the chardev backend, then wait for it to become connected, > and only then plug the device frontend. In that case we would not > want this loop. We'd expect it to use the already connected > chardev, and fail the realize() function if the chardev has > lost its connection. > > Regards, > Daniel I think as step one we should maybe just busy-wait with no sleep(1). But respecting reconnect would be nice too. And I think if we can't respect nowait we should detect it and fail, in both net and block ... > -- > |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| > |: https://libvirt.org -o- https://fstop138.berrange.com :| > |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|