On Tue, May 12, 2020 at 11:32:50AM +0800, Jason Wang wrote: > > On 2020/5/11 下午5:25, Dima Stepanov wrote: > >On Mon, May 11, 2020 at 11:15:53AM +0800, Jason Wang wrote: > >>On 2020/4/30 下午9:36, Dima Stepanov wrote: > >>>If vhost-user daemon is used as a backend for the vhost device, then we > >>>should consider a possibility of disconnect at any moment. If such > >>>disconnect happened in the vhost_migration_log() routine the vhost > >>>device structure will be clean up. > >>>At the start of the vhost_migration_log() function there is a check: > >>> if (!dev->started) { > >>> dev->log_enabled = enable; > >>> return 0; > >>> } > >>>To be consistent with this check add the same check after calling the > >>>vhost_dev_set_log() routine. This in general help not to break a > >>>migration due the assert() message. But it looks like that this code > >>>should be revised to handle these errors more carefully. > >>> > >>>In case of vhost-user device backend the fail paths should consider the > >>>state of the device. In this case we should skip some function calls > >>>during rollback on the error paths, so not to get the NULL dereference > >>>errors. > >>> > >>>Signed-off-by: Dima Stepanov <dimas...@yandex-team.ru> > >>>--- > >>> hw/virtio/vhost.c | 39 +++++++++++++++++++++++++++++++++++---- > >>> 1 file changed, 35 insertions(+), 4 deletions(-) > >>> > >>>diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c > >>>index 3ee50c4..d5ab96d 100644 > >>>--- a/hw/virtio/vhost.c > >>>+++ b/hw/virtio/vhost.c > >>>@@ -787,6 +787,17 @@ static int vhost_dev_set_features(struct vhost_dev > >>>*dev, > >>> static int vhost_dev_set_log(struct vhost_dev *dev, bool enable_log) > >>> { > >>> int r, i, idx; > >>>+ > >>>+ if (!dev->started) { > >>>+ /* > >>>+ * If vhost-user daemon is used as a backend for the > >>>+ * device and the connection is broken, then the vhost_dev > >>>+ * structure will be reset all its values to 0. > >>>+ * Add additional check for the device state. > >>>+ */ > >>>+ return -1; > >>>+ } > >>>+ > >>> r = vhost_dev_set_features(dev, enable_log); > >>> if (r < 0) { > >>> goto err_features; > >>>@@ -801,12 +812,19 @@ static int vhost_dev_set_log(struct vhost_dev *dev, > >>>bool enable_log) > >>> } > >>> return 0; > >>> err_vq: > >>>- for (; i >= 0; --i) { > >>>+ /* > >>>+ * Disconnect with the vhost-user daemon can lead to the > >>>+ * vhost_dev_cleanup() call which will clean up vhost_dev > >>>+ * structure. > >>>+ */ > >>>+ for (; dev->started && (i >= 0); --i) { > >>> idx = dev->vhost_ops->vhost_get_vq_index( > >> > >>Why need the check of dev->started here, can started be modified outside > >>mainloop? If yes, I don't get the check of !dev->started in the beginning of > >>this function. > >> > >No dev->started can't change outside the mainloop. The main problem is > >only for the vhost_user_blk daemon. Consider the case when we > >successfully pass the dev->started check at the beginning of the > >function, but after it we hit the disconnect on the next call on the > >second or third iteration: > > r = vhost_virtqueue_set_addr(dev, dev->vqs + i, idx, enable_log); > >The unix socket backend device will call the disconnect routine for this > >device and reset the structure. So the structure will be reset (and > >dev->started set to false) inside this set_addr() call. > > > I still don't get here. I think the disconnect can not happen in the middle > of vhost_dev_set_log() since both of them were running in mainloop. And even > if it can, we probably need other synchronization mechanism other than > simple check here. Disconnect isn't happened in the separate thread it is happened in this routine inside vhost_dev_set_log. When for instance vhost_user_write() call failed: vhost_user_set_log_base() vhost_user_write() vhost_user_blk_disconnect() vhost_dev_cleanup() vhost_user_backend_cleanup() So the point is that if we somehow got a disconnect with the vhost-user-blk daemon before the vhost_user_write() call then it will continue clean up by running vhost_user_blk_disconnect() function. I wrote a more detailed backtrace stack in the separate thread, which is pretty similar to what we have here: Re: [PATCH v2 4/5] vhost: check vring address before calling unmap The places are different but the problem is pretty similar.
So if vhost-user commands handshake then everything is fine and reconnect will work as expected. The only problem is how to handle reconnect properly between vhost-user command send/receive. As i wrote we have a test: - run src VM with vhost-usr-blk daemon used - run fio inside it - perform reconnect every X seconds (just kill and restart daemon), X is random - run dst VM - perform migration - fio should complete in dst VM And we cycle this test like forever. So it fails once per ~25 iteration. By adding some delays inside qemu we were able to make the race window larger. > > > > So > >we shouldn't call the clean up calls because this virtqueues were clean > >up in the disconnect call. But we should protect these calls somehow, so > >it will not hit SIGSEGV and we will be able to pass migration. > > > >Just to summarize it: > >For the vhost-user-blk devices we ca hit clean up calls twice in case of > >vhost disconnect: > >1. The first time during the disconnect process. The clean up is called > >inside it. > >2. The second time during roll back clean up. > >So if it is the case we should skip p2. > > > >>>dev, dev->vq_index + i); > >>> vhost_virtqueue_set_addr(dev, dev->vqs + i, idx, > >>> dev->log_enabled); > >>> } > >>>- vhost_dev_set_features(dev, dev->log_enabled); > >>>+ if (dev->started) { > >>>+ vhost_dev_set_features(dev, dev->log_enabled); > >>>+ } > >>> err_features: > >>> return r; > >>> } > >>>@@ -832,7 +850,15 @@ static int vhost_migration_log(MemoryListener > >>>*listener, int enable) > >>> } else { > >>> vhost_dev_log_resize(dev, vhost_get_log_size(dev)); > >>> r = vhost_dev_set_log(dev, true); > >>>- if (r < 0) { > >>>+ /* > >>>+ * The dev log resize can fail, because of disconnect > >>>+ * with the vhost-user-blk daemon. Check the device > >>>+ * state before calling the vhost_dev_set_log() > >>>+ * function. > >>>+ * Don't return error if device isn't started to be > >>>+ * consistent with the check above. > >>>+ */ > >>>+ if (dev->started && r < 0) { > >>> return r; > >>> } > >>> } > >>>@@ -1739,7 +1765,12 @@ int vhost_dev_start(struct vhost_dev *hdev, > >>>VirtIODevice *vdev) > >>> fail_log: > >>> vhost_log_put(hdev, false); > >>> fail_vq: > >>>- while (--i >= 0) { > >>>+ /* > >>>+ * Disconnect with the vhost-user daemon can lead to the > >>>+ * vhost_dev_cleanup() call which will clean up vhost_dev > >>>+ * structure. > >>>+ */ > >>>+ while ((--i >= 0) && (hdev->started)) { > >>> vhost_virtqueue_stop(hdev, > >>> vdev, > >>> hdev->vqs + i, > >> > >>This should be a separate patch. > >Do you mean i should split this patch to two patches? > > > Yes. > > Thanks Got it. Will do it in v3. No other comments mixed in below. Thanks. > > > > > >Thanks. > > > >>Thanks > >> >