> On Jul 28, 2023, at 3:48 AM, Li Feng <[email protected]> wrote:
>
> Thanks for your reply.
>
>> 2023年7月28日 上午5:21,Raphael Norwitz <[email protected]> 写道:
>>
>>
>>
>>> On Jul 25, 2023, at 6:19 AM, Li Feng <[email protected]> wrote:
>>>
>>> Thanks for your comments.
>>>
>>>> 2023年7月25日 上午1:21,Raphael Norwitz <[email protected]> 写道:
>>>>
>>>> Very excited to see this. High level looks good modulo a few small things.
>>>>
>>>> My major concern is around existing vhost-user-scsi backends which don’t
>>>> support VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD. IMO we should hide the
>>>> reconnect behavior behind a VHOST_USER_PROTOCOL_F_INFLIGHT_SHMFD check. We
>>>> may want to do the same for vhost-user-blk.
>>>>
>>>> The question is then what happens if the check is false. IIUC without an
>>>> inflight FD, if a device processes requests out of order, it’s not safe to
>>>> continue execution on reconnect, as there’s no way for the backend to know
>>>> how to replay IO. Should we permanently wedge the device or have QEMU fail
>>>> out? May be nice to have a toggle for this.
>>>
>>> Based on what MST said, is there anything else I need to do?
>>
>> I don’t think so.
>>
>>>>
>>>>> On Jul 21, 2023, at 6:51 AM, Li Feng <[email protected]> wrote:
>>>>>
>>>>> If the backend crashes and restarts, the device is broken.
>>>>> This patch adds reconnect for vhost-user-scsi.
>>>>>
>>>>> Tested with spdk backend.
>>>>>
>>>>> Signed-off-by: Li Feng <[email protected]>
>>>>> ---
>>>>> hw/block/vhost-user-blk.c | 2 -
>>>>> hw/scsi/vhost-scsi-common.c | 27 ++---
>>>>> hw/scsi/vhost-user-scsi.c | 163 +++++++++++++++++++++++++---
>>>>> include/hw/virtio/vhost-user-scsi.h | 3 +
>>>>> include/hw/virtio/vhost.h | 2 +
>>>>> 5 files changed, 165 insertions(+), 32 deletions(-)
>>>>>
>>>>> diff --git a/hw/block/vhost-user-blk.c b/hw/block/vhost-user-blk.c
>>>>> index eecf3f7a81..f250c740b5 100644
>>>>> --- a/hw/block/vhost-user-blk.c
>>>>> +++ b/hw/block/vhost-user-blk.c
>>>>> @@ -32,8 +32,6 @@
>>>>> #include "sysemu/sysemu.h"
>>>>> #include "sysemu/runstate.h"
>>>>>
>>>>> -#define REALIZE_CONNECTION_RETRIES 3
>>>>> -
>>>>> static const int user_feature_bits[] = {
>>>>> VIRTIO_BLK_F_SIZE_MAX,
>>>>> VIRTIO_BLK_F_SEG_MAX,
>>>>> diff --git a/hw/scsi/vhost-scsi-common.c b/hw/scsi/vhost-scsi-common.c
>>>>
>>>> Why can’t all the vhost-scsi-common stuff be moved to a separate change?
>>>
>>> I will move this code to separate patch.
>>>>
>>>> Especially the stuff introduced for vhost-user-blk in
>>>> 1b0063b3048af65dfaae6422a572c87db8575a92 should be moved out.
>>> OK.
>>>
>>>>
>>>>> index a06f01af26..08801886b8 100644
>>>>> --- a/hw/scsi/vhost-scsi-common.c
>>>>> +++ b/hw/scsi/vhost-scsi-common.c
>>>>> @@ -52,16 +52,22 @@ int vhost_scsi_common_start(VHostSCSICommon *vsc)
>>>>>
>>>>> vsc->dev.acked_features = vdev->guest_features;
>>>>>
>>>>> - assert(vsc->inflight == NULL);
>>>>> - vsc->inflight = g_new0(struct vhost_inflight, 1);
>>>>> - ret = vhost_dev_get_inflight(&vsc->dev,
>>>>> - vs->conf.virtqueue_size,
>>>>> - vsc->inflight);
>>>>> + ret = vhost_dev_prepare_inflight(&vsc->dev, vdev);
>>>>> if (ret < 0) {
>>>>> - error_report("Error get inflight: %d", -ret);
>>>>> + error_report("Error setting inflight format: %d", -ret);
>>>>> goto err_guest_notifiers;
>>>>> }
>>>>>
>>>>> + if (!vsc->inflight->addr) {
>>>>> + ret = vhost_dev_get_inflight(&vsc->dev,
>>>>> + vs->conf.virtqueue_size,
>>>>> + vsc->inflight);
>>>>> + if (ret < 0) {
>>>>> + error_report("Error get inflight: %d", -ret);
>>>>> + goto err_guest_notifiers;
>>>>> + }
>>>>> + }
>>>>> +
>>>>> ret = vhost_dev_set_inflight(&vsc->dev, vsc->inflight);
>>>>> if (ret < 0) {
>>>>> error_report("Error set inflight: %d", -ret);
>>>>> @@ -85,9 +91,6 @@ int vhost_scsi_common_start(VHostSCSICommon *vsc)
>>>>> return ret;
>>>>>
>>>>> err_guest_notifiers:
>>>>> - g_free(vsc->inflight);
>>>>> - vsc->inflight = NULL;
>>>>> -
>>>>> k->set_guest_notifiers(qbus->parent, vsc->dev.nvqs, false);
>>>>> err_host_notifiers:
>>>>> vhost_dev_disable_notifiers(&vsc->dev, vdev);
>>>>> @@ -111,12 +114,6 @@ void vhost_scsi_common_stop(VHostSCSICommon *vsc)
>>>>> }
>>>>> assert(ret >= 0);
>>>>>
>>>>
>>>> In the vhost-scsi (kernel backend) path, what will cleanup vsc->inflight
>>>> now?
>>> OK, we should check the vsc->inflight if it is null, the vhost-scsi doesn’t
>>> allocate the
>>> inflight object memory.
>>
>> Are you saying vhost-scsi never allocates inflight so we don’t need to check
>> for it?
> We have checked the vsc->inflight, and only if allocated, we send the
> get/set_inflight_fd.
> This works with vhost-user-scsi/vhost-scsi both.
So then it sounds like this code introduces a resource leak.
g_free(vsc->inflight) should be added to the vhost-scsi code in
vhost_scsi_stop().
>>
>>>
>>>>
>>>>> - if (vsc->inflight) {
>>>>> - vhost_dev_free_inflight(vsc->inflight);
>>>>> - g_free(vsc->inflight);
>>>>> - vsc->inflight = NULL;
>>>>> - }
>>>>> -
>>>>> vhost_dev_disable_notifiers(&vsc->dev, vdev);
>>>>> }
>>>>>
>>>>> diff --git a/hw/scsi/vhost-user-scsi.c b/hw/scsi/vhost-user-scsi.c
>>>>> index ee99b19e7a..e0e88b0c42 100644
>>>>> --- a/hw/scsi/vhost-user-scsi.c
>>>>> +++ b/hw/scsi/vhost-user-scsi.c
>>>>> @@ -89,14 +89,126 @@ static void vhost_dummy_handle_output(VirtIODevice
>>>>> *vdev, VirtQueue *vq)
>>>>> {
>>>>> }
>>>>>
>>>>> +static int vhost_user_scsi_connect(DeviceState *dev, Error **errp)
>>>>> +{
>>>>> + VirtIODevice *vdev = VIRTIO_DEVICE(dev);
>>>>> + VHostUserSCSI *s = VHOST_USER_SCSI(vdev);
>>>>> + VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
>>>>> + VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(dev);
>>>>> + int ret = 0;
>>>>> +
>>>>> + if (s->connected) {
>>>>> + return 0;
>>>>> + }
>>>>> + s->connected = true;
>>>>> +
>>>>> + vsc->dev.num_queues = vs->conf.num_queues;
>>>>> + vsc->dev.nvqs = VIRTIO_SCSI_VQ_NUM_FIXED + vs->conf.num_queues;
>>>>> + vsc->dev.vqs = s->vhost_vqs;
>>>>> + vsc->dev.vq_index = 0;
>>>>> + vsc->dev.backend_features = 0;
>>>>> +
>>>>> + ret = vhost_dev_init(&vsc->dev, &s->vhost_user,
>>>>> VHOST_BACKEND_TYPE_USER, 0,
>>>>> + errp);
>>>>> + if (ret < 0) {
>>>>> + return ret;
>>>>> + }
>>>>> +
>>>>> + /* restore vhost state */
>>>>
>>>> Should this use virtio_device_should_start like vhost_user_blk?
>>> I will change this.
>>>>
>>>>> + if (virtio_device_started(vdev, vdev->status)) {
>>>>> + ret = vhost_scsi_common_start(vsc);
>>>>> + if (ret < 0) {
>>>>> + return ret;
>>>>> + }
>>>>> + }
>>>>> +
>>>>> + return 0;
>>>>> +}
>>>>> +
>>>>> +static void vhost_user_scsi_event(void *opaque, QEMUChrEvent event);
>>>>> +
>>>>> +static void vhost_user_scsi_disconnect(DeviceState *dev)
>>>>> +{
>>>>> + VirtIODevice *vdev = VIRTIO_DEVICE(dev);
>>>>> + VHostUserSCSI *s = VHOST_USER_SCSI(vdev);
>>>>> + VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
>>>>> + VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(dev);
>>>>> +
>>>>
>>>> I don’t think we want to execute vhost_scsi_common_stop() if the device
>>>> hasn’t been started. I remember that caused a number of races with the
>>>> vhost_user_blk connecting/disconnecting on startup.
>>>>
>>>> Let’s add a similar started_vu check?
>>> I will add it.
>>>>
>>>>> + if (!s->connected) {
>>>>> + return;
>>>>> + }
>>>>> + s->connected = false;
>>>>> +
>>>>> + vhost_scsi_common_stop(vsc);
>>>>> +
>>>>> + vhost_dev_cleanup(&vsc->dev);
>>>>> +
>>>>> + /* Re-instate the event handler for new connections */
>>>>> + qemu_chr_fe_set_handlers(&vs->conf.chardev, NULL, NULL,
>>>>> + vhost_user_scsi_event, NULL, dev, NULL,
>>>>> true);
>>>>> +}
>>>>> +
>>>>> +static void vhost_user_scsi_event(void *opaque, QEMUChrEvent event)
>>>>> +{
>>>>> + DeviceState *dev = opaque;
>>>>> + VirtIODevice *vdev = VIRTIO_DEVICE(dev);
>>>>> + VHostUserSCSI *s = VHOST_USER_SCSI(vdev);
>>>>> + VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
>>>>> + VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(dev);
>>>>> + Error *local_err = NULL;
>>>>> +
>>>>> + switch (event) {
>>>>> + case CHR_EVENT_OPENED:
>>>>> + if (vhost_user_scsi_connect(dev, &local_err) < 0) {
>>>>> + error_report_err(local_err);
>>>>> + qemu_chr_fe_disconnect(&vs->conf.chardev);
>>>>> + return;
>>>>> + }
>>>>> + break;
>>>>> + case CHR_EVENT_CLOSED:
>>>>> + /* defer close until later to avoid circular close */
>>>>> + vhost_user_async_close(dev, &vs->conf.chardev, &vsc->dev,
>>>>> + vhost_user_scsi_disconnect);
>>>>> + break;
>>>>> + case CHR_EVENT_BREAK:
>>>>> + case CHR_EVENT_MUX_IN:
>>>>> + case CHR_EVENT_MUX_OUT:
>>>>> + /* Ignore */
>>>>> + break;
>>>>> + }
>>>>> +}
>>>>> +
>>>>> +static int vhost_user_scsi_realize_connect(VHostUserSCSI *s, Error
>>>>> **errp)
>>>>> +{
>>>>> + DeviceState *dev = &s->parent_obj.parent_obj.parent_obj.parent_obj;
>>>>> + VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(dev);
>>>>> + int ret;
>>>>> +
>>>>> + s->connected = false;
>>>>> +
>>>>> + ret = qemu_chr_fe_wait_connected(&vs->conf.chardev, errp);
>>>>> + if (ret < 0) {
>>>>> + return ret;
>>>>> + }
>>>>> +
>>>>> + ret = vhost_user_scsi_connect(dev, errp);
>>>>> + if (ret < 0) {
>>>>> + qemu_chr_fe_disconnect(&vs->conf.chardev);
>>>>> + return ret;
>>>>> + }
>>>>> + assert(s->connected);
>>>>> +
>>>>> + return 0;
>>>>> +}
>>>>> +
>>>>> static void vhost_user_scsi_realize(DeviceState *dev, Error **errp)
>>>>> {
>>>>> VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(dev);
>>>>> VHostUserSCSI *s = VHOST_USER_SCSI(dev);
>>>>> VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
>>>>> - struct vhost_virtqueue *vqs = NULL;
>>>>> Error *err = NULL;
>>>>> int ret;
>>>>> + int retries = REALIZE_CONNECTION_RETRIES;
>>>>>
>>>>> if (!vs->conf.chardev.chr) {
>>>>> error_setg(errp, "vhost-user-scsi: missing chardev");
>>>>> @@ -112,21 +224,31 @@ static void vhost_user_scsi_realize(DeviceState
>>>>> *dev, Error **errp)
>>>>> }
>>>>>
>>>>> if (!vhost_user_init(&s->vhost_user, &vs->conf.chardev, errp)) {
>>>>
>>>> Why execute vhost_user_cleanup() if vhost_user_init() fails?
>>> OK, move this line up in v2.
>>>
>>>>
>>>>> - goto free_virtio;
>>>>> + goto free_vhost;
>>>>> }
>>>>>
>>>>> - vsc->dev.nvqs = VIRTIO_SCSI_VQ_NUM_FIXED + vs->conf.num_queues;
>>>>> - vsc->dev.vqs = g_new0(struct vhost_virtqueue, vsc->dev.nvqs);
>>>>> - vsc->dev.vq_index = 0;
>>>>> - vsc->dev.backend_features = 0;
>>>>> - vqs = vsc->dev.vqs;
>>>>> + vsc->inflight = g_new0(struct vhost_inflight, 1);
>>>>> + s->vhost_vqs = g_new0(struct vhost_virtqueue,
>>>>> + VIRTIO_SCSI_VQ_NUM_FIXED +
>>>>> vs->conf.num_queues);
>>>>> +
>>>>> + assert(!*errp);
>>>>> + do {
>>>>> + if (*errp) {
>>>>> + error_prepend(errp, "Reconnecting after error: ");
>>>>> + error_report_err(*errp);
>>>>> + *errp = NULL;
>>>>> + }
>>>>> + ret = vhost_user_scsi_realize_connect(s, errp);
>>>>> + } while (ret < 0 && retries--);
>>>>>
>>>>> - ret = vhost_dev_init(&vsc->dev, &s->vhost_user,
>>>>> - VHOST_BACKEND_TYPE_USER, 0, errp);
>>>>> if (ret < 0) {
>>>>> - goto free_vhost;
>>>>> + goto free_vqs;
>>>>> }
>>>>>
>>>>> + /* we're fully initialized, now we can operate, so add the handler */
>>>>> + qemu_chr_fe_set_handlers(&vs->conf.chardev, NULL, NULL,
>>>>> + vhost_user_scsi_event, NULL, (void *)dev,
>>>>> + NULL, true);
>>>>> /* Channel and lun both are 0 for bootable vhost-user-scsi disk */
>>>>> vsc->channel = 0;
>>>>> vsc->lun = 0;
>>>>> @@ -134,10 +256,15 @@ static void vhost_user_scsi_realize(DeviceState
>>>>> *dev, Error **errp)
>>>>>
>>>>> return;
>>>>>
>>>>> +free_vqs:
>>>>> + g_free(s->vhost_vqs);
>>>>> + s->vhost_vqs = NULL;
>>>>> + g_free(vsc->inflight);
>>>>> + vsc->inflight = NULL;
>>>>> +
>>>>> free_vhost:
>>>>> vhost_user_cleanup(&s->vhost_user);
>>>>> - g_free(vqs);
>>>>> -free_virtio:
>>>>> +
>>>>> virtio_scsi_common_unrealize(dev);
>>>>> }
>>>>>
>>>>> @@ -146,16 +273,22 @@ static void vhost_user_scsi_unrealize(DeviceState
>>>>> *dev)
>>>>> VirtIODevice *vdev = VIRTIO_DEVICE(dev);
>>>>> VHostUserSCSI *s = VHOST_USER_SCSI(dev);
>>>>> VHostSCSICommon *vsc = VHOST_SCSI_COMMON(s);
>>>>> - struct vhost_virtqueue *vqs = vsc->dev.vqs;
>>>>> + VirtIOSCSICommon *vs = VIRTIO_SCSI_COMMON(dev);
>>>>>
>>>>> /* This will stop the vhost backend. */
>>>>> vhost_user_scsi_set_status(vdev, 0);
>>>>> + qemu_chr_fe_set_handlers(&vs->conf.chardev, NULL, NULL, NULL, NULL,
>>>>> NULL,
>>>>> + NULL, false);
>>>>>
>>>>> vhost_dev_cleanup(&vsc->dev);
>>>>> - g_free(vqs);
>>>>
>>>> Nit: Why not put vhost_dev_free_inflight next to the remaining inflight
>>>> cleanup?
>>> OK.
>>>>
>>>>> + vhost_dev_free_inflight(vsc->inflight);
>>>>> + g_free(s->vhost_vqs);
>>>>> + s->vhost_vqs = NULL;
>>>>> + g_free(vsc->inflight);
>>>>> + vsc->inflight = NULL;
>>>>>
>>>>
>>>> Curiosity - why reorder here? Is something in vhost_user_cleanup()
>>>> dependent on state freed in virtio_scsi_common_unrealize()?
>>>>
>>>> If so, should that go as a standalone fix?
>>>
>>> Because in vhost_user_scsi_realize, we initialize in order:
>>> virtio_scsi_common_realize
>>> vhost_user_init
>>>
>>> And in the error handler of vhost_user_scsi_realize, the uninitialize in
>>> order:
>>> vhost_user_cleanup
>>> virtio_scsi_common_unrealize
>>>
>>> I think in vhost_user_scsi_unrealize we should keep it the same order,
>>> right?
>>
>> I’m not saying it’s wrong. If there’s no dependency (i.e. this is not fixing
>> a bug, just a stylistic improvement) it can stay in the same change.
> OK.
>>
>>>
>>>>
>>>>> - virtio_scsi_common_unrealize(dev);
>>>>> vhost_user_cleanup(&s->vhost_user);
>>>>> + virtio_scsi_common_unrealize(dev);
>>>>> }
>>>>>
>>>>> static Property vhost_user_scsi_properties[] = {
>>>>> diff --git a/include/hw/virtio/vhost-user-scsi.h
>>>>> b/include/hw/virtio/vhost-user-scsi.h
>>>>> index 521b08e559..c66acc68b7 100644
>>>>> --- a/include/hw/virtio/vhost-user-scsi.h
>>>>> +++ b/include/hw/virtio/vhost-user-scsi.h
>>>>> @@ -29,6 +29,9 @@ OBJECT_DECLARE_SIMPLE_TYPE(VHostUserSCSI,
>>>>> VHOST_USER_SCSI)
>>>>> struct VHostUserSCSI {
>>>>> VHostSCSICommon parent_obj;
>>>>> VhostUserState vhost_user;
>>>>
>>>> See above - we should probably have started_vu here/
>>> I will add it.
>>>>
>>>> Maybe we should have some shared struct with vhost_user_blk for
>>>> connectivity params?
>>>
>>> In the future vhost-user-blk/scsi can be refactored to share the same code.
>>
>> Sure - this can be done at some point in the future.
>>
>>>>
>>>>> + bool connected;
>>>>> +
>>>>> + struct vhost_virtqueue *vhost_vqs;
>>>>> };
>>>>>
>>>>> #endif /* VHOST_USER_SCSI_H */
>>>>> diff --git a/include/hw/virtio/vhost.h b/include/hw/virtio/vhost.h
>>>>> index 6a173cb9fa..b904346fe1 100644
>>>>> --- a/include/hw/virtio/vhost.h
>>>>> +++ b/include/hw/virtio/vhost.h
>>>>> @@ -8,6 +8,8 @@
>>>>> #define VHOST_F_DEVICE_IOTLB 63
>>>>> #define VHOST_USER_F_PROTOCOL_FEATURES 30
>>>>>
>>>>
>>>> Should the macro name indicate that this is for vhost-user?
>>>>
>>>> VU_REALIZE_CONN_RETRIES?
>>> I will rename it in v2.
>>>
>>>>
>>>>> +#define REALIZE_CONNECTION_RETRIES 3
>>>>> +
>>>>> /* Generic structures common for any vhost based device. */
>>>>>
>>>>> struct vhost_inflight {
>>>>> --
>>>>> 2.41.0
>
> Any comments about other patches?
I’ll send shortly.