Re: [PATCH] vdpa: Avoid reset when stop device

Si-Wei Liu Fri, 25 Mar 2022 13:04:21 -0700



On 3/25/2022 12:18 AM, Michael Qiu wrote:

On 2022/3/25 14:32, Si-Wei Liu Wrote:
On 3/23/2022 2:20 AM, Jason Wang wrote:
Adding Eugenio,  and Ling Shan.

On Wed, Mar 23, 2022 at 4:58 PM <08005...@163.com> wrote:
From: Michael Qiu <qiud...@archeros.com>

Currently, when VM poweroff, it will trigger vdpa
device(such as mlx bluefield2 VF) reset twice, this leads
to below issue:

vhost VQ 2 ring restore failed: -22: Invalid argument (22)

This because in vhost_dev_stop(), qemu tries to stop the device,
then stop the queue: vhost_virtqueue_stop().
In vhost_dev_stop(), it resets the device, which clear some flags
in low level driver, and the driver finds
that the VQ is invalied, this is the root cause.

Actually, device reset will be called within func release()

To solve the issue, vdpa should set vring unready, and
remove reset ops in device stop: vhost_dev_start(hdev, false).
This is an interesting issue. Do you see a real issue except for the
above warnings.

The reason we "abuse" reset is that we don't have a stop uAPI for
vhost. We plan to add a status bit to stop the whole device in the
virtio spec, but considering it may take a while maybe we can first
introduce a new uAPI/ioctl for that.
Yep. What was missing here is a vdpa specific uAPI for per-virtqueuestop/suspend rather than spec level amendment to stop the wholedevice (including both vq and config space). For now we can have vDPAspecific means to control the vq, something vDPA hardware vendor mustsupport for live migration, e.g. datapath switching to shadow vq. Ibelieve the spec amendment may follow to define a bit for virtiofeature negotiation later on if needed (FWIW virtio-vdpa already doesset_vq_ready(..., 0) to stop the vq).
However, there's a flaw in this patch, see below.
Note that the stop doesn't just work for virtqueue but others like,
e.g config space. But considering we don't have config interrupt
support right now, we're probably fine.

Checking the driver, it looks to me only the IFCVF's set_vq_ready() is
problematic, Ling Shan, please have a check. And we probably need a
workaround for vp_vdpa as well.

Anyhow, this seems to be better than reset. So for 7.1:

Acked-by: Jason Wang <jasow...@redhat.com>
Signed-off-by: Michael Qiu<qiud...@archeros.com>
---
  hw/virtio/vhost-vdpa.c | 8 ++++----
  1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c
index c5ed7a3..d858b4f 100644
--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c
@@ -719,14 +719,14 @@ static int vhost_vdpa_get_vq_index(structvhost_dev *dev, int idx)
      return idx;
  }

-static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev)
+static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev,unsigned int ready)
  {
      int i;
      trace_vhost_vdpa_set_vring_ready(dev);
      for (i = 0; i < dev->nvqs; ++i) {
          struct vhost_vring_state state = {
              .index = dev->vq_index + i,
-            .num = 1,
+            .num = ready,
          };
          vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state);
      }
@@ -1088,8 +1088,9 @@ static int vhost_vdpa_dev_start(structvhost_dev *dev, bool started)
          if (unlikely(!ok)) {
              return -1;
          }
-        vhost_vdpa_set_vring_ready(dev);
+        vhost_vdpa_set_vring_ready(dev, 1);
      } else {
+        vhost_vdpa_set_vring_ready(dev, 0);
          ok = vhost_vdpa_svqs_stop(dev);
          if (unlikely(!ok)) {
              return -1;
@@ -1105,7 +1106,6 @@ static int vhost_vdpa_dev_start(structvhost_dev *dev, bool started) memory_listener_register(&v->listener,&address_space_memory); return vhost_vdpa_add_status(dev,VIRTIO_CONFIG_S_DRIVER_OK);
      } else {
-        vhost_vdpa_reset_device(dev);
Unfortunately, the reset can't be be removed from here as this codepath usually involves virtio reset or status change for e.g. invokedvia virtio_net_set_status(... , 0). Ideally we should use theVhostOps.vhost_reset_device() to reset the vhost-vdpa device wherestatus change is involved after vhost_dev_stop() is done, but thisdistinction is not there yet as of today in all of the virtio devicesexcept vhost_user_scsi.
Alternatively we may be able to do something like below, stop thevirtqueue in vhost_vdpa_get_vring_base() in thevhost_virtqueue_stop() context. Only until the hardware vq isstopped, svq can stop and unmap then vhost-vdpa would reset thedevice status. It kinda works, but not in a perfect way...

As I indicated above, this is an less ideal way to address the issue youcame across about, without losing functionality or introducingregression to the code. Ideally it'd be best to get fixed in a cleanway, though that would a little more effort in code refactoring.Personally I feel that the error message you saw is somewhat benign anddon't think it caused any real problem. Did you see trouble if livingwith the bogus error message for the moment?


--- a/hw/virtio/vhost-vdpa.c
+++ b/hw/virtio/vhost-vdpa.c

@@ -564,14 +564,14 @@ static int vhost_vdpa_get_vq_index(structvhost_dev *dev, int idx)

      return idx;
  }

-static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev)

+static int vhost_vdpa_set_vring_ready(struct vhost_dev *dev, intenable)

  {
      int i;
      trace_vhost_vdpa_set_vring_ready(dev);
      for (i = 0; i < dev->nvqs; ++i) {
          struct vhost_vring_state state = {
              .index = dev->vq_index + i,
-            .num = 1,
+            .num = enable,
          };
          vhost_vdpa_call(dev, VHOST_VDPA_SET_VRING_ENABLE, &state);
      }

@@ -641,7 +641,7 @@ static int vhost_vdpa_dev_start(struct vhost_dev*dev, bool started)


      if (started) {
          vhost_vdpa_host_notifiers_init(dev);
-        vhost_vdpa_set_vring_ready(dev);
+        vhost_vdpa_set_vring_ready(dev, 1);
      } else {
          vhost_vdpa_host_notifiers_uninit(dev, dev->nvqs);
      }

@@ -708,6 +708,9 @@ static int vhost_vdpa_get_vring_base(structvhost_dev *dev,

  {
      int ret;

+    /* Deactivate the queue (best effort) */
+    vhost_vdpa_set_vring_ready(dev, 0);
+


I don't think it's a good idea to change the state in "get" function,

get means "read" not "write".

Well, if you look at the context of vhost_vdpa_get_vring_base(), theonly caller is vhost_virtqueue_stop(). Without stopping the hardwareahead, it doesn't make sense to effectively get a used_index snapshotfor resuming/restarting the vq. It might be more obvious and sensible tosee that were to introduce another Vhost op to suspend the vq rightbefore the get_vring_base() call, though I wouldn't bother doing it.

      ret = vhost_vdpa_call(dev, VHOST_GET_VRING_BASE, ring);
      trace_vhost_vdpa_get_vring_base(dev, ring->index, ring->num);
      return ret;
diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c
index 437347a..2e917d8 100644
--- a/hw/virtio/vhost.c
+++ b/hw/virtio/vhost.c

@@ -1832,15 +1832,15 @@ void vhost_dev_stop(struct vhost_dev *hdev,VirtIODevice *vdev)

      /* should only be called after backend is connected */
      assert(hdev->vhost_ops);

-    if (hdev->vhost_ops->vhost_dev_start) {
-        hdev->vhost_ops->vhost_dev_start(hdev, false);
-    }
      for (i = 0; i < hdev->nvqs; ++i) {
          vhost_virtqueue_stop(hdev,
                               vdev,
                               hdev->vqs + i,
                               hdev->vq_index + i);
      }
+    if (hdev->vhost_ops->vhost_dev_start) {
+        hdev->vhost_ops->vhost_dev_start(hdev, false);
+    }

This first idea comes to me is just like this, but at last I don'tchoose this solution.

When we start a device, first we start the virtqueue thenvhost_ops->vhost_dev_start.

So in stop stage, in my opinion, we should just do the opposite, do asthe orignal code do. Change the sequential is a dangerous action.

I don't see any danger yet, would you please elaborate the specificproblem you see? I think this sequence is as expected:

1. suspend each individual vq i.e. stop processing upcoming request, andpossibly complete inflight requests -> get_vring_base()2. tear down virtio resources from VMM for e.g. unmap guest memorymappings, remove host notifiers, and et al

3. reset device -> vhost_vdpa_reset_device()

Regards,
-Siwei


Thanks,
Michael

      if (vhost_dev_has_iommu(hdev)) {
          if (hdev->vhost_ops->vhost_set_iotlb_callback) {

Regards,
-Siwei

          vhost_vdpa_add_status(dev, VIRTIO_CONFIG_S_ACKNOWLEDGE |
VIRTIO_CONFIG_S_DRIVER);
          memory_listener_unregister(&v->listener);
--
1.8.3.1

Re: [PATCH] vdpa: Avoid reset when stop device

Reply via email to