Vladimir Sementsov-Ogievskiy <vsement...@yandex-team.ru> writes: > For now we only log the vhost device error, when virtqueue is actually > stopped. Let's add a QAPI event, which makes possible: > > - collect statistics of such errors > - make immediate actions: take coredums or do some other debugging
Core dumps, I presume. Is QMP the right tool for the job? Or could a trace point do? > The event could be reused for some other virtqueue problems (not only > for vhost devices) in future. For this it gets a generic name and > structure. > > Signed-off-by: Vladimir Sementsov-Ogievskiy <vsement...@yandex-team.ru> > --- > hw/virtio/vhost.c | 12 +++++++++--- > qapi/qdev.json | 25 +++++++++++++++++++++++++ > 2 files changed, 34 insertions(+), 3 deletions(-) > > diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c > index f758f177bb..caa81f2ace 100644 > --- a/hw/virtio/vhost.c > +++ b/hw/virtio/vhost.c > @@ -15,6 +15,7 @@ > > #include "qemu/osdep.h" > #include "qapi/error.h" > +#include "qapi/qapi-events-qdev.h" > #include "hw/virtio/vhost.h" > #include "qemu/atomic.h" > #include "qemu/range.h" Only tangentially related to this patch, but here goes anyway: /* enabled until disconnected backend stabilizes */ #define _VHOST_DEBUG 1 This is from 2016. Has it stabilized? #ifdef _VHOST_DEBUG #define VHOST_OPS_DEBUG(retval, fmt, ...) \ do { \ error_report(fmt ": %s (%d)", ## __VA_ARGS__, \ strerror(-retval), -retval); \ error_report() is for errors the user can do something about, not for debug messages. } while (0) #else #define VHOST_OPS_DEBUG(retval, fmt, ...) \ do { } while (0) #endif > @@ -1287,11 +1288,16 @@ static void > vhost_virtqueue_error_notifier(EventNotifier *n) > struct vhost_virtqueue *vq = container_of(n, struct vhost_virtqueue, > error_notifier); > struct vhost_dev *dev = vq->dev; > - int index = vq - dev->vqs; > > if (event_notifier_test_and_clear(n) && dev->vdev) { > - VHOST_OPS_DEBUG(-EINVAL, "vhost vring error in virtqueue %d", > - dev->vq_index + index); > + int ind = vq - dev->vqs + dev->vq_index; > + DeviceState *ds = &dev->vdev->parent_obj; > + > + VHOST_OPS_DEBUG(-EINVAL, "vhost vring error in virtqueue %d", ind); > + qapi_event_send_virtqueue_error(!!ds->id, ds->id, ds->canonical_path, > + ind, VIRTQUEUE_ERROR_VHOST_VRING_ERR, > + "vhost reported failure through > vring " > + "error fd"); Do we still need VHOST_OPS_DEBUG() here? > } > } > > diff --git a/qapi/qdev.json b/qapi/qdev.json > index 2708fb4e99..b7c2669c2c 100644 > --- a/qapi/qdev.json > +++ b/qapi/qdev.json > @@ -158,3 +158,28 @@ > ## > { 'event': 'DEVICE_UNPLUG_GUEST_ERROR', > 'data': { '*device': 'str', 'path': 'str' } } > + > +## > +# @VirtqueueError: > +# > +# Since: 7.2 > +## > +{ 'enum': 'VirtqueueError', > + 'data': [ 'vhost-vring-err' ] } > + > +## > +# @VIRTQUEUE_ERROR: > +# > +# Emitted when a device virtqueue fails in runtime. > +# > +# @device: the device's ID if it has one > +# @path: the device's QOM path > +# @virtqueue: virtqueue index > +# @error: error identifier > +# @description: human readable description > +# > +# Since: 7.2 > +## > +{ 'event': 'VIRTQUEUE_ERROR', > + 'data': { '*device': 'str', 'path': 'str', 'virtqueue': 'int', > + 'error': 'VirtqueueError', 'description': 'str'} } Can the guest trigger the event? If yes, it needs to be rate-limited.