On Mon, 22 Sept 2025 at 16:23, Markus Armbruster <[email protected]> wrote: > > Alessandro Ratti <[email protected]> writes: > > > On Mon, 22 Sept 2025 at 14:29, Markus Armbruster <[email protected]> wrote: > >> > >> Alex Bennée <[email protected]> writes: > >> > >> > Daniel P. Berrangé <[email protected]> writes: > >> > > >> >> On Mon, Sep 22, 2025 at 12:37:57PM +0200, Markus Armbruster wrote: > >> >>> Alessandro Ratti <[email protected]> writes: > >> >>> > >> >>> > Replace virtio_error() with a macro that automatically prepends the > >> >>> > calling function name to error messages. This provides better context > >> >>> > for debugging virtio issues by showing exactly which function > >> >>> > encountered the error. > >> >>> > > >> >>> > Before: "Invalid queue size: 1024" > >> >>> > After: "virtio_queue_set_num: Invalid queue size: 1024" > >> >>> > > >> >>> > The implementation uses a macro to insert __func__ at compile time, > >> >>> > avoiding any runtime overhead while providing more specific error > >> >>> > context than a generic "virtio:" prefix. > >> >>> > >> >>> A need for function names and such in error messages suggests the error > >> >>> messages are crap. > >> >> > >> >> I pretty much agree. If we take that view forwards, then I think our > >> >> coding guidelines should explicitly state something like > >> >> > >> >> "Function names must never be included in error messages. > >> >> > >> >> The messages need to be sufficiently descriptive in their > >> >> text, such that including function names is redundant" > >> > >> I'm in favor. > >> > >> > Ahh I missed the fact this ends up as an error_report. I think having > >> > function names in debug output is fine. > >> > >> No argument! > >> > >> > It does however miss important information like which VirtIO device is > >> > actually failing, despite having vdev passed down to the function. > >> > >> Yes, which device failed should definitely be reported. > >> > >> [...] > > > > Hi Markus, Alex, Daniel, > > > > Thanks again for the thoughtful feedback and for helping me see the bigger > > picture. I now fully agree that adding function names to error messages (via > > __func__) doesn't really address the core issue, and I appreciate the > > push to rethink how error reporting can better serve both users and > > developers. > > > > I've taken a first stab at improving one of the messages in > > virtio_init_region_cache(), following your suggestions. > > > > Here's the updated call: > > > > ---8<--- Example diff --8<--- > > --- a/hw/virtio/virtio.c > > +++ b/hw/virtio/virtio.c > > @@ -240,6 +240,7 @@ void virtio_init_region_cache(VirtIODevice *vdev, int n) > > VirtQueue *vq = &vdev->vq[n]; > > VRingMemoryRegionCaches *old = vq->vring.caches; > > VRingMemoryRegionCaches *new = NULL; > > + DeviceState *dev = DEVICE(vdev); > > hwaddr addr, size; > > int64_t len; > > bool packed; > > @@ -264,7 +265,10 @@ void virtio_init_region_cache(VirtIODevice *vdev, int > > n) > > len = address_space_cache_init(&new->used, vdev->dma_as, > > vq->vring.used, size, true); > > if (len < size) { > > - virtio_error(vdev, "Cannot map used"); > > + virtio_error(vdev, > > + "Failed to map used ring for device %s - " > > + "possible guest misconfiguration or insufficient > > memory", > > + qdev_get_dev_path(dev)); > > goto err_used; > > } > > > > With this change, the error output now reads: > > > > qemu-system-x86_64: Failed to map used ring for device > > 0000:00:04.0 - possible guest misconfiguration or insufficient memory > > > > This feels like a clear improvement — it gives context (what failed), > > identifies the device, and hints at likely causes. > > It's *much* better! > > Developers will appreciate "Failed to map used ring for device". By > itself it would still be gobbledygook for users, but together with the > "possible guest misconfiguration or insufficient memory" clue it's fine. > > Perhaps we can still improve on "device 0000:00:04.0". The device's ID > is a good way to identify it to the user, because it's chosen by the > user, and unique (among devices). Sadly, devices without ID exist. We > fall back to canonical QOM path in places. Have a look at > qdev_get_human_name() to see whether it works here.
I experimented with qdev_get_human_name(), but it usually returns paths like: /machine/peripheral-anon/device[0]/virtio-backend …which seems less user-friendly than the PCI address provided by qdev_get_dev_path(). For now, I'm sticking to using the device ID when set (e.g. via -device…,id=foo) and falling back to qdev_get_dev_path() otherwise — which provides predictable output for both PCI and non-PCI devices. > Should we spell out that the device is now broken (whatever vdev->broken > means)? That's a good point — I'll leave that for a possible follow-up so we can keep this patch focused on improving error clarity and device identification first. I'll submit a new patch shortly based on this discussion, with the debug scaffolding removed. I'll include a link to this thread in the cover letter for continuity. > > If this is more in line with what you'd expect, I'd be happy to submit a new > > patch focused solely on improving a few key virtio error messages in this > > direction, starting with the worst offenders in virtio_init_region_cache(). > > > > Thanks again for your time and guidance — I'm learning a lot from this > > process. > > You're welcome! And thank you for accepting my little rant as > constructive criticism :) :) Thanks again for your time and guidance — it's really helping me learn the ropes here. Best regards, Alessandro
