On Tue, Aug 5, 2025 at 7:49 PM Daniel P. Berrangé <berra...@redhat.com> wrote: > > On Tue, Aug 05, 2025 at 07:22:14PM +0300, Manos Pitsidianakis wrote: > > On Tue, Aug 5, 2025 at 7:00 PM Daniel P. Berrangé <berra...@redhat.com> > > wrote: > > > > > > On Tue, Aug 05, 2025 at 12:19:26PM +0300, Manos Pitsidianakis wrote: > > > > Add a backtrace_on_error meson feature (enabled with > > > > --enable-backtrace-on-error) that compiles system binaries with > > > > -rdynamic option and prints a function backtrace on error to stderr. > > > > > > > > Example output by adding an unconditional error_setg on error_abort in > > > > hw/arm/boot.c: > > > > > > > > ./qemu-system-aarch64(+0x13b4a2c) [0x55d015406a2c] > > > > ./qemu-system-aarch64(+0x13b4abd) [0x55d015406abd] > > > > ./qemu-system-aarch64(+0x13b4d49) [0x55d015406d49] > > > > ./qemu-system-aarch64(error_setg_internal+0xe7) [0x55d015406f62] > > > > ./qemu-system-aarch64(arm_load_dtb+0xbf) [0x55d014d7686f] > > > > ./qemu-system-aarch64(+0xd2f1d8) [0x55d014d811d8] > > > > ./qemu-system-aarch64(notifier_list_notify+0x44) [0x55d01540a282] > > > > ./qemu-system-aarch64(qdev_machine_creation_done+0xa0) > > > > [0x55d01476ae17] > > > > ./qemu-system-aarch64(+0xaa691e) [0x55d014af891e] > > > > ./qemu-system-aarch64(qmp_x_exit_preconfig+0x72) [0x55d014af8a5d] > > > > ./qemu-system-aarch64(qemu_init+0x2a89) [0x55d014afb657] > > > > ./qemu-system-aarch64(main+0x2f) [0x55d01521e836] > > > > /lib/x86_64-linux-gnu/libc.so.6(+0x29ca8) [0x7f3033d67ca8] > > > > /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x85) > > > > [0x7f3033d67d65] > > > > ./qemu-system-aarch64(_start+0x21) [0x55d0146814f1] > > > > > > > > Unexpected error in arm_load_dtb() at ../hw/arm/boot.c:529: > > > > > > From an end-user POV, IMHO the error messages need to be good enough > > > that such backtraces aren't needed to understand the problem. For > > > developers, GDB can give much better backtraces (file+line numbers, > > > plus parameters plus local variables) in the ideally rare cases that > > > the error message alone has insufficient info. So I'm not really > > > convinced that programs (in general, not just QEMU) should try to > > > create backtraces themselves. > > > > I don't think there's value in replacing gdb debugging with this, I > > agree. I think it has value for "fire and forget" uses, when errors > > happen unexpectedly and are hard to replicate and you only end up with > > log entries and no easy way to debug it. > > If the log entry with the error message is useless for devs, then it > is even worse for end users... who will be copying that message into > bug reports anyway. This patch doesn't feel like something we could > enable in formal builds in the distro, so we still need better error > reporting without it, such that user bug reports are actionable. > > Was there a specific place where you found things hard to debug > from the error message alone ? I'm sure we have plenty of examples > of errors that can be improved, but wondering if there are some > general patterns we're doing badly that would be a good win > to improve ?
Some months ago I was debugging a MemoryRegion use-after-free and used this code to figure out that the free was called from RCU context instead of the main thread. For problems where the error can happen from multiple contexts and places in the code-base, a backtrace can provide additional insight that might be helpful in a few cases. Again, the intented usecase is not developers with gdb that can reproduce a bug. People ask on IRC about bugs they have that happen rarely over the timespan of a few months and they only have logs to go with. Considering that this feature can be off by default (as it is in this RFC) I don't think there's potential for distro end users to be confused. Thanks, -- Manos Pitsidianakis Emulation and Virtualization Engineer at Linaro Ltd