On Tue, Dec 02, 2025 at 06:31:19PM +0200, Andrey Drobyshev wrote:
> Commit 772f86839f ("scripts/qemu-gdb: Support coroutine dumps in
> coredumps") introduced coroutine traces in coredumps using raw stack
> unwinding.  While this works, this approach does not allow to view the
> function arguments in the corresponding stack frames.
> 
> As an alternative, we can obtain saved registers from the coroutine's
> jmpbuf, patch them into the coredump's struct elf_prstatus in place, and
> execute another gdb subprocess to get backtrace from the patched temporary
> coredump.
> 
> While providing more detailed info, this alternative approach, however, is
> more invasive as it might potentially corrupt the coredump file. We do take
> precautions by saving the original registers values into a separate binary
> blob /path/to/coredump.ptregs, so that it can be restores in the next
> GDB session.  Still, instead of making it a new deault, let's keep raw unwind
> the default behaviour, but add the '--detailed' option for 'qemu bt' and
> 'qemu coroutine' command which would enforce the new behaviour.
> 
> That's how this looks:
> 
>   (gdb) qemu coroutine 0x7fda9335a508
>   #0  0x5602bdb41c26 in qemu_coroutine_switch<+214> () at 
> ../util/coroutine-ucontext.c:321
>   #1  0x5602bdb3e8fe in qemu_aio_coroutine_enter<+493> () at 
> ../util/qemu-coroutine.c:293
>   #2  0x5602bdb3c4eb in co_schedule_bh_cb<+538> () at ../util/async.c:547
>   #3  0x5602bdb3b518 in aio_bh_call<+119> () at ../util/async.c:172
>   #4  0x5602bdb3b79a in aio_bh_poll<+457> () at ../util/async.c:219
>   #5  0x5602bdb10f22 in aio_poll<+1201> () at ../util/aio-posix.c:719
>   #6  0x5602bd8fb1ac in iothread_run<+123> () at ../iothread.c:63
>   #7  0x5602bdb18a24 in qemu_thread_start<+355> () at 
> ../util/qemu-thread-posix.c:393
> 
>   (gdb) qemu coroutine 0x7fda9335a508 --detailed
>   patching core file /tmp/tmpq4hmk2qc
>   found "CORE" at 0x10c48
>   assume pt_regs at 0x10cbc
>   write r15 at 0x10cbc
>   write r14 at 0x10cc4
>   write r13 at 0x10ccc
>   write r12 at 0x10cd4
>   write rbp at 0x10cdc
>   write rbx at 0x10ce4
>   write rip at 0x10d3c
>   write rsp at 0x10d54
> 
>   #0  0x00005602bdb41c26 in qemu_coroutine_switch (from_=0x7fda9335a508, 
> to_=0x7fda8400c280, action=COROUTINE_ENTER) at 
> ../util/coroutine-ucontext.c:321
>   #1  0x00005602bdb3e8fe in qemu_aio_coroutine_enter (ctx=0x5602bf7147c0, 
> co=0x7fda8400c280) at ../util/qemu-coroutine.c:293
>   #2  0x00005602bdb3c4eb in co_schedule_bh_cb (opaque=0x5602bf7147c0) at 
> ../util/async.c:547
>   #3  0x00005602bdb3b518 in aio_bh_call (bh=0x5602bf714a40) at 
> ../util/async.c:172
>   #4  0x00005602bdb3b79a in aio_bh_poll (ctx=0x5602bf7147c0) at 
> ../util/async.c:219
>   #5  0x00005602bdb10f22 in aio_poll (ctx=0x5602bf7147c0, blocking=true) at 
> ../util/aio-posix.c:719
>   #6  0x00005602bd8fb1ac in iothread_run (opaque=0x5602bf42b100) at 
> ../iothread.c:63
>   #7  0x00005602bdb18a24 in qemu_thread_start (args=0x5602bf7164a0) at 
> ../util/qemu-thread-posix.c:393
>   #8  0x00007fda9e89f7f2 in start_thread (arg=<optimized out>) at 
> pthread_create.c:443
>   #9  0x00007fda9e83f450 in clone3 () at 
> ../sysdeps/unix/sysv/linux/x86_64/clone3.S:81
> 
> CC: Vladimir Sementsov-Ogievskiy <[email protected]>
> CC: Peter Xu <[email protected]>
> Originally-by: Vladimir Sementsov-Ogievskiy <[email protected]>
> Signed-off-by: Andrey Drobyshev <[email protected]>
> ---
>  scripts/qemugdb/coroutine.py | 243 +++++++++++++++++++++++++++++++++--
>  1 file changed, 233 insertions(+), 10 deletions(-)
> 
> diff --git a/scripts/qemugdb/coroutine.py b/scripts/qemugdb/coroutine.py
> index e98fc48a4b..280c02c12d 100644
> --- a/scripts/qemugdb/coroutine.py
> +++ b/scripts/qemugdb/coroutine.py
> @@ -10,9 +10,116 @@
>  # or later.  See the COPYING file in the top-level directory.
>  
>  import gdb
> +import os
> +import pty
> +import re
> +import struct
> +import textwrap
> +
> +from collections import OrderedDict
> +from copy import deepcopy
>  
>  VOID_PTR = gdb.lookup_type('void').pointer()
>  
> +# Registers in the same order they're present in ELF coredump file.
> +# See asm/ptrace.h
> +PT_REGS = ['r15', 'r14', 'r13', 'r12', 'rbp', 'rbx', 'r11', 'r10', 'r9',
> +           'r8', 'rax', 'rcx', 'rdx', 'rsi', 'rdi', 'orig_rax', 'rip', 'cs',
> +           'eflags', 'rsp', 'ss']
> +
> +coredump = None
> +
> +
> +class Coredump:
> +    _ptregs_suff = '.ptregs'
> +
> +    def __init__(self, coredump, executable):
> +        gdb.events.exited.connect(self._cleanup)

It's not clear to me that this cleanup mechanism is reliable:

- The restore_regs() method is called from invoke(), but not in a
  `finally` block that would guarantee it runs even when an exception is
  thrown. Maybe _cleanup() can be called without a prior restore_regs()
  call. It would be inconvenient to lose the original register values.

- I'm not sure if gdb.events.exited (when GDB's inferior terminates) is
  the correct event to ensure cleanup. The worst case is that the
  temporary file is leaked, which is not a serious problem.

But then this is a debugging script and it's probably fine:

Reviewed-by: Stefan Hajnoczi <[email protected]>

Attachment: signature.asc
Description: PGP signature

Reply via email to