On Mon, Aug 26, 2024 at 11:55 AM <devel-requ...@lists.crash-utility.osci.io>
wrote:

> Date: Mon, 26 Aug 2024 15:52:26 +1200
> From: Tao Liu <l...@redhat.com>
> Subject: [Crash-utility] [PATCH v6 00/14] gdb stack unwinding support
>         for crash utility
> To: devel@lists.crash-utility.osci.io
> Cc: Tao Liu <l...@redhat.com>
> Message-ID: <20240826035240.14781-1-l...@redhat.com>
> Content-Type: text/plain; charset=UTF-8
>
> This patchset is a rebase/merged version of the following 3 patchsets:
>
> 1): [PATCH v10 0/5] Improve stack unwind on ppc64 [1]
> 2): [PATCH 0/5] x86_64 gdb stack unwinding support [2]
> 3): Clean up on top of one-thread-v2 [3]
>
> A complete description of gdb stack unwinding support for crash can be
> found in [1].
>
> This patchset can be divided into the following 2 parts:
>
> 1) part1: arch independent, mainly modify on the
>    crash_target.c/gdb_interface.c files, in preparation of the
>    gdb side.
> 2) part2: arch specific part, for implementing ppc64/x86_64/arm64/vmware
>    gdb stack unwinding support.
>
> === part 2
>
> - arm64:
> arm64: Add gdb stack unwinding support
>
> - vmware:
> vmware_guestdump: Various format versions support
> set_context(): check if context is already current
>
> - x86_64:
> x86_64: Fix invalid input "=>" for bt command
> Fix cpumask_t recursive dependence issue
> x86_64: Add gdb stack unwinding support
>
> - ppc64:
> ppc64: correct gdb passthroughs by implementing machdep->get_cpu_reg
>
> === part 1
>
> Stop stack unwinding at non-kernel address
> Fix gdb_interface: restore gdb's output streams at end of gdb_interface
> Print task pid/command instead of CPU index
> Rename get_cpu_reg to get_current_task_reg
> Let crash change gdb context
> Leave only one gdb thread for crash
> Remove 'frame' from prohibited commands list
> ===
>
> v6 -> v5:
> 1) Refactor patch 4 & 9, which changed the function signature of struct
>    get_cpu_reg/get_current_task_reg, and let each patch compile with no
>    error when added on.
> 2) Rebased the patchset on top of latest upstream:
>    ("79b93ecb2e72ec Fix a "Bus error" issue caused by 'crash --osrelease'
> or
>    crash loading")
>
>
Thank you for the update, Tao.

I have a few more comments here:

[1] [PATCH v6 04/14]:

+       * Task is active, grab CPU's registers
+       */
+       if (is_task_active(tc->task) && VMSS_DUMPFILE())
+               return vmware_vmss_get_cpu_reg(tc->processor, regno, name,
size, value);

Can you help confirm that it only works for the active task? Is this
expected behavior?

[2] [PATCH v6 07/14]

#ifdef CRASH_MERGE
          CORE_ADDR pc = 0;
          get_frame_pc_if_available (fi, &pc);
          if (!is_kvaddr(pc)) {
            printf_filtered (_("Backtrace stopped due to non-kernel addr:
%lx\n"),pc);
            fi = NULL;
            break;
          }
#endif

I would suggest removing the above printf_filtered(...), otherwise it will
be always displayed.

[3] [PATCH v6 08/14]

a. warning
cc -c -g -DPPC64 -m64 -DLZO -DGDB_10_2  ppc64.c -Wall -O2
-Wstrict-prototypes -Wmissing-prototypes -fstack-protector
-Wformat-security
ppc64.c: In function ‘ppc64_get_current_task_reg’:
ppc64.c:2519:15: warning: unused variable ‘task’ [-Wunused-variable]
 2519 |         ulong task;
      |               ^~~~

b. There are different results between 'bt' and 'gdb bt' commands. Can you
help double check?
crash> bt
PID: 6298     TASK: c000000050bbcb00  CPU: 2    COMMAND: "bash"
 R0:  c0000000002be63c    R1:  c00000005102b710    R2:  c000000001bf8f00
 R3:  c00000005102b728    R4:  0000000000000000    R5:  0000000000000000
 R6:  c00000005102b8d8    R7:  c00000000f5d0000    R8:  0000000000000000
 R9:  c0000000557d3c00    R10: 0000000000000001    R11: 0000000000002000
 R12: c0000000027634a8    R13: c00000000f6cdf00    R14: 0000000000000000
 R15: 0000000000000000    R16: 0000000000000000    R17: 0000000000000000
 R18: 0000000000000000    R19: 0000000000000000    R20: 0000000000000000
 R21: 0000000000000000    R22: 0000000000000000    R23: 0000000000000000
 R24: 0000000000000007    R25: 0000000000000000    R26: 0000000000000000
 R27: c00000000274d898    R28: c000000002d0a8d0    R29: c000000002e23df0
 R30: 0000000000000000    R31: c000000002e23ddc
 NIP: c0000000002bdf64    MSR: 8000000000009033    OR3: 0000000000000000
 CTR: 0000000000000000    LR:  c0000000002be63c    XER: 0000000020040004
 CCR: 0000000024222280    MQ:  0000000000000001    DAR: 0000000000000000
 DSISR: 0000000000000000     Syscall Result: 0000000000000000
 [NIP  : crash_setup_regs+68]
 [LR   : __crash_kexec+156]
 #0 [c00000005102b710] crash_setup_regs at c0000000002bdf64
crash> gdb bt
#0  0xc0000000002bdf64 in crash_setup_regs (newregs=0xc00000005102b728,
oldregs=0x0) at ./arch/powerpc/include/asm/kexec.h:133
#1  0xc0000000002be658 in __crash_kexec (regs=0x0) at
kernel/crash_core.c:122
#2  0xc00000000016c284 in panic (fmt=0xc0000000015dd018 "sysrq triggered
crash\n") at kernel/panic.c:367
#3  0xc000000000a66e78 in sysrq_handle_crash (key=<optimized out>) at
drivers/tty/sysrq.c:154
#4  0xc000000000a67994 in __handle_sysrq (key=key@entry=99 'c',
check_mask=check_mask@entry=false) at drivers/tty/sysrq.c:612
#5  0xc000000000a68454 in write_sysrq_trigger (file=<optimized out>,
buf=<optimized out>, count=2, ppos=<optimized out>) at
drivers/tty/sysrq.c:1181
#6  0xc00000000072868c in pde_write (pde=0xc000000003671f80,
file=<optimized out>, buf=<optimized out>, count=<optimized out>,
ppos=<optimized out>) at fs/proc/inode.c:334
#7  proc_reg_write (file=<optimized out>, buf=<optimized out>,
count=<optimized out>, ppos=<optimized out>) at fs/proc/inode.c:346
#8  0xc00000000063bf20 in vfs_write (file=0xc000000009cb5b00,
buf=0x100123a71a0 <error: Cannot access memory at address 0x100123a71a0>,
count=2, pos=0xc00000005102bc00) at fs/read_write.c:588
#9  vfs_write (file=0xc000000009cb5b00, buf=0x100123a71a0 <error: Cannot
access memory at address 0x100123a71a0>, count=<optimized out>,
pos=0xc00000005102bc00) at fs/read_write.c:570
#10 0xc00000000063c4d0 in ksys_write (fd=<optimized out>, buf=0x100123a71a0
<error: Cannot access memory at address 0x100123a71a0>, count=2) at
fs/read_write.c:643
#11 0xc000000000031a28 in system_call_exception (regs=0xc00000005102be80,
r0=<optimized out>) at arch/powerpc/kernel/syscall.c:153
#12 0xc00000000000d05c in system_call_vectored_common () at
arch/powerpc/kernel/interrupt_64.S:198
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
crash>

[4] [PATCH v6 10/14]
diff --git a/xen_hyper.c b/xen_hyper.c
index 32e56fa..54c7f57 100644
--- a/xen_hyper.c
+++ b/xen_hyper.c
@@ -52,7 +52,7 @@ xen_hyper_init(void)
         */
        xht->xen_virt_start &= 0xffffffffc0000000;
 #endif
-
+       STRUCT_SIZE_INIT(cpumask_t, "cpumask_t");
        if (machine_type("X86_64") &&
            symbol_exists("xen_phys_start") && !xen_phys_start())
                error(WARNING,

Could you please explain why the above changes are needed?

[5] [PATCH v6 14/14]

a. warning
cc -c -g -DARM64 -DLZO -DGDB_10_2  arm64.c -Wall -O2 -Wstrict-prototypes
-Wmissing-prototypes -fstack-protector -Wformat-security
arm64.c: In function ‘arm64_get_stack_frame’:
arm64.c:4138:13: warning: variable ‘ret’ set but not used
[-Wunused-but-set-variable]
 4138 |         int ret;
      |             ^~~

b.  indentation issue
-                               bt->task);
+                ur_bitmap = (struct user_regs_bitmap_struct
*)GETBUF(sizeof(*ur_bitmap));
+                memset(ur_bitmap, 0, sizeof(*ur_bitmap));
+               ur_bitmap->ur.pc = stackframe.pc;

[6] The 'info threads' command only displays the current task, not all
known tasks. And this looks different behavior from the gdb
gdb> info threads
  Id   Target Id         Frame
* 1    6298 bash         0xc0000000002bdf64 in crash_setup_regs
(newregs=0xc00000005102b728, oldregs=0x0) at
./arch/powerpc/include/asm/kexec.h:133

[7] the 'thread' command can not switch to another task.
crash> ps
...
     6297    6287   0  c000000057517e80  IN   0.0    22784    11776
 sshd-session
>    6298    6297   2  c000000050bbcb00  RU   0.0     9664     6272  bash
...

gdb> thread 6297
gdb: gdb request failed: thread 6297
gdb>

Finally I have to do it with 'set' command as below:

crash> set 6297
    PID: 6297
COMMAND: "sshd-session"
   TASK: c000000057517e80  [THREAD_INFO: c000000057517e80]
    CPU: 0
  STATE: TASK_INTERRUPTIBLE
crash> set gdb on
gdb: on
gdb> bt
#0  0xc00000009cd57650 in ?? ()
gdb: gdb request failed: bt
gdb>



In addition, the code is changed, but its patch log is not updated
accordingly. Could you please double check? I won't list them here.

Thanks.
Lianbo

v5 -> v4:
> 1) Plenty of code refactoring based on Lianbo's comments on v4.
> 2) Removed the magic number when dealing with regs bitmap, see [6].
> 3) Rebased the patchset on top of latest upstream:
>    ("1c6da3eaff8207 arm64: Fix bt command show wrong stacktrace on ramdump
> source")
>
> v4 -> v3:
> Fixed the author issue in [PATCH v3 06/16] Fix gdb_interface: restore gdb's
> output streams at end of gdb_interface.
>
> v3 -> v2:
> 1) Updated CC list as pointed out in [4]
> 2) Compiling issues as in [5]
>
> v2 -> v1:
> 1) Added the patch: x86_64: Fix invalid input "=>" for bt command,
>    thanks for Kazu's testing.
> 2) Modify the patch: x86_64: Add gdb stack unwinding support, added the
>    pcp_save, spp_save and sp, for restoring the value in match of the
> original
>    code logic.
>
> [1]:
> https://www.mail-archive.com/devel@lists.crash-utility.osci.io/msg00469.html
> [2]:
> https://www.mail-archive.com/devel@lists.crash-utility.osci.io/msg00488.html
> [3]:
> https://www.mail-archive.com/devel@lists.crash-utility.osci.io/msg00554.html
> [4]:
> https://www.mail-archive.com/devel@lists.crash-utility.osci.io/msg00681.html
> [5]:
> https://www.mail-archive.com/devel@lists.crash-utility.osci.io/msg00715.html
> [6]:
> https://www.mail-archive.com/devel@lists.crash-utility.osci.io/msg00819.html
>
> Aditya Gupta (3):
>   Remove 'frame' from prohibited commands list
>   Fix gdb_interface: restore gdb's output streams at end of
>     gdb_interface
>   ppc64: correct gdb passthroughs by implementing machdep->get_cpu_reg
>
> Alexey Makhalov (2):
>   set_context(): check if context is already current
>   vmware_guestdump: Various format versions support
>
> Tao Liu (9):
>   Leave only one gdb thread for crash
>   Let crash change gdb context
>   Rename get_cpu_reg to get_current_task_reg
>   Print task pid/command instead of CPU index
>   Stop stack unwinding at non-kernel address
>   x86_64: Add gdb stack unwinding support
>   Fix cpumask_t recursive dependence issue
>   x86_64: Fix invalid input "=>" for bt command
>   arm64: Add gdb stack unwinding support
>
>  arm64.c            | 115 +++++++++++++++-
>  crash_target.c     |  71 ++++++----
>  defs.h             | 194 ++++++++++++++++++++++++++-
>  gdb-10.2.patch     |  82 ++++++++++++
>  gdb_interface.c    |  35 ++---
>  kernel.c           |  63 +++++++--
>  ppc64.c            | 175 +++++++++++++++++++++++-
>  symbols.c          |  15 +++
>  task.c             |  34 +++--
>  tools.c            |  10 +-
>  unwind_x86_64.h    |   4 -
>  vmware_guestdump.c | 321 +++++++++++++++++++++++++++++++-------------
>  x86_64.c           | 323 ++++++++++++++++++++++++++++++++++++++++-----
>  xen_hyper.c        |   2 +-
>  14 files changed, 1225 insertions(+), 219 deletions(-)
>
> --
> 2.40.1
>
--
Crash-utility mailing list -- devel@lists.crash-utility.osci.io
To unsubscribe send an email to devel-le...@lists.crash-utility.osci.io
https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/
Contribution Guidelines: https://github.com/crash-utility/crash/wiki

Reply via email to