Glad to see it merged! Many colleagues were waiting for it. Thanks to everyone who contributed to this effort. --Alexey
On Wed, Nov 6, 2024 at 12:29 AM Aditya Gupta <adit...@linux.ibm.com> wrote: > Hi all, > > > Thanks Lianbo, Tao, Alexey and Daisuke for your reviews on this series. > > Feels amazing to finally see this merged ! > > Thank you Tao for collaborating on this for so many months ! > > Hope this helps many people, I have been pinged my multiple people in > dev and support teams that this information can help them classify the > issue into which subsystem the issue might be in. > > > Thanks again, > > - Aditya Gupta > > > On 04/11/24 13:39, lijiang wrote: > > Thank you for working on this feature, Aditya, Tao and Alex. Great > > job! For the [PATCH v7 02/15 -15/15], rearranged them with minor > > changes: [1] > > https: //github. > com/crash-utility/crash/commit/21e0a345f97324b3472d573ed20ef098f0300fac > > [2] > > https: //github. > com/crash-utility/crash/commit/c4db469af091edd1ea0897fbce41bc175375314b > > > > Thank you for working on this feature, Aditya, Tao and Alex. Great job! > > > > For the [PATCH v7 02/15 -15/15], rearranged them with minor changes: > > > > [1] > > > https://github.com/crash-utility/crash/commit/21e0a345f97324b3472d573ed20ef098f0300fac > > [2] > > > https://github.com/crash-utility/crash/commit/c4db469af091edd1ea0897fbce41bc175375314b > > [3] > > > https://github.com/crash-utility/crash/commit/7c8a7dddda66b3d1043ba99516de57691033154a > > [4] > > > https://github.com/crash-utility/crash/commit/1fd80c623c205443fdd2a29b14c5230a09984147 > > [5] > > > https://github.com/crash-utility/crash/commit/6dfda0d2235574cf80530ea92e0ddff270f9c039 > > [6] > > > https://github.com/crash-utility/crash/commit/89ff1e45734457eb66905ef656775fcfd1b46aec > > [7] > > > https://github.com/crash-utility/crash/commit/968debd0d5979dd9ddca3af0766bad714dbd51e3 > > > > BTW: there are still some known issues about this one, but not > > critical issues, so which can be fixed later. > > > > Reminder: the current patchset has changed some function interfaces, > > which may affect crash extensions. > > > > Thanks > > Lianbo > > > > On Wed, Sep 11, 2024 at 10:25 AM lijiang <liji...@redhat.com> wrote: > > > > Hi, Tao > > > > Thank you for the update. > > > > The following patch is a regression issue, so I tend to discuss it > > as a separate patch. > > [PATCH v7 01/15] Fix the regression of cpumask_t for xen hyper > > > > In addition, I found another issue in my tests(on ppc64le), the > > gdb bt can display the back trace for the panic task, but when I > > switch to another task, the gdb bt can not display the back trace: > > > > crash> gdb bt > > #0 0xc0000000002bde04 in crash_setup_regs > > (newregs=0xc00000003264b858, oldregs=0x0) at > > ./arch/powerpc/include/asm/kexec.h:133 > > #1 0xc0000000002be4f8 in __crash_kexec (regs=0x0) at > > kernel/crash_core.c:122 > > #2 0xc00000000016c254 in panic (fmt=0xc0000000015eef20 "sysrq > > triggered crash\n") at kernel/panic.c:373 > > #3 0xc000000000a708b8 in sysrq_handle_crash (key=<optimized out>) > > at drivers/tty/sysrq.c:154 > > #4 0xc000000000a713d4 in __handle_sysrq (key=key@entry=99 'c', > > check_mask=check_mask@entry=false) at drivers/tty/sysrq.c:612 > > #5 0xc000000000a71e94 in write_sysrq_trigger (file=<optimized > > out>, buf=<optimized out>, count=2, ppos=<optimized out>) at > > drivers/tty/sysrq.c:1181 > > #6 0xc00000000073260c in pde_write (pde=0xc00000000af9cc00, > > file=<optimized out>, buf=<optimized out>, count=<optimized out>, > > ppos=<optimized out>) at fs/proc/inode.c:334 > > #7 proc_reg_write (file=<optimized out>, buf=<optimized out>, > > count=<optimized out>, ppos=<optimized out>) at fs/proc/inode.c:346 > > #8 0xc00000000063c0e0 in vfs_write (file=0xc0000000092d2900, > > buf=0x10012536f60 <error: Cannot access memory at address > > 0x10012536f60>, count=2, pos=0xc00000003264bd30) at > > fs/read_write.c:588 > > #9 vfs_write (file=0xc0000000092d2900, buf=0x10012536f60 <error: > > Cannot access memory at address 0x10012536f60>, count=<optimized > > out>, pos=0xc00000003264bd30) at fs/read_write.c:570 > > #10 0xc00000000063c690 in ksys_write (fd=<optimized out>, > > buf=0x10012536f60 <error: Cannot access memory at address > > 0x10012536f60>, count=2) at fs/read_write.c:643 > > #11 0xc000000000031a28 in system_call_exception > > (regs=0xc00000003264be80, r0=<optimized out>) at > > arch/powerpc/kernel/syscall.c:153 > > #12 0xc00000000000d05c in system_call_vectored_common () at > > arch/powerpc/kernel/interrupt_64.S:198 > > > > crash> ps > > PID PPID CPU TASK ST %MEM VSZ RSS COMM > > 0 0 0 c000000002bda980 RU 0.0 0 0 > > [swapper/0] > > > 0 0 1 c000000003864c80 RU 0.0 0 0 > > [swapper/1] > > ... > > 8017 923 0 c000000043a20000 IN 0.2 22528 16256 > > sshd-session > > 8025 8017 6 c000000032271880 IN 0.1 22784 11840 > > sshd-session > > > 8026 8025 0 c000000043a26600 RU 0.1 9664 6208 > > bash > > ... > > 11645 2 3 c000000032264c80 ID 0.0 0 0 > > [kworker/u32:2] > > 11738 6188 2 c00000003811b180 IN 0.1 43520 9408 > > pickup > > 12326 2 0 c00000003226b280 ID 0.0 0 0 > > [kworker/0:1] > > 13112 6089 2 c00000000c809900 IN 0.0 7232 3456 > sleep > > > > Let's take the "pickup" task as an example: > > > > crash> set 11738 > > PID: 11738 > > COMMAND: "pickup" > > TASK: c00000003811b180 [THREAD_INFO: c00000003811b180] > > CPU: 2 > > STATE: TASK_INTERRUPTIBLE > > > > crash> gdb bt > > #0 0xc0000000a7f876a0 in ?? () > > gdb: gdb request failed: bt > > crash> set gdb on > > gdb: on > > gdb> bt > > #0 0xc0000000a7f876a0 in ?? () > > gdb> > > > > Anyway, I did the same test on x86 64 and aarch64, it can work > > well as expected. Can you help to double check on ppc64 architecture? > > > > X86 64: > > crash> set 14599 > > PID: 14599 > > COMMAND: "pickup" > > TASK: ffff8f57a0d7c180 [THREAD_INFO: ffff8f57a0d7c180] > > CPU: 41 > > STATE: TASK_INTERRUPTIBLE > > crash> gdb bt > > #0 0xffffffff8b3efe29 in context_switch (rq=0xffff8f6f1f835900, > > prev=0xffff8f57a0d7c180, next=0xffff8f5786720000, > > rf=0xffff9df22fea7b80) at kernel/sched/core.c:5208 > > #1 __schedule (sched_mode=sched_mode@entry=0) at > > kernel/sched/core.c:6549 > > #2 0xffffffff8b3f0217 in __schedule_loop (sched_mode=<optimized > > out>) at kernel/sched/core.c:6626 > > #3 schedule () at kernel/sched/core.c:6641 > > #4 0xffffffff8b3f6eef in schedule_hrtimeout_range_clock > > (expires=expires@entry=0xffff9df22fea7cb0, delta=<optimized out>, > > delta@entry=99999999, mode=mode@entry=HRTIMER_MODE_ABS, > > clock_id=clock_id@entry=1) at kernel/time/hrtimer.c:2293 > > #5 0xffffffff8b3f7003 in schedule_hrtimeout_range > > (expires=expires@entry=0xffff9df22fea7cb0, > > delta=delta@entry=99999999, mode=mode@entry=HRTIMER_MODE_ABS) at > > kernel/time/hrtimer.c:2340 > > #6 0xffffffff8aae301c in ep_poll (ep=0xffff8f5790d15d40, > > events=events@entry=0x7ffea91b6b90, maxevents=maxevents@entry=100, > > timeout=timeout@entry=0xffff9df22fea7d58) at fs/eventpoll.c:2062 > > #7 0xffffffff8aae3138 in do_epoll_wait (epfd=epfd@entry=8, > > events=events@entry=0x7ffea91b6b90, maxevents=maxevents@entry=100, > > to=0xffff9df22fea7d58) at fs/eventpoll.c:2464 > > #8 0xffffffff8aae44a1 in __do_sys_epoll_wait (epfd=<optimized > > out>, events=0x7ffea91b6b90, maxevents=<optimized out>, > > timeout=<optimized out>) at fs/eventpoll.c:2476 > > #9 __se_sys_epoll_wait (epfd=<optimized out>, events=<optimized > > out>, maxevents=<optimized out>, timeout=<optimized out>) at > > fs/eventpoll.c:2471 > > #10 __x64_sys_epoll_wait (regs=<optimized out>) at > fs/eventpoll.c:2471 > > #11 0xffffffff8b3e293d in do_syscall_x64 (regs=0xffff9df22fea7f48, > > nr=232) at arch/x86/entry/common.c:52 > > #12 do_syscall_64 (regs=0xffff9df22fea7f48, nr=232) at > > arch/x86/entry/common.c:83 > > #13 0xffffffff8b40012f in entry_SYSCALL_64 () at > > arch/x86/entry/entry_64.S:121 > > crash> > > > > > > aarch64: > > crash> set 9338 > > PID: 9338 > > COMMAND: "pickup" > > TASK: ffff0000c7b05400 [THREAD_INFO: ffff0000c7b05400] > > CPU: 3 > > STATE: TASK_INTERRUPTIBLE > > crash> gdb bt > > #0 __switch_to (prev=<unavailable>, > > prev@entry=0xffff0000c7b05400, next=next@entry=<unavailable>) at > > arch/arm64/kernel/process.c:555 > > #1 0xffffafc5b5ebd744 in context_switch (rq=0xffff00077bbd0ec0, > > prev=0xffff0000c7b05400, next=<unavailable>, > > rf=0xffff80008ac63a60) at kernel/sched/core.c:5208 > > #2 __schedule (sched_mode=sched_mode@entry=0) at > > kernel/sched/core.c:6549 > > #3 0xffffafc5b5ebdc2c in __schedule_loop (sched_mode=<optimized > > out>) at kernel/sched/core.c:6626 > > #4 schedule () at kernel/sched/core.c:6641 > > #5 0xffffafc5b5ec6030 in schedule_hrtimeout_range_clock > > (expires=expires@entry=0xffff80008ac63be8, > > delta=delta@entry=99999999, mode=mode@entry=HRTIMER_MODE_ABS, > > clock_id=clock_id@entry=1) at kernel/time/hrtimer.c:2293 > > #6 0xffffafc5b5ec618c in schedule_hrtimeout_range > > (expires=expires@entry=0xffff80008ac63be8, > > delta=delta@entry=99999999, mode=mode@entry=HRTIMER_MODE_ABS) at > > kernel/time/hrtimer.c:2340 > > #7 0xffffafc5b545d33c in ep_poll (ep=<unavailable>, > > events=events@entry=0xffffde5c3f68, maxevents=maxevents@entry=100, > > timeout=timeout@entry=0xffff80008ac63ce0) at fs/eventpoll.c:2062 > > #8 0xffffafc5b545d4e4 in do_epoll_wait (epfd=epfd@entry=8, > > events=events@entry=0xffffde5c3f68, maxevents=maxevents@entry=100, > > to=to@entry=0xffff80008ac63ce0) at fs/eventpoll.c:2464 > > #9 0xffffafc5b545d534 in do_epoll_pwait (epfd=epfd@entry=8, > > events=events@entry=0xffffde5c3f68, maxevents=maxevents@entry=100, > > to=to@entry=0xffff80008ac63ce0, sigsetsize=<optimized out>, > > sigmask=<optimized out>) at fs/eventpoll.c:2498 > > #10 0xffffafc5b545e7c8 in do_epoll_pwait (epfd=8, > > events=0xffffde5c3f68, maxevents=100, to=0xffff80008ac63ce0, > > sigmask=<optimized out>, sigsetsize=<optimized out>) at > > fs/eventpoll.c:2495 > > #11 __do_sys_epoll_pwait (epfd=8, events=0xffffde5c3f68, > > maxevents=100, timeout=<optimized out>, sigmask=<optimized out>, > > sigsetsize=<optimized out>) at fs/eventpoll.c:2511 > > #12 __se_sys_epoll_pwait (epfd=8, events=281474412330856, > > maxevents=100, timeout=<optimized out>, sigmask=<optimized out>, > > sigsetsize=<optimized out>) at fs/eventpoll.c:2505 > > #13 __arm64_sys_epoll_pwait (regs=<optimized out>) at > > fs/eventpoll.c:2505 > > #14 0xffffafc5b4fa99bc in __invoke_syscall > > (regs=0xffff80008ac63eb0, syscall_fn=<optimized out>) at > > arch/arm64/kernel/syscall.c:35 > > #15 invoke_syscall (regs=regs@entry=0xffff80008ac63eb0, > > scno=<optimized out>, sc_nr=sc_nr@entry=463, > > syscall_table=<optimized out>) at arch/arm64/kernel/syscall.c:49 > > #16 0xffffafc5b4fa9ac8 in el0_svc_common (sc_nr=463, > > syscall_table=<optimized out>, regs=0xffff80008ac63eb0, > > scno=<optimized out>) at arch/arm64/kernel/syscall.c:132 > > #17 do_el0_svc (regs=regs@entry=0xffff80008ac63eb0) at > > arch/arm64/kernel/syscall.c:151 > > #18 0xffffafc5b5eb6fa4 in el0_svc (regs=0xffff80008ac63eb0) at > > arch/arm64/kernel/entry-common.c:712 > > #19 0xffffafc5b5eb74c0 in el0t_64_sync_handler (regs=<optimized > > out>) at arch/arm64/kernel/entry-common.c:730 > > #20 0xffffafc5b4f91634 in el0t_64_sync () at > > arch/arm64/kernel/entry.S:598 > > crash> > > > > BTW: other changes are fine to me. > > > > Thanks > > Lianbo > > > > On Wed, Sep 4, 2024 at 3:54 PM > > <devel-requ...@lists.crash-utility.osci.io> wrote: > > > > Date: Wed, 4 Sep 2024 19:49:25 +1200 > > From: Tao Liu <l...@redhat.com> > > Subject: [Crash-utility] [PATCH v7 00/15] gdb stack unwinding > > support > > for crash utility > > To: devel@lists.crash-utility.osci.io > > Cc: Tao Liu <l...@redhat.com> > > Message-ID: <20240904074940.21331-1-l...@redhat.com> > > Content-Type: text/plain; charset=UTF-8 > > > > This patchset is a rebase/merged version of the following 3 > > patchsets: > > > > 1): [PATCH v10 0/5] Improve stack unwind on ppc64 [1] > > 2): [PATCH 0/5] x86_64 gdb stack unwinding support [2] > > 3): Clean up on top of one-thread-v2 [3] > > > > A complete description of gdb stack unwinding support for > > crash can be > > found in [1]. > > > > This patchset can be divided into the following 3 parts: > > > > 1) part1: preparations before stack unwinding support, some > > bugs/regressions found when drafting this patchset. > > 2) part2: common part for all CPU archs, mainly dealing with > > crash_target.c/gdb_interface.c files, in order to > > support different archs. > > 3) part3: arch specific, for each ppc64/x86_64/arm64/vmware > > stack unwinding support. > > > > === part 3 > > arm64: Add gdb stack unwinding support > > vmware_guestdump: Various format versions support > > x86_64: Add gdb stack unwinding support > > ppc64: correct gdb passthroughs by implementing > > machdep->get_current_task_reg > > > > === part 2 > > Conditionally output gdb stack unwinding stop reasons > > Stop stack unwinding at non-kernel address > > Print task pid/command instead of CPU index > > Rename get_cpu_reg to get_current_task_reg > > Let crash change gdb context > > Leave only one gdb thread for crash > > Remove 'frame' from prohibited commands list > > > > === part 1 > > Fix gdb_interface: restore gdb's output streams at end of > > gdb_interface > > x86_64: Fix invalid input "=>" for bt command > > Fix cpumask_t recursive dependence issue > > Fix the regression of cpumask_t for xen hyper > > === > > > > v7 -> v6: > > 1) Reorganise the patchset, re-divided them into 3 part > > against the > > previous 2 parts. > > 2) Re-dealed with the cpumask_t part, which solved the comment > > No.4 > > pointed out by lianbo in [4]. > > 3) Add conditional output for the failing message of gdb stack > > unwinding. > > see [PATCH 11/15] Conditionally output gdb stack unwinding > > stop reasons > > 4) Redraft the commit messages, updated some outdated info. > > 5) Merged "Let crash change gdb context" and "set_context(): > > check if > > context is already current" into one. > > > > [4]: > > > https://www.mail-archive.com/devel@lists.crash-utility.osci.io/msg01067.html > > > > v6 -> v5: > > 1) Refactor patch 4 & 9, which changed the function signature > > of struct > > get_cpu_reg/get_current_task_reg, and let each patch > > compile with no > > error when added on. > > 2) Rebased the patchset on top of latest upstream: > > ("79b93ecb2e72ec Fix a "Bus error" issue caused by 'crash > > --osrelease' or > > crash loading") > > > > v5 -> v4: > > 1) Plenty of code refactoring based on Lianbo's comments on v4. > > 2) Removed the magic number when dealing with regs bitmap, see > > [6]. > > 3) Rebased the patchset on top of latest upstream: > > ("1c6da3eaff8207 arm64: Fix bt command show wrong > > stacktrace on ramdump source") > > > > v4 -> v3: > > Fixed the author issue in [PATCH v3 06/16] Fix gdb_interface: > > restore gdb's > > output streams at end of gdb_interface. > > > > v3 -> v2: > > 1) Updated CC list as pointed out in [4] > > 2) Compiling issues as in [5] > > > > v2 -> v1: > > 1) Added the patch: x86_64: Fix invalid input "=>" for bt > command, > > thanks for Kazu's testing. > > 2) Modify the patch: x86_64: Add gdb stack unwinding support, > > added the > > pcp_save, spp_save and sp, for restoring the value in match > > of the original > > code logic. > > > > [1]: > > > https://www.mail-archive.com/devel@lists.crash-utility.osci.io/msg00469.html > > [2]: > > > https://www.mail-archive.com/devel@lists.crash-utility.osci.io/msg00488.html > > [3]: > > > https://www.mail-archive.com/devel@lists.crash-utility.osci.io/msg00554.html > > [4]: > > > https://www.mail-archive.com/devel@lists.crash-utility.osci.io/msg00681.html > > [5]: > > > https://www.mail-archive.com/devel@lists.crash-utility.osci.io/msg00715.html > > [6]: > > > https://www.mail-archive.com/devel@lists.crash-utility.osci.io/msg00819.html > > > > Aditya Gupta (3): > > Fix gdb_interface: restore gdb's output streams at end of > > gdb_interface > > Remove 'frame' from prohibited commands list > > ppc64: correct gdb passthroughs by implementing > > machdep->get_current_task_reg > > > > Alexey Makhalov (1): > > vmware_guestdump: Various format versions support > > > > Tao Liu (11): > > Fix the regression of cpumask_t for xen hyper > > Fix cpumask_t recursive dependence issue > > x86_64: Fix invalid input "=>" for bt command > > Leave only one gdb thread for crash > > Let crash change gdb context > > Rename get_cpu_reg to get_current_task_reg > > Print task pid/command instead of CPU index > > Stop stack unwinding at non-kernel address > > Conditionally output gdb stack unwinding stop reasons > > x86_64: Add gdb stack unwinding support > > arm64: Add gdb stack unwinding support > > > > arm64.c | 120 +++++++++++++++-- > > crash_target.c | 71 ++++++---- > > defs.h | 194 ++++++++++++++++++++++++++- > > gdb-10.2.patch | 96 ++++++++++++++ > > gdb_interface.c | 39 ++---- > > kernel.c | 63 +++++++-- > > ppc64.c | 174 +++++++++++++++++++++++- > > symbols.c | 15 +++ > > task.c | 34 +++-- > > tools.c | 16 ++- > > unwind_x86_64.h | 4 - > > vmware_guestdump.c | 321 > > +++++++++++++++++++++++++++++++------------- > > x86_64.c | 323 > > ++++++++++++++++++++++++++++++++++++++++----- > > 13 files changed, 1247 insertions(+), 223 deletions(-) > > > > -- > > 2.40.1 > > >
-- Crash-utility mailing list -- devel@lists.crash-utility.osci.io To unsubscribe send an email to devel-le...@lists.crash-utility.osci.io https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/ Contribution Guidelines: https://github.com/crash-utility/crash/wiki