Hi tao Thanks for the reply, but I don't have any other patch for now, and kernel_NR_CPUS seems to be used in many codes.
Thanks Guanyou Tao Liu <l...@redhat.com> 于2024年11月26日周二 13:14写道: > On Fri, Nov 22, 2024 at 11:08 PM Guanyou Chen <chenguanyou9...@gmail.com> > wrote: > > > > Hi tao > > > > > The reason is, kt->kernel_NR_CPUS might be large(5120 in this case), > > > without the filter of in_cpu_map(), it will exhaust the memory buffer. > > > > I hadn't thought about this before, why don't we choose kt->cpus but > kt->kernel_NR_CPUS ? > > There are differences of the 2. nt_prstatus_percpu[i] should be > iterated by kt->kernel_NR_CPUS rather than kt->cpus right? Because > kt->cpus deals only with online cpu, kt->kernel_NR_CPUS is for all > possible CPUs. If you have a better solution, please post the patch so > we can discuss it against the code itself. > > > > > Thanks, > > Gunayou > > > > Tao Liu <l...@redhat.com> 于2024年11月22日周五 17:15写道: > >> > >> Hi Guanyou, > >> > >> On Sat, Nov 2, 2024 at 1:35 AM Guanyou Chen <chenguanyou9...@gmail.com> > wrote: > >> > > >> > Hi Lianbo, Tao > >> > > >> > Remove offline status check, We can query the registers of > >> > each CPU at any time and obtain their stack. > >> > > >> > CPU 0: [OFFLINE] > >> > X0: 0000000000000000 X1: 0000000000000000 X2: 0000000000000000 > >> > X3: 000000000003fcbc X4: 0000000000000001 X5: 0000000000000000 > >> > X6: 0000000000000000 X7: 0000000000000000 X8: 00000000ffffffff > >> > X9: ffffffc009e6ae48 X10: ffffffc009e6ae20 X11: 0000000000000000 > >> > X12: 0000000000000002 X13: 0000000000000004 X14: 0000000000000000 > >> > X15: 0000000000004000 X16: 00000000f90f05f6 X17: 00000000f90f05f6 > >> > X18: 0000000000000000 X19: 0000000000000002 X20: ffffffc009e3b008 > >> > X21: ffffffc00a01d020 X22: ffffffc009f798f0 X23: 0000000060001000 > >> > X24: 0000000000000000 X25: 0000000000000000 X26: 0000000000000000 > >> > X27: 0000000000000000 X28: ffffff8111eecb00 X29: ffffffc008003f50 > >> > LR: ffffffc00802df88 SP: ffffffc008003f40 PC: ffffffc00802df94 > >> > PSTATE: 024003c5 FPVALID: 00000000 > >> > > >> > crash> bt -c 0 > >> > PID: 1842 TASK: ffffff8111eecb00 CPU: 0 COMMAND: "android.bg" > >> > 00 [ffffffc008003f50] ipi_handler at ffffffc00802df90 > >> > 01 [ffffffc008003f90] handle_percpu_devid_irq at ffffffc008146f50 > >> > 02 [ffffffc008003fd0] generic_handle_domain_irq at ffffffc00813f484 > >> > 03 [ffffffc008003fe0] gic_handle_irq at ffffffc008010140 > >> > --- <IRQ stack> --- > >> > 04 [ffffffc019c3be20] call_on_irq_stack at ffffffc008016ed4 > >> > 05 [ffffffc019c3be40] do_interrupt_handler at ffffffc008019cb4 > >> > 06 [ffffffc019c3be60] el0_interrupt at ffffffc008f7b848 > >> > 07 [ffffffc019c3be90] __el0_irq_handler_common at ffffffc008f7b368 > >> > 08 [ffffffc019c3bea0] el0t_64_irq_handler at ffffffc008f7b344 > >> > 09 [ffffffc019c3bfe0] el0t_64_irq at ffffffc008011720 > >> > PC: 0000000072415108 LR: 00000000724150d0 SP: > 0000007691d2bfa0 > >> > X29: 00000000734f60e0 X28: 000000001a2fa678 X27: > 0000000000000063 > >> > X26: 000000001a2fa678 X25: 000000001a2fa678 X24: > 000000001a7bb718 > >> > X23: 000000001a7ba198 X22: 000000001a7ba190 X21: > b4000076f9a828c8 > >> > X20: 0000000000000000 X19: b4000076f9a82800 X18: > 000000768d68a000 > >> > X17: 00000000708f89f8 X16: 00000000000000f0 X15: > 0000000000000000 > >> > X14: 0000007691d2bca0 X13: 0000000080100000 X12: > 0000000000000000 > >> > X11: 0000000000000000 X10: 0000000000000000 X9: > 9636716211228cd4 > >> > X8: 9636716211228cd4 X7: 0000000000000010 X6: > 000000001a7bb728 > >> > X5: 0000000070845200 X4: 0000000018a40d38 X3: > 00000000707e8f98 > >> > X2: 000000001a2fa678 X1: 000000001a7ba198 X0: > 0000000070847aa8 > >> > ORIG_X0: 00000000ffffff9c SYSCALLNO: ffffffff PSTATE: 60001000 > >> > > >> > Signed-off-by: Guanyou.Chen <chenguan...@xiaomi.com> > >> > --- > >> > netdump.c | 15 +++++---------- > >> > 1 file changed, 5 insertions(+), 10 deletions(-) > >> > > >> > diff --git a/netdump.c b/netdump.c > >> > index 435793b..455f90e 100644 > >> > --- a/netdump.c > >> > +++ b/netdump.c > >> > @@ -101,7 +101,7 @@ map_cpus_to_prstatus(void) > >> > nrcpus = (kt->kernel_NR_CPUS ? kt->kernel_NR_CPUS : NR_CPUS); > >> > > >> > for (i = 0; i < nrcpus; i++) { > >> > - if (in_cpu_map(ONLINE_MAP, i) && > machdep->is_cpu_prstatus_valid(i)) { > >> > + if (machdep->is_cpu_prstatus_valid(i)) { > >> > nd->nt_prstatus_percpu[i] = nt_ptr[i]; > >> > >> This patch has dependency on your previous "bugfix map cpus register" > >> patch. I'm not sure about the relations of the 2 patches, but they > >> don't seem to be independent. So please send them within one patchset > >> is preferred. > >> > >> However, for this patch, it will cause regressions after removing > >> in_cpu_map(ONLINE_MAP, i) check before > >> machdep->is_cpu_prstatus_valid(i), see the following stacktrace: > >> > >> ... > >> WARNING: cpu 2027: invalid NT_PRSTATUS note (n_type != NT_PRSTATUS) > >> WARNING: cpu 2028: invalid NT_PRSTATUS note (n_type != NT_PRSTATUS) > >> malloc_bp[1999]: 585a3c0 > >> smallest: 32 > >> largest: 65536 > >> embedded: 2032 > >> max_embedded: 2032 > >> mallocs: 2000 > >> frees: 0 > >> reqs/total: 2063/837500 > >> average size: 406 > >> > >> crash: cannot allocate any more memory! > >> ... > >> (gdb) bt > >> #0 getbuf (reqsize=368) at tools.c:6130 > >> #1 0x000000000065be0b in have_crash_notes (cpu=2029) at diskdump.c:123 > >> #2 0x000000000065bf57 in diskdump_is_cpu_prstatus_valid (cpu=2029) at > >> diskdump.c:155 > >> #3 0x000000000064b055 in map_cpus_to_prstatus () at netdump.c:104 > >> ... > >> > >> The reason is, kt->kernel_NR_CPUS might be large(5120 in this case), > >> without the filter of in_cpu_map(), it will exhaust the memory > >> buffer. > >> > >> Thanks, > >> Tao Liu > >> > >> > nd->num_prstatus_notes = > >> > MAX(nd->num_prstatus_notes, i+1); > >> > @@ -2998,15 +2998,10 @@ dump_registers_for_elf_dumpfiles(void) > >> > return; > >> > } > >> > > >> > - for (c = 0; c < kt->cpus; c++) { > >> > - if (check_offline_cpu(c)) { > >> > - fprintf(fp, "%sCPU %d: [OFFLINE]\n", c ? "\n" : "", c); > >> > - continue; > >> > - } > >> > - > >> > - fprintf(fp, "%sCPU %d:\n", c ? "\n" : "", c); > >> > - display_regs_from_elf_notes(c, fp); > >> > - } > >> > + for (c = 0; c < kt->cpus; c++) { > >> > + fprintf(fp, "%sCPU %d: %s\n", c ? "\n" : "", c, > check_offline_cpu(c) ? "[OFFLINE]" : "[ONLINE]"); > >> > + display_regs_from_elf_notes(c, fp); > >> > + } > >> > } > >> > > >> > struct x86_64_user_regs_struct { > >> > -- > >> > 2.34.1 > >> > > >> > Guanyou. > >> > Thanks. > >> > >
-- Crash-utility mailing list -- devel@lists.crash-utility.osci.io To unsubscribe send an email to devel-le...@lists.crash-utility.osci.io https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/ Contribution Guidelines: https://github.com/crash-utility/crash/wiki