Hi, Aditya Thank you for the fix. On Fri, Sep 22, 2023 at 7:03 AM <crash-utility-requ...@redhat.com> wrote:
> Date: Thu, 21 Sep 2023 12:29:05 +0530 > From: Aditya Gupta <adit...@linux.ibm.com> > To: crash-utility@redhat.com > Cc: Hari Bathini <hbath...@linux.ibm.com>, Mahesh J Salgaonkar > <mah...@linux.ibm.com>, Sourabh Jain <sourabhj...@linux.ibm.com>, > d.hatay...@fujitsu.com > Subject: [Crash-utility] [PATCH v1] diskdump: add hook for additional > checks on prstatus notes validity > Message-ID: <20230921065905.1020839-1-adit...@linux.ibm.com> > Content-Type: text/plain; charset="US-ASCII"; x-default=true > > Upstream crash reports these warnings on PowerPC64: > > WARNING: cpu 0 invalid NT_PRSTATUS note (n_type != NT_PRSTATUS) > ... > > Apart from these warnings, register values are also invalid. > > This warning was found in the commit: > > commit db8c030857b4 ("diskdump/netdump: fix segmentation fault > caused by failure of stopping CPUs") > > With above commit, crash checks whether 'crash_notes' is initialised, > before mapping PRSTATUS notes. > > But some architectures such as PowerPC64, in fadump case > (firmware-assisted dump), don't populate 'crash_notes' since the > registers are already stored in the cpu notes in the vmcore. > > Instead of checking 'crash_notes' for all architectures, introduce > a machdep hook ('is_cpu_prstatus_valid'), for architectures to > decide validity checks for PRSTATUS notes > > A default hook ('diskdump_is_cpu_prstatus_valid') has also been provided > for all architectures other than PowerPC64, which checks if 'crash_notes' > for a given cpu is valid, maintaining the current behaviour > > PowerPC64 doesn't utilise 'crash_notes' to get register values, so no > additional checks are required > > Fixes: db8c030857b4 ("diskdump/netdump: fix segmentation fault caused by > failure of stopping CPUs") > Signed-off-by: Aditya Gupta <adit...@linux.ibm.com> > > --- > Testing > ======= > > NOTE: To test this on PowerPC64 with upstream kernel dump, AND on system > with Radix MMU, following patch will also be needed to be applied: > > Link: > https://listman.redhat.com/archives/crash-utility/2023-September/010961.html > > This is due to change in vmemmap address mapping for Radix MMU, since > following patch in the kernel: > > 368a0590d954: (powerpc/book3s64/vmemmap: switch radix to use a > different vmemmap handling function) > > More details about the change are in the linked patch. Basically what > changed is, the address mapping for vmemmap address is now in kernel > page table, in case of Radix MMU, instead of 'vmemmap_list' which is > currently > used in crash. > > Git Tree for Testing > ==================== > > 1. With this patch (diskdump: add hook for additional ...) applied: > > https://github.com/adi-g15-ibm/crash/tree/bugzilla-203256-list-v1 > > 2. With both this and the linked patch (ppc64: do page traversal ...) > applied: > > https://github.com/adi-g15-ibm/crash/tree/bugzilla-203256-withupstreamradix > > --- > --- > defs.h | 1 + > diskdump.c | 15 ++++++++++++--- > ppc64.c | 10 ++++++++++ > 3 files changed, 23 insertions(+), 3 deletions(-) > > diff --git a/defs.h b/defs.h > index 96a7a2a31471..f7f56947e5ac 100644 > --- a/defs.h > +++ b/defs.h > @@ -1073,6 +1073,7 @@ struct machdep_table { > int (*verify_line_number)(ulong, ulong, ulong); > void (*get_irq_affinity)(int); > void (*show_interrupts)(int, ulong *); > + int (*is_cpu_prstatus_valid)(int cpu); > I would suggest putting it at the end of this table. Although it may not break the compatibility of the extension module, just like the offset_table/size_table, I get used to doing that if there is no special reason. > int (*is_page_ptr)(ulong, physaddr_t *); > int (*get_cpu_reg)(int, int, const char *, int, void *); > }; > diff --git a/diskdump.c b/diskdump.c > index 2c284ff3f97f..ad9a00b08ce1 100644 > --- a/diskdump.c > +++ b/diskdump.c > @@ -142,13 +142,22 @@ int have_crash_notes(int cpu) > return TRUE; > } > > +int diskdump_is_cpu_prstatus_valid(int cpu) > +{ > + static int crash_notes_exists = -1; > + > + if (crash_notes_exists == -1) > + crash_notes_exists = kernel_symbol_exists("crash_notes"); > + > + return (!crash_notes_exists || have_crash_notes(cpu)); > +} > + > Got a warning as below: cc -c -g -DX86_64 -DLZO -DGDB_10_2 diskdump.c -Wall -O2 -Wstrict-prototypes -Wmissing-prototypes -fstack-protector -Wformat-security diskdump.c:145:5: warning: no previous prototype for ‘diskdump_is_cpu_prstatus_valid’ [-Wmissing-prototypes] 145 | int diskdump_is_cpu_prstatus_valid(int cpu) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > void > map_cpus_to_prstatus_kdump_cmprs(void) > { > void **nt_ptr; > int online, i, j, nrcpus; > size_t size; > - int crash_notes_exists; > > if (pc->flags2 & QEMU_MEM_DUMP_COMPRESSED) /* notes exist for all > cpus */ > goto resize_note_pointers; > @@ -171,10 +180,9 @@ map_cpus_to_prstatus_kdump_cmprs(void) > * Re-populate the array with the notes mapping to online cpus > */ > nrcpus = (kt->kernel_NR_CPUS ? kt->kernel_NR_CPUS : NR_CPUS); > - crash_notes_exists = kernel_symbol_exists("crash_notes"); > > for (i = 0, j = 0; i < nrcpus; i++) { > - if (in_cpu_map(ONLINE_MAP, i) && (!crash_notes_exists || > have_crash_notes(i))) { > + if (in_cpu_map(ONLINE_MAP, i) && > machdep->is_cpu_prstatus_valid(i)) { > dd->nt_prstatus_percpu[i] = nt_ptr[j++]; > dd->num_prstatus_notes = > MAX(dd->num_prstatus_notes, i+1); > @@ -1076,6 +1084,7 @@ diskdump_init(char *unused, FILE *fptr) > if (!DISKDUMP_VALID() && !KDUMP_CMPRS_VALID()) > return FALSE; > > + machdep->is_cpu_prstatus_valid = diskdump_is_cpu_prstatus_valid; > dd->ofp = fptr; > return TRUE; > } > diff --git a/ppc64.c b/ppc64.c > index fc34006f4863..1159b8c3a8e7 100644 > --- a/ppc64.c > +++ b/ppc64.c > @@ -298,6 +298,15 @@ struct machine_specific book3e_machine_specific = { > .is_vmaddr = book3e_is_vmaddr, > }; > > +/** > + * No additional checks are required on PPC64, for checking if PRSTATUS > notes > + * is valid > + */ > +int ppc64_is_cpu_prstatus_valid(int cpu) > +{ > + return TRUE; > +} > + > #define SKIBOOT_BASE 0x30000000 > > /* > @@ -418,6 +427,7 @@ ppc64_init(int when) > break; > > case POST_GDB: > + machdep->is_cpu_prstatus_valid = > ppc64_is_cpu_prstatus_valid; > The hook is set in the stage of POST_GDB, I'm wondering if the current warning is still shown in the crash minimal mode(with option --minimal). Can you help to confirm this one? And other changes are fine to me. Thanks. Lianbo ms = machdep->machspec; > > if (!(machdep->flags & BOOK3E)) { > -- > 2.41.0 >
-- Crash-utility mailing list Crash-utility@redhat.com https://listman.redhat.com/mailman/listinfo/crash-utility Contribution Guidelines: https://github.com/crash-utility/crash/wiki