Stephen Brennan <stephen.s.bren...@oracle.com> writes: > Tao Liu <l...@redhat.com> writes: >> Hi Stephen, >> >> Thanks for reporting the issue and patch. Please check if I understood >> you correctly. The correct output you are expecting in your case is: >> >> crash> sym ata_dummy_port_ops >> ffffffffc0a71580 (?) ata_dummy_port_ops [libata] >> crash> mod -s libata >> MODULE NAME TEXT_BASE SIZE >> OBJECT FILE >> ffffffffc0a7b640 libata ffffffffc0a47000 520192 >> >> /usr/lib/debug/lib/modules/6.12.0-0.11.8.el9uek.x86_64/kernel/drivers/ata/libata.ko.debug >> crash> sym ata_dummy_port_ops >> ffffffffc0a71580 (B) ata_dummy_port_ops [libata] >> crash> p/x &ata_dummy_port_ops >> $1 = 0xffffffffc0a6fe80 <<-------- should be 0xffffffffc0a71580, >> same as sym's output, right? > > Yes, that's correct. > >> If that is the case, then after applied your patch, the issue still >> exists on my machine: >> >> crash> sym fuse_miscdevice >> ffffffffc05f7fe0 (?) fuse_miscdevice [fuse] >> crash> mod -s fuse >> MODULE NAME TEXT_BASE >> SIZE OBJECT FILE >> ffffffffc05f8dc0 fuse ffffffffc05da000 >> 233472 >> /usr/lib/debug/lib/modules/6.11.5-300.fc41.x86_64/kernel/fs/fuse/fuse.ko.debug >> crash> sym fuse_miscdevice >> ffffffffc05f7fe0 (b) fuse_miscdevice [fuse] >> crash> p/x &fuse_miscdevice >> $1 = 0xffffffffc05daf20 << ------ unmatch with the previous value. > > Indeed, I checked with fuse on Fedora (6.11.4-301.fc41.x86_64, so almost > the same version). And I see the same behavior with fuse_miscdevice. > > In your case and mine, the offset is 0x1d0c0, which could be a clue to > how this mismatch happens. > >> The .data section of fuse.ko.debug have non-zero address: >> >> $ readelf -S >> /usr/lib/debug/lib/modules/6.11.5-300.fc41.x86_64/kernel/fs/fuse/fuse.ko.debug >> Section Headers: >> [Nr] Name Type Address Offset >> Size EntSize Flags Link Info Align >> ... >> [50] .data NOBITS 0000000000000e20 00000100 >> 000000000000080f 0000000000000000 WA 0 0 32 >> >> Could you please re-check your patch for this, or is there something >> I'm missing? > > From what I can tell, the important difference here may be that the > symbol you selected is a static variable (which becomes a local ELF > symbol). Apparently, my patch only resolves the issue for *global > variables*, but not for local variables. > > $ eu-readelf -s /usr/lib/debug/lib/modules/$(uname > -r)/kernel/fs/fuse/fuse.ko.debug | grep -P "fuse_(miscdevice|mutex)" > 85: 0000000000000100 80 OBJECT LOCAL DEFAULT 50 fuse_miscdevice > 1041: 00000000000005c0 32 OBJECT GLOBAL DEFAULT 50 fuse_mutex > > Here is the behavior of crash's master branch on these two symbols: > > crash> sym fuse_miscdevice > ffffffffc02fafe0 (?) fuse_miscdevice [fuse] > crash> sym fuse_mutex > ffffffffc02fb4a0 (?) fuse_mutex [fuse] > crash> mod -s fuse > MODULE NAME TEXT_BASE > SIZE OBJECT FILE > ffffffffc02fbdc0 fuse ffffffffc02dd000 > 233472 > /usr/lib/debug/lib/modules/6.11.4-301.fc41.x86_64/kernel/fs/fuse/fuse.ko.debug > crash> sym fuse_miscdevice > ffffffffc02fafe0 (b) fuse_miscdevice [fuse] > crash> sym fuse_mutex > ffffffffc02fb4a0 (B) fuse_mutex [fuse] > crash> p/x &fuse_miscdevice > $1 = 0xffffffffc02ddf20 > crash> p/x &fuse_mutex > $2 = 0xffffffffc02fa680 > > Both fuse_miscdevice, and fuse_mutex are mismatched (with different > offsets). And now, here's the behavior of crash with my patch: > > crash> sym fuse_miscdevice > ffffffffc02fafe0 (?) fuse_miscdevice [fuse] > crash> sym fuse_mutex > ffffffffc02fb4a0 (?) fuse_mutex [fuse] > crash> mod -s fuse > MODULE NAME TEXT_BASE > SIZE OBJECT FILE > ffffffffc02fbdc0 fuse ffffffffc02dd000 > 233472 > /usr/lib/debug/lib/modules/6.11.4-301.fc41.x86_64/kernel/fs/fuse/fuse.ko.debug > crash> sym fuse_miscdevice > ffffffffc02fafe0 (b) fuse_miscdevice [fuse] > crash> sym fuse_mutex > ffffffffc02fb4a0 (B) fuse_mutex [fuse] > crash> p/x &fuse_miscdevice > $1 = 0xffffffffc02ddf20 > crash> p/x &fuse_mutex > $2 = 0xffffffffc02fb4a0 > > Here the global fuse_mutex is correct, but the static/local > fuse_miscdevice is still incorrect. This indicates that there's more yet > to fix. (But at least my patch does part of the work!) > > I'm wondering whether this has anything to do with ELF relocations.
Hello Tao, I've continued debugging and verified that the root cause for this was GDB performing relocations to the DWARF info incorrectly. At the bottom of this email I've attached my patch, which you can apply to the master branch to test out my proposed fix. The patch file also contains a longer explanation of what I believe the issue is. However, since it updates the gdb-16.2.patch file, it's quite difficult to understand. Thus, I'll also provide a diff here of the relevant GDB source file. I'm not certain how the GDB patches are normally updated & reviewed. I also recognize that this change is a bit messy, so I'd appreciate your feedback as well. Thank you, Stephen --- tmp/gdb-16.2/gdb/symfile.c 2025-04-03 15:38:26.093760270 -0700 +++ gdb-16.2/gdb/symfile.c 2025-04-03 15:07:43.239691104 -0700 @@ -341,8 +341,13 @@ place_section (bfd *abfd, asection *sect return; /* If the user specified an offset, honor it. */ - if (offsets[gdb_bfd_section_index (abfd, sect)] != 0) + if (offsets[gdb_bfd_section_index (abfd, sect)] != 0) { + /* addr_info_make_relative() subtracts out the section VMA. But if the user + * specified an offset, they have already taken this into account. Add it + * back in */ + offsets[gdb_bfd_section_index (abfd, sect)] += bfd_section_vma(sect); return; + } /* Otherwise, let's try to find a place for the section. */ start_addr = (lowest + align - 1) & -align; @@ -628,35 +633,8 @@ default_symfile_offsets (struct objfile if ((bfd_get_file_flags (objfile->obfd.get ()) & (EXEC_P | DYNAMIC)) == 0) { bfd *abfd = objfile->obfd.get (); - asection *cur_sec; + asection *cur_sec = NULL; - for (cur_sec = abfd->sections; cur_sec != NULL; cur_sec = cur_sec->next) - /* We do not expect this to happen; just skip this step if the - relocatable file has a section with an assigned VMA. */ - if (bfd_section_vma (cur_sec) != 0 - /* - * Kernel modules may have some non-zero VMAs, i.e., like the - * __ksymtab and __ksymtab_gpl sections in this example: - * - * Section Headers: - * [Nr] Name Type Address Offset - * Size EntSize Flags Link Info Align - * ... - * [ 8] __ksymtab PROGBITS 0000000000000060 0000ad90 - * 0000000000000010 0000000000000000 A 0 0 16 - * [ 9] .rela__ksymtab RELA 0000000000000000 0000ada0 - * 0000000000000030 0000000000000018 43 8 8 - * [10] __ksymtab_gpl PROGBITS 0000000000000070 0000add0 - * 00000000000001a0 0000000000000000 A 0 0 16 - * ... - * - * but they should be treated as if they are NULL. - */ - && strncmp (bfd_section_name (cur_sec), "__k", 3) != 0) - break; - - if (cur_sec == NULL) - { section_offsets &offsets = objfile->section_offsets; /* Pick non-overlapping offsets for sections the user did not @@ -704,7 +682,6 @@ default_symfile_offsets (struct objfile offsets[cur_sec->index]); offsets[cur_sec->index] = 0; } - } } /* Remember the bfd indexes for the .text, .data, .bss and
>From 729b89edabc1c67c0b6272c22a7ab9e4f4c4f957 Mon Sep 17 00:00:00 2001 From: Stephen Brennan <stephen.s.bren...@oracle.com> Date: Thu, 3 Apr 2025 15:17:42 -0700 Subject: [PATCH] Fix bad relocations when module sh_addr is nonzero In the previous commit, we fixed crash's logic which determines the module load addresses in the presence of sections with nonzero sh_addr. This allowed GDB to correctly locate public variables (i.e. GLOBAL symbols) via the ELF symbol table (msymbols). However, LOCAL symbols still had incorrect addresses. The root cause for those issues is not in crash. Instead, GDB does not expect sections with nonzero sh_addr, and so it slipped up in multiple places: 1. In default_symfile_offsets(), GDB detects the nonzero sh_addr field and fails to apply the user-supplied section offsets. The result is that later, in symfile_relocate_debug_section(), relocations are applied to the DWARF info using the wrong section addresses, which results in invalid addresses for variables. Clearly, this has happened before, because crash has special-cased the "__ksymtab*" section names to avoid this condition. To resolve this, we simply drop the check for nonzero sh_addr altogether: the only ET_REL files we encounter should be kernel modules, so there's no real reason to be picky. 2. Even with that fixed, the user-supplied section addresses are clobbered by the addr_info_make_relative(), which subtracts out section offsets during its operation. To resolve this, undo the operation for ET_REL files where a section address was provided by the user (i.e. crash). With these two fixes, both local and global variables from a module section with nonzero sh_addr are correctly reported. Behavior is unchanged for modules with a zero sh_addr. Signed-off-by: Stephen Brennan <stephen.s.bren...@oracle.com> --- gdb-16.2.patch | 87 +++++++++++++++++++++++++++++--------------------- 1 file changed, 50 insertions(+), 37 deletions(-) diff --git a/gdb-16.2.patch b/gdb-16.2.patch index ee9bd2b..a6edba2 100644 --- a/gdb-16.2.patch +++ b/gdb-16.2.patch @@ -623,37 +623,50 @@ gdb_printf (_("Backtrace stopped: %s\n"), frame_stop_reason_string (trailing)); } ---- gdb-16.2/gdb/symfile.c.orig -+++ gdb-16.2/gdb/symfile.c -@@ -633,7 +633,26 @@ default_symfile_offsets (struct objfile *objfile, - for (cur_sec = abfd->sections; cur_sec != NULL; cur_sec = cur_sec->next) - /* We do not expect this to happen; just skip this step if the - relocatable file has a section with an assigned VMA. */ +--- gdb-16.2/gdb/symfile.c.orig 2025-04-03 14:56:36.640330084 -0700 ++++ gdb-16.2/gdb/symfile.c 2025-04-03 15:07:43.239691104 -0700 +@@ -341,8 +341,13 @@ place_section (bfd *abfd, asection *sect + return; + + /* If the user specified an offset, honor it. */ +- if (offsets[gdb_bfd_section_index (abfd, sect)] != 0) ++ if (offsets[gdb_bfd_section_index (abfd, sect)] != 0) { ++ /* addr_info_make_relative() subtracts out the section VMA. But if the user ++ * specified an offset, they have already taken this into account. Add it ++ * back in */ ++ offsets[gdb_bfd_section_index (abfd, sect)] += bfd_section_vma(sect); + return; ++ } + + /* Otherwise, let's try to find a place for the section. */ + start_addr = (lowest + align - 1) & -align; +@@ -628,16 +633,8 @@ default_symfile_offsets (struct objfile + if ((bfd_get_file_flags (objfile->obfd.get ()) & (EXEC_P | DYNAMIC)) == 0) + { + bfd *abfd = objfile->obfd.get (); +- asection *cur_sec; +- +- for (cur_sec = abfd->sections; cur_sec != NULL; cur_sec = cur_sec->next) +- /* We do not expect this to happen; just skip this step if the +- relocatable file has a section with an assigned VMA. */ - if (bfd_section_vma (cur_sec) != 0) -+ if (bfd_section_vma (cur_sec) != 0 -+ /* -+ * Kernel modules may have some non-zero VMAs, i.e., like the -+ * __ksymtab and __ksymtab_gpl sections in this example: -+ * -+ * Section Headers: -+ * [Nr] Name Type Address Offset -+ * Size EntSize Flags Link Info Align -+ * ... -+ * [ 8] __ksymtab PROGBITS 0000000000000060 0000ad90 -+ * 0000000000000010 0000000000000000 A 0 0 16 -+ * [ 9] .rela__ksymtab RELA 0000000000000000 0000ada0 -+ * 0000000000000030 0000000000000018 43 8 8 -+ * [10] __ksymtab_gpl PROGBITS 0000000000000070 0000add0 -+ * 00000000000001a0 0000000000000000 A 0 0 16 -+ * ... -+ * -+ * but they should be treated as if they are NULL. -+ */ -+ && strncmp (bfd_section_name (cur_sec), "__k", 3) != 0) - break; - - if (cur_sec == NULL) -@@ -1069,6 +1088,12 @@ symbol_file_add_with_addrs (const gdb_bfd_ref_ptr &abfd, const char *name, +- break; ++ asection *cur_sec = NULL; + +- if (cur_sec == NULL) +- { + section_offsets &offsets = objfile->section_offsets; + + /* Pick non-overlapping offsets for sections the user did not +@@ -685,7 +682,6 @@ default_symfile_offsets (struct objfile + offsets[cur_sec->index]); + offsets[cur_sec->index] = 0; + } +- } + } + + /* Remember the bfd indexes for the .text, .data, .bss and +@@ -1069,6 +1065,12 @@ symbol_file_add_with_addrs (const gdb_bf objfile *objfile = objfile::make (abfd, current_program_space, name, flags, parent); @@ -666,7 +679,7 @@ /* We either created a new mapped symbol table, mapped an existing symbol table file which has not had initial symbol reading -@@ -1095,6 +1120,7 @@ symbol_file_add_with_addrs (const gdb_bfd_ref_ptr &abfd, const char *name, +@@ -1095,6 +1097,7 @@ symbol_file_add_with_addrs (const gdb_bf styled_string (file_name_style.style (), name)); objfile->expand_all_symtabs (); @@ -674,7 +687,7 @@ } /* Note that we only print a message if we have no symbols and have -@@ -1352,6 +1378,10 @@ show_debug_file_directory (struct ui_file *file, int from_tty, +@@ -1352,6 +1355,10 @@ show_debug_file_directory (struct ui_fil #if ! defined (DEBUG_SUBDIRECTORY) #define DEBUG_SUBDIRECTORY ".debug" #endif @@ -685,7 +698,7 @@ /* Find a separate debuginfo file for OBJFILE, using DIR as the directory where the original file resides (may not be the same as -@@ -1390,6 +1420,15 @@ find_separate_debug_file (const char *dir, +@@ -1390,6 +1397,15 @@ find_separate_debug_file (const char *di if (separate_debug_file_exists (debugfile, crc32, objfile, warnings)) return debugfile; @@ -701,7 +714,7 @@ /* Then try in the global debugfile directories. Keep backward compatibility so that DEBUG_FILE_DIRECTORY being "" will -@@ -1545,6 +1584,14 @@ find_separate_debug_file_by_debuglink +@@ -1545,6 +1561,14 @@ find_separate_debug_file_by_debuglink } } @@ -716,7 +729,7 @@ return debugfile; } -@@ -2318,8 +2365,10 @@ add_symbol_file_command (const char *args, int from_tty) +@@ -2318,8 +2342,10 @@ add_symbol_file_command (const char *arg else if (section_addrs.empty ()) gdb_printf ("\n"); @@ -727,7 +740,7 @@ objf = symbol_file_add (filename.get (), add_flags, §ion_addrs, flags); -@@ -2660,6 +2709,7 @@ reread_symbols (int from_tty) +@@ -2660,6 +2686,7 @@ reread_symbols (int from_tty) objfile_name (objfile))); objfile->expand_all_symtabs (); @@ -735,7 +748,7 @@ } if (!objfile_has_symbols (objfile)) -@@ -3638,6 +3688,15 @@ bfd_byte * +@@ -3638,6 +3665,15 @@ bfd_byte * symfile_relocate_debug_section (struct objfile *objfile, asection *sectp, bfd_byte *buf) { -- 2.43.5
-- Crash-utility mailing list -- devel@lists.crash-utility.osci.io To unsubscribe send an email to devel-le...@lists.crash-utility.osci.io https://${domain_name}/admin/lists/devel.lists.crash-utility.osci.io/ Contribution Guidelines: https://github.com/crash-utility/crash/wiki