Re: patch: nm(1): add support for symbols created with -ffunction-sections
On Sat, 6 Nov 2021 15:20:03 +0100 Sebastien Marie wrote: > When an object is compiled using -ffunction-sections... > The following diff makes nm(1) to properly mark the function 'T', by > recognize ".text.*" sections: ok gkoehler@ nm(1) has more problems. If one compiles with -fdata-sections, then the mark on read-only symbols (like "const int var;") changes from 'R' to 'D', because nm knows ".rodata" but not ".rodata.*". If an object has >= 65280 sections, then nm fails. Example is from clang 11.1.0 on my macppc, $ perl -E 'say "void f$_(void) {}" for 1..3'>exam.c $ time cc -O2 -ffunction-sections -c exam.c 7m40.20s real 7m33.96s user 0m03.29s system $ nm exam.o nm: exam.o: no section header table This is because nm(1) can't read a section count >= 65280. https://stackoverflow.com/a/30428833/3614563 quotes the ELF spec, | If the number of sections is greater than or equal to SHN_LORESERVE | (0xff00), e_shnum has the value SHN_UNDEF (0) and the actual number | of section header table entries is contained in the sh_size field of | the section header at index 0 ... I found a large count in real C++ code, when I built llvm-project from git on amd64. Then nm(1) failed on some *.o file (and ar(1) made a wrongly indexed *.a, so I used llvm-ar to unbreak my build). I tried to fix nm(1), but then I got '?' and not 'T' on my function symbols, because I didn't have your diff. My sections were named ".text.*" and not ".text". I switched to llvm-nm or another tool. I didn't know that libtool(1) runs nm. We might have a problem if something will use libtool with -ffunction-sections or -fdata-sections and have >= 65280 sections. I don't know whether nm should check section names like ".text*", or it should ignore section names and check bits like SHF_EXECINSTR. Because nm now checks section names, your diff improves nm. --George > diff cecccd4b3c548875286ca2b010c95cbce6c0e359 /home/semarie/repos/openbsd/src > blob - 5aeef7a01a7cbff029299cfc5562cfcec085347f > file + usr.bin/nm/elf.c > --- usr.bin/nm/elf.c > +++ usr.bin/nm/elf.c > @@ -274,6 +274,8 @@ elf_shn2type(Elf_Ehdr *eh, u_int shn, const char *sn) > return (-1); > else if (!strcmp(sn, ELF_TEXT)) > return (N_TEXT); > + else if (!strncmp(sn, ".text.", 6)) > + return (N_TEXT); > else if (!strcmp(sn, ELF_RODATA)) > return (N_SIZE); > else if (!strcmp(sn, ELF_OPENBSDRANDOMDATA)) > @@ -355,6 +357,7 @@ elf2nlist(Elf_Sym *sym, Elf_Ehdr *eh, Elf_Shdr *shdr, > } else if (sn != NULL && *sn != 0 && > strcmp(sn, ELF_INIT) && > strcmp(sn, ELF_TEXT) && > + strncmp(sn, ".text.", 6) && > strcmp(sn, ELF_FINI)) /* XXX GNU compat */ > np->nl.n_other = '?'; > break; > > The change on elf_shn2type() isn't strictly necessary for my use-case, > but it (should) makes .text.* support better (recognize N_TEXT for > STT_NOTYPE, STT_OBJECT, STT_TLS). > > > After, nm(1) properly recognize the symbol: > > $ /usr/obj/usr.bin/nm/nm test.o > d .L.str > W __llvm_retpoline_r11 > W __retguard_759 > U printf > F test.c > T test_fn > > and it makes libtool(1) happy (LT/Archive.pm: get_symbollist > function), and it makes librsvg build happy (which is playing with > symbols at build time), and it should makes aja@ happy too. > > Comments or OK ? > -- > Sebastien Marie > -- George Koehler
Re: patch: nm(1): add support for symbols created with -ffunction-sections
On Sat, Nov 06, 2021 at 03:20:03PM +0100, Sebastien Marie wrote: > Hi, > > aja@ shows me some problems with x11/gnome/librsvg update (the port is > Rust based), and I finally tracked the problem inside nm(1). > > I will not speak of Rust anymore, and will use only C for the example. > > When an object is compiled using -ffunction-sections, the > compiler/linker will use one section per function (if I correctly > understood the usual purpose, it is to be able to easily discard > unused sections/functions at linking time). > > $ cat test.c > #include > > void > test_fn(void) > { > printf("test_fn()\n"); > } > > $ cc -Wall -c test.c -ffunction-sections > $ readelf --sections test.o | grep -A1 test_fn > [ 3] .text.test_fn PROGBITS 0040 >0040 AX 0 0 16 > $ readelf -s test.o > > Symbol table '.symtab' contains 8 entries: >Num:Value Size TypeBind Vis Ndx Name > 0: 0 NOTYPE LOCAL DEFAULT UND > 1: 0 FILELOCAL DEFAULT ABS test.c > 2: 11 OBJECT LOCAL DEFAULT7 .L.str > 3: 0 SECTION LOCAL DEFAULT3 > 4: 24 FUNCWEAK HIDDEN 6 > __llvm_retpoline_r11 > 5: 8 OBJECT WEAK HIDDEN 9 __retguard_759 > 6: 0 NOTYPE GLOBAL DEFAULT UND printf > 7: 64 FUNCGLOBAL DEFAULT3 test_fn > > > The problem is nm(1) doesn't recognize the test_fn type as a TEXT function: > > $ nm test.o > d .L.str > W __llvm_retpoline_r11 > W __retguard_759 > U printf > F test.c > ? test_fn > > test_fn symbol should be 'T', but it is reported as '?'. > > > llvm-nm(1) is working correctly (but we don't have it in base): > > $ llvm-nm test.o > r .L.str > W __llvm_retpoline_r11 > V __retguard_759 > U printf > T test_fn > > > > The following diff makes nm(1) to properly mark the function 'T', by > recognize ".text.*" sections: > > diff cecccd4b3c548875286ca2b010c95cbce6c0e359 /home/semarie/repos/openbsd/src > blob - 5aeef7a01a7cbff029299cfc5562cfcec085347f > file + usr.bin/nm/elf.c > --- usr.bin/nm/elf.c > +++ usr.bin/nm/elf.c > @@ -274,6 +274,8 @@ elf_shn2type(Elf_Ehdr *eh, u_int shn, const char *sn) > return (-1); > else if (!strcmp(sn, ELF_TEXT)) > return (N_TEXT); > + else if (!strncmp(sn, ".text.", 6)) > + return (N_TEXT); > else if (!strcmp(sn, ELF_RODATA)) > return (N_SIZE); > else if (!strcmp(sn, ELF_OPENBSDRANDOMDATA)) > @@ -355,6 +357,7 @@ elf2nlist(Elf_Sym *sym, Elf_Ehdr *eh, Elf_Shdr *shdr, > } else if (sn != NULL && *sn != 0 && > strcmp(sn, ELF_INIT) && > strcmp(sn, ELF_TEXT) && > + strncmp(sn, ".text.", 6) && > strcmp(sn, ELF_FINI)) /* XXX GNU compat */ > np->nl.n_other = '?'; > break; > > The change on elf_shn2type() isn't strictly necessary for my use-case, > but it (should) makes .text.* support better (recognize N_TEXT for > STT_NOTYPE, STT_OBJECT, STT_TLS). > > > After, nm(1) properly recognize the symbol: > > $ /usr/obj/usr.bin/nm/nm test.o > d .L.str > W __llvm_retpoline_r11 > W __retguard_759 > U printf > F test.c > T test_fn > > and it makes libtool(1) happy (LT/Archive.pm: get_symbollist > function), and it makes librsvg build happy (which is playing with > symbols at build time), and it should makes aja@ happy too. > > Comments or OK ? You made me happy :-) -- Antoine
patch: nm(1): add support for symbols created with -ffunction-sections
Hi, aja@ shows me some problems with x11/gnome/librsvg update (the port is Rust based), and I finally tracked the problem inside nm(1). I will not speak of Rust anymore, and will use only C for the example. When an object is compiled using -ffunction-sections, the compiler/linker will use one section per function (if I correctly understood the usual purpose, it is to be able to easily discard unused sections/functions at linking time). $ cat test.c #include void test_fn(void) { printf("test_fn()\n"); } $ cc -Wall -c test.c -ffunction-sections $ readelf --sections test.o | grep -A1 test_fn [ 3] .text.test_fn PROGBITS 0040 0040 AX 0 0 16 $ readelf -s test.o Symbol table '.symtab' contains 8 entries: Num:Value Size TypeBind Vis Ndx Name 0: 0 NOTYPE LOCAL DEFAULT UND 1: 0 FILELOCAL DEFAULT ABS test.c 2: 11 OBJECT LOCAL DEFAULT7 .L.str 3: 0 SECTION LOCAL DEFAULT3 4: 24 FUNCWEAK HIDDEN 6 __llvm_retpoline_r11 5: 8 OBJECT WEAK HIDDEN 9 __retguard_759 6: 0 NOTYPE GLOBAL DEFAULT UND printf 7: 64 FUNCGLOBAL DEFAULT3 test_fn The problem is nm(1) doesn't recognize the test_fn type as a TEXT function: $ nm test.o d .L.str W __llvm_retpoline_r11 W __retguard_759 U printf F test.c ? test_fn test_fn symbol should be 'T', but it is reported as '?'. llvm-nm(1) is working correctly (but we don't have it in base): $ llvm-nm test.o r .L.str W __llvm_retpoline_r11 V __retguard_759 U printf T test_fn The following diff makes nm(1) to properly mark the function 'T', by recognize ".text.*" sections: diff cecccd4b3c548875286ca2b010c95cbce6c0e359 /home/semarie/repos/openbsd/src blob - 5aeef7a01a7cbff029299cfc5562cfcec085347f file + usr.bin/nm/elf.c --- usr.bin/nm/elf.c +++ usr.bin/nm/elf.c @@ -274,6 +274,8 @@ elf_shn2type(Elf_Ehdr *eh, u_int shn, const char *sn) return (-1); else if (!strcmp(sn, ELF_TEXT)) return (N_TEXT); + else if (!strncmp(sn, ".text.", 6)) + return (N_TEXT); else if (!strcmp(sn, ELF_RODATA)) return (N_SIZE); else if (!strcmp(sn, ELF_OPENBSDRANDOMDATA)) @@ -355,6 +357,7 @@ elf2nlist(Elf_Sym *sym, Elf_Ehdr *eh, Elf_Shdr *shdr, } else if (sn != NULL && *sn != 0 && strcmp(sn, ELF_INIT) && strcmp(sn, ELF_TEXT) && + strncmp(sn, ".text.", 6) && strcmp(sn, ELF_FINI)) /* XXX GNU compat */ np->nl.n_other = '?'; break; The change on elf_shn2type() isn't strictly necessary for my use-case, but it (should) makes .text.* support better (recognize N_TEXT for STT_NOTYPE, STT_OBJECT, STT_TLS). After, nm(1) properly recognize the symbol: $ /usr/obj/usr.bin/nm/nm test.o d .L.str W __llvm_retpoline_r11 W __retguard_759 U printf F test.c T test_fn and it makes libtool(1) happy (LT/Archive.pm: get_symbollist function), and it makes librsvg build happy (which is playing with symbols at build time), and it should makes aja@ happy too. Comments or OK ? -- Sebastien Marie