Re: fix ld -Z on powerpc
Date: Tue, 18 Aug 2015 19:29:58 + From: Miod Vallat m...@online.fr 2. I believe that some light smarts could be added to bfd to make .ctors and .dtors read-only when linking a true static binary, which would alleviate crt0 (.got being already RO in that case). So I will come with a binutils diff shortly, so that you will be able to put your crt0 diff in. Doesn't look like .got is already RO in that case. But we can make it so. Diff below forces .got, .ctors and .dtors to be read0only if we have not .dynamic section, i.e. for truly static executables. This makes the segment that contains those sections read-only for executables built with -static -nopie: $ gcc -static -nopie -o hello hello.c $ readelf -a hello.o ... Program Headers: Type Offset VirtAddr PhysAddr FileSizMemSiz Flags Align LOAD 0x 0x0010 0x0010 0x00016b74 0x00016b74 R E10 LOAD 0x00016b80 0x00216b80 0x00216b80 0x17cc 0x17cc R 10 LOAD 0x00018350 0x00318350 0x00318350 0x0590 0x0590 RW 10 LOAD 0x000188e0 0x004188e0 0x004188e0 0x06d8 0x06d8 R 10 LOAD 0x00018fb8 0x00718fb8 0x00718fb8 0x1088 0xc150 RW 10 NOTE 0x0238 0x00100238 0x00100238 0x0018 0x0018 R 4 OPENBSD_RANDOM 0x00018350 0x00318350 0x00318350 0x0008 0x0008 RW 8 Section to Segment mapping: Segment Sections... 00 .note.openbsd.ident .init .text .fini 01 .rodata .eh_frame 02 .openbsd.randomdata .jcr .data.rel.ro 03 .got .ctors .dtors 04 .data .bss 05 .note.openbsd.ident 06 .openbsd.randomdata All other types retain a writable segment, that is made read-only by either ld.so or rcrt0.o. Tested on macppc and sparc64. ok? Index: ld/emultempl/elf32.em === RCS file: /cvs/src/gnu/usr.bin/binutils-2.17/ld/emultempl/elf32.em,v retrieving revision 1.3 diff -u -p -r1.3 elf32.em --- ld/emultempl/elf32.em 4 Jul 2011 23:58:26 - 1.3 +++ ld/emultempl/elf32.em 21 Aug 2015 11:20:44 - @@ -1109,6 +1109,22 @@ EOF if test x$LDEMUL_AFTER_OPEN != xgld$EMULATION_NAME_after_open; then cat e${EMULATION_NAME}.c EOF +static void +gld${EMULATION_NAME}_force_readonly(lang_input_statement_type *s) +{ + asection *sec; + + if (s-the_bfd == NULL) + return; + + sec = bfd_get_section_by_name (s-the_bfd, .ctors); + if (sec) +sec-flags |= SEC_READONLY; + sec = bfd_get_section_by_name (s-the_bfd, .dtors); + if (sec) +sec-flags |= SEC_READONLY; +} + /* This is called after all the input files have been opened. */ static void @@ -1119,6 +1135,27 @@ gld${EMULATION_NAME}_after_open (void) /* We only need to worry about this when doing a final link. */ if (link_info.relocatable || !link_info.executable) return; + + /* If we don't have a .dynamic section, we have no relocations, and + we can make .got, .ctors and .dtors read-only. This will make + the segment containing those sections to be read-only in static + executables. */ + if (link_info.hash-type == bfd_link_elf_hash_table + !elf_hash_table (link_info)-dynamic_sections_created) +{ + bfd *dynobj = elf_hash_table (link_info)-dynobj; + + if (dynobj != NULL) + { + asection *sec; + + sec = bfd_get_section_by_name (dynobj, .got); + if (sec) + sec-flags |= SEC_READONLY; + } + + lang_for_each_input_file (gld${EMULATION_NAME}_force_readonly); +} /* Get the list of files which appear in DT_NEEDED entries in dynamic objects included in the link (often there will be none).
Re: fix ld -Z on powerpc
Date: Tue, 18 Aug 2015 05:03:19 + From: Miod Vallat m...@online.fr I spent some time today figuting out why the binutils update broke ld -Z on powerpc. Turns out to be a fairly thorny issue. The new binutils discard empty setions. As a result the .gotpad0 and .gotpad1 sections have disappeared. And a s a consequence the __got_start and __got_end symbols are now absolute symbols as the section they referenced to is no longer there. For example, an older libc has: 845: 000eeb68 0 NOTYPE GLOBAL DEFAULT 17 __got_start whereas -current has: 810: 000eeb58 0 NOTYPE GLOBAL DEFAULT ABS __got_start On powerpc, crt0.o has weak references to __got_start and __got_end. When building a binary with ld -Z, these are resolved to the absolute symbols from libc. At runtime we then use these values, relocated as if they were addresses within the binary itself, to change protections and flush the instruction cache. This is very likely to result in a segmentation fault. There is probably a linker bug here, as it doesn't make any sense for the linker to pick the address of these symbols from libc and stick it into the binary. But I'm not sure about this. And it isn't all that obvious what the fix would be. Is the bug that the symbols end up as absolute? Or is the problem that it sibsequently resolves these to the values from libc.so? Wouldn't something like that address the problem better? Unfortunately not. Using PROVIDE() makes the __got_start/end symbols disappear from shared libraries. If I use: GOTSTART=__got_start = .; I end up with the same absolute symbols as before. You really have to tie the symbols to a specific section that isn't discarded to make them end up as normal symbols. And that is difficult since we try to make __got_start/end cover multiple sections, some of which may or may not be present. I still think my diff to remove code is the way to go. As I explained, it isn't really clear which set of symbols the references in crt0.o will actually resolve to. And the secure-plt work will require changes to the code anyway to make sure we don't end up with an executable GOT after all. Index: elf.sc === RCS file: /OpenBSD/src/gnu/usr.bin/binutils-2.17/ld/scripttempl/elf.sc,v retrieving revision 1.8 diff -u -p -r1.8 elf.sc --- elf.sc9 Aug 2014 04:49:47 - 1.8 +++ elf.sc18 Aug 2015 05:02:41 - @@ -195,10 +195,12 @@ if test $NO_PAD = y ; then PAD_RO0=${RELOCATING+${RODATA_ALIGN} + ${RODATA_ALIGN_ADD_VAL};} PAD_PLT0=${RELOCATING+. = ALIGN(${MAXPAGESIZE}) + (. (${MAXPAGESIZE} - 1));} .pltpad0 ${RELOCATING-0} : { ${RELOCATING+__plt_start = .;} } PAD_PLT1=.pltpad1 ${RELOCATING-0} : { ${RELOCATING+__plt_end = .;}} ${RELOCATING+. = ALIGN(${MAXPAGESIZE}) + (. (${MAXPAGESIZE} - 1));} - PAD_GOT0=${RELOCATING+. = ALIGN(${MAXPAGESIZE}) + (. (${MAXPAGESIZE} - 1));} .gotpad0 ${RELOCATING-0} : { ${RELOCATING+__got_start = .;} } - PAD_GOT1=.gotpad1 ${RELOCATING-0} : { ${RELOCATING+__got_end = .;}} ${RELOCATING+. = ALIGN(${MAXPAGESIZE}) + (. (${MAXPAGESIZE} - 1));} + PAD_GOT0=${RELOCATING+. = ALIGN(${MAXPAGESIZE}) + (. (${MAXPAGESIZE} - 1));} + PAD_GOT1=${RELOCATING+. = ALIGN(${MAXPAGESIZE}) + (. (${MAXPAGESIZE} - 1));} test $NO_PAD_CDTOR = y || PAD_CDTOR= fi +GOTSTART=PROVIDE (__got_start = .); +GOTEND=PROVIDE (__got_end = .); CTOR=.ctors${CONSTRUCTING-0} : { @@ -420,9 +422,11 @@ cat EOF ${OTHER_RELRO_SECTIONS} ${TEXT_DYNAMIC-${DYNAMIC}} ${DATA_GOT+${PAD_GOT+${PAD_GOT0}}} + ${DATA_GOT+${GOTSTART}} ${DATA_GOT+${DATA_NONEXEC_PLT+${PLT}}} ${DATA_GOT+${RELRO_NOW+${GOT}}} ${DATA_GOT+${RELRO_NOW+${GOTPLT}}} + ${DATA_GOT+${RELRO_NOW+${GOTEND}}} ${DATA_GOT+${RELRO_NOW+${PAD_GOT+${PAD_GOT1 ${DATA_GOT+${RELRO_NOW-${SEPARATE_GOTPLT+${GOT /* If PAD_CDTOR, and separate .got and .got.plt sections, CTOR and DTOR @@ -430,11 +434,13 @@ cat EOF ${DATA_GOT+${RELRO_NOW-${SEPARATE_GOTPLT+${PAD_CDTOR+${RELOCATING+${CTOR}} ${DATA_GOT+${RELRO_NOW-${SEPARATE_GOTPLT+${PAD_CDTOR+${RELOCATING+${DTOR}} ${DATA_GOT+${RELRO_NOW-${SEPARATE_GOTPLT+${PAD_GOT+${PAD_GOT1} + ${DATA_GOT+${RELRO_NOW-${SEPARATE_GOTPLT+${GOTEND ${RELOCATING+${DATA_SEGMENT_RELRO_END}} ${DATA_GOT+${RELRO_NOW-${SEPARATE_GOTPLT-${GOT ${DATA_GOT+${RELRO_NOW-${SEPARATE_GOTPLT-${PAD_CDTOR+${RELOCATING+${CTOR}} ${DATA_GOT+${RELRO_NOW-${SEPARATE_GOTPLT-${PAD_CDTOR+${RELOCATING+${DTOR}} ${DATA_GOT+${RELRO_NOW-${SEPARATE_GOTPLT-${PAD_GOT+${PAD_GOT1} + ${DATA_GOT+${RELRO_NOW-${SEPARATE_GOTPLT-${GOTEND ${DATA_GOT+${RELRO_NOW-${GOTPLT}}} ${DATA_NONEXEC_PLT-${DATA_PLT+${PLT_BEFORE_GOT-${PAD_PLT+${PAD_PLT0} @@ -458,6 +464,7 @@ cat EOF ${DATA_NONEXEC_PLT-${DATA_PLT+${PLT_BEFORE_GOT+${PLT
Re: fix ld -Z on powerpc
On Tue, Aug 18, 2015 at 6:22 AM, Mark Kettenis mark.kette...@xs4all.nl wrote: Date: Tue, 18 Aug 2015 05:03:19 + From: Miod Vallat m...@online.fr I spent some time today figuting out why the binutils update broke ld -Z on powerpc. Turns out to be a fairly thorny issue. The new binutils discard empty setions. As a result the .gotpad0 and .gotpad1 sections have disappeared. And a s a consequence the __got_start and __got_end symbols are now absolute symbols as the section they referenced to is no longer there. For example, an older libc has: 845: 000eeb68 0 NOTYPE GLOBAL DEFAULT 17 __got_start whereas -current has: 810: 000eeb58 0 NOTYPE GLOBAL DEFAULT ABS __got_start On powerpc, crt0.o has weak references to __got_start and __got_end. When building a binary with ld -Z, these are resolved to the absolute symbols from libc. At runtime we then use these values, relocated as if they were addresses within the binary itself, to change protections and flush the instruction cache. This is very likely to result in a segmentation fault. There is probably a linker bug here, as it doesn't make any sense for the linker to pick the address of these symbols from libc and stick it into the binary. But I'm not sure about this. And it isn't all that obvious what the fix would be. Is the bug that the symbols end up as absolute? Or is the problem that it sibsequently resolves these to the values from libc.so? Wouldn't something like that address the problem better? Unfortunately not. Using PROVIDE() makes the __got_start/end symbols disappear from shared libraries. If I use: GOTSTART=__got_start = .; I end up with the same absolute symbols as before. You really have to tie the symbols to a specific section that isn't discarded to make them end up as normal symbols. And that is difficult since we try to make __got_start/end cover multiple sections, some of which may or may not be present. I still think my diff to remove code is the way to go. As I explained, it isn't really clear which set of symbols the references in crt0.o will actually resolve to. And the secure-plt work will require changes to the code anyway to make sure we don't end up with an executable GOT after all. Would KEEP statements in a linker script be a suitable workaround? https://sourceware.org/binutils/docs/ld/Input-Section-Keep.html#Input-Section-Keep --david
Re: fix ld -Z on powerpc
Date: Tue, 18 Aug 2015 06:58:43 -0400 From: David Higgs hig...@gmail.com On Tue, Aug 18, 2015 at 6:22 AM, Mark Kettenis mark.kette...@xs4all.nl wrote: Date: Tue, 18 Aug 2015 05:03:19 + From: Miod Vallat m...@online.fr I spent some time today figuting out why the binutils update broke ld -Z on powerpc. Turns out to be a fairly thorny issue. The new binutils discard empty setions. As a result the .gotpad0 and .gotpad1 sections have disappeared. And a s a consequence the __got_start and __got_end symbols are now absolute symbols as the section they referenced to is no longer there. For example, an older libc has: 845: 000eeb68 0 NOTYPE GLOBAL DEFAULT 17 __got_start whereas -current has: 810: 000eeb58 0 NOTYPE GLOBAL DEFAULT ABS __got_start On powerpc, crt0.o has weak references to __got_start and __got_end. When building a binary with ld -Z, these are resolved to the absolute symbols from libc. At runtime we then use these values, relocated as if they were addresses within the binary itself, to change protections and flush the instruction cache. This is very likely to result in a segmentation fault. There is probably a linker bug here, as it doesn't make any sense for the linker to pick the address of these symbols from libc and stick it into the binary. But I'm not sure about this. And it isn't all that obvious what the fix would be. Is the bug that the symbols end up as absolute? Or is the problem that it sibsequently resolves these to the values from libc.so? Wouldn't something like that address the problem better? Unfortunately not. Using PROVIDE() makes the __got_start/end symbols disappear from shared libraries. If I use: GOTSTART=__got_start = .; I end up with the same absolute symbols as before. You really have to tie the symbols to a specific section that isn't discarded to make them end up as normal symbols. And that is difficult since we try to make __got_start/end cover multiple sections, some of which may or may not be present. I still think my diff to remove code is the way to go. As I explained, it isn't really clear which set of symbols the references in crt0.o will actually resolve to. And the secure-plt work will require changes to the code anyway to make sure we don't end up with an executable GOT after all. Would KEEP statements in a linker script be a suitable workaround? https://sourceware.org/binutils/docs/ld/Input-Section-Keep.html#Input-Section-Keep No that doesn't work, presumably because .gotpad0 and .gotpad1 aren't really input sections.
Re: fix ld -Z on powerpc
Date: Tue, 18 Aug 2015 05:03:19 + From: Miod Vallat m...@online.fr I spent some time today figuting out why the binutils update broke ld -Z on powerpc. Turns out to be a fairly thorny issue. The new binutils discard empty setions. As a result the .gotpad0 and .gotpad1 sections have disappeared. And a s a consequence the __got_start and __got_end symbols are now absolute symbols as the section they referenced to is no longer there. For example, an older libc has: 845: 000eeb68 0 NOTYPE GLOBAL DEFAULT 17 __got_start whereas -current has: 810: 000eeb58 0 NOTYPE GLOBAL DEFAULT ABS __got_start On powerpc, crt0.o has weak references to __got_start and __got_end. When building a binary with ld -Z, these are resolved to the absolute symbols from libc. At runtime we then use these values, relocated as if they were addresses within the binary itself, to change protections and flush the instruction cache. This is very likely to result in a segmentation fault. There is probably a linker bug here, as it doesn't make any sense for the linker to pick the address of these symbols from libc and stick it into the binary. But I'm not sure about this. And it isn't all that obvious what the fix would be. Is the bug that the symbols end up as absolute? Or is the problem that it sibsequently resolves these to the values from libc.so? Wouldn't something like that address the problem better? Unfortunately not. Using PROVIDE() makes the __got_start/end symbols disappear from shared libraries. If I use: GOTSTART=__got_start = .; I end up with the same absolute symbols as before. You really have to tie the symbols to a specific section that isn't discarded to make them end up as normal symbols. And that is difficult since we try to make __got_start/end cover multiple sections, some of which may or may not be present. I still think my diff to remove code is the way to go. As I explained, it isn't really clear which set of symbols the references in crt0.o will actually resolve to. And the secure-plt work will require changes to the code anyway to make sure we don't end up with an executable GOT after all. This sucks. I'd love to see a libelf-based linker appear eventually, but to be a drop-in replacement for GNU ld, it needs linker script support, and then the nightmare start. That said, I only have two things to say: 1. going the opposite way of strict memory permissions, even in a diminishing use case (true static binaries) does not fit with the ``no compromises'' OpenBSD way of doing things. 2. I believe that some light smarts could be added to bfd to make .ctors and .dtors read-only when linking a true static binary, which would alleviate crt0 (.got being already RO in that case). So I will come with a binutils diff shortly, so that you will be able to put your crt0 diff in.
Re: fix ld -Z on powerpc
I spent some time today figuting out why the binutils update broke ld -Z on powerpc. Turns out to be a fairly thorny issue. The new binutils discard empty setions. As a result the .gotpad0 and .gotpad1 sections have disappeared. And a s a consequence the __got_start and __got_end symbols are now absolute symbols as the section they referenced to is no longer there. For example, an older libc has: 845: 000eeb68 0 NOTYPE GLOBAL DEFAULT 17 __got_start whereas -current has: 810: 000eeb58 0 NOTYPE GLOBAL DEFAULT ABS __got_start On powerpc, crt0.o has weak references to __got_start and __got_end. When building a binary with ld -Z, these are resolved to the absolute symbols from libc. At runtime we then use these values, relocated as if they were addresses within the binary itself, to change protections and flush the instruction cache. This is very likely to result in a segmentation fault. There is probably a linker bug here, as it doesn't make any sense for the linker to pick the address of these symbols from libc and stick it into the binary. But I'm not sure about this. And it isn't all that obvious what the fix would be. Is the bug that the symbols end up as absolute? Or is the problem that it sibsequently resolves these to the values from libc.so? Wouldn't something like that address the problem better? Index: elf.sc === RCS file: /OpenBSD/src/gnu/usr.bin/binutils-2.17/ld/scripttempl/elf.sc,v retrieving revision 1.8 diff -u -p -r1.8 elf.sc --- elf.sc 9 Aug 2014 04:49:47 - 1.8 +++ elf.sc 18 Aug 2015 05:02:41 - @@ -195,10 +195,12 @@ if test $NO_PAD = y ; then PAD_RO0=${RELOCATING+${RODATA_ALIGN} + ${RODATA_ALIGN_ADD_VAL};} PAD_PLT0=${RELOCATING+. = ALIGN(${MAXPAGESIZE}) + (. (${MAXPAGESIZE} - 1));} .pltpad0 ${RELOCATING-0} : { ${RELOCATING+__plt_start = .;} } PAD_PLT1=.pltpad1 ${RELOCATING-0} : { ${RELOCATING+__plt_end = .;}} ${RELOCATING+. = ALIGN(${MAXPAGESIZE}) + (. (${MAXPAGESIZE} - 1));} - PAD_GOT0=${RELOCATING+. = ALIGN(${MAXPAGESIZE}) + (. (${MAXPAGESIZE} - 1));} .gotpad0 ${RELOCATING-0} : { ${RELOCATING+__got_start = .;} } - PAD_GOT1=.gotpad1 ${RELOCATING-0} : { ${RELOCATING+__got_end = .;}} ${RELOCATING+. = ALIGN(${MAXPAGESIZE}) + (. (${MAXPAGESIZE} - 1));} + PAD_GOT0=${RELOCATING+. = ALIGN(${MAXPAGESIZE}) + (. (${MAXPAGESIZE} - 1));} + PAD_GOT1=${RELOCATING+. = ALIGN(${MAXPAGESIZE}) + (. (${MAXPAGESIZE} - 1));} test $NO_PAD_CDTOR = y || PAD_CDTOR= fi +GOTSTART=PROVIDE (__got_start = .); +GOTEND=PROVIDE (__got_end = .); CTOR=.ctors${CONSTRUCTING-0} : { @@ -420,9 +422,11 @@ cat EOF ${OTHER_RELRO_SECTIONS} ${TEXT_DYNAMIC-${DYNAMIC}} ${DATA_GOT+${PAD_GOT+${PAD_GOT0}}} + ${DATA_GOT+${GOTSTART}} ${DATA_GOT+${DATA_NONEXEC_PLT+${PLT}}} ${DATA_GOT+${RELRO_NOW+${GOT}}} ${DATA_GOT+${RELRO_NOW+${GOTPLT}}} + ${DATA_GOT+${RELRO_NOW+${GOTEND}}} ${DATA_GOT+${RELRO_NOW+${PAD_GOT+${PAD_GOT1 ${DATA_GOT+${RELRO_NOW-${SEPARATE_GOTPLT+${GOT /* If PAD_CDTOR, and separate .got and .got.plt sections, CTOR and DTOR @@ -430,11 +434,13 @@ cat EOF ${DATA_GOT+${RELRO_NOW-${SEPARATE_GOTPLT+${PAD_CDTOR+${RELOCATING+${CTOR}} ${DATA_GOT+${RELRO_NOW-${SEPARATE_GOTPLT+${PAD_CDTOR+${RELOCATING+${DTOR}} ${DATA_GOT+${RELRO_NOW-${SEPARATE_GOTPLT+${PAD_GOT+${PAD_GOT1} + ${DATA_GOT+${RELRO_NOW-${SEPARATE_GOTPLT+${GOTEND ${RELOCATING+${DATA_SEGMENT_RELRO_END}} ${DATA_GOT+${RELRO_NOW-${SEPARATE_GOTPLT-${GOT ${DATA_GOT+${RELRO_NOW-${SEPARATE_GOTPLT-${PAD_CDTOR+${RELOCATING+${CTOR}} ${DATA_GOT+${RELRO_NOW-${SEPARATE_GOTPLT-${PAD_CDTOR+${RELOCATING+${DTOR}} ${DATA_GOT+${RELRO_NOW-${SEPARATE_GOTPLT-${PAD_GOT+${PAD_GOT1} + ${DATA_GOT+${RELRO_NOW-${SEPARATE_GOTPLT-${GOTEND ${DATA_GOT+${RELRO_NOW-${GOTPLT}}} ${DATA_NONEXEC_PLT-${DATA_PLT+${PLT_BEFORE_GOT-${PAD_PLT+${PAD_PLT0} @@ -458,6 +464,7 @@ cat EOF ${DATA_NONEXEC_PLT-${DATA_PLT+${PLT_BEFORE_GOT+${PLT ${DATA_NONEXEC_PLT-${DATA_PLT+${PLT_BEFORE_GOT+${PAD_PLT+${PAD_PLT1} ${SDATA_GOT+${PAD_GOT+${PAD_GOT0}}} + ${SDATA_GOT+${GOTSTART}} ${SDATA_GOT+${DATA_NONEXEC_PLT+${PLT}}} ${SDATA_GOT+${RELOCATING+${OTHER_GOT_SYMBOLS}}} ${SDATA_GOT+${GOT}} @@ -468,6 +475,7 @@ cat EOF ${DATA_GOT-${PAD_CDTOR+${RELOCATING+${DTOR ${SDATA_GOT+${OTHER_GOT_SECTIONS}} + ${SDATA_GOT+${GOTEND}} ${SDATA_GOT+${PAD_GOT+${PAD_GOT1}}} ${SDATA}
Re: fix ld -Z on powerpc
Date: Wed, 12 Aug 2015 15:48:57 +0200 (CEST) From: Mark Kettenis mark.kette...@xs4all.nl I spent some time today figuting out why the binutils update broke ld -Z on powerpc. Turns out to be a fairly thorny issue. The new binutils discard empty setions. As a result the .gotpad0 and .gotpad1 sections have disappeared. And a s a consequence the __got_start and __got_end symbols are now absolute symbols as the section they referenced to is no longer there. For example, an older libc has: 845: 000eeb68 0 NOTYPE GLOBAL DEFAULT 17 __got_start whereas -current has: 810: 000eeb58 0 NOTYPE GLOBAL DEFAULT ABS __got_start On powerpc, crt0.o has weak references to __got_start and __got_end. When building a binary with ld -Z, these are resolved to the absolute symbols from libc. At runtime we then use these values, relocated as if they were addresses within the binary itself, to change protections and flush the instruction cache. This is very likely to result in a segmentation fault. There is probably a linker bug here, as it doesn't make any sense for the linker to pick the address of these symbols from libc and stick it into the binary. But I'm not sure about this. And it isn't all that obvious what the fix would be. Is the bug that the symbols end up as absolute? Or is the problem that it sibsequently resolves these to the values from libc.so? It does point out a fundamental weakness about the approach we've taken with the __plt_start/end and __got_start/end synbols. They work fine of you use something like dlsym(3) to lookup the value for a specific object, but if you rely on the default symbol resolution, it isn't clear if you get the right version. Therefore I think that the powerpc crt0.o code shouldn't be doing what it is doing. The diff below removes that code. This diff has a downside though. The GOT on -static -nopie binaries will no longer be read-only. I don't think that is a big loss, as -static -pie binaries are the default now and those do get a read-only GOT. If we think a read-only GOT for -static binaries is still important, there are a few other potential options to achive this that don't need to rely on __got_start/end. ping? This will also make my life easier with the secure-plt changes that are in the pipeline. Index: powerpc/md_init.h === RCS file: /cvs/src/lib/csu/powerpc/md_init.h,v retrieving revision 1.3 diff -u -p -r1.3 md_init.h --- powerpc/md_init.h 26 Dec 2014 13:52:01 - 1.3 +++ powerpc/md_init.h 12 Aug 2015 13:08:28 - @@ -64,88 +64,20 @@ #include sys/syscall.h /* for SYS_mprotect */ #define STR(x) __STRING(x) + #define MD_CRT0_START \ __asm( \ .text \n \ .section\.text\ \n \ .align 2\n \ -.size __got_start, 0 \n \ -.type __got_start, @object\n \ -.size __got_end, 0\n \ -.type __got_end, @object \n \ -.weak __got_start \n \ -.weak __got_end \n \ .globl _start \n \ .type _start, @function \n \ .globl __start \n \ .type __start, @function \n \ _start: \n \ __start:\n \ -# move argument registers to saved registers for startup flush \n \ -# ...except r6 (auxv) as ___start() doesn't need it \n \ -mr %r25, %r3\n \ -mr %r24, %r4\n \ -mr %r23, %r5\n \ -mr %r22, %r7\n \ -mflr%r27/* save off old link register */\n \ -bl 1f \n \ -# this instruction never gets executed but can be used \n \ -# to find the virtual address where the page is loaded. \n \ -bl _GLOBAL_OFFSET_TABLE_@local-4\n \ -1: \n \ -mflr%r6 # this
fix ld -Z on powerpc
I spent some time today figuting out why the binutils update broke ld -Z on powerpc. Turns out to be a fairly thorny issue. The new binutils discard empty setions. As a result the .gotpad0 and .gotpad1 sections have disappeared. And a s a consequence the __got_start and __got_end symbols are now absolute symbols as the section they referenced to is no longer there. For example, an older libc has: 845: 000eeb68 0 NOTYPE GLOBAL DEFAULT 17 __got_start whereas -current has: 810: 000eeb58 0 NOTYPE GLOBAL DEFAULT ABS __got_start On powerpc, crt0.o has weak references to __got_start and __got_end. When building a binary with ld -Z, these are resolved to the absolute symbols from libc. At runtime we then use these values, relocated as if they were addresses within the binary itself, to change protections and flush the instruction cache. This is very likely to result in a segmentation fault. There is probably a linker bug here, as it doesn't make any sense for the linker to pick the address of these symbols from libc and stick it into the binary. But I'm not sure about this. And it isn't all that obvious what the fix would be. Is the bug that the symbols end up as absolute? Or is the problem that it sibsequently resolves these to the values from libc.so? It does point out a fundamental weakness about the approach we've taken with the __plt_start/end and __got_start/end synbols. They work fine of you use something like dlsym(3) to lookup the value for a specific object, but if you rely on the default symbol resolution, it isn't clear if you get the right version. Therefore I think that the powerpc crt0.o code shouldn't be doing what it is doing. The diff below removes that code. This diff has a downside though. The GOT on -static -nopie binaries will no longer be read-only. I don't think that is a big loss, as -static -pie binaries are the default now and those do get a read-only GOT. If we think a read-only GOT for -static binaries is still important, there are a few other potential options to achive this that don't need to rely on __got_start/end. Index: powerpc/md_init.h === RCS file: /cvs/src/lib/csu/powerpc/md_init.h,v retrieving revision 1.3 diff -u -p -r1.3 md_init.h --- powerpc/md_init.h 26 Dec 2014 13:52:01 - 1.3 +++ powerpc/md_init.h 12 Aug 2015 13:08:28 - @@ -64,88 +64,20 @@ #include sys/syscall.h /* for SYS_mprotect */ #define STR(x) __STRING(x) + #defineMD_CRT0_START \ __asm( \ .text \n \ .section\.text\ \n \ .align 2\n \ - .size __got_start, 0 \n \ - .type __got_start, @object\n \ - .size __got_end, 0\n \ - .type __got_end, @object \n \ - .weak __got_start \n \ - .weak __got_end \n \ .globl _start \n \ .type _start, @function \n \ .globl __start \n \ .type __start, @function \n \ _start: \n \ __start: \n \ - # move argument registers to saved registers for startup flush \n \ - # ...except r6 (auxv) as ___start() doesn't need it \n \ - mr %r25, %r3\n \ - mr %r24, %r4\n \ - mr %r23, %r5\n \ - mr %r22, %r7\n \ - mflr%r27/* save off old link register */\n \ - bl 1f \n \ - # this instruction never gets executed but can be used \n \ - # to find the virtual address where the page is loaded. \n \ - bl _GLOBAL_OFFSET_TABLE_@local-4\n \ -1:\n \ - mflr%r6 # this stores where we are (+4) \n \ - lwz %r18, 0(%r6)# load the instruction at offset_sym\n \ - # it contains an offset to the location \n \ - # of