Re: TLS with static non-PIE binaries
On Thu, Nov 9, 2017 at 12:07 AM, Charles Collicuttwrote: > On Wed, Nov 08, 2017 at 10:58:09PM -0800, Philip Guenther wrote: > > @nobits means the section won't have filespace (where the initialization > > data is) included in the loaded data. It should work with @progbits > > instead. > > Wow, I suck at assembly. However, once again binutils saved me: > > $ readelf -SW test.o | grep .tdata > [ 4] .tdataPROGBITS 4c 04 00 > WAT 0 0 4 > Dang it, binutils keeps breaking all my pat answers! Totally unfair. > I'm afraid changing it to @progbits in the source doesn't make any > difference to the result. It still returns the correct thing when > dynamically linked but zero when statically linked. > More alignment fun. The problem will go away if your data has an alignment of 8 or a size that's a multiple of 8. I need to think about the right way to represent the necessary bitsat some time when I'm not falling asleep. Philip Guenther
Re: TLS with static non-PIE binaries
On Wed, Nov 08, 2017 at 10:58:09PM -0800, Philip Guenther wrote: > @nobits means the section won't have filespace (where the initialization > data is) included in the loaded data. It should work with @progbits > instead. Wow, I suck at assembly. However, once again binutils saved me: $ readelf -SW test.o | grep .tdata [ 4] .tdataPROGBITS 4c 04 00 WAT 0 0 4 I'm afraid changing it to @progbits in the source doesn't make any difference to the result. It still returns the correct thing when dynamically linked but zero when statically linked. -- Charles
Re: TLS with static non-PIE binaries
On Wed, Nov 8, 2017 at 10:43 PM, Charles Collicuttwrote: > On Sun, Nov 05, 2017 at 01:02:36PM -0800, Philip Guenther wrote: > > Well, ld.so and libc _should_ currently support startup-time TLS using > the > > initial-exec and local-exec modules. > > I can't see support for R_x_TPOFF64 relocations in ld.so(1) so I > don't think initial-exec will work. But local-exec doesn't require any > relocations by the dynamic linker so that should work. > Hmm, yeah, that makes sense. > > The diff below fixes that, at least on amd64, by checking whether no > > AUX_phdr value was found and, if so, trying to instead find them via the > > ELF header referenced via the linker-provided __executable_start symbol. > > The patch fixed the segfaults I was seeing. However, initialized TLS > data doesn't work in static executables: it appears as zero. > > $ cat test.s > .globl foo > .section.tdata,"awT",@nobits > @nobits means the section won't have filespace (where the initialization data is) included in the loaded data. It should work with @progbits instead. Philip Guenther
Re: TLS with static non-PIE binaries
On Sun, Nov 05, 2017 at 01:02:36PM -0800, Philip Guenther wrote: > Well, ld.so and libc _should_ currently support startup-time TLS using the > initial-exec and local-exec modules. I can't see support for R_x_TPOFF64 relocations in ld.so(1) so I don't think initial-exec will work. But local-exec doesn't require any relocations by the dynamic linker so that should work. > The diff below fixes that, at least on amd64, by checking whether no > AUX_phdr value was found and, if so, trying to instead find them via the > ELF header referenced via the linker-provided __executable_start symbol. The patch fixed the segfaults I was seeing. However, initialized TLS data doesn't work in static executables: it appears as zero. $ cat test.s .globl foo .section.tdata,"awT",@nobits .align 4 .size foo, 4 foo: .int42 .text .globl get_foo .type get_foo, @function get_foo: movl%fs:foo@tpoff, %eax ret $ cc -o test.o -c test.s $ cc -o test test.o main.c $ ./test 42 $ cc -static -o test test.o main.c $ ./test 0 PIE/no-PIE doesn't seem to make any difference to this. I don't actually need initialized TLS data for my use, which is statically linking Go binaries, so you've fixed my problem: thank you! I've run the complete Go test-suite with your patch applied and everything that I'd expect to pass does pass. This is on AMD64 only; I don't have access to any other architecture for testing. -- Charles
Re: TLS with static non-PIE binaries
On Sun, Nov 05, 2017 at 04:17:50PM -0800, Philip Guenther wrote: > BTW, in the .s file in your original message you had this line: > .type foo, @object > > Since foo is in an SHF_TLS section, it has to be of type STT_TLS. You're right. But... > Indeed, binutils is silently overriding what you wrote. Yes, see: http://www.cygwin.com/ml/binutils/2002-11/msg00409.html That assembly was actually generated by GCC. Clang generates the same. The reason is that not all assemblers understand @tls_object. For example, Sun as(1) wants @tls_obj instead. So GCC relies on the fact that it will be fixed later. See: https://gcc.gnu.org/ml/gcc-patches/2015-07/msg00284.html Anyway, that was just a simple example to demonstrate the problem. I don't expect that code to outlive the e-mail. (At least, I hope not...) -- Charles
Re: TLS with static non-PIE binaries
On Mon, 6 Nov 2017, Charles Collicutt wrote: > On Sun, Nov 05, 2017 at 01:02:36PM -0800, Philip Guenther wrote: > > The problem with static non-PIE executables is that we don't pass an AUX > > vector to such processes, so the startup code can't find the TLS segment > > and doesn't leave any space for it. > > Ah, thank you. BTW, in the .s file in your original message you had this line: .type foo, @object Since foo is in an SHF_TLS section, it has to be of type STT_TLS. Indeed, binutils is silently overriding what you wrote. I suggest you either update it to match the reality: .type foo, @tls_object or .type foo, STT_TLS ...or just delete the .type declaration entirely. Philip Guenther
Re: TLS with static non-PIE binaries
On Sun, Nov 05, 2017 at 01:02:36PM -0800, Philip Guenther wrote: > The problem with static non-PIE executables is that we don't pass an AUX > vector to such processes, so the startup code can't find the TLS segment > and doesn't leave any space for it. Ah, thank you. > The diff below fixes that, at least on amd64 [...] I will test it. Thanks for your help and explanation. -- Charles
Re: TLS with static non-PIE binaries
On Sun, 5 Nov 2017, Stuart Henderson wrote: > On 2017/11/05 21:08, Charles Collicutt wrote: > > Hello, > > > > I have a program that uses Thread-Local Storage (TLS) with the 'Local Exec' > > access model [1] on AMD64. This looks like it should work on OpenBSD and, > > indeed, it mostly does. > > OpenBSD doesn't have real thread-local storage yet. Well, ld.so and libc _should_ currently support startup-time TLS using the initial-exec and local-exec modules. The problem with static non-PIE executables is that we don't pass an AUX vector to such processes, so the startup code can't find the TLS segment and doesn't leave any space for it. On a variant II arch like amd64, the negative offset of a TLS variable results in accessing before the beginning of the page the TCB allocation is on, thus the fault. The diff below fixes that, at least on amd64, by checking whether no AUX_phdr value was found and, if so, trying to instead find them via the ELF header referenced via the linker-provided __executable_start symbol. That's the second and third chunks below. The first and fourth chunks correct the sizing of the allocation with the requested alignment was less than the natural alignment of the TIB itself, resulting in a misaligned TIB. The last chunk correctly relocates the pointer to the TLS initialization data (the .tdata segment), so that initialized TLS data works in static executables. This will obviously require some testing... Philip Index: dlfcn/init.c === RCS file: /data/src/openbsd/src/lib/libc/dlfcn/init.c,v retrieving revision 1.5 diff -u -p -r1.5 init.c --- dlfcn/init.c6 Sep 2016 18:49:34 - 1.5 +++ dlfcn/init.c5 Nov 2017 21:02:13 - @@ -34,6 +34,8 @@ #include "init.h" +#define MAX(a,b) (((a)>(b))?(a):(b)) + /* XXX should be in an include file shared with csu */ char ***_csu_finish(char **_argv, char **_envp, void (*_cleanup)(void)); @@ -53,8 +55,10 @@ struct dl_phdr_info _static_phdr_info = static inline void early_static_init(char **_argv, char **_envp); static inline void setup_static_tib(Elf_Phdr *_phdr, int _phnum); -#endif /* PIC */ +/* provided by the linker */ +extern Elf_Ehdr __executable_start[] __attribute__((weak)); +#endif /* PIC */ /* * extract useful bits from the auxiliary vector and either @@ -99,6 +103,15 @@ _csu_finish(char **argv, char **envp, vo } #ifndef PIC + if (cleanup == NULL && phdr == NULL && __executable_start != NULL) { + /* +* Static non-PIE processes don't get an AUX vector, +* so find the phdrs through the ELF header +*/ + phdr = (void *)((char *)__executable_start + + __executable_start->e_phoff); + phnum = __executable_start->e_phnum; + } /* static libc in a static link? */ if (cleanup == NULL) setup_static_tib(phdr, phnum); @@ -169,13 +182,22 @@ setup_static_tib(Elf_Phdr *phdr, int phn #elif TLS_VARIANT == 2 /* * variant 2 places the data before the TIB -* so we need to round up to the alignment +* so we need to round up the size to the +* larger of the TLS data alignment and the +* TIB's alignment. +* Example A: p_memsz=24 p_align=16 align(TIB)=8 +* - need to allocate 32 bytes for TLS as compiler +* - will give the first TLS symbol an offset of -32 +* Example B: p_memsz=4 p_align=4 align(TIB)=8 +* - need to allocate 8 bytes so that the TIB is +* - properly aligned */ _static_tls_size = ELF_ROUND(phdr[i].p_memsz, - phdr[i].p_align); + MAX(__alignof__(struct tib), phdr[i].p_align)); #endif if (phdr[i].p_vaddr != 0 && phdr[i].p_filesz != 0) { - static_tls = (void *)phdr[i].p_vaddr; + static_tls = (void *)phdr[i].p_vaddr + + _static_phdr_info.dlpi_addr; static_tls_fsize = phdr[i].p_filesz; } break;
Re: TLS with static non-PIE binaries
The only "tls" support we have that's currently working is emulated tls, as exemplified by both clang and the gcc 4.9 port. See /usr/src/lib/libcompiler_rt/emutls.c for how it works. It's not incredibly fast, but it does the job... Fortunately, the gcc and clang people managed to make some kind of standard, so both emutls flavors are actually compatible.
Re: TLS with static non-PIE binaries
On Mon, Nov 06, 2017 at 12:58:18AM +0200, Paul Irofti wrote: > We are missing dlopen-time TLS, dynamic TLS relocations and > thread storage-specifier support. Thanks. None of those things are needed for the Local Exec access model though, so that should be OK. Do you have any idea what might cause the segfault I saw? If it's expected behaviour at this stage then that's fine. I just didn't think it looked right. -- Charles
Re: TLS with static non-PIE binaries
On Sun, Nov 05, 2017 at 10:51:41PM +, Charles Collicutt wrote: > On Sun, Nov 05, 2017 at 10:15:57PM +, Stuart Henderson wrote: > > On 2017/11/05 21:08, Charles Collicutt wrote: > > > I have a program that uses Thread-Local Storage (TLS) with the 'Local > > > Exec' > > > access model [1] on AMD64. This looks like it should work on OpenBSD and, > > > indeed, it mostly does. > > > > OpenBSD doesn't have real thread-local storage yet. > > What does 'real' mean here? OpenBSD looks like it has everything needed for > 'Local Exec' thread-local storage. > > You've got TLS block setup in ld.so(1), _csu_finish, and pthread_create(3). > You've got the binutils linker, ld(1), which can handle R_x_TPOFF32 > relocations. > You don't need anything else for local exec TLS. > > This is mainly interesting to me because the Go runtime uses local exec TLS to > store a pointer to the current goroutine. I know people use Go on OpenBSD: if > you're saying that isn't actually expected to work, that would be good to > know. We are missing dlopen-time TLS, dynamic TLS relocations and thread storage-specifier support.
Re: TLS with static non-PIE binaries
On Sun, Nov 05, 2017 at 10:15:57PM +, Stuart Henderson wrote: > On 2017/11/05 21:08, Charles Collicutt wrote: > > I have a program that uses Thread-Local Storage (TLS) with the 'Local Exec' > > access model [1] on AMD64. This looks like it should work on OpenBSD and, > > indeed, it mostly does. > > OpenBSD doesn't have real thread-local storage yet. What does 'real' mean here? OpenBSD looks like it has everything needed for 'Local Exec' thread-local storage. You've got TLS block setup in ld.so(1), _csu_finish, and pthread_create(3). You've got the binutils linker, ld(1), which can handle R_x_TPOFF32 relocations. You don't need anything else for local exec TLS. This is mainly interesting to me because the Go runtime uses local exec TLS to store a pointer to the current goroutine. I know people use Go on OpenBSD: if you're saying that isn't actually expected to work, that would be good to know. -- Charles
Re: TLS with static non-PIE binaries
On 2017/11/05 21:08, Charles Collicutt wrote: > Hello, > > I have a program that uses Thread-Local Storage (TLS) with the 'Local Exec' > access model [1] on AMD64. This looks like it should work on OpenBSD and, > indeed, it mostly does. OpenBSD doesn't have real thread-local storage yet.