> Date: Sat, 8 May 2021 13:42:40 +0200
> From: Sebastien Marie <sema...@online.fr>
> 
> On Thu, May 06, 2021 at 09:32:28AM +0200, Sebastien Marie wrote:
> > Hi,
> > 
> > Anindya, did a good analysis of the problem with mpv using gpu video
> > output backend (it is using EGL and mesa if I correctly followed).
> > 
> > 
> > For people not reading ports@ here a resume: the destructor function
> > used in pthread_key_create() needs to be present in memory until
> > _rthread_tls_destructors() is called.
> > 
> > in the case of mesa, eglInitialize() function could load, via
> > dlopen(), code which will use pthread_key_create() with destructor.
> > 
> > once dlclose() is called, the object is unloaded from memory, but a
> > reference to destructor is kept, leading to segfault when
> > _rthread_tls_destructors() run and use the destructor (because
> > pointing to unloaded code).
> >
> 
> I was going deeper in the analysis.
> 
> At first, I tought that the pthread_key_create() call was going from
> mesa driver (radeonsi_dri.so on my machine) as pinning the DSO in
> memory (using LD_PRELOAD) permitted to avoid the segfault.
> 
> In fact, it isn't directly radeonsi_dri.so but another dependant
> library: libLLVM.so.5.0 in this case (by using
> LD_PRELOAD=.../libLLVM.so.5.0, the crash disapparear).
> 
> Searching where is located the pthread_key_create() call, I found that
> it was coming from emutls implementation (which is using
> pthread_key_create + destructor) and which is statically linked with
> compiler-rt.a
> 
> By instrumenting pthread_key_create, I have the following backtrace
> (the abort(3) is mine):
> 
> (gdb) bt
> #0  thrkill () at /tmp/-:3
> #1  0x000005188f550abe in _libc_abort () at 
> /usr/src/lib/libc/stdlib/abort.c:51
> #2  0x000005191c7e8c2b in pthread_key_create () from 
> /home/semarie/Documents/devel/libhijacking/libthread.so
> #3  0x00000519399e6a87 in emutls_init () at 
> /usr/src/gnu/lib/libcompiler_rt/../../llvm/compiler-rt/lib/builtins/emutls.c:118
> #4  0x000005188f55b4f7 in pthread_once (once_control=0x51939d00b30 
> <emutls_init_once.once>, init_routine=0x27240efb23d627ef) at 
> /usr/src/lib/libc/thread/rthread_once.c:26
> #5  0x00000519399e68dd in emutls_init_once () at 
> /usr/src/gnu/lib/libcompiler_rt/../../llvm/compiler-rt/lib/builtins/emutls.c:125
> #6  emutls_get_index (control=0x51939cae5c8 
> <__emutls_v._ZL25TimeTraceProfilerInstance>) at 
> /usr/src/gnu/lib/libcompiler_rt/../../llvm/compiler-rt/lib/builtins/emutls.c:316
> #7  __emutls_get_address (control=0x51939cae5c8 
> <__emutls_v._ZL25TimeTraceProfilerInstance>) at 
> /usr/src/gnu/lib/libcompiler_rt/../../llvm/compiler-rt/lib/builtins/emutls.c:379
> #8  0x00000519387f296e in llvm::getTimeTraceProfilerInstance() () from 
> /usr/lib/libLLVM.so.5.0
> #9  0x0000051938ec2bf2 in llvm::legacy::PassManagerImpl::run(llvm::Module&) 
> () from /usr/lib/libLLVM.so.5.0
> #10 0x000005193974eb67 in LLVMRunPassManager () from /usr/lib/libLLVM.so.5.0
> #11 0x00000518d11276d8 in ?? () from 
> /usr/X11R6/lib/modules/dri/radeonsi_dri.so
> #12 0x00000518d1082761 in ?? () from 
> /usr/X11R6/lib/modules/dri/radeonsi_dri.so
> #13 0x00000518d110b1ea in ?? () from 
> /usr/X11R6/lib/modules/dri/radeonsi_dri.so
> #14 0x00000518d0c7939c in ?? () from 
> /usr/X11R6/lib/modules/dri/radeonsi_dri.so
> #15 0x00000518d0c794ed in ?? () from 
> /usr/X11R6/lib/modules/dri/radeonsi_dri.so
> #16 0x00000518d1cfbec1 in _rthread_start (v=<error reading variable: 
> Unhandled dwarf expression opcode 0xa3>) at 
> /usr/src/lib/librthread/rthread.c:96
> #17 0x000005188f52bd2a in __tfork_thread () at 
> /usr/src/lib/libc/arch/amd64/sys/tfork_thread.S:84
> 
> It means that emutls implementation we are using couldn't be safely
> used if the code is using dlopen(3).
> 
> 
> I made the following PoC using __thread :
> 
> $ cat lib.c
> #include <stdio.h>
> 
> __thread int value = 0;
> 
> void
> fn()
> {
>       printf("entering:  %s\n", __func__);
>       value = 1;
>       printf("returning: %s\n", __func__);
> }
> 
> $ cat main.c
> #include <err.h>
> #include <dlfcn.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <pthread.h>
> 
> void *
> loadcode(void *arg)
> {
>       void *lib;
>       void (*fn)();
> 
>       printf("thread: entering\n");
>       
>       printf("dlopen(3)\n");
>       if ((lib = dlopen("./lib.so", 0)) == NULL)
>               errx(EXIT_FAILURE, "dlopen: %s", dlerror());
>       
>       if ((fn = dlsym(lib, "fn")) == NULL)
>               errx(EXIT_FAILURE, "dlsym: %s", dlerror());
> 
>       fn();
> 
>       printf("dlclose(3)\n");
>       if (dlclose(lib) != 0)
>               errx(EXIT_FAILURE, "dlclose: %s", dlerror());
> 
>       printf("thread: returning\n");
>       return arg;
> }
> 
> int
> main(int argc, char *argv[])
> {
>       int error;
>       pthread_t th;
> 
>       if ((error = pthread_create(&th, NULL, &loadcode, NULL)) != 0)
>               errc(error, EXIT_FAILURE, "pthread_create");
>       
>       if ((error = pthread_join(th, NULL)) != 0)
>               errc(error, EXIT_FAILURE, "pthread_join");
>               
>       return EXIT_SUCCESS;
> }
> 
> $ cc lib.c -Wall -lpthread -shared -fPIC -o lib.so
> $ cc main.c -Wall -lpthread
> $ ./a.out
> thread: entering
> dlopen(3)
> entering:  fn
> returning: fn
> dlclose(3)
> thread: returning
> Segmentation fault (core dumped) 
> 
> (gdb) bt
> #0  0x000004eb3011aec0 in ?? ()
> #1  0x000004eb7fc41b75 in _rthread_tls_destructors (thread=0x4eaccaa5c40) at 
> /usr/src/lib/libc/thread/rthread_tls.c:182
> #2  0x000004eb7fbd9cd3 in _libc_pthread_exit (retval=<error reading variable: 
> Unhandled dwarf expression opcode 0xa3>) at 
> /usr/src/lib/libc/thread/rthread.c:150
> #3  0x000004eb06814ec9 in _rthread_start (v=<error reading variable: 
> Unhandled dwarf expression opcode 0xa3>) at 
> /usr/src/lib/librthread/rthread.c:97
> #4  0x000004eb7fbedd2a in __tfork_thread () at 
> /usr/src/lib/libc/arch/amd64/sys/tfork_thread.S:84

I'm not surprised.  I fear the only solution is for us to implement proper TLS 
support in ld.so.

Reply via email to