[Bug sanitizer/55354] [asan] by default, the asan run-time should be linked statically, not dynamically
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55354 --- Comment #24 from Jakub Jelinek jakub at gcc dot gnu.org 2012-11-23 08:13:11 UTC --- When I rebuild libtsan with -fPIE instead of -fPIC in the Makefile, and g++ -shared -Wl,--whole-archive -o libtsanx.so libtsan.a -ldl -lpthread (note that .libs/*.o are still built with -fPIC because libtool overrides those flags and adds its -fPIC at the end), then this link fails with: /usr/bin/ld: libtsan.a(tsan_rtl.o): relocation R_X86_64_TPOFF32 against `_ZN6__tsan22cur_thread_placeholderE' can not be used when making a shared object; recompile with -fPIC libtsan.a(tsan_rtl.o): could not read symbols: Bad value collect2: error: ld returned 1 exit status (obviously, local-exec model would need to patch the insn at dynamic linking time, making the library DT_TEXTREL (not acceptable for SELinux, not allowed for x86_64 at all)). Yes, I see the code generation differences, but the functions are huge anyway, is it really so crucial that you'd want to make another libtsan_pie.a for it? BTW, the tsan that was added to GCC yesterday doesn't have -ftls-model=initial-exec even, so it is even slower. -ftls-model=initial-exec can only be done for *-*-linux* targets btw, I don't think other dynamic linkers support dlopening IE model shared libraries. E.g. libgomp has in its configure.tgt if test $gcc_cv_have_tls = yes ; then case ${target} in *-*-linux*) XCFLAGS=${XCFLAGS} -ftls-model=initial-exec ;; esac fi so libsanitizer/configure.tgt would need to add it to say TSAN_CXXFLAGS var that would be substituted by configure. Would be nice if you could check the numerous warnings from tsan build, e.g. it seems the ALWAYS_INLINE macro doesn't include the inline keyword, and you are using it on functions that don't have inline keyword, which gives a warning and if it is inlined, it is by pure luck. Either you should add inline keywords manually, or put inline keyword into ALWAYS_INLINE macro. There are other warnings...
[Bug sanitizer/55354] [asan] by default, the asan run-time should be linked statically, not dynamically
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55354 --- Comment #25 from Jakub Jelinek jakub at gcc dot gnu.org 2012-11-23 08:20:40 UTC --- Have you considered using __builtin_expect to help the compiler in the performance sensitive code? Or even better would be profile-feedback, build libtsan once, run some benchmark, build libtsan again.
[Bug sanitizer/55354] [asan] by default, the asan run-time should be linked statically, not dynamically
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55354 --- Comment #26 from Dmitry Vyukov dvyukov at google dot com 2012-11-23 08:34:52 UTC --- (In reply to comment #25) Have you considered using __builtin_expect to help the compiler in the performance sensitive code? Or even better would be profile-feedback, build libtsan once, run some benchmark, build libtsan again. There are few of them in the code in the obvious cases. The performance is very shaky and depends on compiler, compiler version and options. I remember I get worse results when I added some other likily/unlikely. I didn't have time for more comprehensive investigation. FDO may be a good idea for prebuild libraries.
[Bug sanitizer/55354] [asan] by default, the asan run-time should be linked statically, not dynamically
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55354 --- Comment #27 from Konstantin Serebryany konstantin.s.serebryany at gmail dot com 2012-11-23 10:47:14 UTC --- is it really so crucial that you'd want to make another libtsan_pie.a for it? I would actually prefer to have *only* libtsan_pie.a, and not bother with .so version at all. This is what we have in clang land.
[Bug sanitizer/55354] [asan] by default, the asan run-time should be linked statically, not dynamically
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55354 --- Comment #28 from Konstantin Serebryany konstantin.s.serebryany at gmail dot com 2012-11-23 11:14:58 UTC --- Note that today's upstream tsan sources don't link with -fPIC at all because our assembly blobs are not PIC-friendly. /usr/bin/ld: .libs/tsan_rtl_thread.o: relocation R_X86_64_PC32 against undefined symbol `__tsan_trace_switch_thunk' can not be used when making a shared objec Apparently, Wei hacked this around by enabling pure-C++ code, which is slower.
[Bug sanitizer/55354] [asan] by default, the asan run-time should be linked statically, not dynamically
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55354 --- Comment #22 from Dmitry Vyukov dvyukov at google dot com 2012-11-23 07:16:14 UTC --- For dynamic libraries that are loaded into a non-instrumented executable (e.g. swig so preloaded into python process), we statically link the tsan runtime into the so. And you don't get linker errors from that? That must be by pure luck. Why must I get errors in this case?
[Bug sanitizer/55354] [asan] by default, the asan run-time should be linked statically, not dynamically
http://gcc.gnu.org/bugzilla/show_bug.cgi?id=55354 --- Comment #23 from Dmitry Vyukov dvyukov at google dot com 2012-11-23 07:27:27 UTC --- (In reply to comment #21) (In reply to comment #20) What I see is that it also affect code generation (register allocation). Do we need to file a bug on that? If you see a code generation difference even with -ftls-model=local-exec -fPIC vs. -fPIE, then it must mean you don't have visibility attributes on the symbols used in the fast path. For initial-exec, the RA effects should be minimal, the TLS offset load from got is usually very close to the actual TLS memory load (or lea), and thus it will just pick up some short lived scratch register. Generally in GCC, -fPIE sets flag_pic and not flag_shlib, while -fPIC sets flag_pic and flag_shlib. flag_pic is about whether position independent code needs to be generated, flag_shlib is about whether locally defined symbols can be interposed (plus it affects TLS model default choice). When I compile with -fvisibility=hidden, it does not affect generated code. It's not that we access a lot of symbols in the function, there is one thread-local and one static global var. That minimal RA effects do have effect in our case. We don't have a reserve to squeeze another register for tls access: // -fPIE 0009ca30 __tsan_write2: 9ca30: 64 48 8b 04 25 40 1fmov%fs:0xffeb1f40,%rax 9ca37: eb ff 9ca39: 48 8b 0c 24 mov(%rsp),%rcx 9ca3d: a8 01 test $0x1,%al 9ca3f: 0f 85 d3 00 00 00 jne9cb18 __tsan_write2+0xe8 9ca45: 48 83 e8 80 sub$0xff80,%rax 9ca49: 48 89 femov%rdi,%rsi 9ca4c: 48 89 c2mov%rax,%rdx 9ca4f: 64 48 89 04 25 40 1fmov%rax,%fs:0xffeb1f40 9ca56: eb ff // -fPIC -ftls-model=initial-exec 000969f0 __tsan_write2: 969f0: 48 c7 c2 40 1f eb ffmov$0xffeb1f40,%rdx 969f7: 53 push %rbx 969f8: 48 8b 4c 24 08 mov0x8(%rsp),%rcx 969fd: 64 48 8b 02 mov%fs:(%rdx),%rax 96a01: a8 01 test $0x1,%al 96a03: 0f 85 c7 00 00 00 jne96ad0 __tsan_write2+0xe0