https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88509
Bug ID: 88509 Summary: Missing optimization of tls initialization Product: gcc Version: 8.2.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: rafael at espindo dot la Target Milestone: --- Given struct foo { foo(); }; static thread_local foo bar; foo *f() { return &bar; } foo *g() { static thread_local foo *bar_ptr; if (bar_ptr == nullptr) { [&]() { bar_ptr = &bar; }(); } return bar_ptr; } GCC has to make sure bar is only initialized once. For the function f it produces pushq %rbx cmpb $0, %fs:__tls_guard@tpoff movq %fs:0, %rbx je .L6 leaq _ZL3bar@tpoff(%rbx), %rax popq %rbx ret .L6: <do_the_initialization> for g, the common code path is somewhat simpler: movq %fs:_ZZ1gvE7bar_ptr@tpoff, %rax testq %rax, %rax je .L15 ret .L15: <do_the_initialization> The optimization is to use the a pointer to the object as a guard instead of using a boolean. As far as I can tell this can be applied to any static tls variable (where the ABI is not a problem).