Hello We've got a problem with -reduce-relocations. tl;dr: it's a broken concept and we either add a permanent workaround or we stop using it. The permanent workaround is to compile all executables in PIC/PIE mode.
Long story: The -reduce-relocations option in configure checks that the compiler supports the linker flag -Bsymbolic-functions. That function was added to binutils in 2006 from our urging, to make it possible for us to use it when the -Bsymbolic option presented problems. Turns out that -Bsymbolic-functions has the same problems that -Bsymbolic had and is no fix. Those two options cause the linker to "symbolic link" some symbols into the binary it's producing. That is, if a symbol X is used and is also defined inside this ELF module, then this option tells the linker that it may rightly assume that the symbol will always be inside this module. The linker will then use cheaper types of relocation, or none at all. This is a huge performance improvement both at load- and at run-time. -Bsymbolic does it for everything, whereas -Bsymbolic-functions does it for functions only. The reason why we needed -Bsymbolic-functions in the first place is that ELF has a weird feature that causes data variables to move between modules. Functions weren't affected because they aren't moved. Turns out that there is one situation in which a function is treated as data: when you take its address. In order to compare equally, the dynamic linker must resolve the function address to only one place, and unfortunately for us, the choice isn't to our liking. The "canonical" address may be moved from the library. We haven't hit this problem before because we hadn't been doing function pointer comparisons. Now, with Olivier's "new connection syntax" patch, we are. The workaround possible is to tell the compiler and linker that even executables are position-independent. This causes the linker to stop using copy/move relocations because it doesn't need them. However, there use of PIC may have a non-trivial performance impact on applications, due to indirect variable accesses and loss of one register. Regardless of whether I manage to convince the linker people to improve the situation, we need to figure out a solution for existing systems. What shall we do? Even longer story (background): In code that isn't position-independent (i.e., the executable), a data access is done as: movl variable, %eax And a function call as: call function And the loading of a function address as: movl $function, %edi When linking this program, the linker needs to write the address of the variable "variable" and of the function "function" into the instructions (one is absolute and the other relative, but that's irrelevant). If both symbols are found in a shared library, then the linker will "patch up" differently. For the function, it will make the "call" instruction call to a stub called the Procedure Linkage Table (PLT), which then loads the proper address from somewhere and then jumps to the proper address. That somewhere is another structure called the Global Offset Table, which the dynamic linker will fill with the actual function address once the library has been loaded. For the variable, things get complicated. There's no way to do the PLT trick. So what the linker does instead is add a "copy relocation". It writes the name of the variable and its expected size and reserves that much in the executable. The dynamic linker will then, at load time, find the variable in the shared library, copy the contents and then tell the library it should instead find the variable in the executable's memory. When using position-independent code options (-fPIC and -fPIE), things change. The compiler will write for the function call: call function@PLT The loading of a function address is: movq function@GOTPCREL(%rip), %rdi As for the variable, it produces: movq variable@GOTPCREL(%rip), %rax movl (%rax), %eax All accesses are position-independent and indirect. The call is placed via the PLT, addresses are loaded from the GOT and the loading of values is done after the actual address is loaded from the GOT. This is suitable for accessing symbols defined in other ELF modules. It's also necessary for library code. Unfortunately, the side-effect is that access to symbols defined in the current ELF module is also done indirectly. Two options help change this: - fvisibility=hidden and the symbolics. The -fvisibility=hidden option is enabled by default in Qt since 4.0 and corresponds to the configure option -reduce-exports. It does not change the code above, so it means that all variable accesses to variables not defined in the same compilation unit are indirect. Fortunately for the function call, the linker realises that target is inside the library and cannot be anywhere else, so the call is now direct to function. The loading of the address is via the GOT, which means a run-time relocation is still necessary, when the most efficient solution would be to use the "load effective address" instruction with no relocation. The -Bsymbolic and -Bsymbolic-functions produce the same effect, with the difference that the symbol is left the ELF export table (i.e., "default" visibility). The consequences of all of this are: 1) there's absolutely no way to get the most efficient code in libraries, period. ELF is optimised for executable code, not library. 2) -Bsymbolic is a broken concept so long as copy relocations remain in use 3) -Bsymbolic-functions is either the same broken concept or a broken implementation. It might be possible to salvage the option by making the linker optimise the PLT calls like it does today, but keep the GOT references as public. 4) calling a function via a function pointer is inefficient because of an indirect jump. If that function's address was taken in the executable, it's doubly inefficient: the indirect jump you make resolves to another indirect jump. The only architecture not affected by this is IA-64. One reason is that IA-64 ABI mandates that executables also be PIC, so the original problem is gone: there are no copy relocations. What's more, Intel engineers realised the problem of the indirect loading of data and invented a special relocation that the linker is allowed to relax into simpler code. If the symbol is found, at link-time, to be on the same ELF module, the linker relaxes the "load" generated by the compiler into a "move" between registers. It's possible to apply the same lessons learned to other platforms, but it hasn't been done. -- Thiago Macieira - thiago.macieira (AT) intel.com Software Architect - Intel Open Source Technology Center Intel Sweden AB - Registration Number: 556189-6027 Knarrarnäsgatan 15, 164 40 Kista, Stockholm, Sweden
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ Development mailing list Development@qt-project.org http://lists.qt-project.org/mailman/listinfo/development