On 2014-03-24 23:53, Trevor Saunders wrote:
On Mon, Mar 24, 2014 at 10:00:33AM -0400, Benjamin Smedberg wrote:
We would not like to use -fvisibility=hidden everywhere! -fvisibility=hidden is strictly worse than using the pragmas because it only affects symbol generation and doesn't affect symbol resolution. This means that we still have to use dynamic relocations against internal symbols when we should be using hidden relocations (on ELF platforms). It doesn't matter nearly as much or at all on Mach-O/PE which use startup relocations instead of a
GOT/PLT.
The documentation doesn't make it clear that's the case, and it seems
like a very odd way of doing things. In addition if it is the case then
I'm amazed linux builds I've done with -fvisibility=hidden only had 7k
more relocations than ones using the wrappers for all of libxul.


Well, actually the documentation at http://gcc.gnu.org/wiki/Visibility states:

#pragma GCC visibility is stronger than -fvisibility; it affects extern declarations as well. -fvisibility only affects definitions, so that existing code can be recompiled with minimal changes.

I made a simple example to figure out what this means:

$ cat a.c
extern int x;

int f() {
    return x;
}
$ cat b.c
int x = 42;
$ gcc -fPIC -shared -fvisibility=hidden a.c b.c -olibab.so
$ objdump -d libab.so
[...]
000000000000059c <f>:
 59c:   55                      push   %rbp
 59d:   48 89 e5                mov    %rsp,%rbp
5a0: 48 8b 05 39 1a 00 00 mov 0x1a39(%rip),%rax # 1fe0 <_DYNAMIC+0x1c8>
 5a7:   8b 00                   mov    (%rax),%eax
 5a9:   5d                      pop    %rbp
 5aa:   c3                      retq
$ readelf -r libab.so
[...]
Relocation section '.rela.dyn' at offset 0x400 contains 5 entries:
Offset Info Type Sym. Value Sym. Name + Addend 000000001fe0 000000000008 R_X86_64_RELATIVE 0000000000002010 000000002008 000000000008 R_X86_64_RELATIVE 0000000000002008 000000001fc8 000100000006 R_X86_64_GLOB_DAT 0000000000000000 __gmon_start__ + 0 000000001fd0 000200000006 R_X86_64_GLOB_DAT 0000000000000000 __cxa_finalize + 0 000000001fd8 000300000006 R_X86_64_GLOB_DAT 0000000000000000 _Jv_RegisterClasses + 0
[...]

When the code for a.c is generated, the compiler doesn't know that 'x' is hidden and generates an access through GOT. The resulting .so contains a relocation for x that says to place the address of x to GOT. With #pragma the situation is different:

$ cat a.c
#pragma GCC visibility push(hidden)
extern int x;

int f() {
    return x;
}
$ gcc -fPIC -shared a.c b.c -olibab.so
$ objdump -d libab.so
[...]
000000000000057c <f>:
 57c:   55                      push   %rbp
 57d:   48 89 e5                mov    %rsp,%rbp
580: 8b 05 8a 1a 00 00 mov 0x1a8a(%rip),%eax # 2010 <x>
 586:   5d                      pop    %rbp
[...]
$ readelf -r libab.so
[...]
Relocation section '.rela.dyn' at offset 0x400 contains 4 entries:
Offset Info Type Sym. Value Sym. Name + Addend 000000002008 000000000008 R_X86_64_RELATIVE 0000000000002008 000000001fd0 000100000006 R_X86_64_GLOB_DAT 0000000000000000 __gmon_start__ + 0 000000001fd8 000200000006 R_X86_64_GLOB_DAT 0000000000000000 __cxa_finalize + 0 000000001fe0 000300000006 R_X86_64_GLOB_DAT 0000000000000000 _Jv_RegisterClasses + 0
[...]

Now the declaration of 'x' is hidden, and the pc-relative access without GOT is generated. There is no relocation for 'x' in .so. IIUC the situation above is what Benjamin means. However, external function calls are another story -- again IIUC, the code on the caller's side doesn't depend on visibility at all before linking since even when PLT is used, the dereferencing of an address from GOT is done inside the PLT stub, not at the callsite. Given that internal functions are not called via PLT, a .so should never contain relocations for them.

So, in theory, the choice '-fvisibility=hidden' vs system wrappers should affect only relocations for data, and it would explain why the difference is only ~7K for firefox. I've tried to test the hypothesis, and here is the results of my firefox builds (elfhack disabled, -O3):
* Clang 3.4.1 with LTO and system wrappers:
Relocation section '.rela.dyn' at offset 0x99da0 contains 324292 entries: Relocation section '.rela.plt' at offset 0x806000 contains 4817 entries:
* Clang 3.4.1 with LTO and -fvisibility=hidden:
Relocation section '.rela.dyn' at offset 0x9cc70 contains 324405 entries: Relocation section '.rela.plt' at offset 0x809968 contains 4929 entries:
* Clang 3.4.1 with system wrappers:
Relocation section '.rela.dyn' at offset 0xa3d50 contains 338023 entries: Relocation section '.rela.plt' at offset 0x8606f8 contains 6167 entries:
* Clang 3.4.1 with -fvisibility=hidden:
Relocation section '.rela.dyn' at offset 0xa7098 contains 342740 entries: Relocation section '.rela.plt' at offset 0x87f478 contains 6323 entries:
* GCC 4.6.3 with system wrappers:
Relocation section '.rela.dyn' at offset 0xc0430 contains 361124 entries: Relocation section '.rela.plt' at offset 0x904390 contains 7544 entries:
* GCC 4.6.3 with -fvisibility=hidden:
Relocation section '.rela.dyn' at offset 0xc39b8 contains 366444 entries: Relocation section '.rela.plt' at offset 0x926bd8 contains 7693 entries:

There is always a small difference in .rela.plt, which seems like a contradiction. I've investigated a group of opus_* functions from libopus that have relocations for them in -fvisibility=hidden builds. It turned out that libopus uses OPUS_EXPORT which sets visibility to default but only if OPUS_BUILD macro is set. Since libopus is built with -DOPUS_BUILD, opus_* functions always have default visibility in its object files. However, a user of libopus (somewhere in webrtc) is not built with -DOPUS_BUILD, and OPUS_EXPORT is not set on declarations in this case. Therefore, if #pragma is set, it sets the visibility of declarations to 'hidden', and when libxul.so is linked, the hidden symbols from webrtc are merged with the default from libopus and become hidden, so relocations are not used. With -fvisibility=hidden the symbols have default visibility everywhere, and thus in libxul.so, so the relocations must be used.

All in all, it seems that system wrappers are superior (but not dramatically) unless all internal data used across units is explicitly marked with 'hidden' attribute and issues like with libopus are resolved (by either changing such libs so they don't export symbols when they are built statically or explicitly marking function declarations as hidden at use sites).

--
Alexey Izbyshev













_______________________________________________
dev-builds mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-builds

Reply via email to