> Absolute symbol references such as?
> References to fixed (constant)
> addresses?
Pointers stored in the .data section.  For example, if you have an array of 
const char*.

>  Why is that approach optimal? As few
> relocations records are required as
> possible?

small pic model is optimal for AMD64 executables or shared libraries that are < 
2GB in size, but need to be relocatable to any address in the 64-bit address 
space.  It generates the most compact code due to use of PC-relative jumps, 
calls and effective address calculations.

Technically, the small model is potentially more compact, but the sysv AMD64 
ABI requires small model programs to fit in the lowest 2GB of the address 
space.  EFI binaries load in the lower 4GB but not necessarily lower 2GB.
>  Why don't preemptible symbols make
> sense for PIE?
> (My apologies if I'm disturbingly
> ignorant about this and the question
> doesn't even make sense.)
They do of course.  The small pie model is a GCC extension not documented in 
sysv AMD64 ABI and it has a wierd characteristic that it assumes all external 
symbols are reachable directly and not via the GOT (= are not subject to being 
dymanically linked to.)

small pic model - formalized in sysv AMD64 ABI and mandates access to extern 
symbols via the GOT or PLT.
small pie model - a GCC extension that permits the code generator to elide the 
GOT, but does not mandate that the code generator elide the GOT.

Contrary to conventional wisdom - using the GOT can reduce code size when doing 
pointer arithmetic on the address of an external symbol, or pushing the address 
of an external symbol on the stack to be passed as a function argument.  See my 
response here to Andrew Fish.


As a result, GCC sometimes emits GOT loads for external symbols in the small 
pie model on AMD64.

There is an attribute __attribute__((visibility("hidden"))) that can be 
attached to external symbol declarations and tell the code generator "do not 
assume this symbol has a GOT entry" - effectively eliminating GOT loads.

The pragma mentioned by Ard Biesheuvel turns the attribute on wholesale to all 
symbols in sections of source files affected by it.

 > So... Given this behavior, why is it a
> problem for us? What are the bad
> symptoms? What is currently broken?
Ard Biesheuvel CCed a lot of people that didn't get the private communication 
about this.  As a continuation to the message above, I sent out an email 
detailing what happens in the GCC5 toolchain with LTO enabled and a standalone 
Shell App that demonstrates how today the GCC5 toolchain on X64 can still omit 
GOT loads into the ELF executable that are not handled by GenFw.  Below is my 
email.  The standalone test case can be downloaded from here


===== [quoted email]
> I figured out what's going on with LTO build in GCC5 that is compiled with 
> -Os -flto -DUSING_LTO and does not use visibility #pragma.
> When compiling with LTO enabled, what happens is that all C source files are 
> transformed during compilation stage to LTO intermediate bytecode (gimple in 
> GCC).
> Then when static link (ld) takes place, all LTO intermediate bytecode is sent 
> back to compiler code-generation backend to have machine code generated for 
> it as if all the source code is one big C source file ("whole program 
> optimization").
> As a result of this, all the extern symbols become local symbols !  like 
> file-level static.  Because it's as if all the code is in one big source 
> file.  Since there is no dynamic linking, there are no more "extern", and all 
> symbols are like file-level static and treated the same.
> This is why the LTO build stops emitting GOT loads for size-optimization 
> purposes.  GCC doesn't emit GOT loads for file-level static, and in LTO build 
> they're all like that - so no GOT loads.
> But there is still something that fouls this up...
> If an extern symbol is defined in assembly source file.
> Because assembly source files don't participate in LTO.  They are transformed 
> by assembler into X64 machine code.  During ld, any extern symbol that is 
> defined in an assembly source file and declared and used by C source file is 
> treated as before like external symbol.  Which means code generator can go 
> back to its practice of emitting GOT loads if they reduce code size.
> I'm attaching a standalone example of this coded as a UEFI shell application.
> - Unpack it to edk2/GccGOTEmitter.
> - Add it to ShellPkg/ShellPkg.dsc so it can be built.
> diff --git a/ShellPkg/ShellPkg.dsc b/ShellPkg/ShellPkg.dsc
> --- a/ShellPkg/ShellPkg.dsc
> +++ b/ShellPkg/ShellPkg.dsc
> @@ -134,6 +134,7 @@
>     <LibraryClasses>
>   }
> +  GccGOTEmitter/GccGOTEmitter.inf
> [BuildOptions]
> - Build with
> build -a X64 -b RELEASE -m GccGOTEmitter/GccGOTEmitter.inf -p 
> ShellPkg/ShellPkg.dsc -t GCC5
> - Result:
> /media/Dev/edk2/Build/Shell/RELEASE_GCC5/X64/GccGOTEmitter/GccGOTEmitter/DEBUG/GccGOTEmitter.efi
> /media/Dev/edk2/Build/Shell/RELEASE_GCC5/X64/GccGOTEmitter/GccGOTEmitter/DEBUG/GccGOTEmitter.dll
make: *** [GNUmakefile:367: 
 Error 2
> GenFw: ERROR 3000: Invalid
> unsupported ELF EM_X86_64 relocation 0x2a.
> GenFw: ERROR 3000: Invalid
> unsupported ELF EM_X86_64 relocation 0x2a.
> relocation 0x2a is R_X86_64_REX_GOTPCRELX which is emitted as part of addq 
> instruction into the GOT in order to implement the pointer arithmetic with 
> slightly smaller code.
> There are 2 possible resolutions to this.
> - One is to add the X64 GOTPCREL support to GenFw.
> - The other is to document somewhere that if
>   -- An external symbol is defined in assembly code.
>   -- The symbol is declared and used in C code.
>   -- The C code uses pointer arithmetic on the external symbol or passes it 
>as a function argument.
>   -- Then the external symbol should be declared as 
>"__attribute__((visibility("hidden")))"  in the C code.
> Note that the 2nd resolution also works in the sample - if the attribute is 
> put on ThunksBase declaration.
edk2-devel mailing list

Reply via email to