https://gcc.gnu.org/bugzilla/show_bug.cgi?id=98112
--- Comment #7 from Fangrui Song <i at maskray dot me> --- (In reply to Segher Boessenkool from comment #6) > (In reply to Fangrui Song from comment #5) > > Please read my first comment why copy relocs is a bad name. > > Since I reply to some of that (namely, your argument 1)), you could assume I > have read your comment already ;-) > > > The compiler > > behavior is whether the external data symbol is accessed > > directly/indirectly. > > Not really, no. It isn't clear at all what "directly" even means! > > Copy relocs is just the inferred ELF linker behavior > > (in -no-pie/-pie link mode) when the symbol is external. The option name > > should mention the direct behavior, instead of the inferred behavior at the > > linking stage. > > Yes. But your proposed solution just makes this worse :-( I try to use one term to describe absolute/PC-relative relocation types (e.g. R_X86_64_64, R_X86_64_PC32)... "Indirect" means GOT-generating relocation types and (PowerPC64) TOC-generating relocation types. "direct/indirect" are more descriptive and more accurate than "copy relocs" (which is not the case if the symbol turns out to be defined locally; this term does not apply to other binary formats). > > -fdirect-access-external-data makes sense on other binary formats, though I > > won't ask GCC to > > implement relevant behaviors for other binary formats. > > But what does that *mean*? "direct access"? (And, "external data", for that > matter! This isn't as obvious as it was thirty years ago.) In PowerPC64 ELF v2, the term "GOT-indirect addressing" is used, In x86-64 psABI, there is a section "Indirect Call via the GOT Slot". Indirect calls/jumps are pretty common - so it is understood that GOT relocation types generally mean "indirect". "external data" is the best term I find for things like `extern int var;` It means the data symbol is undefined in the current translation unit but may be defined in another translation unit or another linked unit. > > * For example, on COFF, the behavior is like always > > -fdirect-access-external-data. __declspec(dllimport) is needed to use > > indirect access. > > I don't know what "declspec" is. Something something mswindows? Yes. `extern int var; int foo() { return var; }` compiles to `movl var(%rip), %eax` (a "direct access" (PC-relative) relocation type). Its behavior is like always -fdirect-access-external-data. __declspec(dllimport) annotation can override the command line option. > > * On Mach-O, the behavior is like -fdirect-access-external-data for -fno-pic > > (only available on arm) and the opposite for -fpic. > > So what you want is that object that are globally visible will be implemented > as-is? For if you do not do whole-program optimisation, for example? So > that > a) those objects will actually *exist*, and b) they will be laid out in the > way > the program expects? Undefined global objects and address-taken functions in the current translation unit are affected. A function taken address is very like a data symbol: ``` // gcc -fno-pic generates an absolute relocation type. If foo is defined in a DSO, // it will require a "canonical PLT entry" (st_shndx=0, st_value!=0) - a hack agreed by the linker and ld.so extern void foo(); void *addr() { return foo; } ``` The default ELF behavior on most architectures is: -fno-pic uses an absolute relocation type while (non-x86-64) -fpie uses a GOT-generating relocation type (x86-64) -fpie uses PC-relative. If -fno-direct-access-external-data is specified, -fno-pic/-fpie will use GOT-generating relocation types to prevent * copy relocations if the symbol turns out to be undefined in the module. * canonical PLT entry for an address-taken function. The proposed option is local to a translation unit (like most options). However, if this information is recorded in LTO IR files, the optimizer can assume the variable can be referenced via a direct relocation type in the combined IR file. > > If you don't want to think of non-ELF, feel free to make the option specific > > to ELF. > > The problem is not that I don't want to think about it, but that the way it > seems to be defined only applies to ELF (and to some specific (sub-)targets > using ELF, even). As I mentioned earlier, this applies to other binary formats. I'll just show you evidence by pointing you directly to the code ;-) In LLVM, generally speaking, a dso_local undefined global object is accessed directly while a non-dso_local undefined global object is accessed via GOT indirection. In Clang, dso_local annotation is added in https://github.com/llvm/llvm-project/blob/main/clang/lib/CodeGen/CodeGenModule.cpp#L913-L988 (The internal abstraction is currently a bit unfortunate. LLVM IR has another set of rules (many are duplicated) https://github.com/llvm/llvm-project/blob/main/llvm/lib/Target/TargetMachine.cpp#L94-L178 I intend to eventually clean up the LLVM IR side rules) (Attributes generally supersede the proposed command line option.) The few `return true;` places can be refined to check -f[no-]direct-access-external-data. Two options are similar to -f[no-]direct-access-external-data. * -fno-plt: it only applies to external function calls (not taking address) * -fno-semantic-interposition: it only applies to defined function/variable symbols I've thought about -f[no-]semantic-interposition-external-data, but I don't find it more suitable than -f[no-]direct-access-external-data. > > > You want to have this a generic option, while it is > > > not clear at all what it would mean, what it would *do*, which is > > > especially > > > important if you want this to be an option used by multiple compilers: if > > > it > > > is not clear to every user what simple, sensible thing a flag is the knob > > > for, that flag simply cannot be used at all -- or worse, some users *will* > > > use it, but then their intentions are not clear to humans, and different > > > compilers can (and will!) think the user wanted something else! > > > > To be clear, GCC botched things with the inappropriate HAVE_LD_PIE_COPYRELOC > > Huh? That isn't a user-visible thing at all, it's an implementation detail. > It is a quite straight-forward autoxxxx thing, defined to true if the loader > passes some specific test. > > - o - o - > > So, what you want is to attach the attribute ((used)) variable attribute to > all > data (or at least the data not explicitly made static) automatically? No. The option is very different from __attribute__((used)).