On 04/12/2018 07:59 PM, Jakub Jelinek wrote: > On Thu, Apr 12, 2018 at 07:53:35PM +0200, Jakub Jelinek wrote: >> On Thu, Apr 12, 2018 at 03:46:26PM +0200, Jan Hubicka wrote: >>> If you make C++ inline and get the idea to use target cloning attribute on >>> this, >>> this will likely lead to link error if you compile multiple files because >>> you >>> turn comdat to non-comdat. >>> >>> For comdats this woudl effectivly need to become C++ abi extension and we >>> would >>> need to define comdat sections for these. Perhaps easiest way is to simply >>> reject the attribute on comdats and probaby also extern functions? >> >> I'm not really sure we can do that, various packages in the wild are already >> using this. >> What is the problem with comdats and multi-versioning? >> The question is what comdat groups we should use for the comdat resolver and >> the versioned functions, shall the ifunc symbol be the original mangling of >> the method (or other comdat) and the other entrypoints just be .local >> non-weak symbols inside of the same section? > > Ah, but we emit the resolver only if we see a use of it. That sounds quite > broken, resolver in each TU that uses it? Better to have one at each > definition... > > Jakub >
So after quite some time I would need some brainstorming as I'm not sure how to fix that properly. Let's start how 'target' attribute works and I believe it should behave same as target_clones attribute: $ cat mv.h int foo (void); void test (); $ cat mv.cpp #include "mv.h" __attribute__ ((target ("default"))) int foo () { return 0; } __attribute__ ((target ("sse4.2"))) int foo () { return 1; } int main () { __builtin_printf ("in main: %d\n", foo ()); test (); return 0; } $ cat mv2.cpp #include "mv.h" void test() { int (*f) (void) = &foo; __builtin_printf ("value: %d\n", foo ()); } $ gcc -fdump-ipa-cgraph=/dev/stdout mv.cpp -c _Z13_Z3foov.ifuncv/2 (int _Z3foov.ifunc()) @0x7ffff67182e0 Type: function definition analyzed alias cpp_implicit_alias Visibility: externally_visible asm_written public comdat comdat_group:_Z3foov.resolver one_only section:.text._Z3foov.resolver (implicit_section) artificial Same comdat group as: _Z3foov.resolver/6 References: _Z3foov.resolver/6 (alias) Referring: Availability: overwritable First run: 0 Version info: next: _Z3foov/0 dispatcher: _Z3foov.resolver Function flags: Called by: Calls: _Z3foov.sse4.2/1 (int foo()) @0x7ffff6718170 Type: function definition analyzed Visibility: force_output externally_visible asm_written public Address is taken. References: Referring: Availability: available First run: 0 Version info: prev: _Z3foov/0 dispatcher: int _Z3foov.ifunc() Function flags: Called by: Calls: _Z3foov/0 (int foo()) @0x7ffff6718000 Type: function definition analyzed Visibility: force_output externally_visible asm_written public Address is taken. References: Referring: Availability: available First run: 0 Version info: next: _Z3foov.sse4.2/1 dispatcher: int _Z3foov.ifunc() Function flags: Called by: Calls: _Z3foov.resolver/6 (_Z3foov.resolver) @0x7ffff67188a0 Type: function definition analyzed Visibility: externally_visible asm_written public weak comdat comdat_group:_Z3foov.resolver one_only section:.text._Z3foov.resolver (implicit_section) artificial Same comdat group as: _Z13_Z3foov.ifuncv/2 References: Referring: _Z13_Z3foov.ifuncv/2 (alias) Availability: available First run: 0 Function flags: Called by: Calls: So note that resolver and ifunc live in a unique comdat_group. And as opposed to target_clones, default implementation has not appended '.default' suffix. That's problem because an address taken returns default implementation as seen here: $ gcc mv.cpp mv2.cpp && ./a.out in main: 1 value: 0 Which is the same problem is fixed for target_clones in this release cycle. So it's broken. Apart from that, following is broken for target attribute: cat Crypto-target.ii struct aaa { static __attribute__((target("avx512f"))) void foo() { __builtin_printf ("avx512f\n"); } static __attribute__((target("sse"))) void foo() {__builtin_printf ("sse\n");} static __attribute__((target("default"))) void foo() {__builtin_printf ("default\n");} }; void (*initializer) (void) = { &aaa::foo }; int main() { initializer (); // aaa::foo (); } $ g++ Crypto-target.ii && ./a.out /tmp/ccJMMaDc.o:(.data+0x0): undefined reference to `_ZN3aaa3fooEv.ifunc()' collect2: error: ld returned 1 exit status For target_clones, I have a patch candidate that works for the test-case in PR85329: $ cat ~/Programming/testcases/Crypto.ii struct a { __attribute__((target_clones("sse", "default"))) void operator^=(a) {} } * b; class c { public: a *d(); }; class f { void g(); c e; }; void f::g() { *e.d() ^= b[0]; } $ g++ ~/Programming/testcases/Crypto.ii -c -fdump-ipa-cgraph=/dev/stdout _ZN1aeOES_.resolver/6 (_ZN1aeOES_.resolver) @0x7ffff690d730 Type: function definition analyzed Visibility: force_output externally_visible public weak comdat comdat_group:_ZN1aeOES_ one_only artificial Same comdat group as: _ZN1aeOES_/5 References: _ZN1aeOES_.sse.0/4 (addr)_ZN1aeOES_.default.1/0 (addr) Referring: _ZN1aeOES_/5 (alias) Availability: available First run: 0 Function flags: body Called by: Calls: int __builtin_cpu_supports(const char*)/8 (can throw external) int __builtin_cpu_init()/7 (can throw external) _ZN1aeOES_/5 (_ZN1aeOES_) @0x7ffff690d5c0 Type: function definition analyzed alias cpp_implicit_alias target:_ZN1aeOES_.resolver Visibility: externally_visible public comdat comdat_group:_ZN1aeOES_ one_only artificial Same comdat group as: _ZN1aeOES_.resolver/6 References: _ZN1aeOES_.resolver/6 (alias) Referring: Availability: overwritable First run: 0 Version info: next: _ZN1aeOES_.default.1/0 dispatcher: _ZN1aeOES_.resolver Function flags: Called by: void f::g()/2 Calls: _ZN1aeOES_.sse.0/4 (void a::_ZN1aeOES_.sse.0(a)) @0x7ffff690d450 Type: function definition analyzed Visibility: force_output no_reorder prevailing_def_ironly artificial Address is taken. References: Referring: _ZN1aeOES_.resolver/6 (addr) Availability: available First run: 0 Version info: prev: _ZN1aeOES_.default.1/0 dispatcher: _ZN1aeOES_ Function flags: body Called by: Calls: _ZN1aeOES_.default.1/0 (void a::operator^=(a)) @0x7ffff690d000 Type: function definition analyzed Visibility: force_output no_reorder prevailing_def_ironly artificial Address is taken. References: Referring: _ZN1aeOES_.resolver/6 (addr) Availability: available First run: 0 Version info: next: _ZN1aeOES_.sse.0/4 dispatcher: _ZN1aeOES_ Function flags: body Called by: Calls: It's not ideal because _ZN1aeOES_.sse.0/4 and _ZN1aeOES_.default.1/0 don't share the same comdat group. But it should not break anything now, am I right? Honza? I'll carry on during weekend after I'll get some feedback. Thanks, Martin