On 04/12/2018 07:59 PM, Jakub Jelinek wrote:
> On Thu, Apr 12, 2018 at 07:53:35PM +0200, Jakub Jelinek wrote:
>> On Thu, Apr 12, 2018 at 03:46:26PM +0200, Jan Hubicka wrote:
>>> If you make C++ inline and get the idea to use target cloning attribute on 
>>> this,
>>> this will likely lead to link error if you compile multiple files because 
>>> you
>>> turn comdat to non-comdat.
>>>
>>> For comdats this woudl effectivly need to become C++ abi extension and we 
>>> would
>>> need to define comdat sections for these.  Perhaps easiest way is to simply
>>> reject the attribute on comdats and probaby also extern functions?
>>
>> I'm not really sure we can do that, various packages in the wild are already
>> using this.
>> What is the problem with comdats and multi-versioning?
>> The question is what comdat groups we should use for the comdat resolver and
>> the versioned functions, shall the ifunc symbol be the original mangling of
>> the method (or other comdat) and the other entrypoints just be .local
>> non-weak symbols inside of the same section?
> 
> Ah, but we emit the resolver only if we see a use of it.  That sounds quite
> broken, resolver in each TU that uses it?  Better to have one at each
> definition...
> 
>       Jakub
> 

So after quite some time I would need some brainstorming as I'm not sure how to
fix that properly. Let's start how 'target' attribute works and I believe it 
should
behave same as target_clones attribute:

$ cat mv.h
int foo (void);

void test ();

$ cat mv.cpp 
#include "mv.h"
  
  __attribute__ ((target ("default")))
int foo ()
{
  return 0;
}

__attribute__ ((target ("sse4.2")))
int foo ()
{
  return 1;
}

int main ()
{
  __builtin_printf ("in main: %d\n", foo ());
  test ();

  return 0;
}

$ cat mv2.cpp 
#include "mv.h"

void test()
{
  int (*f) (void) = &foo;
  __builtin_printf ("value: %d\n", foo ());
}

$ gcc -fdump-ipa-cgraph=/dev/stdout mv.cpp -c

_Z13_Z3foov.ifuncv/2 (int _Z3foov.ifunc()) @0x7ffff67182e0
  Type: function definition analyzed alias cpp_implicit_alias
  Visibility: externally_visible asm_written public comdat 
comdat_group:_Z3foov.resolver one_only section:.text._Z3foov.resolver 
(implicit_section) artificial
  Same comdat group as: _Z3foov.resolver/6
  References: _Z3foov.resolver/6 (alias)
  Referring: 
  Availability: overwritable
  First run: 0
  Version info: next: _Z3foov/0 dispatcher: _Z3foov.resolver
  Function flags:
  Called by: 
  Calls: 
_Z3foov.sse4.2/1 (int foo()) @0x7ffff6718170
  Type: function definition analyzed
  Visibility: force_output externally_visible asm_written public
  Address is taken.
  References: 
  Referring: 
  Availability: available
  First run: 0
  Version info: prev: _Z3foov/0 dispatcher: int _Z3foov.ifunc()
  Function flags:
  Called by: 
  Calls: 
_Z3foov/0 (int foo()) @0x7ffff6718000
  Type: function definition analyzed
  Visibility: force_output externally_visible asm_written public
  Address is taken.
  References: 
  Referring: 
  Availability: available
  First run: 0
  Version info: next: _Z3foov.sse4.2/1 dispatcher: int _Z3foov.ifunc()
  Function flags:
  Called by: 
  Calls: 
_Z3foov.resolver/6 (_Z3foov.resolver) @0x7ffff67188a0
  Type: function definition analyzed
  Visibility: externally_visible asm_written public weak comdat 
comdat_group:_Z3foov.resolver one_only section:.text._Z3foov.resolver 
(implicit_section) artificial
  Same comdat group as: _Z13_Z3foov.ifuncv/2
  References: 
  Referring: _Z13_Z3foov.ifuncv/2 (alias)
  Availability: available
  First run: 0
  Function flags:
  Called by: 
  Calls: 

So note that resolver and ifunc live in a unique comdat_group. And as opposed 
to target_clones, default implementation has not appended '.default' suffix. 
That's
problem because an address taken returns default implementation as seen here:

$  gcc mv.cpp mv2.cpp && ./a.out 
in main: 1
value: 0

Which is the same problem is fixed for target_clones in this release cycle. So 
it's broken.

Apart from that, following is broken for target attribute:
cat Crypto-target.ii
struct aaa
{
  static __attribute__((target("avx512f"))) void foo() { __builtin_printf 
("avx512f\n"); }
  static __attribute__((target("sse"))) void foo() {__builtin_printf ("sse\n");}
  static __attribute__((target("default"))) void foo() {__builtin_printf 
("default\n");}
};

void (*initializer) (void) = { &aaa::foo };

int main()
{
  initializer ();
//  aaa::foo ();
}

$ g++ Crypto-target.ii && ./a.out
/tmp/ccJMMaDc.o:(.data+0x0): undefined reference to `_ZN3aaa3fooEv.ifunc()'
collect2: error: ld returned 1 exit status

For target_clones, I have a patch candidate that works for the test-case in 
PR85329:

$ cat ~/Programming/testcases/Crypto.ii
struct a
{
  __attribute__((target_clones("sse", "default"))) void operator^=(a) {}
} * b;

class c {
public:
  a *d();
};

class f {
  void g();
  c e;
};

void f::g() { *e.d() ^= b[0]; }

$ g++ ~/Programming/testcases/Crypto.ii -c -fdump-ipa-cgraph=/dev/stdout

_ZN1aeOES_.resolver/6 (_ZN1aeOES_.resolver) @0x7ffff690d730
  Type: function definition analyzed
  Visibility: force_output externally_visible public weak comdat 
comdat_group:_ZN1aeOES_ one_only artificial
  Same comdat group as: _ZN1aeOES_/5
  References: _ZN1aeOES_.sse.0/4 (addr)_ZN1aeOES_.default.1/0 (addr)
  Referring: _ZN1aeOES_/5 (alias)
  Availability: available
  First run: 0
  Function flags: body
  Called by: 
  Calls: int __builtin_cpu_supports(const char*)/8 (can throw external) int 
__builtin_cpu_init()/7 (can throw external) 
_ZN1aeOES_/5 (_ZN1aeOES_) @0x7ffff690d5c0
  Type: function definition analyzed alias cpp_implicit_alias 
target:_ZN1aeOES_.resolver
  Visibility: externally_visible public comdat comdat_group:_ZN1aeOES_ one_only 
artificial
  Same comdat group as: _ZN1aeOES_.resolver/6
  References: _ZN1aeOES_.resolver/6 (alias)
  Referring: 
  Availability: overwritable
  First run: 0
  Version info: next: _ZN1aeOES_.default.1/0 dispatcher: _ZN1aeOES_.resolver
  Function flags:
  Called by: void f::g()/2 
  Calls: 
_ZN1aeOES_.sse.0/4 (void a::_ZN1aeOES_.sse.0(a)) @0x7ffff690d450
  Type: function definition analyzed
  Visibility: force_output no_reorder prevailing_def_ironly artificial
  Address is taken.
  References: 
  Referring: _ZN1aeOES_.resolver/6 (addr)
  Availability: available
  First run: 0
  Version info: prev: _ZN1aeOES_.default.1/0 dispatcher: _ZN1aeOES_
  Function flags: body
  Called by: 
  Calls: 
_ZN1aeOES_.default.1/0 (void a::operator^=(a)) @0x7ffff690d000
  Type: function definition analyzed
  Visibility: force_output no_reorder prevailing_def_ironly artificial
  Address is taken.
  References: 
  Referring: _ZN1aeOES_.resolver/6 (addr)
  Availability: available
  First run: 0
  Version info: next: _ZN1aeOES_.sse.0/4 dispatcher: _ZN1aeOES_
  Function flags: body
  Called by: 
  Calls: 

It's not ideal because _ZN1aeOES_.sse.0/4 and _ZN1aeOES_.default.1/0 don't 
share the same comdat group.
But it should not break anything now, am I right? Honza?

I'll carry on during weekend after I'll get some feedback.

Thanks,
Martin

Reply via email to