ABataev added a comment.

In D71241#1782157 <https://reviews.llvm.org/D71241#1782157>, @jdoerfert wrote:

> In D71241#1781955 <https://reviews.llvm.org/D71241#1781955>, @ABataev wrote:
>
> > In D71241#1780715 <https://reviews.llvm.org/D71241#1780715>, @jdoerfert 
> > wrote:
> >
> > > In D71241#1779097 <https://reviews.llvm.org/D71241#1779097>, @ABataev 
> > > wrote:
> > >
> > > > In D71241#1778963 <https://reviews.llvm.org/D71241#1778963>, @jdoerfert 
> > > > wrote:
> > > >
> > > > > In D71241#1778736 <https://reviews.llvm.org/D71241#1778736>, @ABataev 
> > > > > wrote:
> > > > >
> > > > > > In D71241#1778717 <https://reviews.llvm.org/D71241#1778717>, 
> > > > > > @jdoerfert wrote:
> > > > > >
> > > > > > > >> There is no evidence that this is more complicated. By all 
> > > > > > > >> measures, this is less complicated (see also below). It is 
> > > > > > > >> also actually doing the right thing when it comes to code 
> > > > > > > >> emission. Take https://godbolt.org/z/sJiP3B for example. The 
> > > > > > > >> calls are wrong and the definition of base is missing.
> > > > > > > > 
> > > > > > > > How did you measure it? I have a completely different opinion. 
> > > > > > > > Also, tried to reproduce the problem locally, could not 
> > > > > > > > reproduce. It seems to me, the output of the test misses 
> > > > > > > > several important things. You can check it yourself. 
> > > > > > > > `tgt_target_teams()` call uses `@.offload_maptypes` global var 
> > > > > > > > but it is not defined.
> > > > > > >
> > > > > > > Here is the link with the globals not hidden: 
> > > > > > > https://godbolt.org/z/5etB5S
> > > > > > >  The base function is called both times but should not be called 
> > > > > > > at all. What is your local output and why does it differ?
> > > > > >
> > > > > >
> > > > > > On the host `base` is an alias for the `hst` function. On the 
> > > > > > device `base` has the body of `dev` function because NVPTX does nit 
> > > > > > support function aliases (10+ suppprts it, but LLVM does not 
> > > > > > support it yet). Try to change the bodies of dev and hst and you 
> > > > > > will see.
> > > > > >
> > > > > > I tried to keep original function names to improve debugging and 
> > > > > > make users less wonder why instead of `base` something else is 
> > > > > > called.
> > > > >
> > > > >
> > > > > How does that work with linking? Another translation unit can call 
> > > > > both the base and hst/dev function, right? I mean they both need to 
> > > > > be present.
> > > >
> > > >
> > > > If hst or dev are needed, they are emitted too, independently. On the 
> > > > host, the variamt function is emitted and base function is set to be an 
> > > > alias of the variant function. On the device we just inherit the body 
> > > > of the variant function, but this variant function also can be emitted, 
> > > > if used independently.
> > >
> > >
> > > This is confusing:
> > >
> > > 1. Especially for debugging we should do the same on host and device.
> >
> >
> > We do the same on the host and on the device. The user will see that he is 
> > the context of originally called function `base`. It is the same behavior 
> > just like the behavior of the GNU gcc `alias` attribute.
>
>
> No we do not. You said it yourself just in the last comment: one time you 
> emit an alias (host), one time you "emit code into the base function" 
> (device).
>  That is not the same thing, especially if you look at debug information, 
> stack frames, ...


Yes, formally it is not the same. But only because NVPTX does not support 
function aliasing.

> 
> 
>>> 2. The device function now exists twice, that is bad.
>> 
>> It is not so bad, actually. In your solution, you have almost the same 
>> situation. Plus, it is not a big problem taking into account that most of 
>> the function calls will be inlined and function will be eliminated. As soon 
>> as we have support for global aliases in LLVM/NVPTX, we can improve it.
> 
> It is bad to duplicate code for no reason. The SemaOverload solution has not 
> "almost the same situation", as it does not replace the base function body 
> with the variant function body.

Again, just a workaround for NVPTX problem with function aliases.

> 
> 
>>> 3. How is this supposed to work with type corrections? I mean the variant 
>>> needs to be compatible but not necessarily of the same type, right? 
>>> https://godbolt.org/z/QAXCuv we just produce a cryptic error (locally it 
>>> crashes for me afterwards).
>> 
>> Actually, these functions are not compatible. We're missing some extra 
>> checks in Sema, will add them later.
> 
> That is not necessarily clear to me. The standard does not say the types need 
> to be equal but compatible.

Compatible in terms of the base language. In C, these functions are not 
compatible. Try to declare the function `base` as `int base(float)` and then 
try to redeclare it as `int base(int)`. You will get the error because the 
types are not compatible. For example, `int ()` and `int(int)` are compatible 
though are not equal.

> 
> 
>>> 4. On the host the expression `&base == &hst` will evaluate to true with 
>>> the alias.
>> 
>> And what is the problem with this? I believe, with your solution the result 
>> will be the same because you will just replace `base` with the `hst` upon 
>> overloading resolution.
> 
> But I can do it for calls only. Aliases are not as fine granular. That is 
> also why the current code cannot handle the `construct` trait but an overload 
> solution can.

I see no problem here.

> 
> 
>>> In D71241#1779779 <https://reviews.llvm.org/D71241#1779779>, @ABataev wrote:
>>> 
>>>> Here is the example that does not work with the proposed solution but 
>>>> works with the existing one:
>>>>
>>>>   static void cpu() { asm("nop"); }
>>>>  
>>>>   #pragma omp declare variant(cpu) match(device = {kind(cpu)})
>>>>   static __attribute__((used)) void wrong_asm() {
>>>>     asm ("xxx");
>>>>   }
>>>>
>>>>
>>>> The existing solution has some problems with the delayed error messages 
>>>> too, but they are very easy to fix.
>>> 
>>> 
>>> I don't understand this example. What is the expected outcome here (I get 
>>> the error below from ToT clang). Why is that not achievable by a 
>>> SemaOverload solution?
>>>  `asm.c:4:35: error: function with '#pragma omp declare variant' must have 
>>> a prototype`
>> 
>> Try to compile as C++:
>> 
>>   clang++ -с -fopenmp repro.cpp
> 
> You did not answer any of the questions here. If you ignore my comments this 
> is not working.
>  What is the expected outcome here?  Why is that not achievable by a 
> SemaOverload solution?

Expected result - the code is compiled successfully. Your solution will produce 
the error message for incorrect asm. SemaOverload won't help you with this. 
Assume, you have a base function with the target-specific code for the host and 
a variant with the target-specific code on the device.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71241/new/

https://reviews.llvm.org/D71241



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to