================
@@ -0,0 +1,74 @@
+// RUN: %clang_cc1 -fsycl-is-device -O0 -triple spirv64-unknown-unknown \
+// RUN: -emit-llvm  %s -o - | FileCheck %s --check-prefix=DEVICE
+
+// RUN: %clang_cc1 -fsycl-is-host -O0 -triple spirv64-unknown-unknown \
+// RUN: -emit-llvm  %s -o - | FileCheck %s --check-prefix=HOST
+
+// RUN: %clang_cc1 -fsycl-is-device -emit-llvm \
+// RUN: -aux-triple x86_64-pc-windows-msvc -triple spir-unknown--unknown \
+// RUN: %s -o - | FileCheck %s --check-prefix=MSVC
+
+namespace QL {
+  auto dg1 = [] { return 1; };
+  inline auto dg_inline1 = [] { return 1; };
+}
+
+namespace QL {
+  auto dg2 = [] { return 2; };
+  template<int N>
+  auto dg_template = [] { return N; };
+}
+
+using namespace QL;
+template<typename T>
+[[clang::sycl_kernel_entry_point(T)]] void f(T t) {
+  t();
+}
+
+void g() {
+  f(dg1);
+  f(dg2);
+  f(dg_inline1);
+  f(dg_template<3>);
+}
+
+// HOST: @_ZN2QL3dg1E = internal global %class.anon undef, align 1
+// HOST: @_ZN2QL3dg2E = internal global %class.anon.0 undef, align 1
+// HOST: @_ZN2QL10dg_inline1E = linkonce_odr global %class.anon.2 undef, 
comdat, align 1
+// HOST: @_ZN2QL11dg_templateILi3EEE = linkonce_odr global %class.anon.4 
undef, comdat, align 1
+
+// DEVICE: define spir_kernel void @_ZTSN2QL3dg1MUlvE_E
+// DEVICE: call spir_func noundef i32 @_ZNK2QL3dg1MUlvE_clEv
+// DEVICE: define internal spir_func noundef i32 @_ZNK2QL3dg1MUlvE_clEv
+// DEVICE: define spir_kernel void @_ZTSN2QL3dg2MUlvE_E
+// DEVICE: call spir_func noundef i32 @_ZNK2QL3dg2MUlvE_clEv
+// DEVICE: define internal spir_func noundef i32 @_ZNK2QL3dg2MUlvE_clEv
+// DEVICE: define spir_kernel void @_ZTSN2QL10dg_inline1MUlvE_E
+// DEVICE: call spir_func noundef i32 @_ZNK2QL10dg_inline1MUlvE_clEv
+// DEVICE: define linkonce_odr spir_func noundef i32 
@_ZNK2QL10dg_inline1MUlvE_clEv
+// DEVICE: define spir_kernel void @_ZTSN2QL11dg_templateILi3EEMUlvE_E
+// DEVICE: call spir_func noundef i32 @_ZNK2QL11dg_templateILi3EEMUlvE_clEv
+// DEVICE: define linkonce_odr spir_func noundef i32 
@_ZNK2QL11dg_templateILi3EEMUlvE_clEv
+
+// HOST: define spir_func void @_Z1gv
+// HOST: call spir_func void @_Z1fIN2QL3dg1MUlvE_EEvT_
+// HOST: call spir_func void @_Z1fIN2QL3dg2MUlvE_EEvT_
+// HOST: call spir_func void @_Z1fIN2QL10dg_inline1MUlvE_EEvT_
+// HOST: call spir_func void @_Z1fIN2QL11dg_templateILi3EEMUlvE_EEvT_
+// HOST: define internal spir_func void @_Z1fIN2QL3dg1MUlvE_EEvT
+// HOST: define internal spir_func void @_Z1fIN2QL3dg2MUlvE_EEvT_
+// HOST: define linkonce_odr spir_func void @_Z1fIN2QL10dg_inline1MUlvE_EEvT_
+// HOST: define linkonce_odr spir_func void 
@_Z1fIN2QL11dg_templateILi3EEMUlvE_EEvT_
+
+// MSVC: define dso_local spir_kernel void @_ZTSN2QL3dg1MUlvE_E
+// MSVC: call spir_func noundef i32 @_ZNK2QL3dg1MUlvE_clEv
+// MSVC: define internal spir_func noundef i32 @_ZNK2QL3dg1MUlvE_clEv
+// MSVC: define dso_local spir_kernel void @_ZTSN2QL3dg2MUlvE_E
+// MSVC: call spir_func noundef i32 @_ZNK2QL3dg2MUlvE_clEv
+// MSVC: define internal spir_func noundef i32 @_ZNK2QL3dg2MUlvE_clEv
+// MSVC: define dso_local spir_kernel void @_ZTSN2QL10dg_inline1MUlvE_E
+// MSVC: call spir_func noundef i32 @_ZNK2QL10dg_inline1MUlvE_clEv
+// MSVC: define linkonce_odr spir_func noundef i32 
@_ZNK2QL10dg_inline1MUlvE_clEv
+// MSVC: define dso_local spir_kernel void @_ZTSN2QL11dg_templateILi3EEMUlvE_E
+// MSVC: call spir_func noundef i32 @_ZNK2QL11dg_templateILi3EEMUlvE_clEv
+// MSVC: define linkonce_odr spir_func noundef i32 
@_ZNK2QL11dg_templateILi3EEMUlvE_clEv
----------------
zygoloid wrote:

> Assuming I'm reading things correctly, 
> [[basic.def.odr]p(16.11)](https://eel.is/c++draft/basic.def.odr#16.11) also 
> makes it clear that uses of such default template arguments result in 
> distinct closure types in the contexts where those uses occur.

Yes. I suspect we probably don't get that right just yet, and there's some ABI 
subtleties here that I think the Itanium ABI doesn't precisely describe yet 
(but fortunately there's not a lot of choice unless an implementation gets 
"clever"). For example:

```c++
template<auto a = [] { static int n; return &n; }, auto b = [] { static int n; 
return &n; }>
int *f(bool x) { return x ? a() : b(); }
inline int *g(bool x) { return f(x); }
```

Here, the two lambdas should be mangled within the context of function `g`, and 
the order in which they're instantiated (and when that happens relative to 
other lambdas that get numbered in `g`) is ABI-relevant. I think the obvious 
thing here is that they get numbered immediately *after* the function call is 
processed (at the point where overload resolution is performed) -- so we'd 
number lambdas within the argument to `x` first for example -- and that the 
default template arguments are instantiated left-to-right (which is mostly 
forced by the possibility of a later default template argument using an earlier 
template parameter). It's also ABI-relevant whether we instantiate default 
template arguments before or after default function arguments, but the language 
rules strongly imply we have to do default template arguments first, so there's 
not much of a choice there.

> The example in [[basic.def.odr]p19](https://eel.is/c++draft/basic.def.odr#19) 
> makes it clear that correspondence across TUs is required for default 
> arguments of function parameters in member functions. The preceding text 
> appears to treat default template arguments and function default arguments 
> the same, so I presume the example is applicable to default template 
> arguments as well. I think the example is more subtle than the description 
> suggests though. Where it states, "The definition of `X` can appear in 
> multiple translation units of a valid program; the 
> [lambda-expression](https://eel.is/c++draft/expr.prim.lambda.general#nt:lambda-expression)s
>  defined within the default argument of `X​::​h` within the definition of `X` 
> denote the same closure type in each translation unit.", I think this is 
> referring only to the use of that lambda within the body of `X::h` as opposed 
> to other hypothetical uses elsewhere outside the class definition. Am I 
> reading that correctly?

Broadly, the rule is that the whole program must behave as if the class 
definition is only defined in one place, and somehow the textual repeats of it 
in other places act as if they were importing / referencing that one definition 
rather than adding another definition. So the behavior must be as if there is 
only one lambda-expression in the whole program -- regardless of how you reach 
the closure type, it's the same type. In practice this means we need to give it 
external (linkonce_odr) linkage and a mangling :)

So, for example, if we had:

```
struct X {
  static auto f(int* (*p)() = [] { static int n; return &n; }) { return p; }
};
```

... then a call to `X::f()()` in different translation units must return a 
pointer to the same `int`. (A call to `X::f()` in different translation units 
doesn't actually have to return the same pointer, not because it's allowed to 
be a different lambda, but because the conversion from lambda to function 
pointer isn't required to return the same pointer each time it's called, just 
like a string literal isn't required to always evaluate to the same pointer.)

> Back to the ABI, my take away from this is that discriminators allocated from 
> namespace and class contexts need not correspond across TUs; when 
> correspondence is required, discriminators are allocated from more local 
> declaration contexts (except in cases like those reported in 
> [itanium-cxx-abi/cxx-abi#165](https://github.com/itanium-cxx-abi/cxx-abi/issues/165)).

For namespace contexts: yes. The Itanium ABI never allocates discriminators for 
namespace context, and there's no way they could align across translation units 
anyway. Lambdas at namespace context should never be emitted with external 
linkage -- except that maybe SYCL wants some special rule here for things that 
are notionally internal but still need to link between host and device.

For class contexts, we need cross-TU correspondence. (@rjmccall and I had a 
discussion a while back about trying to use a context more specific than the 
class where possible, to avoid the ABI relying on declaration order within a 
class definition, but as I recall we didn't come up with anything that we liked 
substantially more than just numbering lexically within the class for the long 
tail of cases where the lambda isn't within, say, a function declaration.)

> Another example to check what I'm learning; https://godbolt.org/z/M4fY4e3Px:
> 
> ```
> extern int x;
> 
> struct S {
>   template<auto F = [] { return x; }>
>   static inline auto vt = F;
> 
>   // GCC:   _ZNK1SUlvE0_clEv
>   // Clang: _ZNK1SUlvE_clEv
>   // SYCL:  _ZNK1SUlvE0_clEv
>   static inline auto sdm = vt<>;
> };
> ```

Looks like everyone gets this wrong.

Each template-id is supposed to perform a fresh substitution into the default 
template arguments, meaning that `sdm` has its own lambda distinct from any 
other use of `vt`. The lambda instantiated for the default argument should have 
`sdm` as its context declaration, to satisfy 
[[basic.def.odr]/16.11](https://eel.is/c++draft/basic.def.odr#16.11) and the 
ABI rule: "If the context of a closure type is an initializer for a class 
member (static or nonstatic), inline variable, or variable template, it is 
encoded in a qualified name"

> ```
> // GCC:   _ZZ6usevt0vENKUlvE1_clEv
> // Clang: _ZNK1SUlvE0_clEv
> // SYCL:  _ZNK1SUlvE1_clEv
> inline int usevt0() { return S::vt<>(); }
> ```

I think GCC is getting this one approximately right. The context declaration 
here should be `usevt0`, because the instantiated lambda is distinct from the 
one produced for any other use of `S::vt<>`.

> ``` 
> // GCC:   _ZZ6usevt1vENKUlvE2_clEv
> // Clang: _ZNK1SUlvE1_clEv
> // SYCL:  _ZNK1SUlvE2_clEv
> inline int usevt1() { return S::vt<>(); }
> ```

The discriminator numbering used by GCC is obviously wrong, though. (The 
`...E1_...` vs `...E2_...`.) Reordering the definitions of `usevt1` and 
`usevt2` causes them to swap which gets the number `1` and which gets `2`! I 
share your suspicion that GCC might be numbering within the class still, but 
using the function when mangling. The lambdas here should have no discriminator 
because they're each the first lambda within their context declaration with 
that lambda-sig.

> Per your reference to 
> [[basic.def.odr]p18](https://eel.is/c++draft/basic.def.odr#18), if `usevt0()` 
> and/or `usevt1()` are defined in multiple TUs, the program is IFNDR because 
> the uses of the lambda in the default template argument do not correspond.

I don't think that's right, but I'm not entirely certain, and I think there 
might be a wording issue here. The ODR says the behavior is as if the token 
sequences of the lambdas appeared within the inline function, and the lambdas 
would be equivalent under those terms.

However, it also says in /18 that we should be "excluding entities defined 
within default arguments or default template arguments of either D or an entity 
not defined within D". But what does that mean for entities formed by 
*instantiating* a default template argument, especially one where the templated 
lambda appears within a class (or similar entity where we'd "normally" 
deduplicate the lambda across translation units? When the lambda is formed by 
instantiating a default [template] argument, we in general need for it to have 
the place where instantiation was triggered as its context declaration, and it 
seems fine for lambdas to be given an "identity" on that basis. I think the 
most reasonable and certainly the most useful interpretation is that the ODR 
does apply in those cases still, and therefore we need to give these 
instantiated lambdas a mangling in the context in which the instantiation 
happens. That'd mean:

```c++
// ODR violation if this appears across TUs, each g would have a different 
lambda
void f(void (*arg)() = []{});
inline void g() { f(); }

// OK! Lambda is instantiated in the call to `f()`, so `g()` has its own 
lambda, and
// it's the same one each time, so no ODR violation.
template<typename = void>
void f(void (*arg)() = []{});
inline void g() { f(); }
```

We probably need some CWG feedback on this.

https://github.com/llvm/llvm-project/pull/159115
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to