[PATCH] D124866: [CUDA][HIP] support __noinline__ as keyword

2022-05-11 Thread Dan Liew via Phabricator via cfe-commits
delcypher added inline comments.



Comment at: clang/lib/Basic/IdentifierTable.cpp:111
 KEYSYCL   = 0x100,
+KEYCUDA   = 0x200,
 KEYALLCXX = KEYCXX | KEYCXX11 | KEYCXX20,

yaxunl wrote:
> yaxunl wrote:
> > delcypher wrote:
> > > yaxunl wrote:
> > > > delcypher wrote:
> > > > > @yaxunl Is it intentional that you didn't update `KEYALL` here? That 
> > > > > means `KEYALL` doesn't include the bit for `KEYCUDA`.
> > > > > 
> > > > > If that was your intention then this will break if someone adds a new 
> > > > > key. E.g.
> > > > > 
> > > > > ```
> > > > > KEYCUDA = 0x200,
> > > > > KEYSOMENEWTHING = 0x400,
> > > > > // ...
> > > > > // KEYALL now includes `KEYCUDA`, whereas it didn't before.
> > > > > // KEYALL includes KEYSOMENEWTHING 
> > > > > KEYALL = (0x7ff & ~KEYNOMS18 &
> > > > >   ~KEYNOOPENCL) // KEYNOMS18 and KEYNOOPENCL are used to 
> > > > > exclude.
> > > > > ...
> > > > > ```
> > > > > 
> > > > > 
> > > > > 1. Updating the `0x1ff` constant to `0x3ff` so that `KEYALL` 
> > > > > includes `KEYCUDA`
> > > > > 2. If your intention **is** to not have `KEYCUDA`  set in `KEYALL` 
> > > > > then amend `KEYALL` to be.
> > > > > 
> > > > > ```
> > > > > KEYALL = (0x7ff & ~KEYNOMS18 &
> > > > >   ~KEYNOOPENCL & ~KEYCUDA ) // KEYNOMS18 and KEYNOOPENCL 
> > > > > are used to exclude.
> > > > > // KEYCUDA is not included in KEYALL
> > > > > ```
> > > > My intention is not to include KEYCUDA in KEYALL.
> > > > 
> > > > Should I change KEYALL to
> > > > 
> > > > 
> > > > ```
> > > > KEYALL = (0x3ff & ~KEYNOMS18 &
> > > >   ~KEYNOOPENCL & ~KEYCUDA ) // KEYNOMS18 and KEYNOOPENCL 
> > > > are used to exclude.
> > > > // KEYCUDA is not included in KEYALL
> > > > ```
> > > > 
> > > > instead of 
> > > > 
> > > > 
> > > > ```
> > > > KEYALL = (0x7ff & ~KEYNOMS18 &
> > > >   ~KEYNOOPENCL & ~KEYCUDA ) // KEYNOMS18 and KEYNOOPENCL 
> > > > are used to exclude.
> > > > // KEYCUDA is not included in KEYALL
> > > > ```
> > > > 
> > > > since the current maximum mask is 0x3ff instead of 0x7ff
> > > Oops, you're right it would be `0x3ff`. I wonder though if we should 
> > > clean this up so we don't need to manually update the bit mask every 
> > > time... what if it was written like this?
> > > 
> > > ```lang=c++
> > >  enum {
> > > KEYC99= 0x1,
> > > KEYCXX= 0x2,
> > > KEYCXX11  = 0x4,
> > > 
> > > KEYSYCL   = 0x100,
> > > KEYCUDA   = 0x200,
> > > KEYMAX = KEYCUDA, // Must be set to the largest KEY enum value
> > > KEYALLCXX = KEYCXX | KEYCXX11 | KEYCXX20,
> > > 
> > > // KEYNOMS18 and KEYNOOPENCL are used to exclude.
> > > // KEYCUDA is not included in KEYALL because 
> > > KEYALL = (((KEYMAX & (KEYMAX-1)) & ~KEYNOMS18 & ~KEYNOOPENCL & 
> > > ~KEYCUDA)
> > > };
> > > ```
> > On second thought, KEYALL does not need to exclude KEYCUDA.
> > 
> > However, it would be good to set KEYALL in a generic approach. I will open 
> > a separate review.
> opened https://reviews.llvm.org/D125396 to fix KEYALL
Oops that should say

```
KEYALL = (((KEYMAX | (KEYMAX-1)) & ~KEYNOMS18 & ~KEYNOOPENCL & ~KEYCUDA)
```


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124866/new/

https://reviews.llvm.org/D124866

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124866: [CUDA][HIP] support __noinline__ as keyword

2022-05-11 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked 2 inline comments as done.
yaxunl added inline comments.



Comment at: clang/lib/Basic/IdentifierTable.cpp:111
 KEYSYCL   = 0x100,
+KEYCUDA   = 0x200,
 KEYALLCXX = KEYCXX | KEYCXX11 | KEYCXX20,

yaxunl wrote:
> delcypher wrote:
> > yaxunl wrote:
> > > delcypher wrote:
> > > > @yaxunl Is it intentional that you didn't update `KEYALL` here? That 
> > > > means `KEYALL` doesn't include the bit for `KEYCUDA`.
> > > > 
> > > > If that was your intention then this will break if someone adds a new 
> > > > key. E.g.
> > > > 
> > > > ```
> > > > KEYCUDA = 0x200,
> > > > KEYSOMENEWTHING = 0x400,
> > > > // ...
> > > > // KEYALL now includes `KEYCUDA`, whereas it didn't before.
> > > > // KEYALL includes KEYSOMENEWTHING 
> > > > KEYALL = (0x7ff & ~KEYNOMS18 &
> > > >   ~KEYNOOPENCL) // KEYNOMS18 and KEYNOOPENCL are used to 
> > > > exclude.
> > > > ...
> > > > ```
> > > > 
> > > > 
> > > > 1. Updating the `0x1ff` constant to `0x3ff` so that `KEYALL` 
> > > > includes `KEYCUDA`
> > > > 2. If your intention **is** to not have `KEYCUDA`  set in `KEYALL` then 
> > > > amend `KEYALL` to be.
> > > > 
> > > > ```
> > > > KEYALL = (0x7ff & ~KEYNOMS18 &
> > > >   ~KEYNOOPENCL & ~KEYCUDA ) // KEYNOMS18 and KEYNOOPENCL 
> > > > are used to exclude.
> > > > // KEYCUDA is not included in KEYALL
> > > > ```
> > > My intention is not to include KEYCUDA in KEYALL.
> > > 
> > > Should I change KEYALL to
> > > 
> > > 
> > > ```
> > > KEYALL = (0x3ff & ~KEYNOMS18 &
> > >   ~KEYNOOPENCL & ~KEYCUDA ) // KEYNOMS18 and KEYNOOPENCL are 
> > > used to exclude.
> > > // KEYCUDA is not included in KEYALL
> > > ```
> > > 
> > > instead of 
> > > 
> > > 
> > > ```
> > > KEYALL = (0x7ff & ~KEYNOMS18 &
> > >   ~KEYNOOPENCL & ~KEYCUDA ) // KEYNOMS18 and KEYNOOPENCL are 
> > > used to exclude.
> > > // KEYCUDA is not included in KEYALL
> > > ```
> > > 
> > > since the current maximum mask is 0x3ff instead of 0x7ff
> > Oops, you're right it would be `0x3ff`. I wonder though if we should 
> > clean this up so we don't need to manually update the bit mask every 
> > time... what if it was written like this?
> > 
> > ```lang=c++
> >  enum {
> > KEYC99= 0x1,
> > KEYCXX= 0x2,
> > KEYCXX11  = 0x4,
> > 
> > KEYSYCL   = 0x100,
> > KEYCUDA   = 0x200,
> > KEYMAX = KEYCUDA, // Must be set to the largest KEY enum value
> > KEYALLCXX = KEYCXX | KEYCXX11 | KEYCXX20,
> > 
> > // KEYNOMS18 and KEYNOOPENCL are used to exclude.
> > // KEYCUDA is not included in KEYALL because 
> > KEYALL = (((KEYMAX & (KEYMAX-1)) & ~KEYNOMS18 & ~KEYNOOPENCL & ~KEYCUDA)
> > };
> > ```
> On second thought, KEYALL does not need to exclude KEYCUDA.
> 
> However, it would be good to set KEYALL in a generic approach. I will open a 
> separate review.
opened https://reviews.llvm.org/D125396 to fix KEYALL


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124866/new/

https://reviews.llvm.org/D124866

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124866: [CUDA][HIP] support __noinline__ as keyword

2022-05-11 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments.



Comment at: clang/lib/Basic/IdentifierTable.cpp:111
 KEYSYCL   = 0x100,
+KEYCUDA   = 0x200,
 KEYALLCXX = KEYCXX | KEYCXX11 | KEYCXX20,

delcypher wrote:
> yaxunl wrote:
> > delcypher wrote:
> > > @yaxunl Is it intentional that you didn't update `KEYALL` here? That 
> > > means `KEYALL` doesn't include the bit for `KEYCUDA`.
> > > 
> > > If that was your intention then this will break if someone adds a new 
> > > key. E.g.
> > > 
> > > ```
> > > KEYCUDA = 0x200,
> > > KEYSOMENEWTHING = 0x400,
> > > // ...
> > > // KEYALL now includes `KEYCUDA`, whereas it didn't before.
> > > // KEYALL includes KEYSOMENEWTHING 
> > > KEYALL = (0x7ff & ~KEYNOMS18 &
> > >   ~KEYNOOPENCL) // KEYNOMS18 and KEYNOOPENCL are used to 
> > > exclude.
> > > ...
> > > ```
> > > 
> > > 
> > > 1. Updating the `0x1ff` constant to `0x3ff` so that `KEYALL` 
> > > includes `KEYCUDA`
> > > 2. If your intention **is** to not have `KEYCUDA`  set in `KEYALL` then 
> > > amend `KEYALL` to be.
> > > 
> > > ```
> > > KEYALL = (0x7ff & ~KEYNOMS18 &
> > >   ~KEYNOOPENCL & ~KEYCUDA ) // KEYNOMS18 and KEYNOOPENCL are 
> > > used to exclude.
> > > // KEYCUDA is not included in KEYALL
> > > ```
> > My intention is not to include KEYCUDA in KEYALL.
> > 
> > Should I change KEYALL to
> > 
> > 
> > ```
> > KEYALL = (0x3ff & ~KEYNOMS18 &
> >   ~KEYNOOPENCL & ~KEYCUDA ) // KEYNOMS18 and KEYNOOPENCL are 
> > used to exclude.
> > // KEYCUDA is not included in KEYALL
> > ```
> > 
> > instead of 
> > 
> > 
> > ```
> > KEYALL = (0x7ff & ~KEYNOMS18 &
> >   ~KEYNOOPENCL & ~KEYCUDA ) // KEYNOMS18 and KEYNOOPENCL are 
> > used to exclude.
> > // KEYCUDA is not included in KEYALL
> > ```
> > 
> > since the current maximum mask is 0x3ff instead of 0x7ff
> Oops, you're right it would be `0x3ff`. I wonder though if we should 
> clean this up so we don't need to manually update the bit mask every time... 
> what if it was written like this?
> 
> ```lang=c++
>  enum {
> KEYC99= 0x1,
> KEYCXX= 0x2,
> KEYCXX11  = 0x4,
> 
> KEYSYCL   = 0x100,
> KEYCUDA   = 0x200,
> KEYMAX = KEYCUDA, // Must be set to the largest KEY enum value
> KEYALLCXX = KEYCXX | KEYCXX11 | KEYCXX20,
> 
> // KEYNOMS18 and KEYNOOPENCL are used to exclude.
> // KEYCUDA is not included in KEYALL because 
> KEYALL = (((KEYMAX & (KEYMAX-1)) & ~KEYNOMS18 & ~KEYNOOPENCL & ~KEYCUDA)
> };
> ```
On second thought, KEYALL does not need to exclude KEYCUDA.

However, it would be good to set KEYALL in a generic approach. I will open a 
separate review.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124866/new/

https://reviews.llvm.org/D124866

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124866: [CUDA][HIP] support __noinline__ as keyword

2022-05-10 Thread Dan Liew via Phabricator via cfe-commits
delcypher added inline comments.



Comment at: clang/lib/Basic/IdentifierTable.cpp:111
 KEYSYCL   = 0x100,
+KEYCUDA   = 0x200,
 KEYALLCXX = KEYCXX | KEYCXX11 | KEYCXX20,

yaxunl wrote:
> delcypher wrote:
> > @yaxunl Is it intentional that you didn't update `KEYALL` here? That means 
> > `KEYALL` doesn't include the bit for `KEYCUDA`.
> > 
> > If that was your intention then this will break if someone adds a new key. 
> > E.g.
> > 
> > ```
> > KEYCUDA = 0x200,
> > KEYSOMENEWTHING = 0x400,
> > // ...
> > // KEYALL now includes `KEYCUDA`, whereas it didn't before.
> > // KEYALL includes KEYSOMENEWTHING 
> > KEYALL = (0x7ff & ~KEYNOMS18 &
> >   ~KEYNOOPENCL) // KEYNOMS18 and KEYNOOPENCL are used to 
> > exclude.
> > ...
> > ```
> > 
> > 
> > 1. Updating the `0x1ff` constant to `0x3ff` so that `KEYALL` 
> > includes `KEYCUDA`
> > 2. If your intention **is** to not have `KEYCUDA`  set in `KEYALL` then 
> > amend `KEYALL` to be.
> > 
> > ```
> > KEYALL = (0x7ff & ~KEYNOMS18 &
> >   ~KEYNOOPENCL & ~KEYCUDA ) // KEYNOMS18 and KEYNOOPENCL are 
> > used to exclude.
> > // KEYCUDA is not included in KEYALL
> > ```
> My intention is not to include KEYCUDA in KEYALL.
> 
> Should I change KEYALL to
> 
> 
> ```
> KEYALL = (0x3ff & ~KEYNOMS18 &
>   ~KEYNOOPENCL & ~KEYCUDA ) // KEYNOMS18 and KEYNOOPENCL are used 
> to exclude.
> // KEYCUDA is not included in KEYALL
> ```
> 
> instead of 
> 
> 
> ```
> KEYALL = (0x7ff & ~KEYNOMS18 &
>   ~KEYNOOPENCL & ~KEYCUDA ) // KEYNOMS18 and KEYNOOPENCL are used 
> to exclude.
> // KEYCUDA is not included in KEYALL
> ```
> 
> since the current maximum mask is 0x3ff instead of 0x7ff
Oops, you're right it would be `0x3ff`. I wonder though if we should clean 
this up so we don't need to manually update the bit mask every time... what if 
it was written like this?

```lang=c++
 enum {
KEYC99= 0x1,
KEYCXX= 0x2,
KEYCXX11  = 0x4,

KEYSYCL   = 0x100,
KEYCUDA   = 0x200,
KEYMAX = KEYCUDA, // Must be set to the largest KEY enum value
KEYALLCXX = KEYCXX | KEYCXX11 | KEYCXX20,

// KEYNOMS18 and KEYNOOPENCL are used to exclude.
// KEYCUDA is not included in KEYALL because 
KEYALL = (((KEYMAX & (KEYMAX-1)) & ~KEYNOMS18 & ~KEYNOOPENCL & ~KEYCUDA)
};
```


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124866/new/

https://reviews.llvm.org/D124866

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124866: [CUDA][HIP] support __noinline__ as keyword

2022-05-10 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments.



Comment at: clang/lib/Basic/IdentifierTable.cpp:111
 KEYSYCL   = 0x100,
+KEYCUDA   = 0x200,
 KEYALLCXX = KEYCXX | KEYCXX11 | KEYCXX20,

delcypher wrote:
> @yaxunl Is it intentional that you didn't update `KEYALL` here? That means 
> `KEYALL` doesn't include the bit for `KEYCUDA`.
> 
> If that was your intention then this will break if someone adds a new key. 
> E.g.
> 
> ```
> KEYCUDA = 0x200,
> KEYSOMENEWTHING = 0x400,
> // ...
> // KEYALL now includes `KEYCUDA`, whereas it didn't before.
> // KEYALL includes KEYSOMENEWTHING 
> KEYALL = (0x7ff & ~KEYNOMS18 &
>   ~KEYNOOPENCL) // KEYNOMS18 and KEYNOOPENCL are used to exclude.
> ...
> ```
> 
> 
> 1. Updating the `0x1ff` constant to `0x3ff` so that `KEYALL` includes 
> `KEYCUDA`
> 2. If your intention **is** to not have `KEYCUDA`  set in `KEYALL` then amend 
> `KEYALL` to be.
> 
> ```
> KEYALL = (0x7ff & ~KEYNOMS18 &
>   ~KEYNOOPENCL & ~KEYCUDA ) // KEYNOMS18 and KEYNOOPENCL are used 
> to exclude.
> // KEYCUDA is not included in KEYALL
> ```
My intention is not to include KEYCUDA in KEYALL.

Should I change KEYALL to


```
KEYALL = (0x3ff & ~KEYNOMS18 &
  ~KEYNOOPENCL & ~KEYCUDA ) // KEYNOMS18 and KEYNOOPENCL are used 
to exclude.
// KEYCUDA is not included in KEYALL
```

instead of 


```
KEYALL = (0x7ff & ~KEYNOMS18 &
  ~KEYNOOPENCL & ~KEYCUDA ) // KEYNOMS18 and KEYNOOPENCL are used 
to exclude.
// KEYCUDA is not included in KEYALL
```

since the current maximum mask is 0x3ff instead of 0x7ff


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124866/new/

https://reviews.llvm.org/D124866

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124866: [CUDA][HIP] support __noinline__ as keyword

2022-05-10 Thread Dan Liew via Phabricator via cfe-commits
delcypher added inline comments.



Comment at: clang/lib/Basic/IdentifierTable.cpp:111
 KEYSYCL   = 0x100,
+KEYCUDA   = 0x200,
 KEYALLCXX = KEYCXX | KEYCXX11 | KEYCXX20,

@yaxunl Is it intentional that you didn't update `KEYALL` here? That means 
`KEYALL` doesn't include the bit for `KEYCUDA`.

If that was your intention then this will break if someone adds a new key. E.g.

```
KEYCUDA = 0x200,
KEYSOMENEWTHING = 0x400,
// ...
// KEYALL now includes `KEYCUDA`, whereas it didn't before.
// KEYALL includes KEYSOMENEWTHING 
KEYALL = (0x7ff & ~KEYNOMS18 &
  ~KEYNOOPENCL) // KEYNOMS18 and KEYNOOPENCL are used to exclude.
...
```


1. Updating the `0x1ff` constant to `0x3ff` so that `KEYALL` includes 
`KEYCUDA`
2. If your intention **is** to not have `KEYCUDA`  set in `KEYALL` then amend 
`KEYALL` to be.

```
KEYALL = (0x7ff & ~KEYNOMS18 &
  ~KEYNOOPENCL & ~KEYCUDA ) // KEYNOMS18 and KEYNOOPENCL are used 
to exclude.
// KEYCUDA is not included in KEYALL
```


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124866/new/

https://reviews.llvm.org/D124866

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124866: [CUDA][HIP] support __noinline__ as keyword

2022-05-10 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
yaxunl marked an inline comment as done.
Closed by commit rGafc9d674fe5a: [CUDA][HIP] support __noinline__ as keyword 
(authored by yaxunl).
Herald added a project: clang.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124866/new/

https://reviews.llvm.org/D124866

Files:
  clang/docs/ReleaseNotes.rst
  clang/include/clang/Basic/Attr.td
  clang/include/clang/Basic/AttrDocs.td
  clang/include/clang/Basic/Features.def
  clang/include/clang/Basic/TokenKinds.def
  clang/include/clang/Parse/Parser.h
  clang/lib/Basic/IdentifierTable.cpp
  clang/lib/Parse/ParseDecl.cpp
  clang/test/CodeGenCUDA/noinline.cu
  clang/test/Lexer/has_feature.cu
  clang/test/SemaCUDA/noinline.cu

Index: clang/test/SemaCUDA/noinline.cu
===
--- /dev/null
+++ clang/test/SemaCUDA/noinline.cu
@@ -0,0 +1,19 @@
+// RUN: %clang_cc1 -fsyntax-only -verify=cuda %s
+// RUN: %clang_cc1 -fsyntax-only -verify=cuda -pedantic %s
+// RUN: %clang_cc1 -fsyntax-only -verify=cpp -x c++ %s
+
+// cuda-no-diagnostics
+
+__noinline__ void fun1() { } // cpp-error {{unknown type name '__noinline__'}}
+
+__attribute__((noinline)) void fun2() { }
+__attribute__((__noinline__)) void fun3() { }
+[[gnu::__noinline__]] void fun4() { }
+
+#define __noinline__ __attribute__((__noinline__))
+__noinline__ void fun5() {}
+
+#undef __noinline__
+#10 "cuda.h" 3
+#define __noinline__ __attribute__((__noinline__))
+__noinline__ void fun6() {}
Index: clang/test/Lexer/has_feature.cu
===
--- /dev/null
+++ clang/test/Lexer/has_feature.cu
@@ -0,0 +1,8 @@
+// RUN: %clang_cc1 -E -triple x86_64-linux-gnu %s -o - | FileCheck %s
+
+// CHECK: has_noinline_keyword
+#if __has_feature(cuda_noinline_keyword)
+int has_noinline_keyword();
+#else
+int no_noinine_keyword();
+#endif
Index: clang/test/CodeGenCUDA/noinline.cu
===
--- /dev/null
+++ clang/test/CodeGenCUDA/noinline.cu
@@ -0,0 +1,34 @@
+// Uses -O2 since the defalt -O0 option adds noinline to all functions.
+
+// RUN: %clang_cc1 -triple nvptx-nvidia-cuda -fcuda-is-device \
+// RUN: -O2 -disable-llvm-passes -emit-llvm -o - %s | FileCheck %s
+
+// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -fcuda-is-device \
+// RUN: -O2 -disable-llvm-passes -emit-llvm -o - -x hip %s | FileCheck %s
+
+// RUN: %clang_cc1 -triple x86_64-unknown-gnu-linux \
+// RUN: -O2 -disable-llvm-passes -emit-llvm -o - %s | FileCheck %s
+
+#include "Inputs/cuda.h"
+
+__noinline__ __device__ __host__ void fun1() {}
+
+__attribute__((noinline)) __device__ __host__ void fun2() {}
+
+__attribute__((__noinline__)) __device__ __host__ void fun3() {}
+
+[[gnu::__noinline__]] __device__ __host__ void fun4() {}
+
+#define __noinline__ __attribute__((__noinline__))
+__noinline__ __device__ __host__ void fun5() {}
+
+__device__ __host__ void fun6() {}
+
+// CHECK: define{{.*}}@_Z4fun1v{{.*}}#[[ATTR1:[0-9]*]]
+// CHECK: define{{.*}}@_Z4fun2v{{.*}}#[[ATTR1:[0-9]*]]
+// CHECK: define{{.*}}@_Z4fun3v{{.*}}#[[ATTR1:[0-9]*]]
+// CHECK: define{{.*}}@_Z4fun4v{{.*}}#[[ATTR1:[0-9]*]]
+// CHECK: define{{.*}}@_Z4fun5v{{.*}}#[[ATTR1:[0-9]*]]
+// CHECK: define{{.*}}@_Z4fun6v{{.*}}#[[ATTR2:[0-9]*]]
+// CHECK: attributes #[[ATTR1]] = {{.*}}noinline
+// CHECK-NOT: attributes #[[ATTR2]] = {{.*}}noinline
Index: clang/lib/Parse/ParseDecl.cpp
===
--- clang/lib/Parse/ParseDecl.cpp
+++ clang/lib/Parse/ParseDecl.cpp
@@ -897,6 +897,15 @@
   }
 }
 
+void Parser::ParseCUDAFunctionAttributes(ParsedAttributes ) {
+  while (Tok.is(tok::kw___noinline__)) {
+IdentifierInfo *AttrName = Tok.getIdentifierInfo();
+SourceLocation AttrNameLoc = ConsumeToken();
+attrs.addNew(AttrName, AttrNameLoc, nullptr, AttrNameLoc, nullptr, 0,
+ ParsedAttr::AS_Keyword);
+  }
+}
+
 void Parser::ParseOpenCLQualifiers(ParsedAttributes ) {
   IdentifierInfo *AttrName = Tok.getIdentifierInfo();
   SourceLocation AttrNameLoc = Tok.getLocation();
@@ -3690,6 +3699,11 @@
   ParseOpenCLKernelAttributes(DS.getAttributes());
   continue;
 
+// CUDA/HIP single token adornments.
+case tok::kw___noinline__:
+  ParseCUDAFunctionAttributes(DS.getAttributes());
+  continue;
+
 // Nullability type specifiers.
 case tok::kw__Nonnull:
 case tok::kw__Nullable:
Index: clang/lib/Basic/IdentifierTable.cpp
===
--- clang/lib/Basic/IdentifierTable.cpp
+++ clang/lib/Basic/IdentifierTable.cpp
@@ -108,6 +108,7 @@
 KEYOPENCLCXX  = 0x40,
 KEYMSCOMPAT   = 0x80,
 KEYSYCL   = 0x100,
+KEYCUDA   = 0x200,
 KEYALLCXX = KEYCXX | KEYCXX11 | KEYCXX20,
 KEYALL = 

[PATCH] D124866: [CUDA][HIP] support __noinline__ as keyword

2022-05-10 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments.



Comment at: clang/test/SemaCUDA/noinline.cu:8
+__attribute__((noinline)) void fun2() { }
+__attribute__((__noinline__)) void fun3() { }

aaron.ballman wrote:
> yaxunl wrote:
> > aaron.ballman wrote:
> > > yaxunl wrote:
> > > > aaron.ballman wrote:
> > > > > yaxunl wrote:
> > > > > > aaron.ballman wrote:
> > > > > > > yaxunl wrote:
> > > > > > > > aaron.ballman wrote:
> > > > > > > > > I think there should also be a test like:
> > > > > > > > > ```
> > > > > > > > > [[gnu::__noinline__]] void fun4() {}
> > > > > > > > > ```
> > > > > > > > > to verify that the double square bracket syntax also 
> > > > > > > > > correctly handles this being a keyword now (I expect the test 
> > > > > > > > > to pass).
> > > > > > > > will do
> > > > > > > Ah, I just noticed we also have no tests for the behavior of the 
> > > > > > > keyword in the presence of the macro being defined. e.g.,
> > > > > > > ```
> > > > > > > #define __noinline__ __attribute__((__noinline__))
> > > > > > > __noinline__ void fun5() {}
> > > > > > > ```
> > > > > > will do
> > > > > I missed an important detail -- I think this is now going to generate 
> > > > > a warning in `-pedantic` mode (through `-Wkeyword-macro`) when 
> > > > > compiling for CUDA; is that going to be a problem for CUDA headers, 
> > > > > or are those always included as a system header (and so the 
> > > > > diagnostics will be suppressed)?
> > > > I could not find how clang driver adds CUDA include path
> > > > 
> > > > https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/ToolChains/Cuda.cpp#L284
> > > > 
> > > > @tra do you know how CUDA include path is added? is it done by CMake? 
> > > > 
> > > > For HIP the HIP include path is added as a system include path by clang 
> > > > driver.
> > > Whatever we find out, we can emulate its behavior here in the test file 
> > > to see what the diagnostic behavior will be (you can use GNU linemarkers 
> > > to convince the compiler parts of the source are in a system header).
> > will add tests for that.
> > 
> > It seems no matter it is system header or normal header, no warnings are 
> > emitted even with -pedantic.
> Excellent, thank you!
CUDA includes are added via `-internal-isystem` here: 
https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/ToolChains/Cuda.cpp#L892


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124866/new/

https://reviews.llvm.org/D124866

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124866: [CUDA][HIP] support __noinline__ as keyword

2022-05-10 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 428423.
yaxunl marked an inline comment as done.
yaxunl added a comment.

make it a feature, add tests for pedantic, fix release notes and doecumentation


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124866/new/

https://reviews.llvm.org/D124866

Files:
  clang/docs/ReleaseNotes.rst
  clang/include/clang/Basic/Attr.td
  clang/include/clang/Basic/AttrDocs.td
  clang/include/clang/Basic/Features.def
  clang/include/clang/Basic/TokenKinds.def
  clang/include/clang/Parse/Parser.h
  clang/lib/Basic/IdentifierTable.cpp
  clang/lib/Parse/ParseDecl.cpp
  clang/test/CodeGenCUDA/noinline.cu
  clang/test/Lexer/has_feature.cu
  clang/test/SemaCUDA/noinline.cu

Index: clang/test/SemaCUDA/noinline.cu
===
--- /dev/null
+++ clang/test/SemaCUDA/noinline.cu
@@ -0,0 +1,19 @@
+// RUN: %clang_cc1 -fsyntax-only -verify=cuda %s
+// RUN: %clang_cc1 -fsyntax-only -verify=cuda -pedantic %s
+// RUN: %clang_cc1 -fsyntax-only -verify=cpp -x c++ %s
+
+// cuda-no-diagnostics
+
+__noinline__ void fun1() { } // cpp-error {{unknown type name '__noinline__'}}
+
+__attribute__((noinline)) void fun2() { }
+__attribute__((__noinline__)) void fun3() { }
+[[gnu::__noinline__]] void fun4() { }
+
+#define __noinline__ __attribute__((__noinline__))
+__noinline__ void fun5() {}
+
+#undef __noinline__
+#10 "cuda.h" 3
+#define __noinline__ __attribute__((__noinline__))
+__noinline__ void fun6() {}
Index: clang/test/Lexer/has_feature.cu
===
--- /dev/null
+++ clang/test/Lexer/has_feature.cu
@@ -0,0 +1,8 @@
+// RUN: %clang_cc1 -E -triple x86_64-linux-gnu %s -o - | FileCheck %s
+
+// CHECK: has_noinline_keyword
+#if __has_feature(cuda_noinline_keyword)
+int has_noinline_keyword();
+#else
+int no_noinine_keyword();
+#endif
Index: clang/test/CodeGenCUDA/noinline.cu
===
--- /dev/null
+++ clang/test/CodeGenCUDA/noinline.cu
@@ -0,0 +1,34 @@
+// Uses -O2 since the defalt -O0 option adds noinline to all functions.
+
+// RUN: %clang_cc1 -triple nvptx-nvidia-cuda -fcuda-is-device \
+// RUN: -O2 -disable-llvm-passes -emit-llvm -o - %s | FileCheck %s
+
+// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -fcuda-is-device \
+// RUN: -O2 -disable-llvm-passes -emit-llvm -o - -x hip %s | FileCheck %s
+
+// RUN: %clang_cc1 -triple x86_64-unknown-gnu-linux \
+// RUN: -O2 -disable-llvm-passes -emit-llvm -o - %s | FileCheck %s
+
+#include "Inputs/cuda.h"
+
+__noinline__ __device__ __host__ void fun1() {}
+
+__attribute__((noinline)) __device__ __host__ void fun2() {}
+
+__attribute__((__noinline__)) __device__ __host__ void fun3() {}
+
+[[gnu::__noinline__]] __device__ __host__ void fun4() {}
+
+#define __noinline__ __attribute__((__noinline__))
+__noinline__ __device__ __host__ void fun5() {}
+
+__device__ __host__ void fun6() {}
+
+// CHECK: define{{.*}}@_Z4fun1v{{.*}}#[[ATTR1:[0-9]*]]
+// CHECK: define{{.*}}@_Z4fun2v{{.*}}#[[ATTR1:[0-9]*]]
+// CHECK: define{{.*}}@_Z4fun3v{{.*}}#[[ATTR1:[0-9]*]]
+// CHECK: define{{.*}}@_Z4fun4v{{.*}}#[[ATTR1:[0-9]*]]
+// CHECK: define{{.*}}@_Z4fun5v{{.*}}#[[ATTR1:[0-9]*]]
+// CHECK: define{{.*}}@_Z4fun6v{{.*}}#[[ATTR2:[0-9]*]]
+// CHECK: attributes #[[ATTR1]] = {{.*}}noinline
+// CHECK-NOT: attributes #[[ATTR2]] = {{.*}}noinline
Index: clang/lib/Parse/ParseDecl.cpp
===
--- clang/lib/Parse/ParseDecl.cpp
+++ clang/lib/Parse/ParseDecl.cpp
@@ -897,6 +897,15 @@
   }
 }
 
+void Parser::ParseCUDAFunctionAttributes(ParsedAttributes ) {
+  while (Tok.is(tok::kw___noinline__)) {
+IdentifierInfo *AttrName = Tok.getIdentifierInfo();
+SourceLocation AttrNameLoc = ConsumeToken();
+attrs.addNew(AttrName, AttrNameLoc, nullptr, AttrNameLoc, nullptr, 0,
+ ParsedAttr::AS_Keyword);
+  }
+}
+
 void Parser::ParseOpenCLQualifiers(ParsedAttributes ) {
   IdentifierInfo *AttrName = Tok.getIdentifierInfo();
   SourceLocation AttrNameLoc = Tok.getLocation();
@@ -3690,6 +3699,11 @@
   ParseOpenCLKernelAttributes(DS.getAttributes());
   continue;
 
+// CUDA/HIP single token adornments.
+case tok::kw___noinline__:
+  ParseCUDAFunctionAttributes(DS.getAttributes());
+  continue;
+
 // Nullability type specifiers.
 case tok::kw__Nonnull:
 case tok::kw__Nullable:
Index: clang/lib/Basic/IdentifierTable.cpp
===
--- clang/lib/Basic/IdentifierTable.cpp
+++ clang/lib/Basic/IdentifierTable.cpp
@@ -108,6 +108,7 @@
 KEYOPENCLCXX  = 0x40,
 KEYMSCOMPAT   = 0x80,
 KEYSYCL   = 0x100,
+KEYCUDA   = 0x200,
 KEYALLCXX = KEYCXX | KEYCXX11 | KEYCXX20,
 KEYALL = (0x1ff & ~KEYNOMS18 &
   ~KEYNOOPENCL) // KEYNOMS18 and KEYNOOPENCL are used to exclude.
@@ -158,6 +159,8 @@
 return KS_Future;
   if 

[PATCH] D124866: [CUDA][HIP] support __noinline__ as keyword

2022-05-10 Thread Aaron Ballman via Phabricator via cfe-commits
aaron.ballman added inline comments.



Comment at: clang/test/SemaCUDA/noinline.cu:8
+__attribute__((noinline)) void fun2() { }
+__attribute__((__noinline__)) void fun3() { }

yaxunl wrote:
> aaron.ballman wrote:
> > yaxunl wrote:
> > > aaron.ballman wrote:
> > > > yaxunl wrote:
> > > > > aaron.ballman wrote:
> > > > > > yaxunl wrote:
> > > > > > > aaron.ballman wrote:
> > > > > > > > I think there should also be a test like:
> > > > > > > > ```
> > > > > > > > [[gnu::__noinline__]] void fun4() {}
> > > > > > > > ```
> > > > > > > > to verify that the double square bracket syntax also correctly 
> > > > > > > > handles this being a keyword now (I expect the test to pass).
> > > > > > > will do
> > > > > > Ah, I just noticed we also have no tests for the behavior of the 
> > > > > > keyword in the presence of the macro being defined. e.g.,
> > > > > > ```
> > > > > > #define __noinline__ __attribute__((__noinline__))
> > > > > > __noinline__ void fun5() {}
> > > > > > ```
> > > > > will do
> > > > I missed an important detail -- I think this is now going to generate a 
> > > > warning in `-pedantic` mode (through `-Wkeyword-macro`) when compiling 
> > > > for CUDA; is that going to be a problem for CUDA headers, or are those 
> > > > always included as a system header (and so the diagnostics will be 
> > > > suppressed)?
> > > I could not find how clang driver adds CUDA include path
> > > 
> > > https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/ToolChains/Cuda.cpp#L284
> > > 
> > > @tra do you know how CUDA include path is added? is it done by CMake? 
> > > 
> > > For HIP the HIP include path is added as a system include path by clang 
> > > driver.
> > Whatever we find out, we can emulate its behavior here in the test file to 
> > see what the diagnostic behavior will be (you can use GNU linemarkers to 
> > convince the compiler parts of the source are in a system header).
> will add tests for that.
> 
> It seems no matter it is system header or normal header, no warnings are 
> emitted even with -pedantic.
Excellent, thank you!


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124866/new/

https://reviews.llvm.org/D124866

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124866: [CUDA][HIP] support __noinline__ as keyword

2022-05-10 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked 4 inline comments as done.
yaxunl added inline comments.



Comment at: clang/test/SemaCUDA/noinline.cu:8
+__attribute__((noinline)) void fun2() { }
+__attribute__((__noinline__)) void fun3() { }

aaron.ballman wrote:
> yaxunl wrote:
> > aaron.ballman wrote:
> > > yaxunl wrote:
> > > > aaron.ballman wrote:
> > > > > yaxunl wrote:
> > > > > > aaron.ballman wrote:
> > > > > > > I think there should also be a test like:
> > > > > > > ```
> > > > > > > [[gnu::__noinline__]] void fun4() {}
> > > > > > > ```
> > > > > > > to verify that the double square bracket syntax also correctly 
> > > > > > > handles this being a keyword now (I expect the test to pass).
> > > > > > will do
> > > > > Ah, I just noticed we also have no tests for the behavior of the 
> > > > > keyword in the presence of the macro being defined. e.g.,
> > > > > ```
> > > > > #define __noinline__ __attribute__((__noinline__))
> > > > > __noinline__ void fun5() {}
> > > > > ```
> > > > will do
> > > I missed an important detail -- I think this is now going to generate a 
> > > warning in `-pedantic` mode (through `-Wkeyword-macro`) when compiling 
> > > for CUDA; is that going to be a problem for CUDA headers, or are those 
> > > always included as a system header (and so the diagnostics will be 
> > > suppressed)?
> > I could not find how clang driver adds CUDA include path
> > 
> > https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/ToolChains/Cuda.cpp#L284
> > 
> > @tra do you know how CUDA include path is added? is it done by CMake? 
> > 
> > For HIP the HIP include path is added as a system include path by clang 
> > driver.
> Whatever we find out, we can emulate its behavior here in the test file to 
> see what the diagnostic behavior will be (you can use GNU linemarkers to 
> convince the compiler parts of the source are in a system header).
will add tests for that.

It seems no matter it is system header or normal header, no warnings are 
emitted even with -pedantic.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124866/new/

https://reviews.llvm.org/D124866

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124866: [CUDA][HIP] support __noinline__ as keyword

2022-05-10 Thread Aaron Ballman via Phabricator via cfe-commits
aaron.ballman added inline comments.



Comment at: clang/test/SemaCUDA/noinline.cu:8
+__attribute__((noinline)) void fun2() { }
+__attribute__((__noinline__)) void fun3() { }

yaxunl wrote:
> aaron.ballman wrote:
> > yaxunl wrote:
> > > aaron.ballman wrote:
> > > > yaxunl wrote:
> > > > > aaron.ballman wrote:
> > > > > > I think there should also be a test like:
> > > > > > ```
> > > > > > [[gnu::__noinline__]] void fun4() {}
> > > > > > ```
> > > > > > to verify that the double square bracket syntax also correctly 
> > > > > > handles this being a keyword now (I expect the test to pass).
> > > > > will do
> > > > Ah, I just noticed we also have no tests for the behavior of the 
> > > > keyword in the presence of the macro being defined. e.g.,
> > > > ```
> > > > #define __noinline__ __attribute__((__noinline__))
> > > > __noinline__ void fun5() {}
> > > > ```
> > > will do
> > I missed an important detail -- I think this is now going to generate a 
> > warning in `-pedantic` mode (through `-Wkeyword-macro`) when compiling for 
> > CUDA; is that going to be a problem for CUDA headers, or are those always 
> > included as a system header (and so the diagnostics will be suppressed)?
> I could not find how clang driver adds CUDA include path
> 
> https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/ToolChains/Cuda.cpp#L284
> 
> @tra do you know how CUDA include path is added? is it done by CMake? 
> 
> For HIP the HIP include path is added as a system include path by clang 
> driver.
Whatever we find out, we can emulate its behavior here in the test file to see 
what the diagnostic behavior will be (you can use GNU linemarkers to convince 
the compiler parts of the source are in a system header).


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124866/new/

https://reviews.llvm.org/D124866

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124866: [CUDA][HIP] support __noinline__ as keyword

2022-05-10 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked 4 inline comments as done.
yaxunl added inline comments.



Comment at: clang/docs/ReleaseNotes.rst:344-345
 
-CUDA Language Changes in Clang
+CUDA/HIP Language Changes in Clang
 --
 

aaron.ballman wrote:
> 
will fix



Comment at: clang/include/clang/Basic/AttrDocs.td:543
+avoid diagnostics due to usage of ``__attribute__((__noinline__))``
+with ``__noinline__`` defined as a macro as ``__attribute__((noinline))`.
+

aaron.ballman wrote:
> 
will fix.



Comment at: clang/lib/Parse/ParseDecl.cpp:902
+  while (Tok.is(tok::kw___noinline__)) {
+IdentifierInfo *AttrName = Tok.getIdentifierInfo();
+SourceLocation AttrNameLoc = ConsumeToken();

tra wrote:
> yaxunl wrote:
> > tra wrote:
> > > aaron.ballman wrote:
> > > > yaxunl wrote:
> > > > > aaron.ballman wrote:
> > > > > > I think we should we be issuing a pedantic "this is a clang 
> > > > > > extension" warning here, WDYT?
> > > > > will do
> > > > I'm questioning whether my advice here was good or not -- now that I 
> > > > see the CUDA spec already calls these function qualifiers... it's 
> > > > debatable whether this is a Clang extension or just the way in which 
> > > > Clang implements the CUDA function qualifiers. @tra -- do you have 
> > > > opinions?
> > > > 
> > > > I'm sort of leaning towards dropping the extension warning, but the 
> > > > only reason I can think of for keeping it is if Clang is the only CUDA 
> > > > compiler that doesn't require you to include a header before using the 
> > > > function qualifiers. If that's the case, there is a portability concern.
> > > I'm not sure if such a warning would be useful. 
> > > 
> > > > the only reason I can think of for keeping it is if Clang is the only 
> > > > CUDA compiler that doesn't require you to include a header before using 
> > > > the function qualifiers. If that's the case, there is a portability 
> > > > concern.
> > > 
> > > I don't think it's an issue.
> > > 
> > > We already have similar divergence between nvcc/clang. E.g. built-in 
> > > variables like `threadIdx`. Clang implements them in a header, but NVCC 
> > > provides them by compiler itself. 
> > > With both compilers the variables are available by the time we get to 
> > > compile user code. Virtually all CUDA compilations are done with tons of 
> > > CUDA headers pre-included by compiler. Those that do not do that are 
> > > already on their own and have to provide many other 'standard' CUDA 
> > > things like target attributes. I don't think we need to worry about that.
> > > 
> > I can remove the diagnostics since it seems unnecessary.
> > 
> > I tend to treat it as an extension since nvcc is the de facto standard 
> > implementation, which does not implement it as a keyword. Compared to that, 
> > this is like an extension.
> I'd argue that NVCC does implement it (as in "documents and makes it 
> available"). Providing the documented functionality using a different 
> implementation does not reach the point of being an extension, IMO. While 
> there are observable differences between implementations, depending on them 
> would be a portability error for the user.
> 
that makes sense. will change the extension to feature



Comment at: clang/test/SemaCUDA/noinline.cu:8
+__attribute__((noinline)) void fun2() { }
+__attribute__((__noinline__)) void fun3() { }

aaron.ballman wrote:
> yaxunl wrote:
> > aaron.ballman wrote:
> > > yaxunl wrote:
> > > > aaron.ballman wrote:
> > > > > I think there should also be a test like:
> > > > > ```
> > > > > [[gnu::__noinline__]] void fun4() {}
> > > > > ```
> > > > > to verify that the double square bracket syntax also correctly 
> > > > > handles this being a keyword now (I expect the test to pass).
> > > > will do
> > > Ah, I just noticed we also have no tests for the behavior of the keyword 
> > > in the presence of the macro being defined. e.g.,
> > > ```
> > > #define __noinline__ __attribute__((__noinline__))
> > > __noinline__ void fun5() {}
> > > ```
> > will do
> I missed an important detail -- I think this is now going to generate a 
> warning in `-pedantic` mode (through `-Wkeyword-macro`) when compiling for 
> CUDA; is that going to be a problem for CUDA headers, or are those always 
> included as a system header (and so the diagnostics will be suppressed)?
I could not find how clang driver adds CUDA include path

https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/ToolChains/Cuda.cpp#L284

@tra do you know how CUDA include path is added? is it done by CMake? 

For HIP the HIP include path is added as a system include path by clang driver.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124866/new/

https://reviews.llvm.org/D124866

___
cfe-commits mailing list
cfe-commits@lists.llvm.org

[PATCH] D124866: [CUDA][HIP] support __noinline__ as keyword

2022-05-10 Thread Aaron Ballman via Phabricator via cfe-commits
aaron.ballman added a comment.

In D124866#3501641 , @yaxunl wrote:

> If we are to add `__forceinline__` as a keyword, I feel it better be a 
> separate patch to be cleaner.

I'm fine with that.

A few nits and a question about the test recently added.




Comment at: clang/docs/ReleaseNotes.rst:344-345
 
-CUDA Language Changes in Clang
+CUDA/HIP Language Changes in Clang
 --
 





Comment at: clang/include/clang/Basic/AttrDocs.td:543
+avoid diagnostics due to usage of ``__attribute__((__noinline__))``
+with ``__noinline__`` defined as a macro as ``__attribute__((noinline))`.
+





Comment at: clang/test/SemaCUDA/noinline.cu:8
+__attribute__((noinline)) void fun2() { }
+__attribute__((__noinline__)) void fun3() { }

yaxunl wrote:
> aaron.ballman wrote:
> > yaxunl wrote:
> > > aaron.ballman wrote:
> > > > I think there should also be a test like:
> > > > ```
> > > > [[gnu::__noinline__]] void fun4() {}
> > > > ```
> > > > to verify that the double square bracket syntax also correctly handles 
> > > > this being a keyword now (I expect the test to pass).
> > > will do
> > Ah, I just noticed we also have no tests for the behavior of the keyword in 
> > the presence of the macro being defined. e.g.,
> > ```
> > #define __noinline__ __attribute__((__noinline__))
> > __noinline__ void fun5() {}
> > ```
> will do
I missed an important detail -- I think this is now going to generate a warning 
in `-pedantic` mode (through `-Wkeyword-macro`) when compiling for CUDA; is 
that going to be a problem for CUDA headers, or are those always included as a 
system header (and so the diagnostics will be suppressed)?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124866/new/

https://reviews.llvm.org/D124866

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124866: [CUDA][HIP] support __noinline__ as keyword

2022-05-09 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision.
tra added a comment.
This revision is now accepted and ready to land.

In D124866#3501641 , @yaxunl wrote:

> If we are to add `__forceinline__` as a keyword, I feel it better be a 
> separate patch to be cleaner.

Fine with me.




Comment at: clang/lib/Parse/ParseDecl.cpp:902
+  while (Tok.is(tok::kw___noinline__)) {
+IdentifierInfo *AttrName = Tok.getIdentifierInfo();
+SourceLocation AttrNameLoc = ConsumeToken();

yaxunl wrote:
> tra wrote:
> > aaron.ballman wrote:
> > > yaxunl wrote:
> > > > aaron.ballman wrote:
> > > > > I think we should we be issuing a pedantic "this is a clang 
> > > > > extension" warning here, WDYT?
> > > > will do
> > > I'm questioning whether my advice here was good or not -- now that I see 
> > > the CUDA spec already calls these function qualifiers... it's debatable 
> > > whether this is a Clang extension or just the way in which Clang 
> > > implements the CUDA function qualifiers. @tra -- do you have opinions?
> > > 
> > > I'm sort of leaning towards dropping the extension warning, but the only 
> > > reason I can think of for keeping it is if Clang is the only CUDA 
> > > compiler that doesn't require you to include a header before using the 
> > > function qualifiers. If that's the case, there is a portability concern.
> > I'm not sure if such a warning would be useful. 
> > 
> > > the only reason I can think of for keeping it is if Clang is the only 
> > > CUDA compiler that doesn't require you to include a header before using 
> > > the function qualifiers. If that's the case, there is a portability 
> > > concern.
> > 
> > I don't think it's an issue.
> > 
> > We already have similar divergence between nvcc/clang. E.g. built-in 
> > variables like `threadIdx`. Clang implements them in a header, but NVCC 
> > provides them by compiler itself. 
> > With both compilers the variables are available by the time we get to 
> > compile user code. Virtually all CUDA compilations are done with tons of 
> > CUDA headers pre-included by compiler. Those that do not do that are 
> > already on their own and have to provide many other 'standard' CUDA things 
> > like target attributes. I don't think we need to worry about that.
> > 
> I can remove the diagnostics since it seems unnecessary.
> 
> I tend to treat it as an extension since nvcc is the de facto standard 
> implementation, which does not implement it as a keyword. Compared to that, 
> this is like an extension.
I'd argue that NVCC does implement it (as in "documents and makes it 
available"). Providing the documented functionality using a different 
implementation does not reach the point of being an extension, IMO. While there 
are observable differences between implementations, depending on them would be 
a portability error for the user.



CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124866/new/

https://reviews.llvm.org/D124866

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124866: [CUDA][HIP] support __noinline__ as keyword

2022-05-09 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 428167.
yaxunl marked 2 inline comments as done.
yaxunl added a comment.

removed diagnostics and added more tests


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124866/new/

https://reviews.llvm.org/D124866

Files:
  clang/docs/ReleaseNotes.rst
  clang/include/clang/Basic/Attr.td
  clang/include/clang/Basic/AttrDocs.td
  clang/include/clang/Basic/Features.def
  clang/include/clang/Basic/TokenKinds.def
  clang/include/clang/Parse/Parser.h
  clang/lib/Basic/IdentifierTable.cpp
  clang/lib/Parse/ParseDecl.cpp
  clang/test/CodeGenCUDA/noinline.cu
  clang/test/Lexer/has_extension.cu
  clang/test/SemaCUDA/noinline.cu

Index: clang/test/SemaCUDA/noinline.cu
===
--- /dev/null
+++ clang/test/SemaCUDA/noinline.cu
@@ -0,0 +1,13 @@
+// RUN: %clang_cc1 -fsyntax-only -verify=cuda %s
+// RUN: %clang_cc1 -fsyntax-only -verify=cpp -x c++ %s
+
+// cuda-no-diagnostics
+
+__noinline__ void fun1() { } // cpp-error {{unknown type name '__noinline__'}}
+
+__attribute__((noinline)) void fun2() { }
+__attribute__((__noinline__)) void fun3() { }
+[[gnu::__noinline__]] void fun4() { }
+
+#define __noinline__ __attribute__((__noinline__))
+__noinline__ void fun5() {}
Index: clang/test/Lexer/has_extension.cu
===
--- /dev/null
+++ clang/test/Lexer/has_extension.cu
@@ -0,0 +1,8 @@
+// RUN: %clang_cc1 -E -triple x86_64-linux-gnu %s -o - | FileCheck %s
+
+// CHECK: has_noinline_keyword
+#if __has_extension(cuda_noinline_keyword)
+int has_noinline_keyword();
+#else
+int no_noinine_keyword();
+#endif
Index: clang/test/CodeGenCUDA/noinline.cu
===
--- /dev/null
+++ clang/test/CodeGenCUDA/noinline.cu
@@ -0,0 +1,34 @@
+// Uses -O2 since the defalt -O0 option adds noinline to all functions.
+
+// RUN: %clang_cc1 -triple nvptx-nvidia-cuda -fcuda-is-device \
+// RUN: -O2 -disable-llvm-passes -emit-llvm -o - %s | FileCheck %s
+
+// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -fcuda-is-device \
+// RUN: -O2 -disable-llvm-passes -emit-llvm -o - -x hip %s | FileCheck %s
+
+// RUN: %clang_cc1 -triple x86_64-unknown-gnu-linux \
+// RUN: -O2 -disable-llvm-passes -emit-llvm -o - %s | FileCheck %s
+
+#include "Inputs/cuda.h"
+
+__noinline__ __device__ __host__ void fun1() {}
+
+__attribute__((noinline)) __device__ __host__ void fun2() {}
+
+__attribute__((__noinline__)) __device__ __host__ void fun3() {}
+
+[[gnu::__noinline__]] __device__ __host__ void fun4() {}
+
+#define __noinline__ __attribute__((__noinline__))
+__noinline__ __device__ __host__ void fun5() {}
+
+__device__ __host__ void fun6() {}
+
+// CHECK: define{{.*}}@_Z4fun1v{{.*}}#[[ATTR1:[0-9]*]]
+// CHECK: define{{.*}}@_Z4fun2v{{.*}}#[[ATTR1:[0-9]*]]
+// CHECK: define{{.*}}@_Z4fun3v{{.*}}#[[ATTR1:[0-9]*]]
+// CHECK: define{{.*}}@_Z4fun4v{{.*}}#[[ATTR1:[0-9]*]]
+// CHECK: define{{.*}}@_Z4fun5v{{.*}}#[[ATTR1:[0-9]*]]
+// CHECK: define{{.*}}@_Z4fun6v{{.*}}#[[ATTR2:[0-9]*]]
+// CHECK: attributes #[[ATTR1]] = {{.*}}noinline
+// CHECK-NOT: attributes #[[ATTR2]] = {{.*}}noinline
Index: clang/lib/Parse/ParseDecl.cpp
===
--- clang/lib/Parse/ParseDecl.cpp
+++ clang/lib/Parse/ParseDecl.cpp
@@ -897,6 +897,15 @@
   }
 }
 
+void Parser::ParseCUDAFunctionAttributes(ParsedAttributes ) {
+  while (Tok.is(tok::kw___noinline__)) {
+IdentifierInfo *AttrName = Tok.getIdentifierInfo();
+SourceLocation AttrNameLoc = ConsumeToken();
+attrs.addNew(AttrName, AttrNameLoc, nullptr, AttrNameLoc, nullptr, 0,
+ ParsedAttr::AS_Keyword);
+  }
+}
+
 void Parser::ParseOpenCLQualifiers(ParsedAttributes ) {
   IdentifierInfo *AttrName = Tok.getIdentifierInfo();
   SourceLocation AttrNameLoc = Tok.getLocation();
@@ -3690,6 +3699,11 @@
   ParseOpenCLKernelAttributes(DS.getAttributes());
   continue;
 
+// CUDA/HIP single token adornments.
+case tok::kw___noinline__:
+  ParseCUDAFunctionAttributes(DS.getAttributes());
+  continue;
+
 // Nullability type specifiers.
 case tok::kw__Nonnull:
 case tok::kw__Nullable:
Index: clang/lib/Basic/IdentifierTable.cpp
===
--- clang/lib/Basic/IdentifierTable.cpp
+++ clang/lib/Basic/IdentifierTable.cpp
@@ -108,6 +108,7 @@
 KEYOPENCLCXX  = 0x40,
 KEYMSCOMPAT   = 0x80,
 KEYSYCL   = 0x100,
+KEYCUDA   = 0x200,
 KEYALLCXX = KEYCXX | KEYCXX11 | KEYCXX20,
 KEYALL = (0x1ff & ~KEYNOMS18 &
   ~KEYNOOPENCL) // KEYNOMS18 and KEYNOOPENCL are used to exclude.
@@ -158,6 +159,8 @@
 return KS_Future;
   if (LangOpts.isSYCL() && (Flags & KEYSYCL))
 return KS_Enabled;
+  if (LangOpts.CUDA && (Flags & KEYCUDA))
+return KS_Enabled;
   return KS_Disabled;
 }
 
Index: clang/include/clang/Parse/Parser.h

[PATCH] D124866: [CUDA][HIP] support __noinline__ as keyword

2022-05-09 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked 4 inline comments as done.
yaxunl added a comment.

In D124866#3501203 , @aaron.ballman 
wrote:

>> `__forceinline__` does not have the issue as `__noinline__` has since it is 
>> not a GCC attribute. The current CUDA/HIP implementation of 
>> `__forceinline__` in header files is sufficient. I do not see the benefit of 
>> implementing `__forceinline__` as a keyword.
>
> Primarily to reduce user confusion. It's kind of weird for `__noinline__` to 
> be a keyword and `__forceinline__` to not be a keyword when they're both 
> defined the same way by the CUDA spec. This means you can #undef one of them 
> but not the other, that sort of thing.

If we are to add `__forceinline__` as a keyword, I feel it better be a separate 
patch to be cleaner.




Comment at: clang/lib/Parse/ParseDecl.cpp:902
+  while (Tok.is(tok::kw___noinline__)) {
+IdentifierInfo *AttrName = Tok.getIdentifierInfo();
+SourceLocation AttrNameLoc = ConsumeToken();

tra wrote:
> aaron.ballman wrote:
> > yaxunl wrote:
> > > aaron.ballman wrote:
> > > > I think we should we be issuing a pedantic "this is a clang extension" 
> > > > warning here, WDYT?
> > > will do
> > I'm questioning whether my advice here was good or not -- now that I see 
> > the CUDA spec already calls these function qualifiers... it's debatable 
> > whether this is a Clang extension or just the way in which Clang implements 
> > the CUDA function qualifiers. @tra -- do you have opinions?
> > 
> > I'm sort of leaning towards dropping the extension warning, but the only 
> > reason I can think of for keeping it is if Clang is the only CUDA compiler 
> > that doesn't require you to include a header before using the function 
> > qualifiers. If that's the case, there is a portability concern.
> I'm not sure if such a warning would be useful. 
> 
> > the only reason I can think of for keeping it is if Clang is the only CUDA 
> > compiler that doesn't require you to include a header before using the 
> > function qualifiers. If that's the case, there is a portability concern.
> 
> I don't think it's an issue.
> 
> We already have similar divergence between nvcc/clang. E.g. built-in 
> variables like `threadIdx`. Clang implements them in a header, but NVCC 
> provides them by compiler itself. 
> With both compilers the variables are available by the time we get to compile 
> user code. Virtually all CUDA compilations are done with tons of CUDA headers 
> pre-included by compiler. Those that do not do that are already on their own 
> and have to provide many other 'standard' CUDA things like target attributes. 
> I don't think we need to worry about that.
> 
I can remove the diagnostics since it seems unnecessary.

I tend to treat it as an extension since nvcc is the de facto standard 
implementation, which does not implement it as a keyword. Compared to that, 
this is like an extension.



Comment at: clang/test/CodeGenCUDA/noinline.cu:1
+// optimization is needed, otherwise by default all functions have noinline.
+

erichkeane wrote:
> aaron.ballman wrote:
> > I've asked @erichkeane to weigh in on whether there's a better approach 
> > here than specifying an optimization level.
> You don't need to do this, it looks like all you're trying to do is keep 
> 'clang' out of `O0` mode.  However, what you do  NOT want is the 
> optimizations to run.  The common way to do that is to combine `O1`/`O2`/etc 
> like: `-O2 -disable-llvm-passes`
> 
> This will keep clang in `O2` mode, but will keep the optimizer from running 
> anything, which might mess with the test later on.
will use -O2 -disable-llvm-passes



Comment at: clang/test/SemaCUDA/noinline.cu:8
+__attribute__((noinline)) void fun2() { }
+__attribute__((__noinline__)) void fun3() { }

aaron.ballman wrote:
> yaxunl wrote:
> > aaron.ballman wrote:
> > > I think there should also be a test like:
> > > ```
> > > [[gnu::__noinline__]] void fun4() {}
> > > ```
> > > to verify that the double square bracket syntax also correctly handles 
> > > this being a keyword now (I expect the test to pass).
> > will do
> Ah, I just noticed we also have no tests for the behavior of the keyword in 
> the presence of the macro being defined. e.g.,
> ```
> #define __noinline__ __attribute__((__noinline__))
> __noinline__ void fun5() {}
> ```
will do


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124866/new/

https://reviews.llvm.org/D124866

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124866: [CUDA][HIP] support __noinline__ as keyword

2022-05-09 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment.

In D124866#3501203 , @aaron.ballman 
wrote:

>>> Should we do `__forceinline__` at the same time so that there's consistency?
>
> Primarily to reduce user confusion. It's kind of weird for `__noinline__` to 
> be a keyword and `__forceinline__` to not be a keyword when they're both 
> defined the same way by the CUDA spec. This means you can #undef one of them 
> but not the other, that sort of thing.

I'm slightly biased towards making them both a keyword. That said, I may be 
convinced otherwise if we discover that it may break some assumptions in 
existing C++ code. I just don't know enough.




Comment at: clang/lib/Parse/ParseDecl.cpp:902
+  while (Tok.is(tok::kw___noinline__)) {
+IdentifierInfo *AttrName = Tok.getIdentifierInfo();
+SourceLocation AttrNameLoc = ConsumeToken();

aaron.ballman wrote:
> yaxunl wrote:
> > aaron.ballman wrote:
> > > I think we should we be issuing a pedantic "this is a clang extension" 
> > > warning here, WDYT?
> > will do
> I'm questioning whether my advice here was good or not -- now that I see the 
> CUDA spec already calls these function qualifiers... it's debatable whether 
> this is a Clang extension or just the way in which Clang implements the CUDA 
> function qualifiers. @tra -- do you have opinions?
> 
> I'm sort of leaning towards dropping the extension warning, but the only 
> reason I can think of for keeping it is if Clang is the only CUDA compiler 
> that doesn't require you to include a header before using the function 
> qualifiers. If that's the case, there is a portability concern.
I'm not sure if such a warning would be useful. 

> the only reason I can think of for keeping it is if Clang is the only CUDA 
> compiler that doesn't require you to include a header before using the 
> function qualifiers. If that's the case, there is a portability concern.

I don't think it's an issue.

We already have similar divergence between nvcc/clang. E.g. built-in variables 
like `threadIdx`. Clang implements them in a header, but NVCC provides them by 
compiler itself. 
With both compilers the variables are available by the time we get to compile 
user code. Virtually all CUDA compilations are done with tons of CUDA headers 
pre-included by compiler. Those that do not do that are already on their own 
and have to provide many other 'standard' CUDA things like target attributes. I 
don't think we need to worry about that.



CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124866/new/

https://reviews.llvm.org/D124866

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124866: [CUDA][HIP] support __noinline__ as keyword

2022-05-09 Thread Erich Keane via Phabricator via cfe-commits
erichkeane added inline comments.



Comment at: clang/test/CodeGenCUDA/noinline.cu:1
+// optimization is needed, otherwise by default all functions have noinline.
+

aaron.ballman wrote:
> I've asked @erichkeane to weigh in on whether there's a better approach here 
> than specifying an optimization level.
You don't need to do this, it looks like all you're trying to do is keep 
'clang' out of `O0` mode.  However, what you do  NOT want is the optimizations 
to run.  The common way to do that is to combine `O1`/`O2`/etc like: `-O2 
-disable-llvm-passes`

This will keep clang in `O2` mode, but will keep the optimizer from running 
anything, which might mess with the test later on.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124866/new/

https://reviews.llvm.org/D124866

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124866: [CUDA][HIP] support __noinline__ as keyword

2022-05-09 Thread Aaron Ballman via Phabricator via cfe-commits
aaron.ballman added a subscriber: erichkeane.
aaron.ballman added a comment.

In D124866#3501181 , @yaxunl wrote:

> In D124866#3500761 , @aaron.ballman 
> wrote:
>
>> Should we do `__forceinline__` at the same time so that there's consistency?
>
> `__forceinline__` does not have the issue as `__noinline__` has since it is 
> not a GCC attribute. The current CUDA/HIP implementation of `__forceinline__` 
> in header files is sufficient. I do not see the benefit of implementing 
> `__forceinline__` as a keyword.

Primarily to reduce user confusion. It's kind of weird for `__noinline__` to be 
a keyword and `__forceinline__` to not be a keyword when they're both defined 
the same way by the CUDA spec. This means you can #undef one of them but not 
the other, that sort of thing.




Comment at: clang/test/CodeGenCUDA/noinline.cu:1
+// optimization is needed, otherwise by default all functions have noinline.
+

I've asked @erichkeane to weigh in on whether there's a better approach here 
than specifying an optimization level.



Comment at: clang/test/SemaCUDA/noinline.cu:8
+__attribute__((noinline)) void fun2() { }
+__attribute__((__noinline__)) void fun3() { }

yaxunl wrote:
> aaron.ballman wrote:
> > I think there should also be a test like:
> > ```
> > [[gnu::__noinline__]] void fun4() {}
> > ```
> > to verify that the double square bracket syntax also correctly handles this 
> > being a keyword now (I expect the test to pass).
> will do
Ah, I just noticed we also have no tests for the behavior of the keyword in the 
presence of the macro being defined. e.g.,
```
#define __noinline__ __attribute__((__noinline__))
__noinline__ void fun5() {}
```


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124866/new/

https://reviews.llvm.org/D124866

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124866: [CUDA][HIP] support __noinline__ as keyword

2022-05-09 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment.

In D124866#3500761 , @aaron.ballman 
wrote:

> Should we do `__forceinline__` at the same time so that there's consistency?

`__forceinline__` does not have the issue as `__noinline__` has since it is not 
a GCC attribute. The current CUDA/HIP implementation of `__forceinline__` in 
header files is sufficient. I do not see the benefit of implementing 
`__forceinline__` as a keyword.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124866/new/

https://reviews.llvm.org/D124866

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124866: [CUDA][HIP] support __noinline__ as keyword

2022-05-09 Thread Aaron Ballman via Phabricator via cfe-commits
aaron.ballman added a comment.

In D124866#3497439 , @tra wrote:

>> CUDA/HIP do not have language spec.
>
> Well. It's not completely true. CUDA programming guide does serve as the 
> de-facto spec for CUDA. It's far from perfect, but it does mention 
> `__noinline__` and `__forceinline__` as function qualifiers: 
> https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#noinline-and-forceinline

Thank you, that's the magic words I was hoping for -- because they're described 
as function qualifiers, I think it's justifiable to add them as a keyword 
implementation in Clang and not worry about stepping on the toes of the CUDA 
spec (it's adhering to what the current spec requires).

Should we do `__forceinline__` at the same time so that there's consistency?




Comment at: clang/lib/Parse/ParseDecl.cpp:902
+  while (Tok.is(tok::kw___noinline__)) {
+IdentifierInfo *AttrName = Tok.getIdentifierInfo();
+SourceLocation AttrNameLoc = ConsumeToken();

yaxunl wrote:
> aaron.ballman wrote:
> > I think we should we be issuing a pedantic "this is a clang extension" 
> > warning here, WDYT?
> will do
I'm questioning whether my advice here was good or not -- now that I see the 
CUDA spec already calls these function qualifiers... it's debatable whether 
this is a Clang extension or just the way in which Clang implements the CUDA 
function qualifiers. @tra -- do you have opinions?

I'm sort of leaning towards dropping the extension warning, but the only reason 
I can think of for keeping it is if Clang is the only CUDA compiler that 
doesn't require you to include a header before using the function qualifiers. 
If that's the case, there is a portability concern.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124866/new/

https://reviews.llvm.org/D124866

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124866: [CUDA][HIP] support __noinline__ as keyword

2022-05-06 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment.

In D124866#3497439 , @tra wrote:

>> CUDA/HIP do not have language spec.
>
> Well. It's not completely true. CUDA programming guide does serve as the 
> de-facto spec for CUDA. It's far from perfect, but it does mention 
> `__noinline__` and `__forceinline__` as function qualifiers: 
> https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#noinline-and-forceinline

Thanks for the pointer. I missed that part.

CUDA SDK implements `__noinline__` as attribute `__attribute__((noinline))` 
though. Some requirements may not have diagnostics.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124866/new/

https://reviews.llvm.org/D124866

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124866: [CUDA][HIP] support __noinline__ as keyword

2022-05-06 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment.

> CUDA/HIP do not have language spec.

Well. It's not completely true. CUDA programming guide does serve as the 
de-facto spec for CUDA. It's far from perfect, but it does mention 
`__noinline__` and `__forceinline__` as function qualifiers: 
https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#noinline-and-forceinline


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124866/new/

https://reviews.llvm.org/D124866

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124866: [CUDA][HIP] support __noinline__ as keyword

2022-05-06 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 427700.
yaxunl marked an inline comment as done.
yaxunl added a comment.

added release note and documentation


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124866/new/

https://reviews.llvm.org/D124866

Files:
  clang/docs/ReleaseNotes.rst
  clang/include/clang/Basic/Attr.td
  clang/include/clang/Basic/AttrDocs.td
  clang/include/clang/Basic/DiagnosticParseKinds.td
  clang/include/clang/Basic/Features.def
  clang/include/clang/Basic/TokenKinds.def
  clang/include/clang/Parse/Parser.h
  clang/lib/Basic/IdentifierTable.cpp
  clang/lib/Parse/ParseDecl.cpp
  clang/test/CodeGenCUDA/noinline.cu
  clang/test/Lexer/has_extension.cu
  clang/test/SemaCUDA/noinline.cu

Index: clang/test/SemaCUDA/noinline.cu
===
--- /dev/null
+++ clang/test/SemaCUDA/noinline.cu
@@ -0,0 +1,12 @@
+// RUN: %clang_cc1 -fsyntax-only -verify=cuda %s
+// RUN: %clang_cc1 -fsyntax-only -verify=pedantic -pedantic %s
+// RUN: %clang_cc1 -fsyntax-only -verify=cpp -x c++ %s
+
+// cuda-no-diagnostics
+
+__noinline__ void fun1() { } // cpp-error {{unknown type name '__noinline__'}}
+// pedantic-warning@-1 {{__noinline__ keyword is a Clang extension for CUDA/HIP}}
+
+__attribute__((noinline)) void fun2() { }
+__attribute__((__noinline__)) void fun3() { }
+[[gnu::__noinline__]] void fun4() { }
Index: clang/test/Lexer/has_extension.cu
===
--- /dev/null
+++ clang/test/Lexer/has_extension.cu
@@ -0,0 +1,8 @@
+// RUN: %clang_cc1 -E -triple x86_64-linux-gnu %s -o - | FileCheck %s
+
+// CHECK: has_noinline_keyword
+#if __has_extension(cuda_noinline_keyword)
+int has_noinline_keyword();
+#else
+int no_noinine_keyword();
+#endif
Index: clang/test/CodeGenCUDA/noinline.cu
===
--- /dev/null
+++ clang/test/CodeGenCUDA/noinline.cu
@@ -0,0 +1,30 @@
+// optimization is needed, otherwise by default all functions have noinline.
+
+// RUN: %clang_cc1 -triple nvptx-nvidia-cuda -fcuda-is-device \
+// RUN: -O2 -emit-llvm -o - %s | FileCheck %s
+
+// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -fcuda-is-device \
+// RUN: -O2 -emit-llvm -o - -x hip %s | FileCheck %s
+
+// RUN: %clang_cc1 -triple x86_64-unknown-gnu-linux \
+// RUN: -O2 -emit-llvm -o - %s | FileCheck %s
+
+#include "Inputs/cuda.h"
+
+__noinline__ __device__ __host__ void fun1() {}
+
+__attribute__((noinline)) __device__ __host__ void fun2() {}
+
+__attribute__((__noinline__)) __device__ __host__ void fun3() {}
+
+[[gnu::__noinline__]] __device__ __host__ void fun4() {}
+
+__device__ __host__ void fun5() {}
+
+// CHECK: define{{.*}}@_Z4fun1v{{.*}}#[[ATTR1:[0-9]*]]
+// CHECK: define{{.*}}@_Z4fun2v{{.*}}#[[ATTR1:[0-9]*]]
+// CHECK: define{{.*}}@_Z4fun3v{{.*}}#[[ATTR1:[0-9]*]]
+// CHECK: define{{.*}}@_Z4fun4v{{.*}}#[[ATTR1:[0-9]*]]
+// CHECK: define{{.*}}@_Z4fun5v{{.*}}#[[ATTR2:[0-9]*]]
+// CHECK: attributes #[[ATTR1]] = {{.*}}noinline
+// CHECK-NOT: attributes #[[ATTR2]] = {{.*}}noinline
Index: clang/lib/Parse/ParseDecl.cpp
===
--- clang/lib/Parse/ParseDecl.cpp
+++ clang/lib/Parse/ParseDecl.cpp
@@ -897,6 +897,18 @@
   }
 }
 
+void Parser::ParseCUDAFunctionAttributes(ParsedAttributes ) {
+  while (Tok.is(tok::kw___noinline__)) {
+IdentifierInfo *AttrName = Tok.getIdentifierInfo();
+SourceLocation AttrNameLoc = ConsumeToken();
+attrs.addNew(AttrName, AttrNameLoc, nullptr, AttrNameLoc, nullptr, 0,
+ ParsedAttr::AS_Keyword);
+if (getLangOpts().CUDA) {
+  Diag(Tok, diag::ext_cuda_noinline_keyword);
+}
+  }
+}
+
 void Parser::ParseOpenCLQualifiers(ParsedAttributes ) {
   IdentifierInfo *AttrName = Tok.getIdentifierInfo();
   SourceLocation AttrNameLoc = Tok.getLocation();
@@ -3690,6 +3702,11 @@
   ParseOpenCLKernelAttributes(DS.getAttributes());
   continue;
 
+// CUDA/HIP single token adornments.
+case tok::kw___noinline__:
+  ParseCUDAFunctionAttributes(DS.getAttributes());
+  continue;
+
 // Nullability type specifiers.
 case tok::kw__Nonnull:
 case tok::kw__Nullable:
Index: clang/lib/Basic/IdentifierTable.cpp
===
--- clang/lib/Basic/IdentifierTable.cpp
+++ clang/lib/Basic/IdentifierTable.cpp
@@ -108,6 +108,7 @@
 KEYOPENCLCXX  = 0x40,
 KEYMSCOMPAT   = 0x80,
 KEYSYCL   = 0x100,
+KEYCUDA   = 0x200,
 KEYALLCXX = KEYCXX | KEYCXX11 | KEYCXX20,
 KEYALL = (0x1ff & ~KEYNOMS18 &
   ~KEYNOOPENCL) // KEYNOMS18 and KEYNOOPENCL are used to exclude.
@@ -158,6 +159,8 @@
 return KS_Future;
   if (LangOpts.isSYCL() && (Flags & KEYSYCL))
 return KS_Enabled;
+  if (LangOpts.CUDA && (Flags & KEYCUDA))
+return KS_Enabled;
   return KS_Disabled;
 }
 
Index: clang/include/clang/Parse/Parser.h

[PATCH] D124866: [CUDA][HIP] support __noinline__ as keyword

2022-05-06 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done.
yaxunl added inline comments.



Comment at: clang/include/clang/Basic/Features.def:274
+// CUDA/HIP Features
+FEATURE(cuda_noinline_keyword, true)
+

aaron.ballman wrote:
> yaxunl wrote:
> > aaron.ballman wrote:
> > > Do the CUDA or HIP specs define `__noinline__` as a keyword specifically? 
> > > If not, this isn't a `FEATURE`, it's an `EXTENSION` because it's specific 
> > > to Clang, not the language standard.
> > CUDA/HIP do not have language spec. In their programming guide, they do not 
> > define `__noinline__` as a keyword.
> > 
> > Will make it an extension.
> > CUDA/HIP do not have language spec. 
> 
> Then what body of people governs changes to the language? Basically, I'm 
> trying to understand whether this patch meets the community requirements for 
> adding an extension: https://clang.llvm.org/get_involved.html#criteria, 
> specifically #4 (though the rest of the points are worth keeping in mind). I 
> don't want to Clang ending up stepping on toes by defining this extension 
> only to accidentally frustrate the CUDA community.
specific to `__noinline__`, it is largely determined by the existing behaviour 
of CUDA SDK.

The CUDA SDK defines `__noinline__` as a macro `__attribute__((noinline))`. 
However, it is not compatible with some C++ headers which use 
`__attribute__((__noinline__))`.

This patch will not change the usage pattern of `__noinline__`. It is 
equivalent to the original behaviour with the benefit of being compatible with 
C++ headers.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124866/new/

https://reviews.llvm.org/D124866

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124866: [CUDA][HIP] support __noinline__ as keyword

2022-05-06 Thread Aaron Ballman via Phabricator via cfe-commits
aaron.ballman added inline comments.



Comment at: clang/include/clang/Basic/Features.def:274
+// CUDA/HIP Features
+FEATURE(cuda_noinline_keyword, true)
+

yaxunl wrote:
> aaron.ballman wrote:
> > Do the CUDA or HIP specs define `__noinline__` as a keyword specifically? 
> > If not, this isn't a `FEATURE`, it's an `EXTENSION` because it's specific 
> > to Clang, not the language standard.
> CUDA/HIP do not have language spec. In their programming guide, they do not 
> define `__noinline__` as a keyword.
> 
> Will make it an extension.
> CUDA/HIP do not have language spec. 

Then what body of people governs changes to the language? Basically, I'm trying 
to understand whether this patch meets the community requirements for adding an 
extension: https://clang.llvm.org/get_involved.html#criteria, specifically #4 
(though the rest of the points are worth keeping in mind). I don't want to 
Clang ending up stepping on toes by defining this extension only to 
accidentally frustrate the CUDA community.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124866/new/

https://reviews.llvm.org/D124866

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124866: [CUDA][HIP] support __noinline__ as keyword

2022-05-06 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 427685.
yaxunl marked 4 inline comments as done.
yaxunl added a comment.

revised by Aaron's comments


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124866/new/

https://reviews.llvm.org/D124866

Files:
  clang/include/clang/Basic/Attr.td
  clang/include/clang/Basic/DiagnosticParseKinds.td
  clang/include/clang/Basic/Features.def
  clang/include/clang/Basic/TokenKinds.def
  clang/include/clang/Parse/Parser.h
  clang/lib/Basic/IdentifierTable.cpp
  clang/lib/Parse/ParseDecl.cpp
  clang/test/CodeGenCUDA/noinline.cu
  clang/test/Lexer/has_extension.cu
  clang/test/SemaCUDA/noinline.cu

Index: clang/test/SemaCUDA/noinline.cu
===
--- /dev/null
+++ clang/test/SemaCUDA/noinline.cu
@@ -0,0 +1,12 @@
+// RUN: %clang_cc1 -fsyntax-only -verify=cuda %s
+// RUN: %clang_cc1 -fsyntax-only -verify=pedantic -pedantic %s
+// RUN: %clang_cc1 -fsyntax-only -verify=cpp -x c++ %s
+
+// cuda-no-diagnostics
+
+__noinline__ void fun1() { } // cpp-error {{unknown type name '__noinline__'}}
+// pedantic-warning@-1 {{__noinline__ keyword is a Clang extension for CUDA/HIP}}
+
+__attribute__((noinline)) void fun2() { }
+__attribute__((__noinline__)) void fun3() { }
+[[gnu::__noinline__]] void fun4() { }
Index: clang/test/Lexer/has_extension.cu
===
--- /dev/null
+++ clang/test/Lexer/has_extension.cu
@@ -0,0 +1,8 @@
+// RUN: %clang_cc1 -E -triple x86_64-linux-gnu %s -o - | FileCheck %s
+
+// CHECK: has_noinline_keyword
+#if __has_extension(cuda_noinline_keyword)
+int has_noinline_keyword();
+#else
+int no_noinine_keyword();
+#endif
Index: clang/test/CodeGenCUDA/noinline.cu
===
--- /dev/null
+++ clang/test/CodeGenCUDA/noinline.cu
@@ -0,0 +1,30 @@
+// optimization is needed, otherwise by default all functions have noinline.
+
+// RUN: %clang_cc1 -triple nvptx-nvidia-cuda -fcuda-is-device \
+// RUN: -O2 -emit-llvm -o - %s | FileCheck %s
+
+// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -fcuda-is-device \
+// RUN: -O2 -emit-llvm -o - -x hip %s | FileCheck %s
+
+// RUN: %clang_cc1 -triple x86_64-unknown-gnu-linux \
+// RUN: -O2 -emit-llvm -o - %s | FileCheck %s
+
+#include "Inputs/cuda.h"
+
+__noinline__ __device__ __host__ void fun1() {}
+
+__attribute__((noinline)) __device__ __host__ void fun2() {}
+
+__attribute__((__noinline__)) __device__ __host__ void fun3() {}
+
+[[gnu::__noinline__]] __device__ __host__ void fun4() {}
+
+__device__ __host__ void fun5() {}
+
+// CHECK: define{{.*}}@_Z4fun1v{{.*}}#[[ATTR1:[0-9]*]]
+// CHECK: define{{.*}}@_Z4fun2v{{.*}}#[[ATTR1:[0-9]*]]
+// CHECK: define{{.*}}@_Z4fun3v{{.*}}#[[ATTR1:[0-9]*]]
+// CHECK: define{{.*}}@_Z4fun4v{{.*}}#[[ATTR1:[0-9]*]]
+// CHECK: define{{.*}}@_Z4fun5v{{.*}}#[[ATTR2:[0-9]*]]
+// CHECK: attributes #[[ATTR1]] = {{.*}}noinline
+// CHECK-NOT: attributes #[[ATTR2]] = {{.*}}noinline
Index: clang/lib/Parse/ParseDecl.cpp
===
--- clang/lib/Parse/ParseDecl.cpp
+++ clang/lib/Parse/ParseDecl.cpp
@@ -897,6 +897,18 @@
   }
 }
 
+void Parser::ParseCUDAFunctionAttributes(ParsedAttributes ) {
+  while (Tok.is(tok::kw___noinline__)) {
+IdentifierInfo *AttrName = Tok.getIdentifierInfo();
+SourceLocation AttrNameLoc = ConsumeToken();
+attrs.addNew(AttrName, AttrNameLoc, nullptr, AttrNameLoc, nullptr, 0,
+ ParsedAttr::AS_Keyword);
+if (getLangOpts().CUDA) {
+  Diag(Tok, diag::ext_cuda_noinline_keyword);
+}
+  }
+}
+
 void Parser::ParseOpenCLQualifiers(ParsedAttributes ) {
   IdentifierInfo *AttrName = Tok.getIdentifierInfo();
   SourceLocation AttrNameLoc = Tok.getLocation();
@@ -3690,6 +3702,11 @@
   ParseOpenCLKernelAttributes(DS.getAttributes());
   continue;
 
+// CUDA/HIP single token adornments.
+case tok::kw___noinline__:
+  ParseCUDAFunctionAttributes(DS.getAttributes());
+  continue;
+
 // Nullability type specifiers.
 case tok::kw__Nonnull:
 case tok::kw__Nullable:
Index: clang/lib/Basic/IdentifierTable.cpp
===
--- clang/lib/Basic/IdentifierTable.cpp
+++ clang/lib/Basic/IdentifierTable.cpp
@@ -108,6 +108,7 @@
 KEYOPENCLCXX  = 0x40,
 KEYMSCOMPAT   = 0x80,
 KEYSYCL   = 0x100,
+KEYCUDA   = 0x200,
 KEYALLCXX = KEYCXX | KEYCXX11 | KEYCXX20,
 KEYALL = (0x1ff & ~KEYNOMS18 &
   ~KEYNOOPENCL) // KEYNOMS18 and KEYNOOPENCL are used to exclude.
@@ -158,6 +159,8 @@
 return KS_Future;
   if (LangOpts.isSYCL() && (Flags & KEYSYCL))
 return KS_Enabled;
+  if (LangOpts.CUDA && (Flags & KEYCUDA))
+return KS_Enabled;
   return KS_Disabled;
 }
 
Index: clang/include/clang/Parse/Parser.h
===
--- 

[PATCH] D124866: [CUDA][HIP] support __noinline__ as keyword

2022-05-06 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked 4 inline comments as done.
yaxunl added inline comments.



Comment at: clang/include/clang/Basic/Attr.td:1778-1779
 def NoInline : DeclOrStmtAttr {
-  let Spellings = [GCC<"noinline">, CXX11<"clang", "noinline">,
+  let Spellings = [Keyword<"__noinline__">, GCC<"noinline">, CXX11<"clang", 
"noinline">,
C2x<"clang", "noinline">, Declspec<"noinline">];
   let Accessors = [Accessor<"isClangNoInline", [CXX11<"clang", "noinline">,

aaron.ballman wrote:
> 
will do



Comment at: clang/include/clang/Basic/Features.def:274
+// CUDA/HIP Features
+FEATURE(cuda_noinline_keyword, true)
+

aaron.ballman wrote:
> Do the CUDA or HIP specs define `__noinline__` as a keyword specifically? If 
> not, this isn't a `FEATURE`, it's an `EXTENSION` because it's specific to 
> Clang, not the language standard.
CUDA/HIP do not have language spec. In their programming guide, they do not 
define `__noinline__` as a keyword.

Will make it an extension.



Comment at: clang/lib/Parse/ParseDecl.cpp:902
+  while (Tok.is(tok::kw___noinline__)) {
+IdentifierInfo *AttrName = Tok.getIdentifierInfo();
+SourceLocation AttrNameLoc = ConsumeToken();

aaron.ballman wrote:
> I think we should we be issuing a pedantic "this is a clang extension" 
> warning here, WDYT?
will do



Comment at: clang/test/SemaCUDA/noinline.cu:8
+__attribute__((noinline)) void fun2() { }
+__attribute__((__noinline__)) void fun3() { }

aaron.ballman wrote:
> I think there should also be a test like:
> ```
> [[gnu::__noinline__]] void fun4() {}
> ```
> to verify that the double square bracket syntax also correctly handles this 
> being a keyword now (I expect the test to pass).
will do


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124866/new/

https://reviews.llvm.org/D124866

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124866: [CUDA][HIP] support __noinline__ as keyword

2022-05-06 Thread Artem Belevich via Phabricator via cfe-commits
tra added a reviewer: rsmith.
tra added a subscriber: rsmith.
tra added a comment.

> I don't know how language extensions come about in CUDA or HIP -- is there an 
> appropriate standards body (or something similar) that's aware of this 
> extension and supports it?

Summoning @rsmith for his language lawyer expertise.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124866/new/

https://reviews.llvm.org/D124866

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124866: [CUDA][HIP] support __noinline__ as keyword

2022-05-06 Thread Aaron Ballman via Phabricator via cfe-commits
aaron.ballman added a comment.

I don't know how language extensions come about in CUDA or HIP -- is there an 
appropriate standards body (or something similar) that's aware of this 
extension and supports it?

The changes should likely come with a release note entry about the new 
functionality, and some documentation changes as well.




Comment at: clang/include/clang/Basic/Attr.td:1778-1779
 def NoInline : DeclOrStmtAttr {
-  let Spellings = [GCC<"noinline">, CXX11<"clang", "noinline">,
+  let Spellings = [Keyword<"__noinline__">, GCC<"noinline">, CXX11<"clang", 
"noinline">,
C2x<"clang", "noinline">, Declspec<"noinline">];
   let Accessors = [Accessor<"isClangNoInline", [CXX11<"clang", "noinline">,





Comment at: clang/include/clang/Basic/Features.def:274
+// CUDA/HIP Features
+FEATURE(cuda_noinline_keyword, true)
+

Do the CUDA or HIP specs define `__noinline__` as a keyword specifically? If 
not, this isn't a `FEATURE`, it's an `EXTENSION` because it's specific to 
Clang, not the language standard.



Comment at: clang/lib/Parse/ParseDecl.cpp:902
+  while (Tok.is(tok::kw___noinline__)) {
+IdentifierInfo *AttrName = Tok.getIdentifierInfo();
+SourceLocation AttrNameLoc = ConsumeToken();

I think we should we be issuing a pedantic "this is a clang extension" warning 
here, WDYT?



Comment at: clang/test/SemaCUDA/noinline.cu:8
+__attribute__((noinline)) void fun2() { }
+__attribute__((__noinline__)) void fun3() { }

I think there should also be a test like:
```
[[gnu::__noinline__]] void fun4() {}
```
to verify that the double square bracket syntax also correctly handles this 
being a keyword now (I expect the test to pass).


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124866/new/

https://reviews.llvm.org/D124866

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124866: [CUDA][HIP] support __noinline__ as keyword

2022-05-03 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl updated this revision to Diff 426794.
yaxunl added a comment.

add feature cuda_noinline_keyword to facilitate CUDA/HIP headers removing 
__noinline__ macro


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124866/new/

https://reviews.llvm.org/D124866

Files:
  clang/include/clang/Basic/Attr.td
  clang/include/clang/Basic/Features.def
  clang/include/clang/Basic/TokenKinds.def
  clang/include/clang/Parse/Parser.h
  clang/lib/Basic/IdentifierTable.cpp
  clang/lib/Parse/ParseDecl.cpp
  clang/test/CodeGenCUDA/noinline.cu
  clang/test/Lexer/has_feature.cu
  clang/test/SemaCUDA/noinline.cu

Index: clang/test/SemaCUDA/noinline.cu
===
--- /dev/null
+++ clang/test/SemaCUDA/noinline.cu
@@ -0,0 +1,8 @@
+// RUN: %clang_cc1 -fsyntax-only -verify=cuda %s
+// RUN: %clang_cc1 -fsyntax-only -verify=cpp -x c++ %s
+
+// cuda-no-diagnostics
+
+__noinline__ void fun1() { } // cpp-error {{unknown type name '__noinline__'}}
+__attribute__((noinline)) void fun2() { }
+__attribute__((__noinline__)) void fun3() { }
Index: clang/test/Lexer/has_feature.cu
===
--- /dev/null
+++ clang/test/Lexer/has_feature.cu
@@ -0,0 +1,8 @@
+// RUN: %clang_cc1 -E -triple x86_64-linux-gnu %s -o - | FileCheck %s
+
+// CHECK: has_noinline_keyword
+#if __has_feature(cuda_noinline_keyword)
+int has_noinline_keyword();
+#else
+int no_noinine_keyword();
+#endif
Index: clang/test/CodeGenCUDA/noinline.cu
===
--- /dev/null
+++ clang/test/CodeGenCUDA/noinline.cu
@@ -0,0 +1,27 @@
+// optimization is needed, otherwise by default all functions have noinline.
+
+// RUN: %clang_cc1 -triple nvptx-nvidia-cuda -fcuda-is-device \
+// RUN: -O2 -emit-llvm -o - %s | FileCheck %s
+
+// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -fcuda-is-device \
+// RUN: -O2 -emit-llvm -o - -x hip %s | FileCheck %s
+
+// RUN: %clang_cc1 -triple x86_64-unknown-gnu-linux \
+// RUN: -O2 -emit-llvm -o - %s | FileCheck %s
+
+#include "Inputs/cuda.h"
+
+__noinline__ __device__ __host__ void fun1() {}
+
+__attribute__((noinline)) __device__ __host__ void fun2() {}
+
+__attribute__((__noinline__)) __device__ __host__ void fun3() {}
+
+__device__ __host__ void fun4() {}
+
+// CHECK: define{{.*}}@_Z4fun1v{{.*}}#[[ATTR1:[0-9]*]]
+// CHECK: define{{.*}}@_Z4fun2v{{.*}}#[[ATTR1:[0-9]*]]
+// CHECK: define{{.*}}@_Z4fun3v{{.*}}#[[ATTR1:[0-9]*]]
+// CHECK: define{{.*}}@_Z4fun4v{{.*}}#[[ATTR2:[0-9]*]]
+// CHECK: attributes #[[ATTR1]] = {{.*}}noinline
+// CHECK-NOT: attributes #[[ATTR2]] = {{.*}}noinline
Index: clang/lib/Parse/ParseDecl.cpp
===
--- clang/lib/Parse/ParseDecl.cpp
+++ clang/lib/Parse/ParseDecl.cpp
@@ -897,6 +897,15 @@
   }
 }
 
+void Parser::ParseCUDAFunctionAttributes(ParsedAttributes ) {
+  while (Tok.is(tok::kw___noinline__)) {
+IdentifierInfo *AttrName = Tok.getIdentifierInfo();
+SourceLocation AttrNameLoc = ConsumeToken();
+attrs.addNew(AttrName, AttrNameLoc, nullptr, AttrNameLoc, nullptr, 0,
+ ParsedAttr::AS_Keyword);
+  }
+}
+
 void Parser::ParseOpenCLQualifiers(ParsedAttributes ) {
   IdentifierInfo *AttrName = Tok.getIdentifierInfo();
   SourceLocation AttrNameLoc = Tok.getLocation();
@@ -3690,6 +3699,11 @@
   ParseOpenCLKernelAttributes(DS.getAttributes());
   continue;
 
+// CUDA/HIP single token adornments.
+case tok::kw___noinline__:
+  ParseCUDAFunctionAttributes(DS.getAttributes());
+  continue;
+
 // Nullability type specifiers.
 case tok::kw__Nonnull:
 case tok::kw__Nullable:
Index: clang/lib/Basic/IdentifierTable.cpp
===
--- clang/lib/Basic/IdentifierTable.cpp
+++ clang/lib/Basic/IdentifierTable.cpp
@@ -108,6 +108,7 @@
 KEYOPENCLCXX  = 0x40,
 KEYMSCOMPAT   = 0x80,
 KEYSYCL   = 0x100,
+KEYCUDA   = 0x200,
 KEYALLCXX = KEYCXX | KEYCXX11 | KEYCXX20,
 KEYALL = (0x1ff & ~KEYNOMS18 &
   ~KEYNOOPENCL) // KEYNOMS18 and KEYNOOPENCL are used to exclude.
@@ -158,6 +159,8 @@
 return KS_Future;
   if (LangOpts.isSYCL() && (Flags & KEYSYCL))
 return KS_Enabled;
+  if (LangOpts.CUDA && (Flags & KEYCUDA))
+return KS_Enabled;
   return KS_Disabled;
 }
 
Index: clang/include/clang/Parse/Parser.h
===
--- clang/include/clang/Parse/Parser.h
+++ clang/include/clang/Parse/Parser.h
@@ -2824,6 +2824,7 @@
   void ParseOpenCLKernelAttributes(ParsedAttributes );
   void ParseOpenCLQualifiers(ParsedAttributes );
   void ParseNullabilityTypeSpecifiers(ParsedAttributes );
+  void ParseCUDAFunctionAttributes(ParsedAttributes );
 
   VersionTuple ParseVersionTuple(SourceRange );
   void ParseAvailabilityAttribute(IdentifierInfo ,
Index: 

[PATCH] D124866: [CUDA][HIP] support __noinline__ as keyword

2022-05-03 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision.
yaxunl added a reviewer: tra.
Herald added a reviewer: aaron.ballman.
Herald added subscribers: mattd, carlosgalvezp, dexonsmith.
Herald added a project: All.
yaxunl requested review of this revision.

CUDA/HIP programs use `__noinline__` like a keyword e.g.
`__noinline__ void foo() {}` since `__noinline__` is defined
as a macro `__attribute__((noinline))` in CUDA/HIP runtime
header files.

However, gcc and clang supports `__attribute__((__noinline__))`
the same as `__attribute__((noinline))`. Some C++ libraries
use `__attribute__((__noinline__))` in their header files.
When CUDA/HIP programs include such header files,
clang will emit error about invalid attributes.

This patch fixes this issue by supporting `__noinline__` as
a keyword, so that CUDA/HIP runtime could remove
the macro definition.


https://reviews.llvm.org/D124866

Files:
  clang/include/clang/Basic/Attr.td
  clang/include/clang/Basic/TokenKinds.def
  clang/include/clang/Parse/Parser.h
  clang/lib/Basic/IdentifierTable.cpp
  clang/lib/Parse/ParseDecl.cpp
  clang/test/CodeGenCUDA/noinline.cu
  clang/test/SemaCUDA/noinline.cu

Index: clang/test/SemaCUDA/noinline.cu
===
--- /dev/null
+++ clang/test/SemaCUDA/noinline.cu
@@ -0,0 +1,8 @@
+// RUN: %clang_cc1 -fsyntax-only -verify=cuda %s
+// RUN: %clang_cc1 -fsyntax-only -verify=cpp -x c++ %s
+
+// cuda-no-diagnostics
+
+__noinline__ void fun1() { } // cpp-error {{unknown type name '__noinline__'}}
+__attribute__((noinline)) void fun2() { }
+__attribute__((__noinline__)) void fun3() { }
Index: clang/test/CodeGenCUDA/noinline.cu
===
--- /dev/null
+++ clang/test/CodeGenCUDA/noinline.cu
@@ -0,0 +1,27 @@
+// optimization is needed, otherwise by default all functions have noinline.
+
+// RUN: %clang_cc1 -triple nvptx-nvidia-cuda -fcuda-is-device \
+// RUN: -O2 -emit-llvm -o - %s | FileCheck %s
+
+// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -fcuda-is-device \
+// RUN: -O2 -emit-llvm -o - -x hip %s | FileCheck %s
+
+// RUN: %clang_cc1 -triple x86_64-unknown-gnu-linux \
+// RUN: -O2 -emit-llvm -o - %s | FileCheck %s
+
+#include "Inputs/cuda.h"
+
+__noinline__ __device__ __host__ void fun1() {}
+
+__attribute__((noinline)) __device__ __host__ void fun2() {}
+
+__attribute__((__noinline__)) __device__ __host__ void fun3() {}
+
+__device__ __host__ void fun4() {}
+
+// CHECK: define{{.*}}@_Z4fun1v{{.*}}#[[ATTR1:[0-9]*]]
+// CHECK: define{{.*}}@_Z4fun2v{{.*}}#[[ATTR1:[0-9]*]]
+// CHECK: define{{.*}}@_Z4fun3v{{.*}}#[[ATTR1:[0-9]*]]
+// CHECK: define{{.*}}@_Z4fun4v{{.*}}#[[ATTR2:[0-9]*]]
+// CHECK: attributes #[[ATTR1]] = {{.*}}noinline
+// CHECK-NOT: attributes #[[ATTR2]] = {{.*}}noinline
Index: clang/lib/Parse/ParseDecl.cpp
===
--- clang/lib/Parse/ParseDecl.cpp
+++ clang/lib/Parse/ParseDecl.cpp
@@ -897,6 +897,15 @@
   }
 }
 
+void Parser::ParseCUDAFunctionAttributes(ParsedAttributes ) {
+  while (Tok.is(tok::kw___noinline__)) {
+IdentifierInfo *AttrName = Tok.getIdentifierInfo();
+SourceLocation AttrNameLoc = ConsumeToken();
+attrs.addNew(AttrName, AttrNameLoc, nullptr, AttrNameLoc, nullptr, 0,
+ ParsedAttr::AS_Keyword);
+  }
+}
+
 void Parser::ParseOpenCLQualifiers(ParsedAttributes ) {
   IdentifierInfo *AttrName = Tok.getIdentifierInfo();
   SourceLocation AttrNameLoc = Tok.getLocation();
@@ -3690,6 +3699,11 @@
   ParseOpenCLKernelAttributes(DS.getAttributes());
   continue;
 
+// CUDA/HIP single token adornments.
+case tok::kw___noinline__:
+  ParseCUDAFunctionAttributes(DS.getAttributes());
+  continue;
+
 // Nullability type specifiers.
 case tok::kw__Nonnull:
 case tok::kw__Nullable:
Index: clang/lib/Basic/IdentifierTable.cpp
===
--- clang/lib/Basic/IdentifierTable.cpp
+++ clang/lib/Basic/IdentifierTable.cpp
@@ -108,6 +108,7 @@
 KEYOPENCLCXX  = 0x40,
 KEYMSCOMPAT   = 0x80,
 KEYSYCL   = 0x100,
+KEYCUDA   = 0x200,
 KEYALLCXX = KEYCXX | KEYCXX11 | KEYCXX20,
 KEYALL = (0x1ff & ~KEYNOMS18 &
   ~KEYNOOPENCL) // KEYNOMS18 and KEYNOOPENCL are used to exclude.
@@ -158,6 +159,8 @@
 return KS_Future;
   if (LangOpts.isSYCL() && (Flags & KEYSYCL))
 return KS_Enabled;
+  if (LangOpts.CUDA && (Flags & KEYCUDA))
+return KS_Enabled;
   return KS_Disabled;
 }
 
Index: clang/include/clang/Parse/Parser.h
===
--- clang/include/clang/Parse/Parser.h
+++ clang/include/clang/Parse/Parser.h
@@ -2824,6 +2824,7 @@
   void ParseOpenCLKernelAttributes(ParsedAttributes );
   void ParseOpenCLQualifiers(ParsedAttributes );
   void ParseNullabilityTypeSpecifiers(ParsedAttributes );
+  void ParseCUDAFunctionAttributes(ParsedAttributes );