[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-09-20 Thread Kazushi Marukawa via Phabricator via cfe-commits
kaz7 added a comment.

In D152054#4648318 , @sandeepkosuri 
wrote:

> @kaz7, it seems that the thread_limit is being set properly, but the 
> `omp_get_thread_limit()` is giving a wrong output when you enable anything 
> more than `-O1`. I will fix it as soon as I can. Meanwhile, if you absolutely 
> want the test case to work right now, remove the printf causing the issue or 
> do not run that test case with a higher optimization level than `-O1`.

Thank you for investigating that.  I'm fine with `-O0` result.  I just came 
across this issue and informed that.  Thank you for your efforts!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-09-19 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri added a comment.

@kaz7, it seems that the thread_limit is being set properly, but the 
`omp_get_thread_limit()` is giving a wrong output when you enable anything more 
than `-O1`. I will fix it as soon as I can. Meanwhile, if you absolutely want 
the test case to work right now, remove the printf causing the issue or do not 
run that test case with a higher optimization level than `-O1`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-09-11 Thread Kazushi Marukawa via Phabricator via cfe-commits
kaz7 added a comment.

I figure out the reason of strange behavior I mentioned last night.  However, 
I'm still not sure what is the good way to solve this problem.  Can someone 
check the behavior with `-O2` option please?  Thanks!




Comment at: openmp/runtime/test/target/target_thread_limit.cpp:73
+printf("\nsecond target: thread_limit = %d", omp_get_thread_limit());
+// OMP51: second target: thread_limit = 3
+#pragma omp parallel

Last night, I cannot use inline comment, but now I can.  So, I'm writing down 
my comments here.

The problem I mentioned last night was caused by optimization.  This works fine 
with optimization level by default, but this doesn't work with `-O2` even on 
x86_64.  I'm using `-O2` for tests, so I come across this problem.  I'll write 
command lines to reproduce my problem below.  As you see, there is a warning 
message at `-O2`.  So, this behavior may be expected, though.  In addition, I'm 
using f8efa65 to produce following results.

```
$ cd build
$ ninja
...
$ ./bin/clang++  -fopenmp -I 
runtimes/runtimes-x86_64-unknown-linux-gnu-bins/openmp/runtime/src -I 
../llvm-project/openmp/runtime/test -L 
runtimes/runtimes-x86_64-unknown-linux-gnu-bins/openmp/runtime/src 
-fno-omit-frame-pointer -I ../llvm-project/openmp/runtime/test/ompt -std=c++17 
../llvm-project/openmp/runtime/test/target/target_thread_limit.cpp -o ok -lm 
-latomic -fopenmp-version=51
$ 
LD_LIBRARY_PATH=runtimes/runtimes-x86_64-unknown-linux-gnu-bins/openmp/runtime/src
 ./ok | grep "second target: thread_limit"
second target: thread_limit = 3
$ ./bin/clang++ -fopenmp -I 
runtimes/runtimes-x86_64-unknown-linux-gnu-bins/openmp/runtime/src -I 
../llvm-project/openmp/runtime/test -L 
runtimes/runtimes-x86_64-unknown-linux-gnu-bins/openmp/runtime/src 
-fno-omit-frame-pointer -I ../llvm-project/openmp/runtime/test/ompt -std=c++17 
../llvm-project/openmp/runtime/test/target/target_thread_limit.cpp -o bad -lm 
-latomic -fopenmp-version=51 -O2
warning: loop not vectorized: the optimizer was unable to perform the requested 
transformation; the transformation might be disabled or specified as part of an 
unsupported transformation ordering [-Wpass-failed=transform-warning]
1 warning generated.
$ 
LD_LIBRARY_PATH=runtimes/runtimes-x86_64-unknown-linux-gnu-bins/openmp/runtime/src
 ./bad | grep "second target: thread_limit"
second target: thread_limit = 2147483647
```


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-09-10 Thread Kazushi Marukawa via Phabricator via cfe-commits
kaz7 added a comment.

In D152054#4642793 , @tianshilei1992 
wrote:

> In D152054#4642725 , @kaz7 wrote:
>
>> I run check-openmp on our machine, this omp_get_thread_limit() returns 
>> default thread limit 2147483647 (=0x7fff).
>
> That is something wrong because this patch is about host instead of 
> offloading.

Thank you for your comment.  I try to inspect this.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-09-10 Thread Shilei Tian via Phabricator via cfe-commits
tianshilei1992 added a comment.

In D152054#4642725 , @kaz7 wrote:

> I run check-openmp on our machine, this omp_get_thread_limit() returns 
> default thread limit 2147483647 (=0x7fff).

That is something wrong because this patch is about host instead of offloading.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-09-10 Thread Kazushi Marukawa via Phabricator via cfe-commits
kaz7 added a comment.

I cannot write inline comment, so I'm leaving message here.

openmp/runtime/test/target/target_thread_limit.cpp:

> // checking consecutive target regions with different thread_limits
> #pragma omp target thread_limit(3)
>
>   {
> printf("\nsecond target: thread_limit = %d", omp_get_thread_limit());
>
> // OMP51: second target: thread_limit = 3

Our VE architecture supports only OpenMP runtime.  It doesn't support 
libomptarget.  If I run check-openmp on our machine, this 
omp_get_thread_limit() returns default thread limit 2147483647 (=0x7fff).  
I guess it is OK because this pragma specifies only target and our VE doensn't 
support target.  Other pragmas are containing not only target but also other 
keyword like parallel, so I guess others are running well.  I'm not clear about 
omptarget, so my assumptions here may be wrong, though.

My question is what is the best way to correct the behavior of this test?  I 
appreciate any comments or suggestions.  Thank you!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-09-07 Thread Martin Storsjö via Phabricator via cfe-commits
mstorsjo added inline comments.



Comment at: openmp/runtime/test/target/target_thread_limit.cpp:28
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51-NOT: target: parallel

sandeepkosuri wrote:
> mstorsjo wrote:
> > mstorsjo wrote:
> > > This test fails when running (on Windows) on GitHub Actions runners - see 
> > > https://github.com/mstorsjo/llvm-mingw/actions/runs/6019088705/job/16342540379.
> > > 
> > > I believe that this bit of the test has got a hidden assumption that it 
> > > is running in an environment with 4 or more cores. By setting `#pragma 
> > > omp target thread_limit(tl)` (with `tl=4`) and running a line in parallel 
> > > with `#pragma omp parallel`, it expects that we'll get 4 printouts - 
> > > while in practice, we'll get anywhere between 1 and 4 printouts depending 
> > > on the number of cores.
> > > 
> > > Is there something that can be done to make this test work in such an 
> > > environment too?
> > Can someone involved in this patch take on fixing it so that it works on 
> > machines with fewer than 4 cores? I'm not sure what's the most appropriate 
> > path forward here, as it breaks clearly in such configs (even if it might 
> > not be hit by one of the official llvm buildbots, but it shows up as 
> > breakage in my nightly builds every day now) - reverting seems a bit harsh. 
> > I guess I could just rip out this part of the test?
> @mstorsjo , I noticed that you have committed this 
> https://github.com/llvm/llvm-project/commit/c2019c416c8d7ec50aec6ac6b82c9aa4e99b0f6f
> 
> Does this solve your problem ?
Yes, that commit fixed the issue.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-09-07 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri added inline comments.



Comment at: openmp/runtime/test/target/target_thread_limit.cpp:28
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51-NOT: target: parallel

mstorsjo wrote:
> mstorsjo wrote:
> > This test fails when running (on Windows) on GitHub Actions runners - see 
> > https://github.com/mstorsjo/llvm-mingw/actions/runs/6019088705/job/16342540379.
> > 
> > I believe that this bit of the test has got a hidden assumption that it is 
> > running in an environment with 4 or more cores. By setting `#pragma omp 
> > target thread_limit(tl)` (with `tl=4`) and running a line in parallel with 
> > `#pragma omp parallel`, it expects that we'll get 4 printouts - while in 
> > practice, we'll get anywhere between 1 and 4 printouts depending on the 
> > number of cores.
> > 
> > Is there something that can be done to make this test work in such an 
> > environment too?
> Can someone involved in this patch take on fixing it so that it works on 
> machines with fewer than 4 cores? I'm not sure what's the most appropriate 
> path forward here, as it breaks clearly in such configs (even if it might not 
> be hit by one of the official llvm buildbots, but it shows up as breakage in 
> my nightly builds every day now) - reverting seems a bit harsh. I guess I 
> could just rip out this part of the test?
@mstorsjo , I noticed that you have committed this 
https://github.com/llvm/llvm-project/commit/c2019c416c8d7ec50aec6ac6b82c9aa4e99b0f6f

Does this solve your problem ?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-31 Thread Martin Storsjö via Phabricator via cfe-commits
mstorsjo added inline comments.



Comment at: openmp/runtime/test/target/target_thread_limit.cpp:28
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51-NOT: target: parallel

mstorsjo wrote:
> This test fails when running (on Windows) on GitHub Actions runners - see 
> https://github.com/mstorsjo/llvm-mingw/actions/runs/6019088705/job/16342540379.
> 
> I believe that this bit of the test has got a hidden assumption that it is 
> running in an environment with 4 or more cores. By setting `#pragma omp 
> target thread_limit(tl)` (with `tl=4`) and running a line in parallel with 
> `#pragma omp parallel`, it expects that we'll get 4 printouts - while in 
> practice, we'll get anywhere between 1 and 4 printouts depending on the 
> number of cores.
> 
> Is there something that can be done to make this test work in such an 
> environment too?
Can someone involved in this patch take on fixing it so that it works on 
machines with fewer than 4 cores? I'm not sure what's the most appropriate path 
forward here, as it breaks clearly in such configs (even if it might not be hit 
by one of the official llvm buildbots, but it shows up as breakage in my 
nightly builds every day now) - reverting seems a bit harsh. I guess I could 
just rip out this part of the test?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-30 Thread Martin Storsjö via Phabricator via cfe-commits
mstorsjo added inline comments.



Comment at: openmp/runtime/test/target/target_thread_limit.cpp:28
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51-NOT: target: parallel

This test fails when running (on Windows) on GitHub Actions runners - see 
https://github.com/mstorsjo/llvm-mingw/actions/runs/6019088705/job/16342540379.

I believe that this bit of the test has got a hidden assumption that it is 
running in an environment with 4 or more cores. By setting `#pragma omp target 
thread_limit(tl)` (with `tl=4`) and running a line in parallel with `#pragma 
omp parallel`, it expects that we'll get 4 printouts - while in practice, we'll 
get anywhere between 1 and 4 printouts depending on the number of cores.

Is there something that can be done to make this test work in such an 
environment too?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-28 Thread Martin Storsjö via Phabricator via cfe-commits
mstorsjo added a comment.

In D152054#4622642 , @vadikp-intel 
wrote:

> Windows importing is now done by name, and new exports do not need to have an 
> ordinal specified for them i.e. you can add a line with just the API name to 
> dllexports.

Oh, right, thanks. Will do.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-28 Thread Vadim Paretsky via Phabricator via cfe-commits
vadikp-intel added a comment.

Windows importing is now done by name, and new exports do not need to have an 
ordinal specified for them i.e. you can add a line with just the API name to 
dllexports.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-28 Thread Martin Storsjö via Phabricator via cfe-commits
mstorsjo added subscribers: vadikp-intel, natgla.
mstorsjo added a comment.

In D152054#4620927 , @mstorsjo wrote:

> This new test is failing on Windows, due to `__kmpc_set_thread_limit` not 
> being exported - see e.g. 
> https://github.com/mstorsjo/llvm-mingw/actions/runs/5994183421/job/16264501555.
>  Can someone add it to `openmp/runtime/src/dllexports`?

CC @vadikp-intel @natgla about this. What's the procedure for allocating new 
ordinal numbers for new exported functions here?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-28 Thread Martin Storsjö via Phabricator via cfe-commits
mstorsjo added a comment.

This new test is failing on Windows, due to `__kmpc_set_thread_limit` not being 
exported - see e.g. 
https://github.com/mstorsjo/llvm-mingw/actions/runs/5994183421/job/16264501555. 
Can someone add it to `openmp/runtime/src/dllexports`?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-26 Thread Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG08bbff4aad57: [OpenMP] Codegen support for thread_limit on 
target directive for host (authored by sandeepkosuri, committed by Sandeep 
Kosuri kosu...@pe28vega.us.cray.com).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

Files:
  clang/include/clang/Basic/OpenMPKinds.h
  clang/lib/Basic/OpenMPKinds.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.h
  clang/lib/CodeGen/CGStmtOpenMP.cpp
  clang/lib/Sema/SemaOpenMP.cpp
  clang/test/OpenMP/target_codegen.cpp
  clang/test/OpenMP/target_parallel_for_simd_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_for_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_generic_loop_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_tl_codegen.cpp
  clang/test/OpenMP/target_simd_tl_codegen.cpp
  llvm/include/llvm/Frontend/OpenMP/OMP.td
  llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
  openmp/runtime/src/kmp.h
  openmp/runtime/src/kmp_csupport.cpp
  openmp/runtime/src/kmp_ftn_entry.h
  openmp/runtime/src/kmp_global.cpp
  openmp/runtime/src/kmp_runtime.cpp
  openmp/runtime/test/target/target_thread_limit.cpp

Index: openmp/runtime/test/target/target_thread_limit.cpp
===
--- /dev/null
+++ openmp/runtime/test/target/target_thread_limit.cpp
@@ -0,0 +1,168 @@
+// RUN: %libomp-cxx-compile -fopenmp-version=51
+// RUN: %libomp-run | FileCheck %s --check-prefix OMP51
+
+#include 
+#include 
+
+void foo() {
+#pragma omp parallel num_threads(10)
+  { printf("\ntarget: foo(): parallel num_threads(10)"); }
+}
+
+int main(void) {
+
+  int tl = 4;
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+
+#pragma omp target thread_limit(tl)
+  {
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+// check whether thread_limit is honoured
+#pragma omp parallel
+{ printf("\ntarget: parallel"); }
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51-NOT: target: parallel
+
+// check whether num_threads is honoured
+#pragma omp parallel num_threads(2)
+{ printf("\ntarget: parallel num_threads(2)"); }
+// OMP51: target: parallel num_threads(2)
+// OMP51: target: parallel num_threads(2)
+// OMP51-NOT: target: parallel num_threads(2)
+
+// check whether thread_limit is honoured when there is a conflicting
+// num_threads
+#pragma omp parallel num_threads(10)
+{ printf("\ntarget: parallel num_threads(10)"); }
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51-NOT: target: parallel num_threads(10)
+
+// check whether threads are limited across functions
+foo();
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51-NOT: target: foo(): parallel num_threads(10)
+
+// check if user can set num_threads at runtime
+omp_set_num_threads(2);
+#pragma omp parallel
+{ printf("\ntarget: parallel with omp_set_num_thread(2)"); }
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51-NOT: target: parallel with omp_set_num_thread(2)
+
+// make sure thread_limit is unaffected by omp_set_num_threads
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+  }
+
+// checking consecutive target regions with different thread_limits
+#pragma omp target thread_limit(3)
+  {
+printf("\nsecond target: thread_limit = %d", omp_get_thread_limit());
+// OMP51: second target: thread_limit = 3
+#pragma omp parallel
+{ printf("\nsecond target: parallel"); }
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51-NOT: second target: parallel
+  }
+
+  // confirm that thread_limit's effects are limited to target region
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+#pragma omp parallel num_threads(10)
+  { printf("\nmain: parallel num_threads(10)"); }
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // 

[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-25 Thread Alexey Bataev via Phabricator via cfe-commits
ABataev accepted this revision.
ABataev added a comment.

LG


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-25 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri updated this revision to Diff 553487.
sandeepkosuri added a comment.

made new LIT test cases target specific to linux


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

Files:
  clang/include/clang/Basic/OpenMPKinds.h
  clang/lib/Basic/OpenMPKinds.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.h
  clang/lib/CodeGen/CGStmtOpenMP.cpp
  clang/lib/Sema/SemaOpenMP.cpp
  clang/test/OpenMP/target_codegen.cpp
  clang/test/OpenMP/target_parallel_for_simd_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_for_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_generic_loop_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_tl_codegen.cpp
  clang/test/OpenMP/target_simd_tl_codegen.cpp
  llvm/include/llvm/Frontend/OpenMP/OMP.td
  llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
  openmp/runtime/src/kmp.h
  openmp/runtime/src/kmp_csupport.cpp
  openmp/runtime/src/kmp_ftn_entry.h
  openmp/runtime/src/kmp_global.cpp
  openmp/runtime/src/kmp_runtime.cpp
  openmp/runtime/test/target/target_thread_limit.cpp

Index: openmp/runtime/test/target/target_thread_limit.cpp
===
--- /dev/null
+++ openmp/runtime/test/target/target_thread_limit.cpp
@@ -0,0 +1,168 @@
+// RUN: %libomp-cxx-compile -fopenmp-version=51
+// RUN: %libomp-run | FileCheck %s --check-prefix OMP51
+
+#include 
+#include 
+
+void foo() {
+#pragma omp parallel num_threads(10)
+  { printf("\ntarget: foo(): parallel num_threads(10)"); }
+}
+
+int main(void) {
+
+  int tl = 4;
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+
+#pragma omp target thread_limit(tl)
+  {
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+// check whether thread_limit is honoured
+#pragma omp parallel
+{ printf("\ntarget: parallel"); }
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51-NOT: target: parallel
+
+// check whether num_threads is honoured
+#pragma omp parallel num_threads(2)
+{ printf("\ntarget: parallel num_threads(2)"); }
+// OMP51: target: parallel num_threads(2)
+// OMP51: target: parallel num_threads(2)
+// OMP51-NOT: target: parallel num_threads(2)
+
+// check whether thread_limit is honoured when there is a conflicting
+// num_threads
+#pragma omp parallel num_threads(10)
+{ printf("\ntarget: parallel num_threads(10)"); }
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51-NOT: target: parallel num_threads(10)
+
+// check whether threads are limited across functions
+foo();
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51-NOT: target: foo(): parallel num_threads(10)
+
+// check if user can set num_threads at runtime
+omp_set_num_threads(2);
+#pragma omp parallel
+{ printf("\ntarget: parallel with omp_set_num_thread(2)"); }
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51-NOT: target: parallel with omp_set_num_thread(2)
+
+// make sure thread_limit is unaffected by omp_set_num_threads
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+  }
+
+// checking consecutive target regions with different thread_limits
+#pragma omp target thread_limit(3)
+  {
+printf("\nsecond target: thread_limit = %d", omp_get_thread_limit());
+// OMP51: second target: thread_limit = 3
+#pragma omp parallel
+{ printf("\nsecond target: parallel"); }
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51-NOT: second target: parallel
+  }
+
+  // confirm that thread_limit's effects are limited to target region
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+#pragma omp parallel num_threads(10)
+  { printf("\nmain: parallel num_threads(10)"); }
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51-NOT: main: parallel num_threads(10)
+
+// check combined target directives which 

[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-25 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri updated this revision to Diff 553405.
sandeepkosuri added a comment.

Made LIT test cases more robust to check lines ordering problem


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

Files:
  clang/include/clang/Basic/OpenMPKinds.h
  clang/lib/Basic/OpenMPKinds.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.h
  clang/lib/CodeGen/CGStmtOpenMP.cpp
  clang/lib/Sema/SemaOpenMP.cpp
  clang/test/OpenMP/target_codegen.cpp
  clang/test/OpenMP/target_parallel_for_simd_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_for_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_generic_loop_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_tl_codegen.cpp
  clang/test/OpenMP/target_simd_tl_codegen.cpp
  llvm/include/llvm/Frontend/OpenMP/OMP.td
  llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
  openmp/runtime/src/kmp.h
  openmp/runtime/src/kmp_csupport.cpp
  openmp/runtime/src/kmp_ftn_entry.h
  openmp/runtime/src/kmp_global.cpp
  openmp/runtime/src/kmp_runtime.cpp
  openmp/runtime/test/target/target_thread_limit.cpp

Index: openmp/runtime/test/target/target_thread_limit.cpp
===
--- /dev/null
+++ openmp/runtime/test/target/target_thread_limit.cpp
@@ -0,0 +1,168 @@
+// RUN: %libomp-cxx-compile -fopenmp-version=51
+// RUN: %libomp-run | FileCheck %s --check-prefix OMP51
+
+#include 
+#include 
+
+void foo() {
+#pragma omp parallel num_threads(10)
+  { printf("\ntarget: foo(): parallel num_threads(10)"); }
+}
+
+int main(void) {
+
+  int tl = 4;
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+
+#pragma omp target thread_limit(tl)
+  {
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+// check whether thread_limit is honoured
+#pragma omp parallel
+{ printf("\ntarget: parallel"); }
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51-NOT: target: parallel
+
+// check whether num_threads is honoured
+#pragma omp parallel num_threads(2)
+{ printf("\ntarget: parallel num_threads(2)"); }
+// OMP51: target: parallel num_threads(2)
+// OMP51: target: parallel num_threads(2)
+// OMP51-NOT: target: parallel num_threads(2)
+
+// check whether thread_limit is honoured when there is a conflicting
+// num_threads
+#pragma omp parallel num_threads(10)
+{ printf("\ntarget: parallel num_threads(10)"); }
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51-NOT: target: parallel num_threads(10)
+
+// check whether threads are limited across functions
+foo();
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51-NOT: target: foo(): parallel num_threads(10)
+
+// check if user can set num_threads at runtime
+omp_set_num_threads(2);
+#pragma omp parallel
+{ printf("\ntarget: parallel with omp_set_num_thread(2)"); }
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51-NOT: target: parallel with omp_set_num_thread(2)
+
+// make sure thread_limit is unaffected by omp_set_num_threads
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+  }
+
+// checking consecutive target regions with different thread_limits
+#pragma omp target thread_limit(3)
+  {
+printf("\nsecond target: thread_limit = %d", omp_get_thread_limit());
+// OMP51: second target: thread_limit = 3
+#pragma omp parallel
+{ printf("\nsecond target: parallel"); }
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51-NOT: second target: parallel
+  }
+
+  // confirm that thread_limit's effects are limited to target region
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+#pragma omp parallel num_threads(10)
+  { printf("\nmain: parallel num_threads(10)"); }
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51-NOT: main: parallel num_threads(10)
+
+// check combined target 

[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-24 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri updated this revision to Diff 553358.
sandeepkosuri added a comment.

Used `CHECK-DAG` s to avoid LIT test failures on Windows system


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

Files:
  clang/include/clang/Basic/OpenMPKinds.h
  clang/lib/Basic/OpenMPKinds.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.h
  clang/lib/CodeGen/CGStmtOpenMP.cpp
  clang/lib/Sema/SemaOpenMP.cpp
  clang/test/OpenMP/target_codegen.cpp
  clang/test/OpenMP/target_parallel_for_simd_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_for_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_generic_loop_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_tl_codegen.cpp
  clang/test/OpenMP/target_simd_tl_codegen.cpp
  llvm/include/llvm/Frontend/OpenMP/OMP.td
  llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
  openmp/runtime/src/kmp.h
  openmp/runtime/src/kmp_csupport.cpp
  openmp/runtime/src/kmp_ftn_entry.h
  openmp/runtime/src/kmp_global.cpp
  openmp/runtime/src/kmp_runtime.cpp
  openmp/runtime/test/target/target_thread_limit.cpp

Index: openmp/runtime/test/target/target_thread_limit.cpp
===
--- /dev/null
+++ openmp/runtime/test/target/target_thread_limit.cpp
@@ -0,0 +1,168 @@
+// RUN: %libomp-cxx-compile -fopenmp-version=51
+// RUN: %libomp-run | FileCheck %s --check-prefix OMP51
+
+#include 
+#include 
+
+void foo() {
+#pragma omp parallel num_threads(10)
+  { printf("\ntarget: foo(): parallel num_threads(10)"); }
+}
+
+int main(void) {
+
+  int tl = 4;
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+
+#pragma omp target thread_limit(tl)
+  {
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+// check whether thread_limit is honoured
+#pragma omp parallel
+{ printf("\ntarget: parallel"); }
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51-NOT: target: parallel
+
+// check whether num_threads is honoured
+#pragma omp parallel num_threads(2)
+{ printf("\ntarget: parallel num_threads(2)"); }
+// OMP51: target: parallel num_threads(2)
+// OMP51: target: parallel num_threads(2)
+// OMP51-NOT: target: parallel num_threads(2)
+
+// check whether thread_limit is honoured when there is a conflicting
+// num_threads
+#pragma omp parallel num_threads(10)
+{ printf("\ntarget: parallel num_threads(10)"); }
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51-NOT: target: parallel num_threads(10)
+
+// check whether threads are limited across functions
+foo();
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51-NOT: target: foo(): parallel num_threads(10)
+
+// check if user can set num_threads at runtime
+omp_set_num_threads(2);
+#pragma omp parallel
+{ printf("\ntarget: parallel with omp_set_num_thread(2)"); }
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51-NOT: target: parallel with omp_set_num_thread(2)
+
+// make sure thread_limit is unaffected by omp_set_num_threads
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+  }
+
+// checking consecutive target regions with different thread_limits
+#pragma omp target thread_limit(3)
+  {
+printf("\nsecond target: thread_limit = %d", omp_get_thread_limit());
+// OMP51: second target: thread_limit = 3
+#pragma omp parallel
+{ printf("\nsecond target: parallel"); }
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51-NOT: second target: parallel
+  }
+
+  // confirm that thread_limit's effects are limited to target region
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+#pragma omp parallel num_threads(10)
+  { printf("\nmain: parallel num_threads(10)"); }
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51-NOT: main: parallel num_threads(10)
+
+// check combined target 

[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-24 Thread Alexey Bataev via Phabricator via cfe-commits
ABataev accepted this revision.
ABataev added a comment.

LG


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-24 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri updated this revision to Diff 553139.
sandeepkosuri added a comment.

Edited the LIT test cases to use more script generated check lines.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

Files:
  clang/include/clang/Basic/OpenMPKinds.h
  clang/lib/Basic/OpenMPKinds.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.h
  clang/lib/CodeGen/CGStmtOpenMP.cpp
  clang/lib/Sema/SemaOpenMP.cpp
  clang/test/OpenMP/target_codegen.cpp
  clang/test/OpenMP/target_parallel_for_simd_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_for_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_generic_loop_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_tl_codegen.cpp
  clang/test/OpenMP/target_simd_tl_codegen.cpp
  llvm/include/llvm/Frontend/OpenMP/OMP.td
  llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
  openmp/runtime/src/kmp.h
  openmp/runtime/src/kmp_csupport.cpp
  openmp/runtime/src/kmp_ftn_entry.h
  openmp/runtime/src/kmp_global.cpp
  openmp/runtime/src/kmp_runtime.cpp
  openmp/runtime/test/target/target_thread_limit.cpp

Index: openmp/runtime/test/target/target_thread_limit.cpp
===
--- /dev/null
+++ openmp/runtime/test/target/target_thread_limit.cpp
@@ -0,0 +1,168 @@
+// RUN: %libomp-cxx-compile -fopenmp-version=51
+// RUN: %libomp-run | FileCheck %s --check-prefix OMP51
+
+#include 
+#include 
+
+void foo() {
+#pragma omp parallel num_threads(10)
+  { printf("\ntarget: foo(): parallel num_threads(10)"); }
+}
+
+int main(void) {
+
+  int tl = 4;
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+
+#pragma omp target thread_limit(tl)
+  {
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+// check whether thread_limit is honoured
+#pragma omp parallel
+{ printf("\ntarget: parallel"); }
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51-NOT: target: parallel
+
+// check whether num_threads is honoured
+#pragma omp parallel num_threads(2)
+{ printf("\ntarget: parallel num_threads(2)"); }
+// OMP51: target: parallel num_threads(2)
+// OMP51: target: parallel num_threads(2)
+// OMP51-NOT: target: parallel num_threads(2)
+
+// check whether thread_limit is honoured when there is a conflicting
+// num_threads
+#pragma omp parallel num_threads(10)
+{ printf("\ntarget: parallel num_threads(10)"); }
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51-NOT: target: parallel num_threads(10)
+
+// check whether threads are limited across functions
+foo();
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51-NOT: target: foo(): parallel num_threads(10)
+
+// check if user can set num_threads at runtime
+omp_set_num_threads(2);
+#pragma omp parallel
+{ printf("\ntarget: parallel with omp_set_num_thread(2)"); }
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51-NOT: target: parallel with omp_set_num_thread(2)
+
+// make sure thread_limit is unaffected by omp_set_num_threads
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+  }
+
+// checking consecutive target regions with different thread_limits
+#pragma omp target thread_limit(3)
+  {
+printf("\nsecond target: thread_limit = %d", omp_get_thread_limit());
+// OMP51: second target: thread_limit = 3
+#pragma omp parallel
+{ printf("\nsecond target: parallel"); }
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51-NOT: second target: parallel
+  }
+
+  // confirm that thread_limit's effects are limited to target region
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+#pragma omp parallel num_threads(10)
+  { printf("\nmain: parallel num_threads(10)"); }
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51-NOT: main: parallel num_threads(10)
+
+// check combined target 

[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-24 Thread Alexey Bataev via Phabricator via cfe-commits
ABataev added inline comments.



Comment at: clang/test/OpenMP/target_parallel_for_simd_tl_codegen.cpp:30
+// OMP51-NEXT:  entry:
+// OMP51-NEXT:[[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
+// OMP51-NEXT:[[DOTPART_ID__ADDR_I:%.*]] = alloca ptr, align 8

sandeepkosuri wrote:
> ABataev wrote:
> > sandeepkosuri wrote:
> > > ABataev wrote:
> > > > Why removed these checks?
> > > I did not remove any check lines in this function.
> > > But I removed checks in `omp_task_entry` function that were not related 
> > > to my changes, to avoid failures. I only wanted to check whether 
> > > `__kmpc_set_thread_limit()` is called.
> > > 
> > > Same for all the other test cases.
> > Better to restore it to be able to use the script in future without many 
> > changes
> But a few check lines are failing on windows, while passing on debian.
It must be investigated, it is bad idea just to remove these checks. Most 
probably related to the order of the expressions emission.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-24 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri marked an inline comment as done.
sandeepkosuri added inline comments.



Comment at: clang/test/OpenMP/target_parallel_for_simd_tl_codegen.cpp:30
+// OMP51-NEXT:  entry:
+// OMP51-NEXT:[[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
+// OMP51-NEXT:[[DOTPART_ID__ADDR_I:%.*]] = alloca ptr, align 8

ABataev wrote:
> sandeepkosuri wrote:
> > ABataev wrote:
> > > Why removed these checks?
> > I did not remove any check lines in this function.
> > But I removed checks in `omp_task_entry` function that were not related to 
> > my changes, to avoid failures. I only wanted to check whether 
> > `__kmpc_set_thread_limit()` is called.
> > 
> > Same for all the other test cases.
> Better to restore it to be able to use the script in future without many 
> changes
But a few check lines are failing on windows, while passing on debian.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-24 Thread Alexey Bataev via Phabricator via cfe-commits
ABataev added inline comments.



Comment at: clang/test/OpenMP/target_parallel_for_simd_tl_codegen.cpp:30
+// OMP51-NEXT:  entry:
+// OMP51-NEXT:[[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
+// OMP51-NEXT:[[DOTPART_ID__ADDR_I:%.*]] = alloca ptr, align 8

sandeepkosuri wrote:
> ABataev wrote:
> > Why removed these checks?
> I did not remove any check lines in this function.
> But I removed checks in `omp_task_entry` function that were not related to my 
> changes, to avoid failures. I only wanted to check whether 
> `__kmpc_set_thread_limit()` is called.
> 
> Same for all the other test cases.
Better to restore it to be able to use the script in future without many changes


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-23 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri marked an inline comment as done.
sandeepkosuri added inline comments.



Comment at: clang/test/OpenMP/target_parallel_for_simd_tl_codegen.cpp:30
+// OMP51-NEXT:  entry:
+// OMP51-NEXT:[[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
+// OMP51-NEXT:[[DOTPART_ID__ADDR_I:%.*]] = alloca ptr, align 8

ABataev wrote:
> Why removed these checks?
I did not remove any check lines in this function.
But I removed checks in `omp_task_entry` function that were not related to my 
changes, to avoid failures. I only wanted to check whether 
`__kmpc_set_thread_limit()` is called.

Same for all the other test cases.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-23 Thread Alexey Bataev via Phabricator via cfe-commits
ABataev added inline comments.



Comment at: clang/test/OpenMP/target_parallel_for_simd_tl_codegen.cpp:30
+// OMP51-NEXT:  entry:
+// OMP51-NEXT:[[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
+// OMP51-NEXT:[[DOTPART_ID__ADDR_I:%.*]] = alloca ptr, align 8

Why removed these checks?



Comment at: clang/test/OpenMP/target_parallel_for_tl_codegen.cpp:30
+// OMP51-NEXT:  entry:
+// OMP51-NEXT:[[DOTGLOBAL_TID__ADDR_I:%.*]] = alloca i32, align 4
+// OMP51-NEXT:[[DOTPART_ID__ADDR_I:%.*]] = alloca ptr, align 8

Same, why it was removed?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-23 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri updated this revision to Diff 552725.
sandeepkosuri added a comment.

Added PCH options to the RUN lines in LIT tests


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

Files:
  clang/include/clang/Basic/OpenMPKinds.h
  clang/lib/Basic/OpenMPKinds.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.h
  clang/lib/CodeGen/CGStmtOpenMP.cpp
  clang/lib/Sema/SemaOpenMP.cpp
  clang/test/OpenMP/target_codegen.cpp
  clang/test/OpenMP/target_parallel_for_simd_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_for_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_generic_loop_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_tl_codegen.cpp
  clang/test/OpenMP/target_simd_tl_codegen.cpp
  llvm/include/llvm/Frontend/OpenMP/OMP.td
  llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
  openmp/runtime/src/kmp.h
  openmp/runtime/src/kmp_csupport.cpp
  openmp/runtime/src/kmp_ftn_entry.h
  openmp/runtime/src/kmp_global.cpp
  openmp/runtime/src/kmp_runtime.cpp
  openmp/runtime/test/target/target_thread_limit.cpp

Index: openmp/runtime/test/target/target_thread_limit.cpp
===
--- /dev/null
+++ openmp/runtime/test/target/target_thread_limit.cpp
@@ -0,0 +1,168 @@
+// RUN: %libomp-cxx-compile -fopenmp-version=51
+// RUN: %libomp-run | FileCheck %s --check-prefix OMP51
+
+#include 
+#include 
+
+void foo() {
+#pragma omp parallel num_threads(10)
+  { printf("\ntarget: foo(): parallel num_threads(10)"); }
+}
+
+int main(void) {
+
+  int tl = 4;
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+
+#pragma omp target thread_limit(tl)
+  {
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+// check whether thread_limit is honoured
+#pragma omp parallel
+{ printf("\ntarget: parallel"); }
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51-NOT: target: parallel
+
+// check whether num_threads is honoured
+#pragma omp parallel num_threads(2)
+{ printf("\ntarget: parallel num_threads(2)"); }
+// OMP51: target: parallel num_threads(2)
+// OMP51: target: parallel num_threads(2)
+// OMP51-NOT: target: parallel num_threads(2)
+
+// check whether thread_limit is honoured when there is a conflicting
+// num_threads
+#pragma omp parallel num_threads(10)
+{ printf("\ntarget: parallel num_threads(10)"); }
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51-NOT: target: parallel num_threads(10)
+
+// check whether threads are limited across functions
+foo();
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51-NOT: target: foo(): parallel num_threads(10)
+
+// check if user can set num_threads at runtime
+omp_set_num_threads(2);
+#pragma omp parallel
+{ printf("\ntarget: parallel with omp_set_num_thread(2)"); }
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51-NOT: target: parallel with omp_set_num_thread(2)
+
+// make sure thread_limit is unaffected by omp_set_num_threads
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+  }
+
+// checking consecutive target regions with different thread_limits
+#pragma omp target thread_limit(3)
+  {
+printf("\nsecond target: thread_limit = %d", omp_get_thread_limit());
+// OMP51: second target: thread_limit = 3
+#pragma omp parallel
+{ printf("\nsecond target: parallel"); }
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51-NOT: second target: parallel
+  }
+
+  // confirm that thread_limit's effects are limited to target region
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+#pragma omp parallel num_threads(10)
+  { printf("\nmain: parallel num_threads(10)"); }
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51-NOT: main: parallel num_threads(10)
+
+// check combined target directives which 

[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-09 Thread Alexey Bataev via Phabricator via cfe-commits
ABataev added inline comments.



Comment at: clang/test/OpenMP/target_parallel_for_simd_tl_codegen.cpp:4
+
+// RUN: %clang_cc1 -fopenmp -fopenmp-version=51 -emit-llvm %s -o - | FileCheck 
--check-prefix=OMP51 %s
+

Add PCH serialization/desrialization checks in your tests


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-09 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri updated this revision to Diff 548518.
sandeepkosuri added a comment.

Used the python script `update_cc_test_checks.py` to generate the checks for 
the newly added tests.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

Files:
  clang/include/clang/Basic/OpenMPKinds.h
  clang/lib/Basic/OpenMPKinds.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.h
  clang/lib/CodeGen/CGStmtOpenMP.cpp
  clang/lib/Sema/SemaOpenMP.cpp
  clang/test/OpenMP/target_codegen.cpp
  clang/test/OpenMP/target_parallel_for_simd_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_for_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_generic_loop_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_tl_codegen.cpp
  clang/test/OpenMP/target_simd_tl_codegen.cpp
  llvm/include/llvm/Frontend/OpenMP/OMP.td
  llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
  openmp/runtime/src/kmp.h
  openmp/runtime/src/kmp_csupport.cpp
  openmp/runtime/src/kmp_ftn_entry.h
  openmp/runtime/src/kmp_global.cpp
  openmp/runtime/src/kmp_runtime.cpp
  openmp/runtime/test/target/target_thread_limit.cpp

Index: openmp/runtime/test/target/target_thread_limit.cpp
===
--- /dev/null
+++ openmp/runtime/test/target/target_thread_limit.cpp
@@ -0,0 +1,168 @@
+// RUN: %libomp-cxx-compile -fopenmp-version=51
+// RUN: %libomp-run | FileCheck %s --check-prefix OMP51
+
+#include 
+#include 
+
+void foo() {
+#pragma omp parallel num_threads(10)
+  { printf("\ntarget: foo(): parallel num_threads(10)"); }
+}
+
+int main(void) {
+
+  int tl = 4;
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+
+#pragma omp target thread_limit(tl)
+  {
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+// check whether thread_limit is honoured
+#pragma omp parallel
+{ printf("\ntarget: parallel"); }
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51-NOT: target: parallel
+
+// check whether num_threads is honoured
+#pragma omp parallel num_threads(2)
+{ printf("\ntarget: parallel num_threads(2)"); }
+// OMP51: target: parallel num_threads(2)
+// OMP51: target: parallel num_threads(2)
+// OMP51-NOT: target: parallel num_threads(2)
+
+// check whether thread_limit is honoured when there is a conflicting
+// num_threads
+#pragma omp parallel num_threads(10)
+{ printf("\ntarget: parallel num_threads(10)"); }
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51-NOT: target: parallel num_threads(10)
+
+// check whether threads are limited across functions
+foo();
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51-NOT: target: foo(): parallel num_threads(10)
+
+// check if user can set num_threads at runtime
+omp_set_num_threads(2);
+#pragma omp parallel
+{ printf("\ntarget: parallel with omp_set_num_thread(2)"); }
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51-NOT: target: parallel with omp_set_num_thread(2)
+
+// make sure thread_limit is unaffected by omp_set_num_threads
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+  }
+
+// checking consecutive target regions with different thread_limits
+#pragma omp target thread_limit(3)
+  {
+printf("\nsecond target: thread_limit = %d", omp_get_thread_limit());
+// OMP51: second target: thread_limit = 3
+#pragma omp parallel
+{ printf("\nsecond target: parallel"); }
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51-NOT: second target: parallel
+  }
+
+  // confirm that thread_limit's effects are limited to target region
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+#pragma omp parallel num_threads(10)
+  { printf("\nmain: parallel num_threads(10)"); }
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51-NOT: main: parallel 

[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-08 Thread Alexey Bataev via Phabricator via cfe-commits
ABataev added a comment.

Please, use the script to generate the checks for the newly added tests


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-08 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri updated this revision to Diff 548082.
sandeepkosuri added a comment.

- Updated `SemaOpenMP.cpp` to support `thread_limit` clause on the newly 
allowed directives.

- This update is to fix the newly added LIT tests' failures (which were 
occurring only on debug build)


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

Files:
  clang/include/clang/Basic/OpenMPKinds.h
  clang/lib/Basic/OpenMPKinds.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.h
  clang/lib/CodeGen/CGStmtOpenMP.cpp
  clang/lib/Sema/SemaOpenMP.cpp
  clang/test/OpenMP/target_codegen.cpp
  clang/test/OpenMP/target_parallel_for_simd_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_for_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_generic_loop_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_tl_codegen.cpp
  clang/test/OpenMP/target_simd_tl_codegen.cpp
  llvm/include/llvm/Frontend/OpenMP/OMP.td
  llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
  openmp/runtime/src/kmp.h
  openmp/runtime/src/kmp_csupport.cpp
  openmp/runtime/src/kmp_ftn_entry.h
  openmp/runtime/src/kmp_global.cpp
  openmp/runtime/src/kmp_runtime.cpp
  openmp/runtime/test/target/target_thread_limit.cpp

Index: openmp/runtime/test/target/target_thread_limit.cpp
===
--- /dev/null
+++ openmp/runtime/test/target/target_thread_limit.cpp
@@ -0,0 +1,168 @@
+// RUN: %libomp-cxx-compile -fopenmp-version=51
+// RUN: %libomp-run | FileCheck %s --check-prefix OMP51
+
+#include 
+#include 
+
+void foo() {
+#pragma omp parallel num_threads(10)
+  { printf("\ntarget: foo(): parallel num_threads(10)"); }
+}
+
+int main(void) {
+
+  int tl = 4;
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+
+#pragma omp target thread_limit(tl)
+  {
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+// check whether thread_limit is honoured
+#pragma omp parallel
+{ printf("\ntarget: parallel"); }
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51-NOT: target: parallel
+
+// check whether num_threads is honoured
+#pragma omp parallel num_threads(2)
+{ printf("\ntarget: parallel num_threads(2)"); }
+// OMP51: target: parallel num_threads(2)
+// OMP51: target: parallel num_threads(2)
+// OMP51-NOT: target: parallel num_threads(2)
+
+// check whether thread_limit is honoured when there is a conflicting
+// num_threads
+#pragma omp parallel num_threads(10)
+{ printf("\ntarget: parallel num_threads(10)"); }
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51-NOT: target: parallel num_threads(10)
+
+// check whether threads are limited across functions
+foo();
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51-NOT: target: foo(): parallel num_threads(10)
+
+// check if user can set num_threads at runtime
+omp_set_num_threads(2);
+#pragma omp parallel
+{ printf("\ntarget: parallel with omp_set_num_thread(2)"); }
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51-NOT: target: parallel with omp_set_num_thread(2)
+
+// make sure thread_limit is unaffected by omp_set_num_threads
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+  }
+
+// checking consecutive target regions with different thread_limits
+#pragma omp target thread_limit(3)
+  {
+printf("\nsecond target: thread_limit = %d", omp_get_thread_limit());
+// OMP51: second target: thread_limit = 3
+#pragma omp parallel
+{ printf("\nsecond target: parallel"); }
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51-NOT: second target: parallel
+  }
+
+  // confirm that thread_limit's effects are limited to target region
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+#pragma omp parallel num_threads(10)
+  { printf("\nmain: parallel num_threads(10)"); }
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel 

[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-07 Thread Alexey Bataev via Phabricator via cfe-commits
ABataev accepted this revision.
ABataev added a comment.
This revision is now accepted and ready to land.

LG


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-06 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri added a comment.

In D152054#4560353 , @tianshilei1992 
wrote:

> Is this patch to support `thread_limit` on `target` directive on the host?

Yes @tianshilei1992 , It is for host only


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-04 Thread Shilei Tian via Phabricator via cfe-commits
tianshilei1992 added a comment.

Is this patch to support `thread_limit` on `target` directive on the host?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-08-04 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri added inline comments.



Comment at: clang/lib/CodeGen/CGOpenMPRuntime.cpp:9866
+  (CGM.getLangOpts().OpenMP >= 51 &&
+   needsTaskBasedThreadLimit(D.getDirectiveKind()) &&
+   D.hasClausesOfKind());

ABataev wrote:
> I think you don't need needsTaskBasedThreadLimit call here, the 
> emitTargetCall function itself can be called only for target-based directives
`emitTargetCall()` is called for all the target based directives, even target - 
team based directives, which already have a thread limit implementation in 
place. So, I need `needsTaskBasedThreadLimit` to select applicable directives 
only.



Comment at: clang/lib/CodeGen/CGStmtOpenMP.cpp:5143
+if (CGF.CGM.getLangOpts().OpenMP >= 51 &&
+needsTaskBasedThreadLimit(S.getDirectiveKind()) && TL) {
+  // Emit __kmpc_set_thread_limit() to set the thread_limit for the task

ABataev wrote:
> Same regarding needsTaskBasedThreadLimit(S.getDirectiveKind()) , the function 
> EmitOMPTargetTaskBasedDirective is called only for target-based directives
Similarly here as well.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-07-26 Thread Alexey Bataev via Phabricator via cfe-commits
ABataev added inline comments.



Comment at: clang/lib/CodeGen/CGOpenMPRuntime.cpp:9866
+  (CGM.getLangOpts().OpenMP >= 51 &&
+   needsTaskBasedThreadLimit(D.getDirectiveKind()) &&
+   D.hasClausesOfKind());

I think you don't need needsTaskBasedThreadLimit call here, the emitTargetCall 
function itself can be called only for target-based directives



Comment at: clang/lib/CodeGen/CGStmtOpenMP.cpp:5143
+if (CGF.CGM.getLangOpts().OpenMP >= 51 &&
+needsTaskBasedThreadLimit(S.getDirectiveKind()) && TL) {
+  // Emit __kmpc_set_thread_limit() to set the thread_limit for the task

Same regarding needsTaskBasedThreadLimit(S.getDirectiveKind()) , the function 
EmitOMPTargetTaskBasedDirective is called only for target-based directives


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-07-25 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri updated this revision to Diff 544201.
sandeepkosuri added a comment.

Explicitly mentioned `-fopenmp-version=51` in LIT test cases


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

Files:
  clang/include/clang/Basic/OpenMPKinds.h
  clang/lib/Basic/OpenMPKinds.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.h
  clang/lib/CodeGen/CGStmtOpenMP.cpp
  clang/test/OpenMP/target_codegen.cpp
  clang/test/OpenMP/target_parallel_for_simd_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_for_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_generic_loop_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_tl_codegen.cpp
  clang/test/OpenMP/target_simd_tl_codegen.cpp
  llvm/include/llvm/Frontend/OpenMP/OMP.td
  llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
  openmp/runtime/src/kmp.h
  openmp/runtime/src/kmp_csupport.cpp
  openmp/runtime/src/kmp_ftn_entry.h
  openmp/runtime/src/kmp_global.cpp
  openmp/runtime/src/kmp_runtime.cpp
  openmp/runtime/test/target/target_thread_limit.cpp

Index: openmp/runtime/test/target/target_thread_limit.cpp
===
--- /dev/null
+++ openmp/runtime/test/target/target_thread_limit.cpp
@@ -0,0 +1,168 @@
+// RUN: %libomp-cxx-compile -fopenmp-version=51
+// RUN: %libomp-run | FileCheck %s --check-prefix OMP51
+
+#include 
+#include 
+
+void foo() {
+#pragma omp parallel num_threads(10)
+  { printf("\ntarget: foo(): parallel num_threads(10)"); }
+}
+
+int main(void) {
+
+  int tl = 4;
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+
+#pragma omp target thread_limit(tl)
+  {
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+// check whether thread_limit is honoured
+#pragma omp parallel
+{ printf("\ntarget: parallel"); }
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51-NOT: target: parallel
+
+// check whether num_threads is honoured
+#pragma omp parallel num_threads(2)
+{ printf("\ntarget: parallel num_threads(2)"); }
+// OMP51: target: parallel num_threads(2)
+// OMP51: target: parallel num_threads(2)
+// OMP51-NOT: target: parallel num_threads(2)
+
+// check whether thread_limit is honoured when there is a conflicting
+// num_threads
+#pragma omp parallel num_threads(10)
+{ printf("\ntarget: parallel num_threads(10)"); }
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51-NOT: target: parallel num_threads(10)
+
+// check whether threads are limited across functions
+foo();
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51-NOT: target: foo(): parallel num_threads(10)
+
+// check if user can set num_threads at runtime
+omp_set_num_threads(2);
+#pragma omp parallel
+{ printf("\ntarget: parallel with omp_set_num_thread(2)"); }
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51-NOT: target: parallel with omp_set_num_thread(2)
+
+// make sure thread_limit is unaffected by omp_set_num_threads
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+  }
+
+// checking consecutive target regions with different thread_limits
+#pragma omp target thread_limit(3)
+  {
+printf("\nsecond target: thread_limit = %d", omp_get_thread_limit());
+// OMP51: second target: thread_limit = 3
+#pragma omp parallel
+{ printf("\nsecond target: parallel"); }
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51-NOT: second target: parallel
+  }
+
+  // confirm that thread_limit's effects are limited to target region
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+#pragma omp parallel num_threads(10)
+  { printf("\nmain: parallel num_threads(10)"); }
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51-NOT: main: parallel num_threads(10)
+
+// check combined target directives which support thread_limit
+// 

[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-07-25 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri updated this revision to Diff 543889.
sandeepkosuri added a comment.

- Added support for `thread_limit` clause on relevant combined directives which 
begin with `target` as per @ABataev 's comments.
- Added additional LIT test cases to check codegen of the `thread_limit` on the 
newly supported directives.
- Updated the runtime LIT as per @jdoerfert 's comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

Files:
  clang/include/clang/Basic/OpenMPKinds.h
  clang/lib/Basic/OpenMPKinds.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.h
  clang/lib/CodeGen/CGStmtOpenMP.cpp
  clang/test/OpenMP/target_codegen.cpp
  clang/test/OpenMP/target_parallel_for_simd_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_for_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_generic_loop_tl_codegen.cpp
  clang/test/OpenMP/target_parallel_tl_codegen.cpp
  clang/test/OpenMP/target_simd_tl_codegen.cpp
  llvm/include/llvm/Frontend/OpenMP/OMP.td
  llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
  openmp/runtime/src/kmp.h
  openmp/runtime/src/kmp_csupport.cpp
  openmp/runtime/src/kmp_ftn_entry.h
  openmp/runtime/src/kmp_global.cpp
  openmp/runtime/src/kmp_runtime.cpp
  openmp/runtime/test/target/target_thread_limit.cpp

Index: openmp/runtime/test/target/target_thread_limit.cpp
===
--- /dev/null
+++ openmp/runtime/test/target/target_thread_limit.cpp
@@ -0,0 +1,168 @@
+// RUN: %libomp-cxx-compile -fopenmp-version=51
+// RUN: %libomp-run | FileCheck %s --check-prefix OMP51
+
+#include 
+#include 
+
+void foo() {
+#pragma omp parallel num_threads(10)
+  { printf("\ntarget: foo(): parallel num_threads(10)"); }
+}
+
+int main(void) {
+
+  int tl = 4;
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+
+#pragma omp target thread_limit(tl)
+  {
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+// check whether thread_limit is honoured
+#pragma omp parallel
+{ printf("\ntarget: parallel"); }
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51-NOT: target: parallel
+
+// check whether num_threads is honoured
+#pragma omp parallel num_threads(2)
+{ printf("\ntarget: parallel num_threads(2)"); }
+// OMP51: target: parallel num_threads(2)
+// OMP51: target: parallel num_threads(2)
+// OMP51-NOT: target: parallel num_threads(2)
+
+// check whether thread_limit is honoured when there is a conflicting
+// num_threads
+#pragma omp parallel num_threads(10)
+{ printf("\ntarget: parallel num_threads(10)"); }
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51-NOT: target: parallel num_threads(10)
+
+// check whether threads are limited across functions
+foo();
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51-NOT: target: foo(): parallel num_threads(10)
+
+// check if user can set num_threads at runtime
+omp_set_num_threads(2);
+#pragma omp parallel
+{ printf("\ntarget: parallel with omp_set_num_thread(2)"); }
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51: target: parallel with omp_set_num_thread(2)
+// OMP51-NOT: target: parallel with omp_set_num_thread(2)
+
+// make sure thread_limit is unaffected by omp_set_num_threads
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+  }
+
+// checking consecutive target regions with different thread_limits
+#pragma omp target thread_limit(3)
+  {
+printf("\nsecond target: thread_limit = %d", omp_get_thread_limit());
+// OMP51: second target: thread_limit = 3
+#pragma omp parallel
+{ printf("\nsecond target: parallel"); }
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51-NOT: second target: parallel
+  }
+
+  // confirm that thread_limit's effects are limited to target region
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+#pragma omp parallel num_threads(10)
+  { printf("\nmain: parallel num_threads(10)"); }
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: 

[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-07-01 Thread Johannes Doerfert via Phabricator via cfe-commits
jdoerfert added inline comments.



Comment at: openmp/runtime/test/target/target_thread_limit.cpp:28
+// OMP51: target: parallel
+// OMP51: target: parallel
+

This doesn't check much. You need to verify a 5th or print the team size. Same 
for the rest



Comment at: openmp/runtime/test/target/target_thread_limit.cpp:35
+// OMP51: target: parallel num_threads(2)
+
+// check whether thread_limit is honoured when there is a conflicting

Verify that the user can use the omp_set_ functions and see consistent behavior.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-06-30 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri added inline comments.



Comment at: clang/lib/CodeGen/CGOpenMPRuntime.cpp:9818
+  D.hasClausesOfKind() ||
+  (CGM.getLangOpts().OpenMP >= 51 && D.getDirectiveKind() == OMPD_target &&
+   D.hasClausesOfKind());

ABataev wrote:
> What if D is combined target directive, i.e. D.getDirectiveKind() is 
> something like OMPD_target_teams, etc.?
I will fix that, thanks for noticing.



Comment at: clang/lib/CodeGen/CGStmtOpenMP.cpp:5143-5148
+S.getSingleClause()) {
+  // Emit __kmpc_set_thread_limit() to set the thread_limit for the task
+  // enclosing this target region. This will indirectly set the 
thread_limit
+  // for every applicable construct within target region.
+  CGF.CGM.getOpenMPRuntime().emitThreadLimitClause(
+  CGF, S.getSingleClause()->getThreadLimit(),

ABataev wrote:
> Avoid double call of S.getSingleClause(), store in 
> local variable call result.
sure.



Comment at: clang/test/OpenMP/target_codegen.cpp:849
 // OMP51: [[CE:%.*]] = load {{.*}} [[CEA]]
-// OMP51: call i32 @__tgt_target_kernel({{.*}}, i64 -1, i32 -1, i32 [[CE]],
+// OMP51: call ptr @__kmpc_omp_task_alloc({{.*@.omp_task_entry.*}})
+// OMP51: call i32 [[OMP_TASK_ENTRY]]

ABataev wrote:
> It requires extra resource consumption, can you try to avoid creating outer 
> task, if possible?
I tried different ideas for making `thread_limit` work on `target`.

I tried to reuse the existing implementation by replacing the directive to 
`target teams(1) thread_limit(x)` at  parsing , sema and IR stages. I couldn't 
successfully implement any of them. So, I tried adding `num_threads` for all 
the parallel directives within `target`, and there were corner cases like 
parallel directives in a function which is called in target region, which were 
becoming tedious to handle.

This method seem to encompass the idea of thread limit on `target` pretty well 
and also works... So I proceeded with this idea.



Comment at: openmp/runtime/src/kmp_ftn_entry.h:809
+return thread_limit;
+  else
+return thread->th.th_current_task->td_icvs.thread_limit;

ABataev wrote:
> No need for else here
oops, I will fix that


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-06-28 Thread Alexey Bataev via Phabricator via cfe-commits
ABataev added inline comments.



Comment at: clang/lib/CodeGen/CGOpenMPRuntime.cpp:9818
+  D.hasClausesOfKind() ||
+  (CGM.getLangOpts().OpenMP >= 51 && D.getDirectiveKind() == OMPD_target &&
+   D.hasClausesOfKind());

What if D is combined target directive, i.e. D.getDirectiveKind() is something 
like OMPD_target_teams, etc.?



Comment at: clang/lib/CodeGen/CGStmtOpenMP.cpp:5143-5148
+S.getSingleClause()) {
+  // Emit __kmpc_set_thread_limit() to set the thread_limit for the task
+  // enclosing this target region. This will indirectly set the 
thread_limit
+  // for every applicable construct within target region.
+  CGF.CGM.getOpenMPRuntime().emitThreadLimitClause(
+  CGF, S.getSingleClause()->getThreadLimit(),

Avoid double call of S.getSingleClause(), store in local 
variable call result.



Comment at: clang/test/OpenMP/target_codegen.cpp:849
 // OMP51: [[CE:%.*]] = load {{.*}} [[CEA]]
-// OMP51: call i32 @__tgt_target_kernel({{.*}}, i64 -1, i32 -1, i32 [[CE]],
+// OMP51: call ptr @__kmpc_omp_task_alloc({{.*@.omp_task_entry.*}})
+// OMP51: call i32 [[OMP_TASK_ENTRY]]

It requires extra resource consumption, can you try to avoid creating outer 
task, if possible?



Comment at: openmp/runtime/src/kmp_ftn_entry.h:809
+return thread_limit;
+  else
+return thread->th.th_current_task->td_icvs.thread_limit;

No need for else here


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-06-12 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri updated this revision to Diff 530591.
sandeepkosuri added a comment.

Updated `target_codegen.cpp` test case to incorporate my changes


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152054/new/

https://reviews.llvm.org/D152054

Files:
  clang/lib/CodeGen/CGOpenMPRuntime.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.h
  clang/lib/CodeGen/CGStmtOpenMP.cpp
  clang/test/OpenMP/target_codegen.cpp
  llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
  openmp/runtime/src/kmp.h
  openmp/runtime/src/kmp_csupport.cpp
  openmp/runtime/src/kmp_ftn_entry.h
  openmp/runtime/src/kmp_global.cpp
  openmp/runtime/src/kmp_runtime.cpp
  openmp/runtime/test/target/target_thread_limit.cpp

Index: openmp/runtime/test/target/target_thread_limit.cpp
===
--- /dev/null
+++ openmp/runtime/test/target/target_thread_limit.cpp
@@ -0,0 +1,81 @@
+// RUN: %libomp-cxx-compile -fopenmp-version=51
+// RUN: %libomp-run | FileCheck %s --check-prefix OMP51
+
+#include 
+#include 
+
+void foo() {
+#pragma omp parallel num_threads(10)
+  { printf("\ntarget: foo(): parallel num_threads(10)"); }
+}
+
+int main(void) {
+
+  int tl = 4;
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+
+#pragma omp target thread_limit(tl)
+  {
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+// check whether thread_limit is honoured
+#pragma omp parallel
+{ printf("\ntarget: parallel"); }
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+
+// check whether num_threads is honoured
+#pragma omp parallel num_threads(2)
+{ printf("\ntarget: parallel num_threads(2)"); }
+// OMP51: target: parallel num_threads(2)
+// OMP51: target: parallel num_threads(2)
+
+// check whether thread_limit is honoured when there is a conflicting
+// num_threads
+#pragma omp parallel num_threads(10)
+{ printf("\ntarget: parallel num_threads(10)"); }
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+
+// check whether threads are limited across functions
+foo();
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+  }
+
+// checking consecutive target regions with different thread_limits
+#pragma omp target thread_limit(3)
+  {
+printf("\nsecond target: thread_limit = %d", omp_get_thread_limit());
+// OMP51: second target: thread_limit = 3
+#pragma omp parallel
+{ printf("\nsecond target: parallel"); }
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+  }
+
+// confirm that thread_limit's effects are limited to target region
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+// OMP51: main: thread_limit = {{[0-9]+}}
+#pragma omp parallel num_threads(10)
+  { printf("\nmain: parallel num_threads(10)"); }
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  return 0;
+}
Index: openmp/runtime/src/kmp_runtime.cpp
===
--- openmp/runtime/src/kmp_runtime.cpp
+++ openmp/runtime/src/kmp_runtime.cpp
@@ -1867,6 +1867,7 @@
   int nthreads;
   int master_active;
   int master_set_numthreads;
+  int task_thread_limit = 0;
   int level;
   int active_level;
   int teams_level;
@@ -1905,6 +1906,8 @@
 root = master_th->th.th_root;
 master_active = root->r.r_active;
 master_set_numthreads = master_th->th.th_set_nproc;
+task_thread_limit =
+master_th->th.th_current_task->td_icvs.task_thread_limit;
 
 #if OMPT_SUPPORT
 ompt_data_t ompt_parallel_data = ompt_data_none;
@@ -1995,6 +1998,11 @@
  ? master_set_numthreads
  // TODO: get nproc directly from current task
  : get__nproc_2(parent_team, master_tid);
+  // Use the thread_limit set for the current target task if exists, else go
+  // with the deduced nthreads
+  nthreads = task_thread_limit > 0 && task_thread_limit < nthreads
+ ? task_thread_limit
+ : nthreads;
   // Check if we need to take forkjoin lock? (no need for serialized
   // parallel out of teams 

[PATCH] D152054: [OpenMP] Codegen support for thread_limit on target directive

2023-06-03 Thread Sandeep via Phabricator via cfe-commits
sandeepkosuri created this revision.
sandeepkosuri added reviewers: ABataev, soumitra, koops, RitanyaB, dreachem.
Herald added subscribers: sunshaoce, guansong, yaxunl.
Herald added a project: All.
sandeepkosuri requested review of this revision.
Herald added a reviewer: jdoerfert.
Herald added subscribers: llvm-commits, openmp-commits, cfe-commits, jplehr, 
sstefan1.
Herald added projects: clang, OpenMP, LLVM.

- This patch adds support for thread_limit clause on target directive according 
to OpenMP 51 [2.14.5]
- The idea is to create an outer task for target region, when there is a 
thread_limit clause, and manipulate the thread_limit of task instead. This way, 
thread_limit will be applied to all the relevant constructs enclosed by the 
target region.




Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D152054

Files:
  clang/lib/CodeGen/CGOpenMPRuntime.cpp
  clang/lib/CodeGen/CGOpenMPRuntime.h
  clang/lib/CodeGen/CGStmtOpenMP.cpp
  clang/test/OpenMP/target_codegen.cpp
  llvm/include/llvm/Frontend/OpenMP/OMPKinds.def
  openmp/runtime/src/kmp.h
  openmp/runtime/src/kmp_csupport.cpp
  openmp/runtime/src/kmp_ftn_entry.h
  openmp/runtime/src/kmp_global.cpp
  openmp/runtime/src/kmp_runtime.cpp
  openmp/runtime/test/target/target_thread_limit.cpp

Index: openmp/runtime/test/target/target_thread_limit.cpp
===
--- /dev/null
+++ openmp/runtime/test/target/target_thread_limit.cpp
@@ -0,0 +1,81 @@
+// RUN: %libomp-cxx-compile -fopenmp-version=51
+// RUN: %libomp-run | FileCheck %s --check-prefix OMP51
+
+#include 
+#include 
+
+void foo() {
+#pragma omp parallel num_threads(10)
+  { printf("\ntarget: foo(): parallel num_threads(10)"); }
+}
+
+int main(void) {
+
+  int tl = 4;
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+  // OMP51: main: thread_limit = {{[0-9]+}}
+
+#pragma omp target thread_limit(tl)
+  {
+printf("\ntarget: thread_limit = %d", omp_get_thread_limit());
+// OMP51: target: thread_limit = 4
+// check whether thread_limit is honoured
+#pragma omp parallel
+{ printf("\ntarget: parallel"); }
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+// OMP51: target: parallel
+
+// check whether num_threads is honoured
+#pragma omp parallel num_threads(2)
+{ printf("\ntarget: parallel num_threads(2)"); }
+// OMP51: target: parallel num_threads(2)
+// OMP51: target: parallel num_threads(2)
+
+// check whether thread_limit is honoured when there is a conflicting
+// num_threads
+#pragma omp parallel num_threads(10)
+{ printf("\ntarget: parallel num_threads(10)"); }
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+// OMP51: target: parallel num_threads(10)
+
+// check whether threads are limited across functions
+foo();
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+// OMP51: target: foo(): parallel num_threads(10)
+  }
+
+// checking consecutive target regions with different thread_limits
+#pragma omp target thread_limit(3)
+  {
+printf("\nsecond target: thread_limit = %d", omp_get_thread_limit());
+// OMP51: second target: thread_limit = 3
+#pragma omp parallel
+{ printf("\nsecond target: parallel"); }
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+// OMP51: second target: parallel
+  }
+
+// confirm that thread_limit's effects are limited to target region
+  printf("\nmain: thread_limit = %d", omp_get_thread_limit());
+// OMP51: main: thread_limit = {{[0-9]+}}
+#pragma omp parallel num_threads(10)
+  { printf("\nmain: parallel num_threads(10)"); }
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  // OMP51: main: parallel num_threads(10)
+  return 0;
+}
Index: openmp/runtime/src/kmp_runtime.cpp
===
--- openmp/runtime/src/kmp_runtime.cpp
+++ openmp/runtime/src/kmp_runtime.cpp
@@ -1867,6 +1867,7 @@
   int nthreads;
   int master_active;
   int master_set_numthreads;
+  int task_thread_limit = 0;
   int level;
   int active_level;
   int teams_level;
@@ -1905,6 +1906,8 @@
 root = master_th->th.th_root;
 master_active = root->r.r_active;
 master_set_numthreads = master_th->th.th_set_nproc;
+task_thread_limit =
+master_th->th.th_current_task->td_icvs.task_thread_limit;
 
 #if OMPT_SUPPORT
 ompt_data_t ompt_parallel_data = ompt_data_none;
@@ -1995,6 +1998,11 @@