[PATCH] D141350: Fix runtime problem for base class member data used in target region.

2023-01-09 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

Does this patch work when there are more than one level of inheritance? For 
example `class B: public A; class C: public B`


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D141350/new/

https://reviews.llvm.org/D141350

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D139287: [OpenMP] Introduce basic JIT support to OpenMP target offloading

2022-12-28 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

Got tons of runtime failure

  target doesn't support jit
  UNREACHABLE executed at 
/gpfs/jlse-fs0/users/yeluo/opt/llvm-clang/llvm-project-nightly/openmp/libomptarget/plugins-nextgen/common/PluginInterface/PluginInterface.h:543!

on AMD GPU gfx908 when running miniqmc ctest


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D139287/new/

https://reviews.llvm.org/D139287

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D133694: [Clang][OpenMP] Fix use_device_addr

2022-09-12 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments.



Comment at: clang/test/OpenMP/target_data_use_device_addr_codegen_ptr.cpp:14
+{
+#pragma omp target data use_device_addr(x)
+{

dreachem wrote:
> ye-luo wrote:
> > doru1004 wrote:
> > > doru1004 wrote:
> > > > ye-luo wrote:
> > > > > doru1004 wrote:
> > > > > > ye-luo wrote:
> > > > > > > doru1004 wrote:
> > > > > > > > ye-luo wrote:
> > > > > > > > > doru1004 wrote:
> > > > > > > > > > ye-luo wrote:
> > > > > > > > > > > In my understanding of the spec.
> > > > > > > > > > > `map(tofrom:x[0:256])` only maps the memory segment that 
> > > > > > > > > > > x points to. x itself as a pointer scalar is not mapped.
> > > > > > > > > > > use_device_addr(x) should fail to find the map of x 
> > > > > > > > > > > scalar.
> > > > > > > > > > > 5.2 spec.
> > > > > > > > > > > If the list item is not a mapped list item, it is assumed 
> > > > > > > > > > > to be accessible on the target device.
> > > > > > > > > > > To me, it seems just keep  as it was, in this case  
> > > > > > > > > > > remains a host address.
> > > > > > > > > > > 
> > > > > > > > > > > But in your patch description, it seems treating x 
> > > > > > > > > > > differently from a scalar.
> > > > > > > > > > > 
> > > > > > > > > > > I also applied your patch on main and got segfault 
> > > > > > > > > > > because the x has a value of device address and x[0] 
> > > > > > > > > > > fails. This should be the behavior of use_device_ptr 
> > > > > > > > > > > instead of use_device_addr.
> > > > > > > > > > > To me, it seems just keep  as it was, in this case  
> > > > > > > > > > > remains a host address.
> > > > > > > > > > 
> > > > > > > > > > So does this mean that if I do something like this in the 
> > > > > > > > > > target data I should get different addresses for x:
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > ```
> > > > > > > > > > #pragma omp target data use_device_ptr(x)
> > > > > > > > > > {
> > > > > > > > > > fprintf(stderr, "x: %p\n", __LINE__, x);
> > > > > > > > > > }
> > > > > > > > > > 
> > > > > > > > > > #pragma omp target data use_device_addr(x)
> > > > > > > > > > {
> > > > > > > > > > fprintf(stderr, "x: %p\n", __LINE__, x);
> > > > > > > > > > }
> > > > > > > > > > ```
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > > I also applied your patch on main and got segfault 
> > > > > > > > > > > because the x has a value of device address and x[0] 
> > > > > > > > > > > fails.
> > > > > > > > > > 
> > > > > > > > > > That's my fault x[0] was the wrong thing to use actually.
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > When you have an outer `target data map(x)`, then two printf 
> > > > > > > > > differ. If there is no outer `map(x)`, two printf should be 
> > > > > > > > > identical.
> > > > > > > > > When you have an outer `target data map(x)`, then two printf 
> > > > > > > > > differ. If there is no outer `map(x)`, two printf should be 
> > > > > > > > > identical.
> > > > > > > > 
> > > > > > > > This is super helpful thank you! I'll make sure that happens.
> > > > > > > > 
> > > > > > > > In the case when an outer target data exists, the print of the 
> > > > > > > > x which is under use_device_addr should print the same address 
> > > > > > > > as printing x on the host? 
> > > > > > > I need a correction. When outer map(x) exists, actually the 
> > > > > > > address(not value) of x should be a device address, and the code 
> > > > > > > cannot even print x. Printing  should be fine.
> > > > > > In the context of the above comment, should  on the device be an 
> > > > > > address I can verify, somehow, to make sure that it's correct or is 
> > > > > > it a completely new device address?
> > > > > > 
> > > > > > So for example, should it be the same as when I do a use_device_ptr 
> > > > > > but print the  in that case? (With the current master those two 
> > > > > > addresses are not the same.)
> > > > > > 
> > > > > > I guess what I need is an example of using use_device_addr that 
> > > > > > actually does something meaningful because with the current main 
> > > > > > branch printing the  of a use_device_addr(x) is nil.
> > > > > When an outer map(x) is placed,  does print something meaningful.
> > > > > I tried to access `omp_get_mapped_ptr(, omp_get_default_device())` 
> > > > > but got link time error about missing omp_get_mapped_ptr definition. 
> > > > > It seems missing an implementation of this OpenMP API.
> > > > > 
> > > > > When there is no map(x), I also got nil, I think this is a bug,  
> > > > > should keep the host value.
> > > > > 
> > > > > I cannot think of a useful example with use_device_addr(x) where x is 
> > > > > a pointer. But x can be a scalar float. 
> > > > > and then call cublas gemm, the alpha/beta parameter can be 
> > > > > When an outer map(x) is placed,  does print something meaningful.
> > 

[PATCH] D133694: [Clang][OpenMP] Fix use_device_addr

2022-09-12 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments.



Comment at: clang/test/OpenMP/target_data_use_device_addr_codegen_ptr.cpp:14
+{
+#pragma omp target data use_device_addr(x)
+{

doru1004 wrote:
> doru1004 wrote:
> > ye-luo wrote:
> > > doru1004 wrote:
> > > > ye-luo wrote:
> > > > > doru1004 wrote:
> > > > > > ye-luo wrote:
> > > > > > > doru1004 wrote:
> > > > > > > > ye-luo wrote:
> > > > > > > > > In my understanding of the spec.
> > > > > > > > > `map(tofrom:x[0:256])` only maps the memory segment that x 
> > > > > > > > > points to. x itself as a pointer scalar is not mapped.
> > > > > > > > > use_device_addr(x) should fail to find the map of x scalar.
> > > > > > > > > 5.2 spec.
> > > > > > > > > If the list item is not a mapped list item, it is assumed to 
> > > > > > > > > be accessible on the target device.
> > > > > > > > > To me, it seems just keep  as it was, in this case  
> > > > > > > > > remains a host address.
> > > > > > > > > 
> > > > > > > > > But in your patch description, it seems treating x 
> > > > > > > > > differently from a scalar.
> > > > > > > > > 
> > > > > > > > > I also applied your patch on main and got segfault because 
> > > > > > > > > the x has a value of device address and x[0] fails. This 
> > > > > > > > > should be the behavior of use_device_ptr instead of 
> > > > > > > > > use_device_addr.
> > > > > > > > > To me, it seems just keep  as it was, in this case  
> > > > > > > > > remains a host address.
> > > > > > > > 
> > > > > > > > So does this mean that if I do something like this in the 
> > > > > > > > target data I should get different addresses for x:
> > > > > > > > 
> > > > > > > > 
> > > > > > > > ```
> > > > > > > > #pragma omp target data use_device_ptr(x)
> > > > > > > > {
> > > > > > > > fprintf(stderr, "x: %p\n", __LINE__, x);
> > > > > > > > }
> > > > > > > > 
> > > > > > > > #pragma omp target data use_device_addr(x)
> > > > > > > > {
> > > > > > > > fprintf(stderr, "x: %p\n", __LINE__, x);
> > > > > > > > }
> > > > > > > > ```
> > > > > > > > 
> > > > > > > > 
> > > > > > > > > I also applied your patch on main and got segfault because 
> > > > > > > > > the x has a value of device address and x[0] fails.
> > > > > > > > 
> > > > > > > > That's my fault x[0] was the wrong thing to use actually.
> > > > > > > > 
> > > > > > > > 
> > > > > > > When you have an outer `target data map(x)`, then two printf 
> > > > > > > differ. If there is no outer `map(x)`, two printf should be 
> > > > > > > identical.
> > > > > > > When you have an outer `target data map(x)`, then two printf 
> > > > > > > differ. If there is no outer `map(x)`, two printf should be 
> > > > > > > identical.
> > > > > > 
> > > > > > This is super helpful thank you! I'll make sure that happens.
> > > > > > 
> > > > > > In the case when an outer target data exists, the print of the x 
> > > > > > which is under use_device_addr should print the same address as 
> > > > > > printing x on the host? 
> > > > > I need a correction. When outer map(x) exists, actually the 
> > > > > address(not value) of x should be a device address, and the code 
> > > > > cannot even print x. Printing  should be fine.
> > > > In the context of the above comment, should  on the device be an 
> > > > address I can verify, somehow, to make sure that it's correct or is it 
> > > > a completely new device address?
> > > > 
> > > > So for example, should it be the same as when I do a use_device_ptr but 
> > > > print the  in that case? (With the current master those two addresses 
> > > > are not the same.)
> > > > 
> > > > I guess what I need is an example of using use_device_addr that 
> > > > actually does something meaningful because with the current main branch 
> > > > printing the  of a use_device_addr(x) is nil.
> > > When an outer map(x) is placed,  does print something meaningful.
> > > I tried to access `omp_get_mapped_ptr(, omp_get_default_device())` but 
> > > got link time error about missing omp_get_mapped_ptr definition. It seems 
> > > missing an implementation of this OpenMP API.
> > > 
> > > When there is no map(x), I also got nil, I think this is a bug,  should 
> > > keep the host value.
> > > 
> > > I cannot think of a useful example with use_device_addr(x) where x is a 
> > > pointer. But x can be a scalar float. 
> > > and then call cublas gemm, the alpha/beta parameter can be 
> > > When an outer map(x) is placed,  does print something meaningful.
> > 
> > For me, in the same scenario, it prints nil.
> > 
> > Here's the full example to avoid any confusion:
> > 
> > ```
> > float *x = (float *) malloc(10*sizeof(float));
> > 
> > #pragma omp target data map(to:x[0:10])
> > {
> > #pragma omp target data use_device_ptr(x)
> > {
> > fprintf(stderr, "line %d x: %p\n", __LINE__, x); // prints 
> > address 0x7f0bda40
> > }
> > 
> > #pragma 

[PATCH] D133694: [Clang][OpenMP] Fix use_device_addr

2022-09-12 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments.



Comment at: clang/test/OpenMP/target_data_use_device_addr_codegen_ptr.cpp:14
+{
+#pragma omp target data use_device_addr(x)
+{

doru1004 wrote:
> ye-luo wrote:
> > doru1004 wrote:
> > > ye-luo wrote:
> > > > doru1004 wrote:
> > > > > ye-luo wrote:
> > > > > > In my understanding of the spec.
> > > > > > `map(tofrom:x[0:256])` only maps the memory segment that x points 
> > > > > > to. x itself as a pointer scalar is not mapped.
> > > > > > use_device_addr(x) should fail to find the map of x scalar.
> > > > > > 5.2 spec.
> > > > > > If the list item is not a mapped list item, it is assumed to be 
> > > > > > accessible on the target device.
> > > > > > To me, it seems just keep  as it was, in this case  remains a 
> > > > > > host address.
> > > > > > 
> > > > > > But in your patch description, it seems treating x differently from 
> > > > > > a scalar.
> > > > > > 
> > > > > > I also applied your patch on main and got segfault because the x 
> > > > > > has a value of device address and x[0] fails. This should be the 
> > > > > > behavior of use_device_ptr instead of use_device_addr.
> > > > > > To me, it seems just keep  as it was, in this case  remains a 
> > > > > > host address.
> > > > > 
> > > > > So does this mean that if I do something like this in the target data 
> > > > > I should get different addresses for x:
> > > > > 
> > > > > 
> > > > > ```
> > > > > #pragma omp target data use_device_ptr(x)
> > > > > {
> > > > > fprintf(stderr, "x: %p\n", __LINE__, x);
> > > > > }
> > > > > 
> > > > > #pragma omp target data use_device_addr(x)
> > > > > {
> > > > > fprintf(stderr, "x: %p\n", __LINE__, x);
> > > > > }
> > > > > ```
> > > > > 
> > > > > 
> > > > > > I also applied your patch on main and got segfault because the x 
> > > > > > has a value of device address and x[0] fails.
> > > > > 
> > > > > That's my fault x[0] was the wrong thing to use actually.
> > > > > 
> > > > > 
> > > > When you have an outer `target data map(x)`, then two printf differ. If 
> > > > there is no outer `map(x)`, two printf should be identical.
> > > > When you have an outer `target data map(x)`, then two printf differ. If 
> > > > there is no outer `map(x)`, two printf should be identical.
> > > 
> > > This is super helpful thank you! I'll make sure that happens.
> > > 
> > > In the case when an outer target data exists, the print of the x which is 
> > > under use_device_addr should print the same address as printing x on the 
> > > host? 
> > I need a correction. When outer map(x) exists, actually the address(not 
> > value) of x should be a device address, and the code cannot even print x. 
> > Printing  should be fine.
> In the context of the above comment, should  on the device be an address I 
> can verify, somehow, to make sure that it's correct or is it a completely new 
> device address?
> 
> So for example, should it be the same as when I do a use_device_ptr but print 
> the  in that case? (With the current master those two addresses are not the 
> same.)
> 
> I guess what I need is an example of using use_device_addr that actually does 
> something meaningful because with the current main branch printing the  of 
> a use_device_addr(x) is nil.
When an outer map(x) is placed,  does print something meaningful.
I tried to access `omp_get_mapped_ptr(, omp_get_default_device())` but got 
link time error about missing omp_get_mapped_ptr definition. It seems missing 
an implementation of this OpenMP API.

When there is no map(x), I also got nil, I think this is a bug,  should keep 
the host value.

I cannot think of a useful example with use_device_addr(x) where x is a 
pointer. But x can be a scalar float. 
and then call cublas gemm, the alpha/beta parameter can be 


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133694/new/

https://reviews.llvm.org/D133694

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D133694: [Clang][OpenMP] Fix use_device_addr

2022-09-12 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments.



Comment at: clang/test/OpenMP/target_data_use_device_addr_codegen_ptr.cpp:14
+{
+#pragma omp target data use_device_addr(x)
+{

doru1004 wrote:
> ye-luo wrote:
> > doru1004 wrote:
> > > ye-luo wrote:
> > > > In my understanding of the spec.
> > > > `map(tofrom:x[0:256])` only maps the memory segment that x points to. x 
> > > > itself as a pointer scalar is not mapped.
> > > > use_device_addr(x) should fail to find the map of x scalar.
> > > > 5.2 spec.
> > > > If the list item is not a mapped list item, it is assumed to be 
> > > > accessible on the target device.
> > > > To me, it seems just keep  as it was, in this case  remains a host 
> > > > address.
> > > > 
> > > > But in your patch description, it seems treating x differently from a 
> > > > scalar.
> > > > 
> > > > I also applied your patch on main and got segfault because the x has a 
> > > > value of device address and x[0] fails. This should be the behavior of 
> > > > use_device_ptr instead of use_device_addr.
> > > > To me, it seems just keep  as it was, in this case  remains a host 
> > > > address.
> > > 
> > > So does this mean that if I do something like this in the target data I 
> > > should get different addresses for x:
> > > 
> > > 
> > > ```
> > > #pragma omp target data use_device_ptr(x)
> > > {
> > > fprintf(stderr, "x: %p\n", __LINE__, x);
> > > }
> > > 
> > > #pragma omp target data use_device_addr(x)
> > > {
> > > fprintf(stderr, "x: %p\n", __LINE__, x);
> > > }
> > > ```
> > > 
> > > 
> > > > I also applied your patch on main and got segfault because the x has a 
> > > > value of device address and x[0] fails.
> > > 
> > > That's my fault x[0] was the wrong thing to use actually.
> > > 
> > > 
> > When you have an outer `target data map(x)`, then two printf differ. If 
> > there is no outer `map(x)`, two printf should be identical.
> > When you have an outer `target data map(x)`, then two printf differ. If 
> > there is no outer `map(x)`, two printf should be identical.
> 
> This is super helpful thank you! I'll make sure that happens.
> 
> In the case when an outer target data exists, the print of the x which is 
> under use_device_addr should print the same address as printing x on the 
> host? 
I need a correction. When outer map(x) exists, actually the address(not value) 
of x should be a device address, and the code cannot even print x. Printing  
should be fine.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133694/new/

https://reviews.llvm.org/D133694

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D133694: [Clang][OpenMP] Fix use_device_addr

2022-09-12 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments.



Comment at: clang/test/OpenMP/target_data_use_device_addr_codegen_ptr.cpp:14
+{
+#pragma omp target data use_device_addr(x)
+{

doru1004 wrote:
> ye-luo wrote:
> > In my understanding of the spec.
> > `map(tofrom:x[0:256])` only maps the memory segment that x points to. x 
> > itself as a pointer scalar is not mapped.
> > use_device_addr(x) should fail to find the map of x scalar.
> > 5.2 spec.
> > If the list item is not a mapped list item, it is assumed to be accessible 
> > on the target device.
> > To me, it seems just keep  as it was, in this case  remains a host 
> > address.
> > 
> > But in your patch description, it seems treating x differently from a 
> > scalar.
> > 
> > I also applied your patch on main and got segfault because the x has a 
> > value of device address and x[0] fails. This should be the behavior of 
> > use_device_ptr instead of use_device_addr.
> > To me, it seems just keep  as it was, in this case  remains a host 
> > address.
> 
> So does this mean that if I do something like this in the target data I 
> should get different addresses for x:
> 
> 
> ```
> #pragma omp target data use_device_ptr(x)
> {
> fprintf(stderr, "x: %p\n", __LINE__, x);
> }
> 
> #pragma omp target data use_device_addr(x)
> {
> fprintf(stderr, "x: %p\n", __LINE__, x);
> }
> ```
> 
> 
> > I also applied your patch on main and got segfault because the x has a 
> > value of device address and x[0] fails.
> 
> That's my fault x[0] was the wrong thing to use actually.
> 
> 
When you have an outer `target data map(x)`, then two printf differ. If there 
is no outer `map(x)`, two printf should be identical.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133694/new/

https://reviews.llvm.org/D133694

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D133694: [Clang][OpenMP] Fix use_device_addr

2022-09-12 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments.



Comment at: clang/test/OpenMP/target_data_use_device_addr_codegen_ptr.cpp:14
+{
+#pragma omp target data use_device_addr(x)
+{

In my understanding of the spec.
`map(tofrom:x[0:256])` only maps the memory segment that x points to. x itself 
as a pointer scalar is not mapped.
use_device_addr(x) should fail to find the map of x scalar.
5.2 spec.
If the list item is not a mapped list item, it is assumed to be accessible on 
the target device.
To me, it seems just keep  as it was, in this case  remains a host address.

But in your patch description, it seems treating x differently from a scalar.

I also applied your patch on main and got segfault because the x has a value of 
device address and x[0] fails. This should be the behavior of use_device_ptr 
instead of use_device_addr.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D133694/new/

https://reviews.llvm.org/D133694

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D130020: [OpenMP] Deprecate the old driver for OpenMP offloading

2022-08-24 Thread Ye Luo via Phabricator via cfe-commits
ye-luo accepted this revision.
ye-luo added a comment.

The new driver remains failing offload to x86 in certain scenarios when linking 
static libraries. Once I link object files directly there is no issue. This not 
worse than old driver that only supports linking object files directly.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D130020/new/

https://reviews.llvm.org/D130020

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D130020: [OpenMP] Deprecate the old driver for OpenMP offloading

2022-07-27 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

I noticed that in one of my applications, offload to x86 is not fully working 
with static libraries but directly linking all object files resolves the issue. 
So the new driver doesn't cause regression compared to the old driver which 
doesn't work with static libraries anyway. So we can say good bye to the old 
drivers.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D130020/new/

https://reviews.llvm.org/D130020

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D129393: [Clang] Fix the wrong features being derivec in the offload packager

2022-07-08 Thread Ye Luo via Phabricator via cfe-commits
ye-luo accepted this revision.
ye-luo added a comment.
This revision is now accepted and ready to land.

LGTM


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D129393/new/

https://reviews.llvm.org/D129393

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D129383: [LinkerWrapper] Fix use of string savers and correctly pass bitcode libraries

2022-07-08 Thread Ye Luo via Phabricator via cfe-commits
ye-luo accepted this revision.
ye-luo added a comment.
This revision is now accepted and ready to land.

Confirm that #56445 is fixed now


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D129383/new/

https://reviews.llvm.org/D129383

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D128206: [Clang] Allow multiple comma separated arguments to `--offload-arch=`

2022-06-20 Thread Ye Luo via Phabricator via cfe-commits
ye-luo accepted this revision.
ye-luo added a comment.
This revision is now accepted and ready to land.

LGTM. This allows me to write concise compile lines.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D128206/new/

https://reviews.llvm.org/D128206

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D123498: [clang] Adding Platform/Architecture Specific Resource Header Installation Targets

2022-04-28 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

@qiongsiwu1 Tested the updated patch. Works fine now.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123498/new/

https://reviews.llvm.org/D123498

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D123498: [clang] Adding Platform/Architecture Specific Resource Header Installation Targets

2022-04-23 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

See https://github.com/llvm/llvm-project/issues/55002


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D123498/new/

https://reviews.llvm.org/D123498

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D122592: [OpenMP] Fix library path missing when using OpenMP

2022-03-28 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

> using the multiarch directory

If we can cross compile libomp and libomptarget to the target system. We may 
have
lib/x86_64-unknown-linux-gnu/libomp.so
lib/aarch64-unknown-linux-gnu/libomp.so
Compile clang once but compile runtime library for multiple architectures.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D122592/new/

https://reviews.llvm.org/D122592

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D120106: [OpenMP] Add flag for disabling threat state in runtime

2022-02-17 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

Change title
threat state
to
thread state


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120106/new/

https://reviews.llvm.org/D120106

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D116865: [OpenMP][FIX] Emit debug declares only if debug info is available

2022-01-08 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

Confirm that #52938 is fixed by this patch.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D116865/new/

https://reviews.llvm.org/D116865

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109885: [MLIR][[amdgpu-arch]][OpenMP] Remove direct dependency on /opt/rocm

2021-12-20 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments.



Comment at: clang/tools/amdgpu-arch/CMakeLists.txt:13
+
+find_package(hsa-runtime64 QUIET 1.2.0 HINTS ${CMAKE_INSTALL_PREFIX} PATH 
${ROCM_PATH})
 if (NOT ${hsa-runtime64_FOUND})

JonChesterfield wrote:
> arsenm wrote:
> > I also think CMAKE_INSTALL_PREFIX in the search hints is bad practice. I 
> > don't recall ever seeing a project do this. Depending on the install path 
> > for anything leads to flaky builds
> This was copied from the amdgpu plugin. I can't remember where I copied that 
> from. Alternatives welcome - what's the proper way to indicate 'this library? 
> it's probably next to all the other llvm libraries'
CMAKE_INSTALL_PREFIX should be removed.
If that search location is intended, pass it via _ROOT, 
CMAKE_PREFIX_PATH or CMAKE_LIBRARY_PATH.
See searching order 
https://cmake.org/cmake/help/latest/command/find_library.html

The search line should be as simple as
```
find_package(hsa-runtime64 QUIET 1.2.0 PATH /opt/rocm)
```
`/opt/rocm` can be kept as the lowest priority searching guess.

There is no good reason to use `ENV{ROCM_PATH}` nor `ROCM_PATH`. I have enough 
pain with `XXX_ROOT`, `XXX_HOME`, `XXX_PATH` environment variables. If you need 
`ROCM_PATH` because it comes from modules (environment modules, lmod), ask the 
maintainer to add proper CMAKE_PREFIX_PATH. If hsa is not pulled in via 
modules, it is user responsible to tell CMake where the hsa installation is.

In addition, please ask the hsa maintainer to move hsa-runtime64 cmake files 
from "/opt/rocm/lib/cmake/hsa-runtime64" to the hsa library 
"/opt/rocm/hsa/lib/cmake/hsa-runtime64" and then softlink to /opt/rocm/lib not 
the other way round. In such way, the hsa library from /opt/rocm can be used as 
an independent library without pulling the whole /opt/rocm if needed.



Comment at: mlir/lib/Dialect/GPU/CMakeLists.txt:137
   set(CMAKE_MODULE_PATH "${HIP_PATH}/cmake" ${CMAKE_MODULE_PATH})
   find_package(HIP)
   if (NOT HIP_FOUND)

dbhaskaran wrote:
> ye-luo wrote:
> > Both ROCM_PATH HIP_PATH are used as hints without verification.
> > But they are used later for generating include paths. Overall logic is 
> > broken.
> > 
> > if ROCM_PATH takes the precedence over everything else
> > You can do this
> > if ROCM_PATH defined
> > find_path(
> >   HIP_MODULE_FILE_DIR FindHIP.cmake
> >   HINTS ${ROCM_PATH}
> >   PATH_SUFFIXES hip/cmake REQUIRED
> >   NO_DEFAULT_PATH)
> > else
> > find_path(
> >   HIP_MODULE_FILE_DIR FindHIP.cmake
> >   HINTS $ENV{ROCM_PATH} /opt/rocm
> >   PATH_SUFFIXES hip/cmake REQUIRED)
> > endif
> > 
> > by doing this, you can verify that ROCM_PATH is correct if provided and the 
> > path the hip module file has been verified. then it is safe to do
> > set(CMAKE_MODULE_PATH "${HIP_MODULE_FILE_DIR}" ${CMAKE_MODULE_PATH})
> > find_package(HIP)
> The primary intention here is to avoid fallback to a default ROCM path  to 
> avoid unwanted issues related to multiple installation, however I agree the 
> we should use paths with verification so I have accommodated the changes for 
> HIP excluding the hint. Thanks for the code snippets. 
Do you know by chance that whether MLIR really need the hip runtime library or 
it only needs the HSA as well?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109885/new/

https://reviews.llvm.org/D109885

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D113140: [OpenMP][NFCI] Introduce the kernel environment for target regions

2021-11-11 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

> the kernel environment which contains information passed by the compiler to a 
> GPU kernel.

DId you mean this environment is baked into kernel at compile time? So there is 
no additional H2D cost at each call, right?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113140/new/

https://reviews.llvm.org/D113140

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D105191: [Clang][OpenMP] Add partial support for Static Device Libraries

2021-10-06 Thread Ye Luo via Phabricator via cfe-commits
ye-luo accepted this revision.
ye-luo added a comment.
This revision is now accepted and ready to land.

LGTM.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105191/new/

https://reviews.llvm.org/D105191

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D105191: [Clang][OpenMP] Add partial support for Static Device Libraries

2021-10-05 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments.



Comment at: clang/lib/Driver/ToolChains/CommonArgs.h:62
+ bool postClangLink);
+void AddStaticDeviceLibs(Compilation *C, const Tool *T, const JobAction *JA,
+ const InputInfoList *Inputs, const Driver ,

ye-luo wrote:
> saiislam wrote:
> > ye-luo wrote:
> > > ye-luo wrote:
> > > > Differentiate the names of all the three AddStaticDeviceLibs functions 
> > > > and add documentation. Be sure to do document every function added in 
> > > > this patch.
> > > @saiislam Fix this?
> > I have added documentation along with function definition in 
> > CommonArgs.cpp. Should I move it here? I thought keep documentation and 
> > code at the same place will improve readability.
> The docs are good. I was looking for "Differentiate the names of all the 
> three AddStaticDeviceLibs functions"
ping @saiislam 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105191/new/

https://reviews.llvm.org/D105191

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D105191: [Clang][OpenMP] Add partial support for Static Device Libraries

2021-09-29 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments.



Comment at: clang/lib/Driver/ToolChains/CommonArgs.h:62
+ bool postClangLink);
+void AddStaticDeviceLibs(Compilation *C, const Tool *T, const JobAction *JA,
+ const InputInfoList *Inputs, const Driver ,

saiislam wrote:
> ye-luo wrote:
> > ye-luo wrote:
> > > Differentiate the names of all the three AddStaticDeviceLibs functions 
> > > and add documentation. Be sure to do document every function added in 
> > > this patch.
> > @saiislam Fix this?
> I have added documentation along with function definition in CommonArgs.cpp. 
> Should I move it here? I thought keep documentation and code at the same 
> place will improve readability.
The docs are good. I was looking for "Differentiate the names of all the three 
AddStaticDeviceLibs functions"


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105191/new/

https://reviews.llvm.org/D105191

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D105191: [Clang][OpenMP] Add partial support for Static Device Libraries

2021-09-29 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments.



Comment at: clang/lib/Driver/ToolChains/CommonArgs.h:62
+ bool postClangLink);
+void AddStaticDeviceLibs(Compilation *C, const Tool *T, const JobAction *JA,
+ const InputInfoList *Inputs, const Driver ,

ye-luo wrote:
> Differentiate the names of all the three AddStaticDeviceLibs functions and 
> add documentation. Be sure to do document every function added in this patch.
@saiislam Fix this?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105191/new/

https://reviews.llvm.org/D105191

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D105191: [Clang][OpenMP] Add support for Static Device Libraries

2021-09-21 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

This patch doesn't seem to break anything on my side.
@saiislam could you

1. address all the in-source review comments
2. update the title to `[Clang][OpenMP] Add partial support for Static Device 
Libraries`
3. update the patch description about what works and what doesn't.

I will accept this patch once these minor revisions are added.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105191/new/

https://reviews.llvm.org/D105191

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109885: [MLIR][[amdgpu-arch]][OpenMP] Remove direct dependency on /opt/rocm

2021-09-16 Thread Ye Luo via Phabricator via cfe-commits
ye-luo requested changes to this revision.
ye-luo added a comment.
This revision now requires changes to proceed.

The fallback opt/rocm is desired. If a module system needs to point to the 
specific rocm installation. Set CMAKE_PREFIX_PATH= in the 
module file.
If you would like to honor ROCM_PATH, then do one find_library with explicit 
ROCM_PATH and another with all the defaults.




Comment at: mlir/lib/Dialect/GPU/CMakeLists.txt:137
   set(CMAKE_MODULE_PATH "${HIP_PATH}/cmake" ${CMAKE_MODULE_PATH})
   find_package(HIP)
   if (NOT HIP_FOUND)

Both ROCM_PATH HIP_PATH are used as hints without verification.
But they are used later for generating include paths. Overall logic is broken.

if ROCM_PATH takes the precedence over everything else
You can do this
if ROCM_PATH defined
find_path(
  HIP_MODULE_FILE_DIR FindHIP.cmake
  HINTS ${ROCM_PATH}
  PATH_SUFFIXES hip/cmake REQUIRED
  NO_DEFAULT_PATH)
else
find_path(
  HIP_MODULE_FILE_DIR FindHIP.cmake
  HINTS $ENV{ROCM_PATH} /opt/rocm
  PATH_SUFFIXES hip/cmake REQUIRED)
endif

by doing this, you can verify that ROCM_PATH is correct if provided and the 
path the hip module file has been verified. then it is safe to do
set(CMAKE_MODULE_PATH "${HIP_MODULE_FILE_DIR}" ${CMAKE_MODULE_PATH})
find_package(HIP)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109885/new/

https://reviews.llvm.org/D109885

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D105191: [Clang][OpenMP] Add support for Static Device Libraries

2021-09-15 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

> The option of adding sm_XX in Bundle Entry ID when user hasn't used -march 
> flag, comes under command line simplification. I have a bunch of upcoming 
> patches which will significantly simplify OpenMP command line for GPU 
> offloading. But, don't you think this feature is different than supporting 
> static device libraries and should be dealt separately?

Simplifying command line options is a different topic and we have not agreed 
upon how to do the simplification. In general, that is unrelated to this broken 
test case. But it can probably be fixed later with a separate patch.

> Our plan was always to support SDLs using -l and -L options, as described in 
> Greg Rodgers's presentation about static device libraries 
>  in last year's LLVM-CTH 
> Workshop. It was also discussed in multi company meetings. We can explore 
> supporting direct linking of SDLs as described by you, but it seems to me 
> that it is out of scope of this patch. What do you think?

To me, supporting both -L/-l and libXXX.a are both required for static 
libraries. It is not reasonable to request users to use one way not the other. 
AOMP had no issue working in both ways.

> PS: We (mulit-company OpenMP-dev meetings) have been tracking this feature 
> for a while and we would very much like it to be picked for llvm-13.

Although we wanted to deliver this feature to users, I don't think we should 
tell users that linking static archive works with the current quality. So in my 
view, It is not suitable to backport it o 13.

This patch is good. It makes part of the static linking feature work but we can 
not claim "Static Device Libraries" is fully working. So please update the tile 
and also in the description saying what is working and what is not. Also need a 
test case for nvptx64.
In addition, I'd like to see commitment of addressing both above failed test 
cases before accepting this patch to help incremental development.

by the way, I found offload to x86 with static linking doesn't work with -L. 
-lmylib
https://github.com/ye-luo/openmp-target/blob/master/tests/linking/link_static_fat_bin/compile-x86.sh
I think this can also be addressed separately. FYI, AOMP happily makes it 
working.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105191/new/

https://reviews.llvm.org/D105191

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D105191: [Clang][OpenMP] Add support for Static Device Libraries

2021-09-14 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

1. modf works now.

2. if I modify the complile.sh

  clang++ -fopenmp -fopenmp-targets=nvptx64 -c classA.cpp
  rm -f libmylib.a
  ar qc libmylib.a classA.o
  ranlib libmylib.a
  clang++ -fopenmp -fopenmp-targets=nvptx64 main.cpp -L. -lmylib
  ./a.out

doesn't work. I think the solution is adding sm_XX to the module name 
regardless of user command line.

3,  directly linking static archive doesn't work.

  clang++ -fopenmp -fopenmp-targets=nvptx64 main.cpp libmylib.a

CMake generates this style of link line. So this really needs to work.

only the following case works right now.

  clang++ -fopenmp -fopenmp-targets=nvptx64 main.cpp -L. -lmylib


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105191/new/

https://reviews.llvm.org/D105191

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D105191: [Clang][OpenMP] Add support for Static Device Libraries

2021-09-14 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

  yeluo@epyc-server:~/opt/openmp-target/tests/math$ clang++ -fopenmp 
-fopenmp-targets=nvptx64 -Xopenmp-target=nvptx64 -march=sm_80 modf.cpp -c
  yeluo@epyc-server:~/opt/openmp-target/tests/math$ clang-offload-bundler 
-type=o --inputs=modf.o --list
  openmp-nvptx64
  host-x86_64-unknown-linux-gnu
  yeluo@epyc-server:~/opt/openmp-target/tests/math$ clang++ -fopenmp 
-fopenmp-targets=nvptx64 modf.cpp -c
  yeluo@epyc-server:~/opt/openmp-target/tests/math$ clang-offload-bundler 
-type=o --inputs=modf.o --list
  openmp-nvptx64
  host-x86_64-unknown-linux-gnu
  yeluo@epyc-server:~/opt/openmp-target/tests/math$ clang++ -fopenmp 
-fopenmp-targets=nvptx64-nvidia-cuda modf.cpp -c
  warning: linking module 
'/soft/llvm/main-20210910/lib/libomptarget-nvptx-sm_80.bc': Linking two modules 
of different target triples: 
'/soft/llvm/main-20210910/lib/libomptarget-nvptx-sm_80.bc' is 'nvptx64' whereas 
'modf.cpp' is 'nvptx64-nvidia-cuda'
   [-Wlinker-warnings]
  1 warning generated.
  yeluo@epyc-server:~/opt/openmp-target/tests/math$ clang-offload-bundler 
-type=o --inputs=modf.o --list
  openmp-nvptx64-nvidia-cuda
  host-x86_64-unknown-linux-gnu

Here is my clang build recipe

  cmake -DCMAKE_C_COMPILER=gcc -DCMAKE_CXX_COMPILER=g++ \
  -DCMAKE_BUILD_TYPE=Release \
  -DCMAKE_INSTALL_PREFIX=$INSTALL_FOLDER \
  -DLLVM_ENABLE_BACKTRACES=ON \
  -DLLVM_ENABLE_WERROR=OFF \
  -DBUILD_SHARED_LIBS=OFF \
  -DLLVM_ENABLE_RTTI=ON \
  -DLLVM_TARGETS_TO_BUILD="X86;AMDGPU;NVPTX" \
  -DLLVM_ENABLE_ASSERTIONS=ON \
  -DLLVM_ENABLE_PROJECTS="clang;lld" \
  -DLLVM_ENABLE_RUNTIMES="libcxxabi;libcxx;openmp" \
  -DLIBOMPTARGET_NVPTX_COMPUTE_CAPABILITIES="80,61" \
  -DCLANG_OPENMP_NVPTX_DEFAULT_ARCH=sm_80 \
  -DLIBOMPTARGET_NVPTX_MAX_SM=38 \
  -DLIBOMPTARGET_ENABLE_DEBUG=ON \
  ../llvm-project/llvm


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105191/new/

https://reviews.llvm.org/D105191

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D105191: [Clang][OpenMP] Add support for Static Device Libraries

2021-09-13 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

the modf test still doesn't work. The issue was from unbundle.
case 1 works.

  clang++ -fopenmp -fopenmp-targets=nvptx64 modf.cpp -c
  clang++ -fopenmp -fopenmp-targets=nvptx64 modf.o

case 2

  clang++ -fopenmp -fopenmp-targets=nvptx64 -Xopenmp-target=nvptx64 
-march=sm_80 modf.cpp -c
  clang++ -fopenmp -fopenmp-targets=nvptx64 -Xopenmp-target=nvptx64 
-march=sm_80 modf.o 

doesn't work. The issue was a failure in unbundling

  "/soft/llvm/main-patched/bin/clang-offload-bundler" -type=o 
-targets=host-x86_64-unknown-linux-gnu,openmp-nvptx64-sm_80 -inputs=modf.o 
-outputs=/tmp/modf-99be57.o,/tmp/modf-88a9eb.cubin -unbundle 
-allow-missing-bundles

ends up an empty cubin file. If I remove -sm_80 on the unbundle line
It seems that we need to have sm_80 when the object file is created.

When clang is compiled CLANG_OPENMP_NVPTX_DEFAULT_ARCH can be used to set the 
default sm_XX so it doesn't needs to be provided at compile time. If not set 
default is sm_35.
So in general sm_XX is always known when the compiler is used for compiling.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105191/new/

https://reviews.llvm.org/D105191

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D105191: [Clang][OpenMP] Add support for Static Device Libraries

2021-09-13 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

@saiislam did you turn on offload? 
https://github.com/ye-luo/openmp-target/wiki/OpenMP-offload-compilers#llvm-clang
On NVIDIA, it fails at CMake step. On AMD, make step stops because of unrelated 
issue.

Please make the exact reproducer 1 working. Right now I got

  $ clang++ -fopenmp -fopenmp-targets=nvptx64 -Xopenmp-target=nvptx64 
-march=sm_80 modf.o
  nvlink fatal   : Could not open input file '/tmp/modf-88b730.cubin'
  /soft/llvm/main-patched/bin/clang-nvlink-wrapper: error: 'nvlink' failed
  clang-14: error: nvlink command failed with exit code 1 (use -v to see 
invocation)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105191/new/

https://reviews.llvm.org/D105191

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D105191: [Clang][OpenMP] Add support for Static Device Libraries

2021-09-11 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

@saiislam do my test cases work on your side? I tried this patch and still got 
linking failure.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105191/new/

https://reviews.llvm.org/D105191

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D105191: [Clang][OpenMP] Add support for Static Device Libraries

2021-09-06 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

@saiislam since clang-nvlink-wrapper has landed, could you update this patch?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105191/new/

https://reviews.llvm.org/D105191

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108291: [clang-nvlink-wrapper] Wrapper around nvlink for archive files

2021-09-05 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

This patch has landed as 83f3782c6129e7a5df3faaf0ae576611d16a8d49 
 but not 
reflected on Phabricator


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108291/new/

https://reviews.llvm.org/D108291

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108291: [clang-nvlink-wrapper] Wrapper around nvlink for archive files

2021-08-31 Thread Ye Luo via Phabricator via cfe-commits
ye-luo accepted this revision.
ye-luo added a comment.
This revision is now accepted and ready to land.

Documentation is much improved. LGTM.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108291/new/

https://reviews.llvm.org/D108291

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D101960: [openmp] Drop requirement on library path environment variables

2021-08-30 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

In D101960#2961133 , @jdoerfert wrote:

> There are 3 problems here (ignoring our test setup which should be discussed 
> separately):
>
> 1. make sure clang finds libomp.so
> 2. make sure libomp.so (or clang?) finds libomptarget.so
> 3. make sure libomptartget.so finds the plugins
>
> All of which have to work nicely with LD_LIBRARY_PATH.
>
> I think for 3 you can look at the current dir. That was discussed and, as 
> long as it does work with LD_LIBRARY_PATH, that seems a win.
>
> For 2, why don't we install them in the same place? If so, we reduce it to 
> problem 1) [after applying solution to 3)] no?
>
> For 1, doing something always backwards compatible that may or may not work 
> on some platforms and configurations seems best. You had a proposal here 
> already. If that works, let's do it.

For 1. If users prefer linking an alternative libomp.so, they should not use 
-fopenmp at linking and link libomp.so explicitly as a regular library. If 
users add -fopenmp to clang at linking, clang needs to link the libomp.so 
shipped with clang and set rpath to ensure libomp.so can be found at run even 
libomp.so doesn't exist on LD_LIBRARY_PATH. In this way, if a user would like 
to switch to anther libomp.so, they can still specify LD_LIBRARY_PATH.
For 2. Is libomp.so aware of libomptarget.so? I doubt. Anyway I'd like to see a 
similar logic as case 1 and the control option is -fopenmp-targets.

So addOpenMPRuntimeSpecificRPath seems aligned with what I described.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101960/new/

https://reviews.llvm.org/D101960

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D101935: [clang] Search runtimes build tree for openmp runtime

2021-08-30 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

It seems that this path is baked in to clang executable even after make 
install. I'm not convinced this is the right direction.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D101935/new/

https://reviews.llvm.org/D101935

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D105191: [Clang][OpenMP] Add support for Static Device Libraries

2021-08-23 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments.



Comment at: clang/lib/Driver/ToolChains/CommonArgs.h:62
+ bool postClangLink);
+void AddStaticDeviceLibs(Compilation *C, const Tool *T, const JobAction *JA,
+ const InputInfoList *Inputs, const Driver ,

Differentiate the names of all the three AddStaticDeviceLibs functions and add 
documentation. Be sure to do document every function added in this patch.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105191/new/

https://reviews.llvm.org/D105191

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108291: [clang-nvlink-wrapper] Wrapper around nvlink for archive files

2021-08-23 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments.



Comment at: clang/tools/clang-nvlink-wrapper/ClangNvlinkWrapper.cpp:19
+/// Such an archive is then passed to this tool to extract cubin files before
+/// passing to nvlink.
+///

Right now clang-offload-bundler is only used to create an object file for the 
host and a cubin file for the device.
Then cubin files are passed to the nvlink.
This is different from what you described
```
clang-offload-bundler creates a device specific archive of cubin files.
Such an archive is then passed to this tool to extract cubin files before 
passing to nvlink.
```
Is this caused by changes in https://reviews.llvm.org/D105191?
Do you have any reading materials which documents the whole linking flow of 
D105191?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108291/new/

https://reviews.llvm.org/D108291

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108291: [clang-nvlink-wrapper] Wrapper around nvlink for archive files

2021-08-18 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

this is the working steps in the linking script.

  clang-offload-bundler (host,device)
  in: complex_reduction.cpp.o
  out: complex_reduction-494ba8.o, complex_reduction-5aba63.cubin
  
  nvlink (device)
  in: complex_reduction-5aba63.cubin
  out: complex_reduction-b1898c.out
  
  clang-offload-wrapper (device)
  in: complex_reduction-b1898c.out
  out: cxx-a8318a.bc
  
  clang (device)
  in: cxx-a8318a.bc cxx-e54e6f.o
  
  ld (host, device)
  in: complex_reduction-494ba8.o, cxx-e54e6f.o
  out: executable

I'm not quite understand what this wrapper replaces and why.
"It is required for linking static device libraries on nvptx" is not explaining 
what is not working with existing steps and what the clang-nvlink-wrapper 
changes to make it work. Need elaboration.




Comment at: clang/tools/clang-nvlink-wrapper/ClangNvlinkWrapper.cpp:41
+For descriptions of the options please run 'nvlink --help'
+The wrapper extracts any arcive objects and call nvlink with the
+individual files instead, plus any other options/object.

arcive -> archive
 is input already
"The wrapper extracts any arcive objects " what does it mean?
"call nvlink with the individual files" waht individual files.
What is the output?
Please make this documentation more clear.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108291/new/

https://reviews.llvm.org/D108291

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D106960: [OffloadArch] Library to query properties of current offload archicture

2021-08-04 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

I testing with aomp 13.0-5 on ubuntu 20.04.2 LTS (Focal Fossa)

  yeluo@epyc-server:~$ offload-arch -a
  gfx906
  ERROR: offload-arch not found for 10de:2486.
  yeluo@epyc-server:~$ offload-arch -c
  gfx906   sramecc+ xnack-
  yeluo@epyc-server:~$ offload-arch -n
  gfx906 1002:66AF

my second GPU is NVIDIA 3060Ti (sm_86)
I build my app daily with -Xopenmp-target=nvptx64-nvidia-cuda -march=sm_80.

About sm_80 binary able ot run on sm_86
https://docs.nvidia.com/cuda/ampere-compatibility-guide/index.html#application-compatibility-on-ampere


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106960/new/

https://reviews.llvm.org/D106960

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D104904: [OpenMP][AMDGCN] Initial math headers support

2021-07-30 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

Unforuantely I hit error

  #include 
  int main()
  { }

~/opt/llvm-clang/build_mirror_offload_main/bin/clang++ -fopenmp 
-fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa 
-march=gfx906 main.cpp -c
works fine

~/opt/llvm-clang/build_mirror_offload_main/bin/clang++ -fopenmp 
-fopenmp-targets=nvptx64 main.cpp

  $ ~/opt/llvm-clang/build_mirror_offload_main/bin/clang++ -fopenmp 
-fopenmp-targets=nvptx64 
--libomptarget-nvptx-bc-path=/home/yeluo/opt/llvm-clang/build_mirror_offload_main/runtimes/runtimes-bins/openmp/libomptarget/libomptarget-nvptx-sm_80.bc
 main.cpp -c --std=c++14
  clang-14: warning: Unknown CUDA version. cuda.h: CUDA_VERSION=11040. Assuming 
the latest supported version 10.1 [-Wunknown-cuda-version]
  In file included from main.cpp:2:
  In file included from 
/home/yeluo/opt/llvm-clang/build_mirror_offload_main/lib/clang/14.0.0/include/openmp_wrappers/complex:26:
  In file included from 
/usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/complex:45:
  In file included from 
/usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/sstream:38:
  In file included from 
/usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/istream:38:
  In file included from 
/usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/ios:38:
  In file included from 
/usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/iosfwd:39:
  
/usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/stringfwd.h:73:30:
 error: no template named 'allocator'
 typename _Alloc = allocator<_CharT> >
   ^
  
/usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/stringfwd.h:73:21:
 error: template parameter missing a default argument
 typename _Alloc = allocator<_CharT> >
  ^
  
/usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/stringfwd.h:72:48:
 note: previous default template argument defined here
template,
 ^
  
/usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/stringfwd.h:79:11:
 error: too few template arguments for class template 'basic_string'
typedef basic_stringstring;   
^
  
/usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/stringfwd.h:74:11:
 note: template is declared here
  class basic_string;
^
  
/usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/stringfwd.h:83:11:
 error: too few template arguments for class template 'basic_string'
typedef basic_string wstring;   
^
  
/usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/stringfwd.h:74:11:
 note: template is declared here
  class basic_string;
^
  
/usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/stringfwd.h:93:11:
 error: too few template arguments for class template 'basic_string'
typedef basic_string u16string; 
^
  
/usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/stringfwd.h:74:11:
 note: template is declared here
  class basic_string;
^
  
/usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/stringfwd.h:96:11:
 error: too few template arguments for class template 'basic_string'
typedef basic_string u32string; 
^
  
/usr/lib/gcc/x86_64-linux-gnu/9/../../../../include/c++/9/bits/stringfwd.h:74:11:
 note: template is declared here
  class basic_string;
^

Manually add  fixes the problem.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104904/new/

https://reviews.llvm.org/D104904

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D104904: [OpenMP][AMDGCN] Initial math headers support

2021-07-29 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

how to get this moving?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104904/new/

https://reviews.llvm.org/D104904

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D105191: [Clang][OpenMP] Add support for Static Device Libraries

2021-07-28 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

Do I must use llvm-ar/ranlib  or system ar/ranlib is OK?

1. existing use case breaks

Use https://github.com/ye-luo/openmp-target/blob/master/tests/math/modf.cpp
$ clang++ -fopenmp -fopenmp-targets=nvptx64 -Xopenmp-target=nvptx64 
-march=sm_80 modf.cpp  # still OK
$ clang++ -fopenmp -fopenmp-targets=nvptx64 -Xopenmp-target=nvptx64 
-march=sm_80 modf.o 
clang-14: warning: Unknown CUDA version. version.txt: 11.0.228. Assuming the 
latest supported version 10.1 [-Wunknown-cuda-version]
nvlink fatal   : Could not open input file '/tmp/modf-0bf89b.cubin'
clang-14: error: nvlink command failed with exit code 1 (use -v to see 
invocation)

2. could you make my test case working?

https://github.com/ye-luo/openmp-target/tree/master/tests/link_static_fat_bin
both compile-amd.sh and compile.sh doesn't work for me.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105191/new/

https://reviews.llvm.org/D105191

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D106870: [OpenMP] Multi architecture compilation support

2021-07-27 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

In D106870#2907257 , @saiislam wrote:

> In D106870#2907252 , @ye-luo wrote:
>
>> `-fopenmp-targets=amdgcn-amd-amdhsa,amdgcn-amd-amdhsa` seems burdensome. 
>> Could you just count how many `-Xopenmp-target=amdgcn-amd-amdhsa` there are 
>> on the comand line and then count the unique ones?
>
> I have a patch in pipeline which will eliminate need of (-fopenmp-targets, 
> -Xopenmp-target, and -march) altogether. User will be able to compile with 
> just "--offload-arch=gfx906" instead of using the other three flags.
> It is working in our downstream AOMP Compiler but I haven't posted a phab 
> review yet.

That is just a convenient option and separate topic. I'm commenting on the 
current generic option you are fiddle with.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106870/new/

https://reviews.llvm.org/D106870

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D106870: [OpenMP] Multi architecture compilation support

2021-07-27 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

`-fopenmp-targets=amdgcn-amd-amdhsa,amdgcn-amd-amdhsa` seems burdensome. Could 
you just count how many `-Xopenmp-target=amdgcn-amd-amdhsa` there are on the 
comand line and then count the unique ones?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106870/new/

https://reviews.llvm.org/D106870

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D106793: [OpenMP] Add a driver flag to enable the new device runtime library

2021-07-26 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

In D106793#2904943 , @jhuber6 wrote:

> In D106793#2904661 , @ye-luo wrote:
>
>> Not clear from the summary what is the new driver option.
>
> It's a rewrite of the current device runtime. This option will enable it.

What is the option name? I tried not to read the source code but only the 
summary


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106793/new/

https://reviews.llvm.org/D106793

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D106793: [OpenMP] Add a driver flag to enable the new device runtime library

2021-07-26 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

Not clear from the summary what is the new driver option.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106793/new/

https://reviews.llvm.org/D106793

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D96877: [libomptarget] Try a fallback devicertl if the preferred one is missing

2021-02-22 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

In D96877#2578752 , @tianshilei1992 
wrote:

> In D96877#2578748 , @ye-luo wrote:
>
>> to me this is still desired + cmake creating libomptarget-nvptx-unknown.bc 
>> as a solution for forward compatibility until a clean solution lands.
>
> We’ll have newer version LLVM like 12.1 or 12.01 w/ a *right* solution. I 
> don’t think we need to think that further.

This doesn't help people who needs to run exactly 12.0.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96877/new/

https://reviews.llvm.org/D96877

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D96877: [libomptarget] Try a fallback devicertl if the preferred one is missing

2021-02-22 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

to me this is still desired + cmake creating libomptarget-nvptx-unknown.bc as a 
solution for forward compatibility until a clean solution lands.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96877/new/

https://reviews.llvm.org/D96877

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D96877: [libomptarget] Try a fallback devicertl if the preferred one is missing

2021-02-18 Thread Ye Luo via Phabricator via cfe-commits
ye-luo accepted this revision.
ye-luo added a comment.
This revision is now accepted and ready to land.

Got it. Copy a file can be tricky. Compile one more can be easily done in cmake.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96877/new/

https://reviews.llvm.org/D96877

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D96877: [libomptarget] Try a fallback devicertl if the preferred one is missing

2021-02-18 Thread Ye Luo via Phabricator via cfe-commits
ye-luo requested changes to this revision.
ye-luo added a comment.
This revision now requires changes to proceed.

Let user to copy the bc file is not feasible.
Handle this in CMake please.
libomptarget-nvptx-unknown.bc


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96877/new/

https://reviews.llvm.org/D96877

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D96877: [libomptarget] Try a fallback devicertl if the preferred one is missing

2021-02-18 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

Does this patch includes creating 'libomptarget-nvptx-unknown.bc'?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D96877/new/

https://reviews.llvm.org/D96877

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86119: [OPENMP50]Allow overlapping mapping in target constrcuts.

2020-12-07 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments.



Comment at: openmp/libomptarget/src/omptarget.cpp:233
 MapperComponents
-.Components[target_data_function == targetDataEnd ? I : E - I - 1];
+.Components[target_data_function == targetDataEnd ? E - I - 1 : I];
 MapperArgsBase[I] = C.Base;

ABataev wrote:
> ye-luo wrote:
> > ABataev wrote:
> > > ye-luo wrote:
> > > > ABataev wrote:
> > > > > ye-luo wrote:
> > > > > > ye-luo wrote:
> > > > > > > ye-luo wrote:
> > > > > > > > ABataev wrote:
> > > > > > > > > ABataev wrote:
> > > > > > > > > > ye-luo wrote:
> > > > > > > > > > > ye-luo wrote:
> > > > > > > > > > > > ye-luo wrote:
> > > > > > > > > > > > > ABataev wrote:
> > > > > > > > > > > > > > ye-luo wrote:
> > > > > > > > > > > > > > > ABataev wrote:
> > > > > > > > > > > > > > > > ABataev wrote:
> > > > > > > > > > > > > > > > > ye-luo wrote:
> > > > > > > > > > > > > > > > > > ye-luo wrote:
> > > > > > > > > > > > > > > > > > > ABataev wrote:
> > > > > > > > > > > > > > > > > > > > ye-luo wrote:
> > > > > > > > > > > > > > > > > > > > > ABataev wrote:
> > > > > > > > > > > > > > > > > > > > > > grokos wrote:
> > > > > > > > > > > > > > > > > > > > > > > What is the current status of the 
> > > > > > > > > > > > > > > > > > > > > > > order of the arguments clang 
> > > > > > > > > > > > > > > > > > > > > > > emits? Is it still necessary to 
> > > > > > > > > > > > > > > > > > > > > > > traverse arguments in reverse 
> > > > > > > > > > > > > > > > > > > > > > > order here?
> > > > > > > > > > > > > > > > > > > > > > Yes, still required
> > > > > > > > > > > > > > > > > > > > > Based on the conversation in
> > > > > > > > > > > > > > > > > > > > > https://reviews.llvm.org/D85216
> > > > > > > > > > > > > > > > > > > > > This line of code neither before nor 
> > > > > > > > > > > > > > > > > > > > > after the change plays well.
> > > > > > > > > > > > > > > > > > > > > Shall we fix the order in 
> > > > > > > > > > > > > > > > > > > > > targetDataEnd first?
> > > > > > > > > > > > > > > > > > > > This change is part of this patch and 
> > > > > > > > > > > > > > > > > > > > cannot be committed separately. 
> > > > > > > > > > > > > > > > > > > I mean could you fix that issue as a 
> > > > > > > > > > > > > > > > > > > parent of this patch?
> > > > > > > > > > > > > > > > > > > This change is part of this patch and 
> > > > > > > > > > > > > > > > > > > cannot be committed separately. 
> > > > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > > > If fixing the reordering is part of this 
> > > > > > > > > > > > > > > > > > patch, I should have seen 
> > > > > > > > > > > > > > > > > > "target_data_function == targetDataEnd ?" 
> > > > > > > > > > > > > > > > > > branches disappear.
> > > > > > > > > > > > > > > > > Nope, just with this patch. It reorders the 
> > > > > > > > > > > > > > > > > maps and need to change the cleanup order 
> > > > > > > > > > > > > > > > > too. 
> > > > > > > > > > > > > > > > It works just like constructors/destructors: 
> > > > > > > > > > > > > > > > allocate in direct order, deallocate in 
> > > > > > > > > > > > > > > > reversed to correctly handle map order. 
> > > > > > > > > > > > > > > The description says that "present and alloc 
> > > > > > > > > > > > > > > mappings are processed first and then all others."
> > > > > > > > > > > > > > > Why the order of arguments in targetDataBegin, 
> > > > > > > > > > > > > > > targetDataEnd and targetDataUpdate all get 
> > > > > > > > > > > > > > > reversed.
> > > > > > > > > > > > > > Because this is for mappers. Mapper maps are 
> > > > > > > > > > > > > > ordered by the compiler in the direct order (alloc, 
> > > > > > > > > > > > > > maps, delete) but when we need to do exit, we need 
> > > > > > > > > > > > > > to release the data in reversed order (deletes, 
> > > > > > > > > > > > > > maps, allocs). 
> > > > > > > > > > > > > I was not making the question clear. My question 
> > > > > > > > > > > > > about "reverse" is not about having a reverse order 
> > > > > > > > > > > > > for targetDataBegin. My question was about 
> > > > > > > > > > > > > "reversing" from the the old code. Your change put 
> > > > > > > > > > > > > the opposite order for targetDataBegin, targetDataEnd 
> > > > > > > > > > > > > and targetDataUpdate cases. 
> > > > > > > > > > > > > I was not making the question clear. My question 
> > > > > > > > > > > > > about "reverse" is not about having a reverse order 
> > > > > > > > > > > > > for targetDataBegin. My question was about 
> > > > > > > > > > > > > "reversing" from the the old code. Your change put 
> > > > > > > > > > > > > the opposite order for targetDataBegin, targetDataEnd 
> > > > > > > > > > > > > and targetDataUpdate cases. 
> > > > > > > > > > > > 
> > > > > > > > > > > > typo correction
> > > > > > > > > > > > I was not making the question clear. My question about 
> > > > > > > > > > > > "reverse" is not about having a 

[PATCH] D86119: [OPENMP50]Allow overlapping mapping in target constrcuts.

2020-12-07 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments.



Comment at: openmp/libomptarget/src/omptarget.cpp:233
 MapperComponents
-.Components[target_data_function == targetDataEnd ? I : E - I - 1];
+.Components[target_data_function == targetDataEnd ? E - I - 1 : I];
 MapperArgsBase[I] = C.Base;

ABataev wrote:
> ye-luo wrote:
> > ABataev wrote:
> > > ye-luo wrote:
> > > > ye-luo wrote:
> > > > > ye-luo wrote:
> > > > > > ABataev wrote:
> > > > > > > ABataev wrote:
> > > > > > > > ye-luo wrote:
> > > > > > > > > ye-luo wrote:
> > > > > > > > > > ye-luo wrote:
> > > > > > > > > > > ABataev wrote:
> > > > > > > > > > > > ye-luo wrote:
> > > > > > > > > > > > > ABataev wrote:
> > > > > > > > > > > > > > ABataev wrote:
> > > > > > > > > > > > > > > ye-luo wrote:
> > > > > > > > > > > > > > > > ye-luo wrote:
> > > > > > > > > > > > > > > > > ABataev wrote:
> > > > > > > > > > > > > > > > > > ye-luo wrote:
> > > > > > > > > > > > > > > > > > > ABataev wrote:
> > > > > > > > > > > > > > > > > > > > grokos wrote:
> > > > > > > > > > > > > > > > > > > > > What is the current status of the 
> > > > > > > > > > > > > > > > > > > > > order of the arguments clang emits? 
> > > > > > > > > > > > > > > > > > > > > Is it still necessary to traverse 
> > > > > > > > > > > > > > > > > > > > > arguments in reverse order here?
> > > > > > > > > > > > > > > > > > > > Yes, still required
> > > > > > > > > > > > > > > > > > > Based on the conversation in
> > > > > > > > > > > > > > > > > > > https://reviews.llvm.org/D85216
> > > > > > > > > > > > > > > > > > > This line of code neither before nor 
> > > > > > > > > > > > > > > > > > > after the change plays well.
> > > > > > > > > > > > > > > > > > > Shall we fix the order in targetDataEnd 
> > > > > > > > > > > > > > > > > > > first?
> > > > > > > > > > > > > > > > > > This change is part of this patch and 
> > > > > > > > > > > > > > > > > > cannot be committed separately. 
> > > > > > > > > > > > > > > > > I mean could you fix that issue as a parent 
> > > > > > > > > > > > > > > > > of this patch?
> > > > > > > > > > > > > > > > > This change is part of this patch and cannot 
> > > > > > > > > > > > > > > > > be committed separately. 
> > > > > > > > > > > > > > > > 
> > > > > > > > > > > > > > > > If fixing the reordering is part of this patch, 
> > > > > > > > > > > > > > > > I should have seen "target_data_function == 
> > > > > > > > > > > > > > > > targetDataEnd ?" branches disappear.
> > > > > > > > > > > > > > > Nope, just with this patch. It reorders the maps 
> > > > > > > > > > > > > > > and need to change the cleanup order too. 
> > > > > > > > > > > > > > It works just like constructors/destructors: 
> > > > > > > > > > > > > > allocate in direct order, deallocate in reversed to 
> > > > > > > > > > > > > > correctly handle map order. 
> > > > > > > > > > > > > The description says that "present and alloc mappings 
> > > > > > > > > > > > > are processed first and then all others."
> > > > > > > > > > > > > Why the order of arguments in targetDataBegin, 
> > > > > > > > > > > > > targetDataEnd and targetDataUpdate all get reversed.
> > > > > > > > > > > > Because this is for mappers. Mapper maps are ordered by 
> > > > > > > > > > > > the compiler in the direct order (alloc, maps, delete) 
> > > > > > > > > > > > but when we need to do exit, we need to release the 
> > > > > > > > > > > > data in reversed order (deletes, maps, allocs). 
> > > > > > > > > > > I was not making the question clear. My question about 
> > > > > > > > > > > "reverse" is not about having a reverse order for 
> > > > > > > > > > > targetDataBegin. My question was about "reversing" from 
> > > > > > > > > > > the the old code. Your change put the opposite order for 
> > > > > > > > > > > targetDataBegin, targetDataEnd and targetDataUpdate 
> > > > > > > > > > > cases. 
> > > > > > > > > > > I was not making the question clear. My question about 
> > > > > > > > > > > "reverse" is not about having a reverse order for 
> > > > > > > > > > > targetDataBegin. My question was about "reversing" from 
> > > > > > > > > > > the the old code. Your change put the opposite order for 
> > > > > > > > > > > targetDataBegin, targetDataEnd and targetDataUpdate 
> > > > > > > > > > > cases. 
> > > > > > > > > > 
> > > > > > > > > > typo correction
> > > > > > > > > > I was not making the question clear. My question about 
> > > > > > > > > > "reverse" is not about having a reverse order for 
> > > > > > > > > > **targetDataEnd**. My question was about "reversing" from 
> > > > > > > > > > the the old code. Your change put the opposite order for 
> > > > > > > > > > targetDataBegin, targetDataEnd and targetDataUpdate cases.
> > > > > > > > > My separate question specifically for targetDataEnd is the 
> > > > > > > > > following.
> > > > > > > > > 
> > > > > > > > > In target(), we call
> > > > > > > > > ```
> > > > > > > > > targetDataBegin(args)
> > > > > > > > 

[PATCH] D86119: [OPENMP50]Allow overlapping mapping in target constrcuts.

2020-12-07 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments.



Comment at: openmp/libomptarget/src/omptarget.cpp:233
 MapperComponents
-.Components[target_data_function == targetDataEnd ? I : E - I - 1];
+.Components[target_data_function == targetDataEnd ? E - I - 1 : I];
 MapperArgsBase[I] = C.Base;

ABataev wrote:
> ye-luo wrote:
> > ye-luo wrote:
> > > ye-luo wrote:
> > > > ABataev wrote:
> > > > > ABataev wrote:
> > > > > > ye-luo wrote:
> > > > > > > ye-luo wrote:
> > > > > > > > ye-luo wrote:
> > > > > > > > > ABataev wrote:
> > > > > > > > > > ye-luo wrote:
> > > > > > > > > > > ABataev wrote:
> > > > > > > > > > > > ABataev wrote:
> > > > > > > > > > > > > ye-luo wrote:
> > > > > > > > > > > > > > ye-luo wrote:
> > > > > > > > > > > > > > > ABataev wrote:
> > > > > > > > > > > > > > > > ye-luo wrote:
> > > > > > > > > > > > > > > > > ABataev wrote:
> > > > > > > > > > > > > > > > > > grokos wrote:
> > > > > > > > > > > > > > > > > > > What is the current status of the order 
> > > > > > > > > > > > > > > > > > > of the arguments clang emits? Is it still 
> > > > > > > > > > > > > > > > > > > necessary to traverse arguments in 
> > > > > > > > > > > > > > > > > > > reverse order here?
> > > > > > > > > > > > > > > > > > Yes, still required
> > > > > > > > > > > > > > > > > Based on the conversation in
> > > > > > > > > > > > > > > > > https://reviews.llvm.org/D85216
> > > > > > > > > > > > > > > > > This line of code neither before nor after 
> > > > > > > > > > > > > > > > > the change plays well.
> > > > > > > > > > > > > > > > > Shall we fix the order in targetDataEnd first?
> > > > > > > > > > > > > > > > This change is part of this patch and cannot be 
> > > > > > > > > > > > > > > > committed separately. 
> > > > > > > > > > > > > > > I mean could you fix that issue as a parent of 
> > > > > > > > > > > > > > > this patch?
> > > > > > > > > > > > > > > This change is part of this patch and cannot be 
> > > > > > > > > > > > > > > committed separately. 
> > > > > > > > > > > > > > 
> > > > > > > > > > > > > > If fixing the reordering is part of this patch, I 
> > > > > > > > > > > > > > should have seen "target_data_function == 
> > > > > > > > > > > > > > targetDataEnd ?" branches disappear.
> > > > > > > > > > > > > Nope, just with this patch. It reorders the maps and 
> > > > > > > > > > > > > need to change the cleanup order too. 
> > > > > > > > > > > > It works just like constructors/destructors: allocate 
> > > > > > > > > > > > in direct order, deallocate in reversed to correctly 
> > > > > > > > > > > > handle map order. 
> > > > > > > > > > > The description says that "present and alloc mappings are 
> > > > > > > > > > > processed first and then all others."
> > > > > > > > > > > Why the order of arguments in targetDataBegin, 
> > > > > > > > > > > targetDataEnd and targetDataUpdate all get reversed.
> > > > > > > > > > Because this is for mappers. Mapper maps are ordered by the 
> > > > > > > > > > compiler in the direct order (alloc, maps, delete) but when 
> > > > > > > > > > we need to do exit, we need to release the data in reversed 
> > > > > > > > > > order (deletes, maps, allocs). 
> > > > > > > > > I was not making the question clear. My question about 
> > > > > > > > > "reverse" is not about having a reverse order for 
> > > > > > > > > targetDataBegin. My question was about "reversing" from the 
> > > > > > > > > the old code. Your change put the opposite order for 
> > > > > > > > > targetDataBegin, targetDataEnd and targetDataUpdate cases. 
> > > > > > > > > I was not making the question clear. My question about 
> > > > > > > > > "reverse" is not about having a reverse order for 
> > > > > > > > > targetDataBegin. My question was about "reversing" from the 
> > > > > > > > > the old code. Your change put the opposite order for 
> > > > > > > > > targetDataBegin, targetDataEnd and targetDataUpdate cases. 
> > > > > > > > 
> > > > > > > > typo correction
> > > > > > > > I was not making the question clear. My question about 
> > > > > > > > "reverse" is not about having a reverse order for 
> > > > > > > > **targetDataEnd**. My question was about "reversing" from the 
> > > > > > > > the old code. Your change put the opposite order for 
> > > > > > > > targetDataBegin, targetDataEnd and targetDataUpdate cases.
> > > > > > > My separate question specifically for targetDataEnd is the 
> > > > > > > following.
> > > > > > > 
> > > > > > > In target(), we call
> > > > > > > ```
> > > > > > > targetDataBegin(args)
> > > > > > > {  // forward order
> > > > > > >   for (int32_t i = 0; i < arg_num; ++i) { ... }
> > > > > > > }
> > > > > > > launch_kernels
> > > > > > > targetDataEnd(args)
> > > > > > > {  // reverse order
> > > > > > >   for (int32_t I = ArgNum - 1; I >= 0; --I) { }
> > > > > > > }
> > > > > > > ```
> > > > > > > 
> > > > > > > At a mapper,
> > > > > > > ```
> > > > > > > targetDataMapper
> > > > > > > {
> > > > > > >   // 

[PATCH] D86119: [OPENMP50]Allow overlapping mapping in target constrcuts.

2020-12-04 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments.



Comment at: openmp/libomptarget/src/omptarget.cpp:233
 MapperComponents
-.Components[target_data_function == targetDataEnd ? I : E - I - 1];
+.Components[target_data_function == targetDataEnd ? E - I - 1 : I];
 MapperArgsBase[I] = C.Base;

ABataev wrote:
> ABataev wrote:
> > ye-luo wrote:
> > > ye-luo wrote:
> > > > ye-luo wrote:
> > > > > ABataev wrote:
> > > > > > ye-luo wrote:
> > > > > > > ABataev wrote:
> > > > > > > > ABataev wrote:
> > > > > > > > > ye-luo wrote:
> > > > > > > > > > ye-luo wrote:
> > > > > > > > > > > ABataev wrote:
> > > > > > > > > > > > ye-luo wrote:
> > > > > > > > > > > > > ABataev wrote:
> > > > > > > > > > > > > > grokos wrote:
> > > > > > > > > > > > > > > What is the current status of the order of the 
> > > > > > > > > > > > > > > arguments clang emits? Is it still necessary to 
> > > > > > > > > > > > > > > traverse arguments in reverse order here?
> > > > > > > > > > > > > > Yes, still required
> > > > > > > > > > > > > Based on the conversation in
> > > > > > > > > > > > > https://reviews.llvm.org/D85216
> > > > > > > > > > > > > This line of code neither before nor after the change 
> > > > > > > > > > > > > plays well.
> > > > > > > > > > > > > Shall we fix the order in targetDataEnd first?
> > > > > > > > > > > > This change is part of this patch and cannot be 
> > > > > > > > > > > > committed separately. 
> > > > > > > > > > > I mean could you fix that issue as a parent of this patch?
> > > > > > > > > > > This change is part of this patch and cannot be committed 
> > > > > > > > > > > separately. 
> > > > > > > > > > 
> > > > > > > > > > If fixing the reordering is part of this patch, I should 
> > > > > > > > > > have seen "target_data_function == targetDataEnd ?" 
> > > > > > > > > > branches disappear.
> > > > > > > > > Nope, just with this patch. It reorders the maps and need to 
> > > > > > > > > change the cleanup order too. 
> > > > > > > > It works just like constructors/destructors: allocate in direct 
> > > > > > > > order, deallocate in reversed to correctly handle map order. 
> > > > > > > The description says that "present and alloc mappings are 
> > > > > > > processed first and then all others."
> > > > > > > Why the order of arguments in targetDataBegin, targetDataEnd and 
> > > > > > > targetDataUpdate all get reversed.
> > > > > > Because this is for mappers. Mapper maps are ordered by the 
> > > > > > compiler in the direct order (alloc, maps, delete) but when we need 
> > > > > > to do exit, we need to release the data in reversed order (deletes, 
> > > > > > maps, allocs). 
> > > > > I was not making the question clear. My question about "reverse" is 
> > > > > not about having a reverse order for targetDataBegin. My question was 
> > > > > about "reversing" from the the old code. Your change put the opposite 
> > > > > order for targetDataBegin, targetDataEnd and targetDataUpdate cases. 
> > > > > I was not making the question clear. My question about "reverse" is 
> > > > > not about having a reverse order for targetDataBegin. My question was 
> > > > > about "reversing" from the the old code. Your change put the opposite 
> > > > > order for targetDataBegin, targetDataEnd and targetDataUpdate cases. 
> > > > 
> > > > typo correction
> > > > I was not making the question clear. My question about "reverse" is not 
> > > > about having a reverse order for **targetDataEnd**. My question was 
> > > > about "reversing" from the the old code. Your change put the opposite 
> > > > order for targetDataBegin, targetDataEnd and targetDataUpdate cases.
> > > My separate question specifically for targetDataEnd is the following.
> > > 
> > > In target(), we call
> > > ```
> > > targetDataBegin(args)
> > > {  // forward order
> > >   for (int32_t i = 0; i < arg_num; ++i) { ... }
> > > }
> > > launch_kernels
> > > targetDataEnd(args)
> > > {  // reverse order
> > >   for (int32_t I = ArgNum - 1; I >= 0; --I) { }
> > > }
> > > ```
> > > 
> > > At a mapper,
> > > ```
> > > targetDataMapper
> > > {
> > >   // generate args_reverse in reverse order for targetDataEnd
> > >   targetDataEnd(args_reverse)
> > > }
> > > ```
> > > Are we actually getting the original forward order due to one reverse in 
> > > targetDataMapper and second reverse in targetDataEnd? Is this the desired 
> > > behavior? This part confused me. Do I miss something? Could you explain a 
> > > bit?
> > Yes, something like this. targetDataEnd reverses the order of mapping 
> > arrays. But mapper generator always generates mapping arrays in the direct 
> > order (it fills mapping arrays that later processed by the targetDataEnd 
> > function). We could fix this by passing extra Boolean flag to the generator 
> > function but it means the redesign of the mappers. That's why we have to 
> > reverse it in the libomptarget.
> You can check it yourself. Apply the  patch, restore the original 

[PATCH] D86119: [OPENMP50]Allow overlapping mapping in target constrcuts.

2020-12-04 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments.



Comment at: openmp/libomptarget/src/omptarget.cpp:233
 MapperComponents
-.Components[target_data_function == targetDataEnd ? I : E - I - 1];
+.Components[target_data_function == targetDataEnd ? E - I - 1 : I];
 MapperArgsBase[I] = C.Base;

ye-luo wrote:
> ye-luo wrote:
> > ABataev wrote:
> > > ye-luo wrote:
> > > > ABataev wrote:
> > > > > ABataev wrote:
> > > > > > ye-luo wrote:
> > > > > > > ye-luo wrote:
> > > > > > > > ABataev wrote:
> > > > > > > > > ye-luo wrote:
> > > > > > > > > > ABataev wrote:
> > > > > > > > > > > grokos wrote:
> > > > > > > > > > > > What is the current status of the order of the 
> > > > > > > > > > > > arguments clang emits? Is it still necessary to 
> > > > > > > > > > > > traverse arguments in reverse order here?
> > > > > > > > > > > Yes, still required
> > > > > > > > > > Based on the conversation in
> > > > > > > > > > https://reviews.llvm.org/D85216
> > > > > > > > > > This line of code neither before nor after the change plays 
> > > > > > > > > > well.
> > > > > > > > > > Shall we fix the order in targetDataEnd first?
> > > > > > > > > This change is part of this patch and cannot be committed 
> > > > > > > > > separately. 
> > > > > > > > I mean could you fix that issue as a parent of this patch?
> > > > > > > > This change is part of this patch and cannot be committed 
> > > > > > > > separately. 
> > > > > > > 
> > > > > > > If fixing the reordering is part of this patch, I should have 
> > > > > > > seen "target_data_function == targetDataEnd ?" branches disappear.
> > > > > > Nope, just with this patch. It reorders the maps and need to change 
> > > > > > the cleanup order too. 
> > > > > It works just like constructors/destructors: allocate in direct 
> > > > > order, deallocate in reversed to correctly handle map order. 
> > > > The description says that "present and alloc mappings are processed 
> > > > first and then all others."
> > > > Why the order of arguments in targetDataBegin, targetDataEnd and 
> > > > targetDataUpdate all get reversed.
> > > Because this is for mappers. Mapper maps are ordered by the compiler in 
> > > the direct order (alloc, maps, delete) but when we need to do exit, we 
> > > need to release the data in reversed order (deletes, maps, allocs). 
> > I was not making the question clear. My question about "reverse" is not 
> > about having a reverse order for targetDataBegin. My question was about 
> > "reversing" from the the old code. Your change put the opposite order for 
> > targetDataBegin, targetDataEnd and targetDataUpdate cases. 
> > I was not making the question clear. My question about "reverse" is not 
> > about having a reverse order for targetDataBegin. My question was about 
> > "reversing" from the the old code. Your change put the opposite order for 
> > targetDataBegin, targetDataEnd and targetDataUpdate cases. 
> 
> typo correction
> I was not making the question clear. My question about "reverse" is not about 
> having a reverse order for **targetDataEnd**. My question was about 
> "reversing" from the the old code. Your change put the opposite order for 
> targetDataBegin, targetDataEnd and targetDataUpdate cases.
My separate question specifically for targetDataEnd is the following.

In target(), we call
```
targetDataBegin(args)
{  // forward order
  for (int32_t i = 0; i < arg_num; ++i) { ... }
}
launch_kernels
targetDataEnd(args)
{  // reverse order
  for (int32_t I = ArgNum - 1; I >= 0; --I) { }
}
```

At a mapper,
```
targetDataMapper
{
  // generate args_reverse in reverse order for targetDataEnd
  targetDataEnd(args_reverse)
}
```
Are we actually getting the original forward order due to one reverse in 
targetDataMapper and second reverse in targetDataEnd? Is this the desired 
behavior? This part confused me. Do I miss something? Could you explain a bit?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86119/new/

https://reviews.llvm.org/D86119

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86119: [OPENMP50]Allow overlapping mapping in target constrcuts.

2020-12-04 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments.



Comment at: openmp/libomptarget/src/omptarget.cpp:233
 MapperComponents
-.Components[target_data_function == targetDataEnd ? I : E - I - 1];
+.Components[target_data_function == targetDataEnd ? E - I - 1 : I];
 MapperArgsBase[I] = C.Base;

ye-luo wrote:
> ABataev wrote:
> > ye-luo wrote:
> > > ABataev wrote:
> > > > ABataev wrote:
> > > > > ye-luo wrote:
> > > > > > ye-luo wrote:
> > > > > > > ABataev wrote:
> > > > > > > > ye-luo wrote:
> > > > > > > > > ABataev wrote:
> > > > > > > > > > grokos wrote:
> > > > > > > > > > > What is the current status of the order of the arguments 
> > > > > > > > > > > clang emits? Is it still necessary to traverse arguments 
> > > > > > > > > > > in reverse order here?
> > > > > > > > > > Yes, still required
> > > > > > > > > Based on the conversation in
> > > > > > > > > https://reviews.llvm.org/D85216
> > > > > > > > > This line of code neither before nor after the change plays 
> > > > > > > > > well.
> > > > > > > > > Shall we fix the order in targetDataEnd first?
> > > > > > > > This change is part of this patch and cannot be committed 
> > > > > > > > separately. 
> > > > > > > I mean could you fix that issue as a parent of this patch?
> > > > > > > This change is part of this patch and cannot be committed 
> > > > > > > separately. 
> > > > > > 
> > > > > > If fixing the reordering is part of this patch, I should have seen 
> > > > > > "target_data_function == targetDataEnd ?" branches disappear.
> > > > > Nope, just with this patch. It reorders the maps and need to change 
> > > > > the cleanup order too. 
> > > > It works just like constructors/destructors: allocate in direct order, 
> > > > deallocate in reversed to correctly handle map order. 
> > > The description says that "present and alloc mappings are processed first 
> > > and then all others."
> > > Why the order of arguments in targetDataBegin, targetDataEnd and 
> > > targetDataUpdate all get reversed.
> > Because this is for mappers. Mapper maps are ordered by the compiler in the 
> > direct order (alloc, maps, delete) but when we need to do exit, we need to 
> > release the data in reversed order (deletes, maps, allocs). 
> I was not making the question clear. My question about "reverse" is not about 
> having a reverse order for targetDataBegin. My question was about "reversing" 
> from the the old code. Your change put the opposite order for 
> targetDataBegin, targetDataEnd and targetDataUpdate cases. 
> I was not making the question clear. My question about "reverse" is not about 
> having a reverse order for targetDataBegin. My question was about "reversing" 
> from the the old code. Your change put the opposite order for 
> targetDataBegin, targetDataEnd and targetDataUpdate cases. 

typo correction
I was not making the question clear. My question about "reverse" is not about 
having a reverse order for **targetDataEnd**. My question was about "reversing" 
from the the old code. Your change put the opposite order for targetDataBegin, 
targetDataEnd and targetDataUpdate cases.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86119/new/

https://reviews.llvm.org/D86119

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86119: [OPENMP50]Allow overlapping mapping in target constrcuts.

2020-12-04 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments.



Comment at: openmp/libomptarget/src/omptarget.cpp:233
 MapperComponents
-.Components[target_data_function == targetDataEnd ? I : E - I - 1];
+.Components[target_data_function == targetDataEnd ? E - I - 1 : I];
 MapperArgsBase[I] = C.Base;

ABataev wrote:
> ye-luo wrote:
> > ABataev wrote:
> > > ABataev wrote:
> > > > ye-luo wrote:
> > > > > ye-luo wrote:
> > > > > > ABataev wrote:
> > > > > > > ye-luo wrote:
> > > > > > > > ABataev wrote:
> > > > > > > > > grokos wrote:
> > > > > > > > > > What is the current status of the order of the arguments 
> > > > > > > > > > clang emits? Is it still necessary to traverse arguments in 
> > > > > > > > > > reverse order here?
> > > > > > > > > Yes, still required
> > > > > > > > Based on the conversation in
> > > > > > > > https://reviews.llvm.org/D85216
> > > > > > > > This line of code neither before nor after the change plays 
> > > > > > > > well.
> > > > > > > > Shall we fix the order in targetDataEnd first?
> > > > > > > This change is part of this patch and cannot be committed 
> > > > > > > separately. 
> > > > > > I mean could you fix that issue as a parent of this patch?
> > > > > > This change is part of this patch and cannot be committed 
> > > > > > separately. 
> > > > > 
> > > > > If fixing the reordering is part of this patch, I should have seen 
> > > > > "target_data_function == targetDataEnd ?" branches disappear.
> > > > Nope, just with this patch. It reorders the maps and need to change the 
> > > > cleanup order too. 
> > > It works just like constructors/destructors: allocate in direct order, 
> > > deallocate in reversed to correctly handle map order. 
> > The description says that "present and alloc mappings are processed first 
> > and then all others."
> > Why the order of arguments in targetDataBegin, targetDataEnd and 
> > targetDataUpdate all get reversed.
> Because this is for mappers. Mapper maps are ordered by the compiler in the 
> direct order (alloc, maps, delete) but when we need to do exit, we need to 
> release the data in reversed order (deletes, maps, allocs). 
I was not making the question clear. My question about "reverse" is not about 
having a reverse order for targetDataBegin. My question was about "reversing" 
from the the old code. Your change put the opposite order for targetDataBegin, 
targetDataEnd and targetDataUpdate cases. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86119/new/

https://reviews.llvm.org/D86119

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86119: [OPENMP50]Allow overlapping mapping in target constrcuts.

2020-12-04 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments.



Comment at: openmp/libomptarget/src/omptarget.cpp:233
 MapperComponents
-.Components[target_data_function == targetDataEnd ? I : E - I - 1];
+.Components[target_data_function == targetDataEnd ? E - I - 1 : I];
 MapperArgsBase[I] = C.Base;

ABataev wrote:
> ABataev wrote:
> > ye-luo wrote:
> > > ye-luo wrote:
> > > > ABataev wrote:
> > > > > ye-luo wrote:
> > > > > > ABataev wrote:
> > > > > > > grokos wrote:
> > > > > > > > What is the current status of the order of the arguments clang 
> > > > > > > > emits? Is it still necessary to traverse arguments in reverse 
> > > > > > > > order here?
> > > > > > > Yes, still required
> > > > > > Based on the conversation in
> > > > > > https://reviews.llvm.org/D85216
> > > > > > This line of code neither before nor after the change plays well.
> > > > > > Shall we fix the order in targetDataEnd first?
> > > > > This change is part of this patch and cannot be committed separately. 
> > > > I mean could you fix that issue as a parent of this patch?
> > > > This change is part of this patch and cannot be committed separately. 
> > > 
> > > If fixing the reordering is part of this patch, I should have seen 
> > > "target_data_function == targetDataEnd ?" branches disappear.
> > Nope, just with this patch. It reorders the maps and need to change the 
> > cleanup order too. 
> It works just like constructors/destructors: allocate in direct order, 
> deallocate in reversed to correctly handle map order. 
The description says that "present and alloc mappings are processed first and 
then all others."
Why the order of arguments in targetDataBegin, targetDataEnd and 
targetDataUpdate all get reversed.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86119/new/

https://reviews.llvm.org/D86119

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86119: [OPENMP50]Allow overlapping mapping in target constrcuts.

2020-12-04 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments.



Comment at: openmp/libomptarget/src/omptarget.cpp:233
 MapperComponents
-.Components[target_data_function == targetDataEnd ? I : E - I - 1];
+.Components[target_data_function == targetDataEnd ? E - I - 1 : I];
 MapperArgsBase[I] = C.Base;

ye-luo wrote:
> ABataev wrote:
> > ye-luo wrote:
> > > ABataev wrote:
> > > > grokos wrote:
> > > > > What is the current status of the order of the arguments clang emits? 
> > > > > Is it still necessary to traverse arguments in reverse order here?
> > > > Yes, still required
> > > Based on the conversation in
> > > https://reviews.llvm.org/D85216
> > > This line of code neither before nor after the change plays well.
> > > Shall we fix the order in targetDataEnd first?
> > This change is part of this patch and cannot be committed separately. 
> I mean could you fix that issue as a parent of this patch?
> This change is part of this patch and cannot be committed separately. 

If fixing the reordering is part of this patch, I should have seen 
"target_data_function == targetDataEnd ?" branches disappear.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86119/new/

https://reviews.llvm.org/D86119

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86119: [OPENMP50]Allow overlapping mapping in target constrcuts.

2020-12-04 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments.



Comment at: openmp/libomptarget/src/omptarget.cpp:233
 MapperComponents
-.Components[target_data_function == targetDataEnd ? I : E - I - 1];
+.Components[target_data_function == targetDataEnd ? E - I - 1 : I];
 MapperArgsBase[I] = C.Base;

ABataev wrote:
> ye-luo wrote:
> > ABataev wrote:
> > > grokos wrote:
> > > > What is the current status of the order of the arguments clang emits? 
> > > > Is it still necessary to traverse arguments in reverse order here?
> > > Yes, still required
> > Based on the conversation in
> > https://reviews.llvm.org/D85216
> > This line of code neither before nor after the change plays well.
> > Shall we fix the order in targetDataEnd first?
> This change is part of this patch and cannot be committed separately. 
I mean could you fix that issue as a parent of this patch?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86119/new/

https://reviews.llvm.org/D86119

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86119: [OPENMP50]Allow overlapping mapping in target constrcuts.

2020-12-04 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments.



Comment at: openmp/libomptarget/src/omptarget.cpp:233
 MapperComponents
-.Components[target_data_function == targetDataEnd ? I : E - I - 1];
+.Components[target_data_function == targetDataEnd ? E - I - 1 : I];
 MapperArgsBase[I] = C.Base;

ABataev wrote:
> grokos wrote:
> > What is the current status of the order of the arguments clang emits? Is it 
> > still necessary to traverse arguments in reverse order here?
> Yes, still required
Based on the conversation in
https://reviews.llvm.org/D85216
This line of code neither before nor after the change plays well.
Shall we fix the order in targetDataEnd first?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86119/new/

https://reviews.llvm.org/D86119

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D80743: (PR46111) Properly handle elaborated types in an implicit deduction guide

2020-11-27 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

This patch caused severe regression in Clang 11.
https://bugs.llvm.org/show_bug.cgi?id=48177


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D80743/new/

https://reviews.llvm.org/D80743

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86119: [OPENMP50]Allow overlapping mapping in target constrcuts.

2020-11-18 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

1. Could you separate the reordering related changes to separate patch?
2. Could you mention which line in spec 4.5 was the restriction? Even 5.0/5.1 
has some restrictions. Need to be clear which one you refer to.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86119/new/

https://reviews.llvm.org/D86119

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D89844: [Clang][OpenMP][WIP] Fixed an issue of segment fault when using target nowait

2020-10-21 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

Getting this even when compiling without offload. You can use the reproducer 
from the original bug report.

  clang++: 
/home/yeluo/opt/llvm-clang/llvm-project/llvm/include/llvm/ADT/APInt.h:1151: 
bool llvm::APInt::operator==(const llvm::APInt &) const: Assertion `BitWidth == 
RHS.BitWidth && "Comparison requires equal bit widths"' failed.
  PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash 
backtrace, preprocessed source, and associated run script.
  Stack dump:
  0.Program arguments: /home/packages/llvm/master-patched/bin/clang++ 
-DADD_ -DH5_USE_16_API -DHAVE_CONFIG_H -Drestrict=__restrict__ 
-I/home/yeluo/opt/miniqmc/src 
-I/home/yeluo/opt/miniqmc/build_clang_offlaod_nowait/src -fopenmp 
-fomit-frame-pointer -fstrict-aliasing -D__forceinline=inline -march=native -O3 
-DNDEBUG -ffast-math -std=c++11 -o 
CMakeFiles/qmcutil.dir/Utilities/tinyxml/tinyxml2.cpp.o -c 
/home/yeluo/opt/miniqmc/src/Utilities/tinyxml/tinyxml2.cpp 
  1. parser at end of file
  2.Per-module optimization passes
  3.Running pass 'CallGraph Pass Manager' on module 
'/home/yeluo/opt/miniqmc/src/Utilities/tinyxml/tinyxml2.cpp'.
  4.Running pass 'Combine redundant instructions' on function 
'@_ZN8tinyxml27XMLUtil10IsNameCharEh'
   #0 0x01ecc523 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) 
(/home/packages/llvm/master-patched/bin/clang+++0x1ecc523)
   #1 0x01eca25e llvm::sys::RunSignalHandlers() 
(/home/packages/llvm/master-patched/bin/clang+++0x1eca25e)
   #2 0x01ecb8cd llvm::sys::CleanupOnSignal(unsigned long) 
(/home/packages/llvm/master-patched/bin/clang+++0x1ecb8cd)
   #3 0x01e513b3 (anonymous 
namespace)::CrashRecoveryContextImpl::HandleCrash(int, unsigned long) 
(/home/packages/llvm/master-patched/bin/clang+++0x1e513b3)
   #4 0x01e514ee CrashRecoverySignalHandler(int) 
(/home/packages/llvm/master-patched/bin/clang+++0x1e514ee)
   #5 0x7f18f56923c0 __restore_rt 
(/lib/x86_64-linux-gnu/libpthread.so.0+0x153c0)
   #6 0x7f18f512718b raise 
/build/glibc-ZN95T4/glibc-2.31/signal/../sysdeps/unix/sysv/linux/raise.c:51:1
   #7 0x7f18f5106859 abort 
/build/glibc-ZN95T4/glibc-2.31/stdlib/abort.c:81:7
   #8 0x7f18f5106729 get_sysdep_segment_value 
/build/glibc-ZN95T4/glibc-2.31/intl/loadmsgcat.c:509:8
   #9 0x7f18f5106729 _nl_load_domain 
/build/glibc-ZN95T4/glibc-2.31/intl/loadmsgcat.c:970:34
  #10 0x7f18f5117f36 (/lib/x86_64-linux-gnu/libc.so.6+0x36f36)
  #11 0x019f4c00 llvm::InstCombinerImpl::foldOrOfICmps(llvm::ICmpInst*, 
llvm::ICmpInst*, llvm::BinaryOperator&) 
(/home/packages/llvm/master-patched/bin/clang+++0x19f4c00)
  #12 0x019fb023 llvm::InstCombinerImpl::visitOr(llvm::BinaryOperator&) 
(/home/packages/llvm/master-patched/bin/clang+++0x19fb023)
  #13 0x019d354c llvm::InstCombinerImpl::run() 
(/home/packages/llvm/master-patched/bin/clang+++0x19d354c)
  #14 0x019d5788 combineInstructionsOverFunction(llvm::Function&, 
llvm::InstCombineWorklist&, llvm::AAResults*, llvm::AssumptionCache&, 
llvm::TargetLibraryInfo&, llvm::TargetTransformInfo&, llvm::DominatorTree&, 
llvm::OptimizationRemarkEmitter&, llvm::BlockFrequencyInfo*, 
llvm::ProfileSummaryInfo*, unsigned int, llvm::LoopInfo*) 
(/home/packages/llvm/master-patched/bin/clang+++0x19d5788)
  #15 0x019d70b1 
llvm::InstructionCombiningPass::runOnFunction(llvm::Function&) 
(/home/packages/llvm/master-patched/bin/clang+++0x19d70b1)
  #16 0x017c7a68 llvm::FPPassManager::runOnFunction(llvm::Function&) 
(/home/packages/llvm/master-patched/bin/clang+++0x17c7a68)
  #17 0x010d0033 (anonymous 
namespace)::CGPassManager::runOnModule(llvm::Module&) 
(/home/packages/llvm/master-patched/bin/clang+++0x10d0033)
  #18 0x017c8117 llvm::legacy::PassManagerImpl::run(llvm::Module&) 
(/home/packages/llvm/master-patched/bin/clang+++0x17c8117)
  #19 0x020fed4a clang::EmitBackendOutput(clang::DiagnosticsEngine&, 
clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, 
clang::TargetOptions const&, clang::LangOptions const&, llvm::DataLayout 
const&, llvm::Module*, clang::BackendAction, 
std::unique_ptr >) 
(/home/packages/llvm/master-patched/bin/clang+++0x20fed4a)
  #20 0x02d29c9c 
clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) 
(/home/packages/llvm/master-patched/bin/clang+++0x2d29c9c)
  #21 0x037e77e3 clang::ParseAST(clang::Sema&, bool, bool) 
(/home/packages/llvm/master-patched/bin/clang+++0x37e77e3)
  #22 0x026dc383 clang::FrontendAction::Execute() 
(/home/packages/llvm/master-patched/bin/clang+++0x26dc383)
  #23 0x0266e4f2 
clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) 
(/home/packages/llvm/master-patched/bin/clang+++0x266e4f2)
  #24 0x02789bb2 
clang::ExecuteCompilerInvocation(clang::CompilerInstance*) 
(/home/packages/llvm/master-patched/bin/clang+++0x2789bb2)
  #25 0x00a4568c cc1_main(llvm::ArrayRef, 

[PATCH] D88929: [OpenMP] Change CMake Configuration to Build for Highest CUDA Architecture by Default

2020-10-07 Thread Ye Luo via Phabricator via cfe-commits
ye-luo accepted this revision.
ye-luo added a comment.
This revision is now accepted and ready to land.

LGTM


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88929/new/

https://reviews.llvm.org/D88929

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D88929: [OpenMP] Change CMake Configuration to Build for Highest CUDA Architecture by Default

2020-10-06 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments.



Comment at: openmp/libomptarget/deviceRTLs/nvptx/CMakeLists.txt:80
+  else()
+list(APPEND compute_capabilities ${CMAKE_MATCH_1})
+  endif()

jhuber6 wrote:
> ye-luo wrote:
> > 1. Doesn't work right now. Missing comma ",${CMAKE_MATCH_1}"
> > 2. using CUDA_ARCH as "string(REGEX MATCH" output causes problems.
> > 3. "append" needs to protect redundant 35.
> > 4. I think it is better to move this part of logic to 
> > `openmp/libomptarget/cmake/Modules/LibomptargetGetDependencies.cmake`
> > after `find_package(CUDA QUIET)`.
> 1. I noticed the comma problem, it's because the compute capabilities uses 
> commas instead of semicolons like the rest of CMake to delimit the values. 
> There's another weird error I'm getting while trying to build with this where 
> it will add `sm_70` as an argument causing an nvcc error. like in `nvcc -o 
> out file.cpp sm_70`.
> 2. What's the problem here? Pretty much the output gives some string that you 
> would pass to nvcc like `--arch=sm_70` and I'm regexing out the `sm_70` if 
> there was no match or the architecture is too small it doesn't add anything.
> 3. I'm thinking it would just be easiest to change it do something like 
> `set(default_capabilities "35,${CMAKE_MATCH_1}")`
> 4. My feeling is that there's not enough complexity here to justify moving it 
> since I'd need to add just as much code to generate the output and then even 
> more to use it here. Since the LibomptargetGetDependencies.cmake doesn't 
> bother checking whether or not the `find_package(CUDA)` was successful I feel 
> like there's no need to add logic that requires checking if it was there when 
> we already have it here.
3. better than append but still need a check to avoid "35,35"
4. 
https://github.com/ye-luo/llvm-project/commit/ac5f20f9770e894ff48467a9317ec0649f5c7562
libomptarget part should be fulling working.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88929/new/

https://reviews.llvm.org/D88929

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D88929: [OpenMP] Change CMake Configuration to Build for Highest CUDA Architecture by Default

2020-10-06 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments.



Comment at: openmp/libomptarget/deviceRTLs/nvptx/CMakeLists.txt:92
   foreach(sm ${nvptx_sm_list})
 set(CUDA_ARCH ${CUDA_ARCH} -gencode arch=compute_${sm},code=sm_${sm})
   endforeach()

my point 2 refers to here CUDA_ARCH which gets into the compile line, your 
point 1 issue. rename your output variable ot CUDA_ARCH_MATCH_OUTPUT should 
solve the problem.

I still think it is better to move default_capabilities. Very natural to have 
cuda_select_nvcc_arch_flags next to find_package(cuda) in one place.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88929/new/

https://reviews.llvm.org/D88929

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D88929: [OpenMP] Change CMake Configuration to Build for Highest CUDA Architecture by Default

2020-10-06 Thread Ye Luo via Phabricator via cfe-commits
ye-luo requested changes to this revision.
ye-luo added inline comments.
This revision now requires changes to proceed.



Comment at: openmp/libomptarget/deviceRTLs/nvptx/CMakeLists.txt:80
+  else()
+list(APPEND compute_capabilities ${CMAKE_MATCH_1})
+  endif()

1. Doesn't work right now. Missing comma ",${CMAKE_MATCH_1}"
2. using CUDA_ARCH as "string(REGEX MATCH" output causes problems.
3. "append" needs to protect redundant 35.
4. I think it is better to move this part of logic to 
`openmp/libomptarget/cmake/Modules/LibomptargetGetDependencies.cmake`
after `find_package(CUDA QUIET)`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88929/new/

https://reviews.llvm.org/D88929

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D88929: [OpenMP] Change CMake Configuration to Build for Highest CUDA Architecture by Default

2020-10-06 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

In D88929#2315640 , @JonChesterfield 
wrote:

> An alternative approach is to build the deviceRTL for multiple cuda versions 
> and then pick whichever one is the best fit when compiling application code. 
> That has advantages when building the deviceRTL libraries on a different 
> machine to the one that intends to use it.
>
> Cmake isn't my thing, but I see that my trunk build only has 
> libomptarget-nvptx-sm_35.bc when the local card is a sm_50. The downstream 
> amd toolchain builds lots of this library, my install dir has fifteen of them 
> (including sm_50).

You can build multiple deviceRTL today with 
LIBOMPTARGET_NVPTX_COMPUTE_CAPABILITIES=50,61,70. This patch tries to add the 
high arch automatically.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88929/new/

https://reviews.llvm.org/D88929

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D88929: [OpenMP] Change CMake Configuration to Build for Highest CUDA Architecture by Default

2020-10-06 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

In D88929#2315538 , @jhuber6 wrote:

> In D88929#2315519 , @ye-luo wrote:
>
>> Probably not messing with `enable_language(CUDA)` at the moment, just add 
>> `cuda_select_nvcc_arch_flags(CUDA_ARCH_FLAGS)` to  
>> `openmp/libomptarget/cmake/Modules/LibomptargetGetDependencies.cmake?
>
> That only controls loading the library, since this is where we set all the 
> CUDA options I think it's fine to call it here.

Yes. I'm reducing my review to libomptarget only. I could not comment on 
clang/CMakeLists.txt. Regarding libomptarget, keep all the CUDA detection 
inside `LibomptargetGetDependencies.cmake`


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88929/new/

https://reviews.llvm.org/D88929

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D88929: [OpenMP] Change CMake Configuration to Build for Highest CUDA Architecture by Default

2020-10-06 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

In D88929#2315513 , @jhuber6 wrote:

> In D88929#2315451 , @ye-luo wrote:
>
>> I just realized that this patch affects clang and libomptarget.
>> I cannot comment on clang. Regarding libomptarget, Could you explain why the 
>> detection is not put together with other cuda stuff in 
>> `openmp/libomptarget/cmake/Modules/LibomptargetGetDependencies.cmake`
>
> If we're sticking with using FindCUDA it's definitely redundant here since it 
> was already called by the time we get here. The support for CUDA language 
> would use the same method but have `enable_language(CUDA)` somewhere instead 
> of `find_package(CUDA)`

Probably not messing with `enable_language(CUDA)` at the moment, just add 
`cuda_select_nvcc_arch_flags(CUDA_ARCH_FLAGS)` to  
`openmp/libomptarget/cmake/Modules/LibomptargetGetDependencies.cmake?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88929/new/

https://reviews.llvm.org/D88929

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D88929: [OpenMP] Change CMake Configuration to Build for Highest CUDA Architecture by Default

2020-10-06 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

I just realized that this patch affects clang and libomptarget.
I cannot comment on clang. Regarding libomptarget, Could you explain why the 
detection is not put together with other cuda stuff in 
`openmp/libomptarget/cmake/Modules/LibomptargetGetDependencies.cmake`


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88929/new/

https://reviews.llvm.org/D88929

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D88929: [OpenMP] Change CMake Configuration to Build for Highest CUDA Architecture by Default

2020-10-06 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

3.18 introduces CMAKE_CUDA_ARCHITECTURES. Does 3.18 supports detection? If we 
know a new way works since 3.18, I think putting both with if-else makes sense.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88929/new/

https://reviews.llvm.org/D88929

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D88929: [OpenMP] Change CMake Configuration to Build for Highest CUDA Architecture by Default

2020-10-06 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

The link I posted indicated that independent feature is merged since 3.12. 
Better to avoid deprecated stuff when introducing new cmake lines even though 
some existing lines may rely on deprecated cmake.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88929/new/

https://reviews.llvm.org/D88929

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D88929: [OpenMP] Change CMake Configuration to Build for Highest CUDA Architecture by Default

2020-10-06 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

FindCUDA has been deprecated.
Please explore the following feature without directly calling FindCUDA.
https://gitlab.kitware.com/cmake/cmake/-/merge_requests/1856


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88929/new/

https://reviews.llvm.org/D88929

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D88384: [OpenMP][FIX] Verify compatible types for declare variant calls

2020-09-28 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

The minimal reproducer and full app work now.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D88384/new/

https://reviews.llvm.org/D88384

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D78075: [WIP][Clang][OpenMP] Added support for nowait target in CodeGen

2020-09-14 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

In D78075#2272474 , @tianshilei1992 
wrote:

> In D78075#2272398 , @ye-luo wrote:
>
>>> However, OpenMP task has a problem that it must be within
>>> to a parallel region; otherwise the task will be executed immediately. As a
>>> result, if we directly wrap to a regular task, the nowait target outside of 
>>> a
>>> parallel region is still a synchronous version.
>>
>> The spec says an implicit task can be generated by an implicit parallel 
>> region which can be the whole OpenMP program. For this reason, the need of 
>> explicit parallel region is a limitation of the llvm OpenMP runtime, right?
>>
>> Can I have an option to run the nowait region as a regular task instead of 
>> an unshackled task? So I can use "parallel" and well established ways to 
>> control the thread affinity.
>
> According to the spec, an implicit parallel region is an inactive parallel 
> region that is not generated from a parallel construct. And based on the 
> definition of active parallel region, which is a parallel region that is 
> executed by a team consisting of more than one thread, an inactive parallel 
> region only has one thread. Since we only have one thread, if we encounter a 
> task, executing it immediately does make sense as we don't have another 
> thread to execute it.

If I remember correctly, you may yield the thread inside a target region after 
enqueuing kernels and transfers. So even with 1 thread, there is chance to run 
other tasks without finishing this target. Isn't that possible?

> I do remember your request about the regular task. This patch is exactly what 
> you need. Later when I finish the RTL, I could provide an option.

Thanks. I see, we will be able to control that in the runtime library.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78075/new/

https://reviews.llvm.org/D78075

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D78075: [WIP][Clang][OpenMP] Added support for nowait target in CodeGen

2020-09-14 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

> However, OpenMP task has a problem that it must be within
> to a parallel region; otherwise the task will be executed immediately. As a
> result, if we directly wrap to a regular task, the nowait target outside of a
> parallel region is still a synchronous version.

The spec says an implicit task can be generated by an implicit parallel region 
which can be the whole OpenMP program. For this reason, the need of explicit 
parallel region is a limitation of the llvm OpenMP runtime, right?

Can I have an option to run the nowait region as a regular task instead of an 
unshackled task? So I can use "parallel" and well established ways to control 
the thread affinity.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D78075/new/

https://reviews.llvm.org/D78075

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D84767: [OPENMP]Fix PR46824: Global declare target pointer cannot be accessed in target region.

2020-07-29 Thread Ye Luo via Phabricator via cfe-commits
ye-luo accepted this revision.
ye-luo added a comment.
This revision is now accepted and ready to land.

LGTM. My applications run as expected now. PR46824, PR46012, PR46868 all work 
fine.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D84767/new/

https://reviews.llvm.org/D84767

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D84767: [OPENMP]Fix PR46824: Global declare target pointer cannot be accessed in target region.

2020-07-28 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

In D84767#2180280 , @ye-luo wrote:

> This patch
> GPU activities:   96.99%  350.05ms10  35.005ms  1.5680us  350.00ms  
> [CUDA memcpy HtoD]
> before the July21 change
> GPU activities:   95.33%  20.317ms 4  5.0793ms  1.6000us  20.305ms  
> [CUDA memcpy HtoD]
> Still more transfer than it should.

@ABataev could you have a look. My July 21 compiler was built before 
"[OPENMP]Fix PR46012: declare target pointer cannot be accessed in target 
region." gets in.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D84767/new/

https://reviews.llvm.org/D84767

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D84767: [OPENMP]Fix PR46824: Global declare target pointer cannot be accessed in target region.

2020-07-28 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

This patch
GPU activities:   96.99%  350.05ms10  35.005ms  1.5680us  350.00ms  
[CUDA memcpy HtoD]
before the July21 change
GPU activities:   95.33%  20.317ms 4  5.0793ms  1.6000us  20.305ms  
[CUDA memcpy HtoD]
Still more transfer than it should.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D84767/new/

https://reviews.llvm.org/D84767

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D84767: [OPENMP]Fix PR46824: Global declare target pointer cannot be accessed in target region.

2020-07-28 Thread Ye Luo via Phabricator via cfe-commits
ye-luo requested changes to this revision.
ye-luo added a comment.
This revision now requires changes to proceed.

Please check the reproducer in https://bugs.llvm.org/show_bug.cgi?id=46868 with 
LIBOMPTARGET_DEBUG=1.
The reference counting on the base pointer variable has side effects. It was 
not cleaned up when these variables leave its scope.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D84767/new/

https://reviews.llvm.org/D84767

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D84182: [OPENMP]Fix PR46012: declare target pointer cannot be accessed in target region.

2020-07-28 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

In D84182#2173578 , @grokos wrote:

> @ABataev:
>
> After this patch was committed, I tried to run the following example:
>
>   #include 
>   
>   int *yptr;
>   
>   int main() {
> int y[10];
> y[1] = 1;
> yptr = [0];
>   
> printf(" = %p\n", );
> printf("[0] = %p\n", [0]);
>   
> #pragma omp target data map(to: yptr[0:5])
> #pragma omp target
> {
>   printf("y = %d\n", yptr[1]);
>   yptr[1] = 10;
>   printf("y = %d\n", yptr[1]);
> }
>   
> printf("y = %d\n", yptr[1]);
> return 0;
>   }
>
> The arguments clang generates are:
>
>   1) base = [0], begin = , size = 8, type = TARGET_PARAM | TO
>   2) base = , begin = [0], size = 8, type = PTR_AND_OBJ | TO
>
> The second argument is correct, the first argument doesn't make much sense. I 
> believe it should have its base set to , not [0].
> y[0] is not the base for anything, it's only the pointee object.

I hit the same issue and it results incorrect mapping.
https://bugs.llvm.org/show_bug.cgi?id=46868
In addition, only yptr is touched inside the target region. After this patch, 
two mappings instead of one occurs.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D84182/new/

https://reviews.llvm.org/D84182

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D84182: [OPENMP]Fix PR46012: declare target pointer cannot be accessed in target region.

2020-07-21 Thread Ye Luo via Phabricator via cfe-commits
ye-luo accepted this revision.
ye-luo added a comment.
This revision is now accepted and ready to land.

I verified that 46012 is fixed with this patch


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D84182/new/

https://reviews.llvm.org/D84182



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D83707: [OpenMP][NFC] Emit remarks during GPU state machine optimization

2020-07-14 Thread Ye Luo via Phabricator via cfe-commits
ye-luo accepted this revision.
ye-luo added a comment.
This revision is now accepted and ready to land.

LGTM


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83707/new/

https://reviews.llvm.org/D83707



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D83707: [OpenMP][NFC] Emit remarks during GPU state machine optimization

2020-07-13 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added inline comments.



Comment at: clang/test/OpenMP/remarks_parallel_in_target_state_machine.c:12
+#pragma omp parallel // #1
+// expected-remark@#1 {{Found parallel region that is called through a 
state machine__omp_outlined__2_wrapper in non-SPMD target region. This can lead 
to excessive register usage in unrelated kernels in the same translation unit 
due to spurious call edges assumed by ptxas.}}
+// expected-remark@#1 {{Parallel region __omp_outlined__2_wrapper is not 
known to be called from a single target region only, maybe the surrounding 
function has external linkage?; will not attempt to rewrite the state machine 
use.}}

Add a space "machine__omp_outlined__2_wrapper" to "machine 
__omp_outlined__2_wrapper"


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83707/new/

https://reviews.llvm.org/D83707



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D75788: [OpenMP] Provide math functions in OpenMP device code via OpenMP variants

2020-04-02 Thread Ye Luo via Phabricator via cfe-commits
ye-luo added a comment.

My RHEL issue was caused by a CPLUS_INCLUDE_PATH environment variable. So this 
is feature not a bug. After removing it, everything works smoothly for me.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75788/new/

https://reviews.llvm.org/D75788



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D75788: [OpenMP] Provide math functions in OpenMP device code via OpenMP variants

2020-04-02 Thread Ye Luo via Phabricator via cfe-commits
ye-luo accepted this revision.
ye-luo added a comment.
This revision is now accepted and ready to land.

Good work.
I verified that PR42798 and PR42799 are fixed by this.
Tests are completed on Ubuntu 18.04.
Clang now becomes usable for application developers.

There are still issues on RHEL that openmp_wrappers is not added before the 
system math searching path. Option -isystem openmp_wrappers can be used as a 
workaround.
This remaining issue should not be a blocker of accepting this patch. It can be 
dealt separately later without complicating the review of the current patch.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D75788/new/

https://reviews.llvm.org/D75788



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits