Re: [PATCH] D11361: [OpenMP] Target directive host codegen
sfantao abandoned this revision. sfantao added a comment. Closing revision. It has been replaced by http://reviews.llvm.org/D12871 has suggested by John. Thanks! Samuel http://reviews.llvm.org/D11361 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Re: [PATCH] D11361: [OpenMP] Target directive host codegen
rjmccall added a comment. Sorry for putting off the final review on this; I was heads-down trying to get the alignment patch done. It's looking good; obviously you'll need to update it to work with Addresses properly, but hopefully that won't be too much of a problem. When you do, maybe you should start a new review; I think there's some way to do that in Phabricator that ties it to the old one. Phabricator seems to not be very happy with the extent to which the code has changed, and the old comments now just make it harder to review the current patch. http://reviews.llvm.org/D11361 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Re: [PATCH] D11361: [OpenMP] Target directive host codegen
Hahnfeld added a comment. Needs two small changes to work with current trunk Comment at: lib/CodeGen/CGStmtOpenMP.cpp:2135-2136 @@ +2134,4 @@ + const Expr *IfCond = nullptr; + if (auto C = S.getSingleClause(OMPC_if)) { +IfCond = cast(C)->getCondition(); + } This now has to be `S.getSingleClause()` and we can therefore omit the cast... Comment at: lib/CodeGen/CGStmtOpenMP.cpp:2141-2142 @@ +2140,4 @@ + const Expr *Device = nullptr; + if (auto C = S.getSingleClause(OMPC_device)) { +Device = cast(C)->getDevice(); + } Likewise `S.getSingleClause()` without the need for an extra cast http://reviews.llvm.org/D11361 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Re: [PATCH] D11361: [OpenMP] Target directive host codegen
ABataev added a comment. Seems good to me, but it would be good if John McCall could look at the patch. http://reviews.llvm.org/D11361 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Re: [PATCH] D11361: [OpenMP] Target directive host codegen
sfantao updated this revision to Diff 33640. sfantao added a comment. Address last review comments. http://reviews.llvm.org/D11361 Files: lib/CodeGen/CGOpenMPRuntime.cpp lib/CodeGen/CGOpenMPRuntime.h lib/CodeGen/CGStmtOpenMP.cpp test/OpenMP/target_codegen.cpp Index: test/OpenMP/target_codegen.cpp === --- /dev/null +++ test/OpenMP/target_codegen.cpp @@ -0,0 +1,753 @@ +// RUN: %clang_cc1 -verify -fopenmp -x c++ -triple powerpc64le-unknown-unknown -emit-llvm %s -o - | FileCheck %s +// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple powerpc64le-unknown-unknown -emit-pch -o %t %s +// RUN: %clang_cc1 -fopenmp -x c++ -triple powerpc64le-unknown-unknown -std=c++11 -include-pch %t -verify %s -emit-llvm -o - | FileCheck %s +// RUN: %clang_cc1 -verify -fopenmp -x c++ -triple i386-unknown-unknown -emit-llvm %s -o - | FileCheck %s +// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple i386-unknown-unknown -emit-pch -o %t %s +// RUN: %clang_cc1 -fopenmp -x c++ -triple i386-unknown-unknown -std=c++11 -include-pch %t -verify %s -emit-llvm -o - | FileCheck %s +// expected-no-diagnostics +#ifndef HEADER +#define HEADER + +// CHECK-DAG: [[TT:%.+]] = type { i64, i8 } +// CHECK-DAG: [[S1:%.+]] = type { double } + +// We have 8 target regions, but only 7 that actually will generate offloading +// code, only 6 will have mapped arguments, and only 4 have all-constant map +// sizes. + +// CHECK-DAG: [[SIZET2:@.+]] = private unnamed_addr constant [1 x i{{32|64}}] [i[[SZ:32|64]] 2] +// CHECK-DAG: [[MAPT2:@.+]] = private unnamed_addr constant [1 x i32] [i32 3] +// CHECK-DAG: [[SIZET3:@.+]] = private unnamed_addr constant [2 x i[[SZ]]] [i[[SZ]] 4, i[[SZ]] 2] +// CHECK-DAG: [[MAPT3:@.+]] = private unnamed_addr constant [2 x i32] [i32 3, i32 3] +// CHECK-DAG: [[MAPT4:@.+]] = private unnamed_addr constant [9 x i32] [i32 3, i32 3, i32 1, i32 3, i32 3, i32 1, i32 1, i32 3, i32 3] +// CHECK-DAG: [[SIZET5:@.+]] = private unnamed_addr constant [3 x i[[SZ]]] [i[[SZ]] 4, i[[SZ]] 2, i[[SZ]] 40] +// CHECK-DAG: [[MAPT5:@.+]] = private unnamed_addr constant [3 x i32] [i32 3, i32 3, i32 3] +// CHECK-DAG: [[SIZET6:@.+]] = private unnamed_addr constant [4 x i[[SZ]]] [i[[SZ]] 4, i[[SZ]] 2, i[[SZ]] 1, i[[SZ]] 40] +// CHECK-DAG: [[MAPT6:@.+]] = private unnamed_addr constant [4 x i32] [i32 3, i32 3, i32 3, i32 3] +// CHECK-DAG: [[MAPT7:@.+]] = private unnamed_addr constant [5 x i32] [i32 3, i32 3, i32 1, i32 1, i32 3] +// CHECK-DAG: @{{.*}} = private constant i8 0 +// CHECK-DAG: @{{.*}} = private constant i8 0 +// CHECK-DAG: @{{.*}} = private constant i8 0 +// CHECK-DAG: @{{.*}} = private constant i8 0 +// CHECK-DAG: @{{.*}} = private constant i8 0 +// CHECK-DAG: @{{.*}} = private constant i8 0 +// CHECK-DAG: @{{.*}} = private constant i8 0 + +template +struct TT{ + tx X; + ty Y; +}; + +// CHECK: define {{.*}}[[FOO:@.+]]( +int foo(int n) { + int a = 0; + short aa = 0; + float b[10]; + float bn[n]; + double c[5][10]; + double cn[5][n]; + TT d; + + // CHECK: br label %[[TRY:[^,]+]] + // CHECK: [[TRY]] + // CHECK: [[RET:%.+]] = call i32 @__tgt_target(i32 -1, i8* @{{[^,]+}}, i32 0, i8** null, i8** null, i[[SZ]]* null, i32* null) + // CHECK-NEXT: [[ERROR:%.+]] = icmp ne i32 [[RET]], 0 + // CHECK-NEXT: br i1 [[ERROR]], label %[[FAIL:[^,]+]], label %[[END:[^,]+]] + // CHECK: [[FAIL]] + // CHECK: call void [[HVT0:@.+]]() + // CHECK-NEXT: br label %[[END]] + // CHECK: [[END]] + #pragma omp target + { + } + + // CHECK: call void [[HVT1:@.+]](i32* {{[^,]+}}) + #pragma omp target if(0) + { +a += 1; + } + + // CHECK: br label %[[TRY:[^,]+]] + // CHECK: [[TRY]] + // CHECK-DAG: [[RET:%.+]] = call i32 @__tgt_target(i32 -1, i8* @{{[^,]+}}, i32 1, i8** [[BP:%[^,]+]], i8** [[P:%[^,]+]], i[[SZ]]* getelementptr inbounds ([1 x i[[SZ]]], [1 x i[[SZ]]]* [[SIZET2]], i32 0, i32 0), i32* getelementptr inbounds ([1 x i32], [1 x i32]* [[MAPT2]], i32 0, i32 0)) + // CHECK-DAG: [[BP]] = getelementptr inbounds [1 x i8*], [1 x i8*]* [[BPR:%[^,]+]], i32 0, i32 0 + // CHECK-DAG: [[P]] = getelementptr inbounds [1 x i8*], [1 x i8*]* [[PR:%[^,]+]], i32 0, i32 0 + // CHECK-DAG: [[BPADDR0:%.+]] = getelementptr inbounds [1 x i8*], [1 x i8*]* [[BPR]], i32 0, i32 [[IDX0:[0-9]+]] + // CHECK-DAG: [[PADDR0:%.+]] = getelementptr inbounds [1 x i8*], [1 x i8*]* [[PR]], i32 0, i32 [[IDX0]] + // CHECK-DAG: store i8* [[BP0:%[^,]+]], i8** [[BPADDR0]] + // CHECK-DAG: store i8* [[P0:%[^,]+]], i8** [[PADDR0]] + // CHECK-DAG: [[BP0]] = bitcast i16* %{{.+}} to i8* + // CHECK-DAG: [[P0]] = bitcast i16* %{{.+}} to i8* + + // CHECK: [[ERROR:%.+]] = icmp ne i32 [[RET]], 0 + // CHECK-NEXT: br i1 [[ERROR]], label %[[FAIL:[^,]+]], label %[[END:[^,]+]] + // CHECK: [[FAIL]] + // CHECK: call void [[HVT2:@.+]](i16* {{[^,]+}}) + // CHECK-NEXT: br label %[[END]] + // CHECK: [[END]] + #pragma omp
Re: [PATCH] D11361: [OpenMP] Target directive host codegen
sfantao added inline comments. Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:2991-3005 @@ +2990,17 @@ + +/// \brief Values for bit flags used to specify the mapping type for +/// offloading. +enum OpenMPOffloadMappingFlags { + /// \brief Allocate memory on the device and move data from host to device. + OMP_MAP_TO = 0x01, + /// \brief Allocate memory on the device and move data from device to host. + OMP_MAP_FROM = 0x02, +}; + +enum OpenMPOffloadingReservedDeviceIDs { + /// \brief Device ID if the device was not defined, runtime should get it + /// from environment variables in the spec. + OMP_DEVICEID_UNDEF = -1, +}; + +void CGOpenMPRuntime::emitTargetCall(CodeGenFunction , ABataev wrote: > Move them to CGOpenMPRuntime::emitTargetCall(), they can be made local Ok, done! Comment at: test/OpenMP/target_codegen.cpp:8 @@ +7,3 @@ +// expected-no-diagnostics +// REQUIRES: powerpc-registered-target +#ifndef HEADER ABataev wrote: > Some of your tests has triple i386, they don't need PowerPC target True, I'm not using any target specific property here. So I guess it is safe to remove the requirement. Not using // REQUIRES anymore. Thanks. http://reviews.llvm.org/D11361 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Re: [PATCH] D11361: [OpenMP] Target directive host codegen
sfantao updated this revision to Diff 33111. sfantao added a comment. Move map type and device id enums from CGOpenMPRuntime.h to CGOpenMPRuntime.cpp. http://reviews.llvm.org/D11361 Files: lib/CodeGen/CGOpenMPRuntime.cpp lib/CodeGen/CGOpenMPRuntime.h lib/CodeGen/CGStmtOpenMP.cpp test/OpenMP/target_codegen.cpp Index: test/OpenMP/target_codegen.cpp === --- /dev/null +++ test/OpenMP/target_codegen.cpp @@ -0,0 +1,754 @@ +// RUN: %clang_cc1 -verify -fopenmp -x c++ -triple powerpc64le-unknown-unknown -emit-llvm %s -o - | FileCheck %s +// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple powerpc64le-unknown-unknown -emit-pch -o %t %s +// RUN: %clang_cc1 -fopenmp -x c++ -triple powerpc64le-unknown-unknown -std=c++11 -include-pch %t -verify %s -emit-llvm -o - | FileCheck %s +// RUN: %clang_cc1 -verify -fopenmp -x c++ -triple i386-unknown-unknown -emit-llvm %s -o - | FileCheck %s +// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple i386-unknown-unknown -emit-pch -o %t %s +// RUN: %clang_cc1 -fopenmp -x c++ -triple i386-unknown-unknown -std=c++11 -include-pch %t -verify %s -emit-llvm -o - | FileCheck %s +// expected-no-diagnostics +// REQUIRES: powerpc-registered-target +#ifndef HEADER +#define HEADER + +// CHECK-DAG: [[TT:%.+]] = type { i64, i8 } +// CHECK-DAG: [[S1:%.+]] = type { double } + +// We have 8 target regions, but only 7 that actually will generate offloading +// code, only 6 will have mapped arguments, and only 4 have all-constant map +// sizes. + +// CHECK-DAG: [[SIZET2:@.+]] = private unnamed_addr constant [1 x i{{32|64}}] [i[[SZ:32|64]] 2] +// CHECK-DAG: [[MAPT2:@.+]] = private unnamed_addr constant [1 x i32] [i32 3] +// CHECK-DAG: [[SIZET3:@.+]] = private unnamed_addr constant [2 x i[[SZ]]] [i[[SZ]] 4, i[[SZ]] 2] +// CHECK-DAG: [[MAPT3:@.+]] = private unnamed_addr constant [2 x i32] [i32 3, i32 3] +// CHECK-DAG: [[MAPT4:@.+]] = private unnamed_addr constant [9 x i32] [i32 3, i32 3, i32 1, i32 3, i32 3, i32 1, i32 1, i32 3, i32 3] +// CHECK-DAG: [[SIZET5:@.+]] = private unnamed_addr constant [3 x i[[SZ]]] [i[[SZ]] 4, i[[SZ]] 2, i[[SZ]] 40] +// CHECK-DAG: [[MAPT5:@.+]] = private unnamed_addr constant [3 x i32] [i32 3, i32 3, i32 3] +// CHECK-DAG: [[SIZET6:@.+]] = private unnamed_addr constant [4 x i[[SZ]]] [i[[SZ]] 4, i[[SZ]] 2, i[[SZ]] 1, i[[SZ]] 40] +// CHECK-DAG: [[MAPT6:@.+]] = private unnamed_addr constant [4 x i32] [i32 3, i32 3, i32 3, i32 3] +// CHECK-DAG: [[MAPT7:@.+]] = private unnamed_addr constant [5 x i32] [i32 3, i32 3, i32 1, i32 1, i32 3] +// CHECK-DAG: @{{.*}} = private constant i8 0 +// CHECK-DAG: @{{.*}} = private constant i8 0 +// CHECK-DAG: @{{.*}} = private constant i8 0 +// CHECK-DAG: @{{.*}} = private constant i8 0 +// CHECK-DAG: @{{.*}} = private constant i8 0 +// CHECK-DAG: @{{.*}} = private constant i8 0 +// CHECK-DAG: @{{.*}} = private constant i8 0 + +templatetypename tx, typename ty +struct TT{ + tx X; + ty Y; +}; + +// CHECK: define {{.*}}[[FOO:@.+]]( +int foo(int n) { + int a = 0; + short aa = 0; + float b[10]; + float bn[n]; + double c[5][10]; + double cn[5][n]; + TTlong long, char d; + + // CHECK: br label %[[TRY:[^,]+]] + // CHECK: [[TRY]] + // CHECK: [[RET:%.+]] = call i32 @__tgt_target(i32 -1, i8* @{{[^,]+}}, i32 0, i8** null, i8** null, i[[SZ]]* null, i32* null) + // CHECK-NEXT: [[ERROR:%.+]] = icmp ne i32 [[RET]], 0 + // CHECK-NEXT: br i1 [[ERROR]], label %[[FAIL:[^,]+]], label %[[END:[^,]+]] + // CHECK: [[FAIL]] + // CHECK: call void [[HVT0:@.+]]() + // CHECK-NEXT: br label %[[END]] + // CHECK: [[END]] + #pragma omp target + { + } + + // CHECK: call void [[HVT1:@.+]](i32* {{[^,]+}}) + #pragma omp target if(0) + { +a += 1; + } + + // CHECK: br label %[[TRY:[^,]+]] + // CHECK: [[TRY]] + // CHECK-DAG: [[RET:%.+]] = call i32 @__tgt_target(i32 -1, i8* @{{[^,]+}}, i32 1, i8** [[BP:%[^,]+]], i8** [[P:%[^,]+]], i[[SZ]]* getelementptr inbounds ([1 x i[[SZ]]], [1 x i[[SZ]]]* [[SIZET2]], i32 0, i32 0), i32* getelementptr inbounds ([1 x i32], [1 x i32]* [[MAPT2]], i32 0, i32 0)) + // CHECK-DAG: [[BP]] = getelementptr inbounds [1 x i8*], [1 x i8*]* [[BPR:%[^,]+]], i32 0, i32 0 + // CHECK-DAG: [[P]] = getelementptr inbounds [1 x i8*], [1 x i8*]* [[PR:%[^,]+]], i32 0, i32 0 + // CHECK-DAG: [[BPADDR0:%.+]] = getelementptr inbounds [1 x i8*], [1 x i8*]* [[BPR]], i32 0, i32 [[IDX0:[0-9]+]] + // CHECK-DAG: [[PADDR0:%.+]] = getelementptr inbounds [1 x i8*], [1 x i8*]* [[PR]], i32 0, i32 [[IDX0]] + // CHECK-DAG: store i8* [[BP0:%[^,]+]], i8** [[BPADDR0]] + // CHECK-DAG: store i8* [[P0:%[^,]+]], i8** [[PADDR0]] + // CHECK-DAG: [[BP0]] = bitcast i16* %{{.+}} to i8* + // CHECK-DAG: [[P0]] = bitcast i16* %{{.+}} to i8* + + // CHECK: [[ERROR:%.+]] = icmp ne i32 [[RET]], 0 + // CHECK-NEXT: br i1 [[ERROR]], label %[[FAIL:[^,]+]], label %[[END:[^,]+]] + // CHECK: [[FAIL]] + //
Re: [PATCH] D11361: [OpenMP] Target directive host codegen
sfantao added a comment. Thanks for the review! In http://reviews.llvm.org/D11361#232045, @ABataev wrote: Samuel, Yes, I thought about different files and different classes. Runtime for offloading codegen is not a part of libomp and it would be good to have separate runtime handler class for target codegen also. We need to think about it in future. Sure, we can definitely do something like that in the future if we detect a clear place to separate. I guess that when we do that, the design of the offloading-unrelated lib has to change a little. There are things like the capture statement info that may make sense to share between the two implementation. Comment at: lib/CodeGen/CGOpenMPRuntime.h:188-203 @@ -179,2 +187,18 @@ }; + + /// \brief Values for bit flags used to specify the mapping type for + /// offloading. + enum OpenMPOffloadMappingFlags { +/// \brief Allocate memory on the device and move data from host to device. +OMP_MAP_TO = 0x01, +/// \brief Allocate memory on the device and move data from device to host. +OMP_MAP_FROM = 0x02, + }; + + enum OpenMPOffloadingReservedDeviceIDs { +/// \brief Device ID if the device was not defined, runtime should get it +/// from environment variables in the spec. +OMP_DEVICEID_UNDEF = -1, + }; + CodeGenModule CGM; ABataev wrote: Move them to .cpp file. Done! Comment at: lib/CodeGen/CGOpenMPRuntime.h:761 @@ -714,2 +760,2 @@ #endif ABataev wrote: sfantao wrote: Unlike the other enums, more than one map types need to be combined. E.g., to/from are two different enums. Once the map clause and 4.1 get to be support, we will have more combinations. I see two options here: add enums for all combinations or use a typedef each time an ineger refer to map types, so the code is more readable. Let me know your thoughts. Yes, I think we need to add separate enums for different combination in Basic/OpenMPKinds.{def, h} for AST. In runtime support library we can represent these combinations as a bit-or of single mapping types. Ok, I see, it makes sense to do that from the AST for the map clause SEMA. I'm keeping the bit-or in the runtime library call codegen as you say. http://reviews.llvm.org/D11361 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Re: [PATCH] D11361: [OpenMP] Target directive host codegen
sfantao added a comment. Two more inlined comments that I forgot to integrate in my previous response. Thanks! Samuel Comment at: lib/CodeGen/CGOpenMPRuntime.h:190-204 @@ -180,2 +189,17 @@ + /// \brief Values for bit flags used to specify the mapping type for + /// offloading. + enum OpenMPOffloadMappingFlags { +/// \brief Allocate memory on the device and move data from host to device. +OMP_MAP_TO = 0x01, +/// \brief Allocate memory on the device and move data from device to host. +OMP_MAP_FROM = 0x02, + }; + + enum OpenMPOffloadingReservedDeviceIDs { +/// \brief Device ID if the device was not defined, runtime should get it +/// from environment variables in the spec. +OMP_DEVICEID_UNDEF = -1, + }; + CodeGenModule CGM; /// \brief Default const ident_t object used for initialization of all other Got it, not exposed anymore. Comment at: lib/CodeGen/CGOpenMPRuntime.h:761 @@ -714,2 +760,2 @@ #endif Unlike the other enums, more than one map types need to be combined. E.g., to/from are two different enums. Once the map clause and 4.1 get to be support, we will have more combinations. I see two options here: add enums for all combinations or use a typedef each time an ineger refer to map types, so the code is more readable. Let me know your thoughts. http://reviews.llvm.org/D11361 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Re: [PATCH] D11361: [OpenMP] Target directive host codegen
sfantao updated this revision to Diff 32843. sfantao added a comment. Address reviewer concerns. http://reviews.llvm.org/D11361 Files: lib/CodeGen/CGOpenMPRuntime.cpp lib/CodeGen/CGOpenMPRuntime.h lib/CodeGen/CGStmtOpenMP.cpp test/OpenMP/target_codegen.cpp Index: test/OpenMP/target_codegen.cpp === --- /dev/null +++ test/OpenMP/target_codegen.cpp @@ -0,0 +1,754 @@ +// RUN: %clang_cc1 -verify -fopenmp -x c++ -triple powerpc64le-unknown-unknown -emit-llvm %s -o - | FileCheck %s +// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple powerpc64le-unknown-unknown -emit-pch -o %t %s +// RUN: %clang_cc1 -fopenmp -x c++ -triple powerpc64le-unknown-unknown -std=c++11 -include-pch %t -verify %s -emit-llvm -o - | FileCheck %s +// RUN: %clang_cc1 -verify -fopenmp -x c++ -triple i386-unknown-unknown -emit-llvm %s -o - | FileCheck %s +// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple i386-unknown-unknown -emit-pch -o %t %s +// RUN: %clang_cc1 -fopenmp -x c++ -triple i386-unknown-unknown -std=c++11 -include-pch %t -verify %s -emit-llvm -o - | FileCheck %s +// expected-no-diagnostics +// REQUIRES: powerpc-registered-target +#ifndef HEADER +#define HEADER + +// CHECK-DAG: [[TT:%.+]] = type { i64, i8 } +// CHECK-DAG: [[S1:%.+]] = type { double } + +// We have 8 target regions, but only 7 that actually will generate offloading +// code, only 6 will have mapped arguments, and only 4 have all-constant map +// sizes. + +// CHECK-DAG: [[SIZET2:@.+]] = private unnamed_addr constant [1 x i{{32|64}}] [i[[SZ:32|64]] 2] +// CHECK-DAG: [[MAPT2:@.+]] = private unnamed_addr constant [1 x i32] [i32 3] +// CHECK-DAG: [[SIZET3:@.+]] = private unnamed_addr constant [2 x i[[SZ]]] [i[[SZ]] 4, i[[SZ]] 2] +// CHECK-DAG: [[MAPT3:@.+]] = private unnamed_addr constant [2 x i32] [i32 3, i32 3] +// CHECK-DAG: [[MAPT4:@.+]] = private unnamed_addr constant [9 x i32] [i32 3, i32 3, i32 1, i32 3, i32 3, i32 1, i32 1, i32 3, i32 3] +// CHECK-DAG: [[SIZET5:@.+]] = private unnamed_addr constant [3 x i[[SZ]]] [i[[SZ]] 4, i[[SZ]] 2, i[[SZ]] 40] +// CHECK-DAG: [[MAPT5:@.+]] = private unnamed_addr constant [3 x i32] [i32 3, i32 3, i32 3] +// CHECK-DAG: [[SIZET6:@.+]] = private unnamed_addr constant [4 x i[[SZ]]] [i[[SZ]] 4, i[[SZ]] 2, i[[SZ]] 1, i[[SZ]] 40] +// CHECK-DAG: [[MAPT6:@.+]] = private unnamed_addr constant [4 x i32] [i32 3, i32 3, i32 3, i32 3] +// CHECK-DAG: [[MAPT7:@.+]] = private unnamed_addr constant [5 x i32] [i32 3, i32 3, i32 1, i32 1, i32 3] +// CHECK-DAG: @{{.*}} = private constant i8 0 +// CHECK-DAG: @{{.*}} = private constant i8 0 +// CHECK-DAG: @{{.*}} = private constant i8 0 +// CHECK-DAG: @{{.*}} = private constant i8 0 +// CHECK-DAG: @{{.*}} = private constant i8 0 +// CHECK-DAG: @{{.*}} = private constant i8 0 +// CHECK-DAG: @{{.*}} = private constant i8 0 + +templatetypename tx, typename ty +struct TT{ + tx X; + ty Y; +}; + +// CHECK: define {{.*}}[[FOO:@.+]]( +int foo(int n) { + int a = 0; + short aa = 0; + float b[10]; + float bn[n]; + double c[5][10]; + double cn[5][n]; + TTlong long, char d; + + // CHECK: br label %[[TRY:[^,]+]] + // CHECK: [[TRY]] + // CHECK: [[RET:%.+]] = call i32 @__tgt_target(i32 -1, i8* @{{[^,]+}}, i32 0, i8** null, i8** null, i[[SZ]]* null, i32* null) + // CHECK-NEXT: [[ERROR:%.+]] = icmp ne i32 [[RET]], 0 + // CHECK-NEXT: br i1 [[ERROR]], label %[[FAIL:[^,]+]], label %[[END:[^,]+]] + // CHECK: [[FAIL]] + // CHECK: call void [[HVT0:@.+]]() + // CHECK-NEXT: br label %[[END]] + // CHECK: [[END]] + #pragma omp target + { + } + + // CHECK: call void [[HVT1:@.+]](i32* {{[^,]+}}) + #pragma omp target if(0) + { +a += 1; + } + + // CHECK: br label %[[TRY:[^,]+]] + // CHECK: [[TRY]] + // CHECK-DAG: [[RET:%.+]] = call i32 @__tgt_target(i32 -1, i8* @{{[^,]+}}, i32 1, i8** [[BP:%[^,]+]], i8** [[P:%[^,]+]], i[[SZ]]* getelementptr inbounds ([1 x i[[SZ]]], [1 x i[[SZ]]]* [[SIZET2]], i32 0, i32 0), i32* getelementptr inbounds ([1 x i32], [1 x i32]* [[MAPT2]], i32 0, i32 0)) + // CHECK-DAG: [[BP]] = getelementptr inbounds [1 x i8*], [1 x i8*]* [[BPR:%[^,]+]], i32 0, i32 0 + // CHECK-DAG: [[P]] = getelementptr inbounds [1 x i8*], [1 x i8*]* [[PR:%[^,]+]], i32 0, i32 0 + // CHECK-DAG: [[BPADDR0:%.+]] = getelementptr inbounds [1 x i8*], [1 x i8*]* [[BPR]], i32 0, i32 [[IDX0:[0-9]+]] + // CHECK-DAG: [[PADDR0:%.+]] = getelementptr inbounds [1 x i8*], [1 x i8*]* [[PR]], i32 0, i32 [[IDX0]] + // CHECK-DAG: store i8* [[BP0:%[^,]+]], i8** [[BPADDR0]] + // CHECK-DAG: store i8* [[P0:%[^,]+]], i8** [[PADDR0]] + // CHECK-DAG: [[BP0]] = bitcast i16* %{{.+}} to i8* + // CHECK-DAG: [[P0]] = bitcast i16* %{{.+}} to i8* + + // CHECK: [[ERROR:%.+]] = icmp ne i32 [[RET]], 0 + // CHECK-NEXT: br i1 [[ERROR]], label %[[FAIL:[^,]+]], label %[[END:[^,]+]] + // CHECK: [[FAIL]] + // CHECK: call void [[HVT2:@.+]](i16* {{[^,]+}}) + //
Re: [PATCH] D11361: [OpenMP] Target directive host codegen
sfantao added a comment. In http://reviews.llvm.org/D11361#229744, @ABataev wrote: Another one thing I forget to mention. Current implementation of CGOpenMPRuntime is libomp-specific. You're trying to add functionality that is libtarget-specific. Maybe it is a good idea to separate support for libomp and libtarget runtime libraries? Not sure what do you mean by separation. Different files? Different codegen class? My perspective is that the two things should be together given that they both address the same specification, and I see that interaction is required between the two components. E.g. teams codegen will have to interact with the target codegen (communicate number of teams/threads ) and the teams codegen will require the libomp interface in its implementation. We could always separate the two things in the future if we see that is a better way to organize the code. Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:2887 @@ +2886,3 @@ +llvm::Value * +CGOpenMPRuntime::emitTargetOutlinedFunction(CodeGenFunction CGF, +const OMPExecutableDirective D, ABataev wrote: I don't think you need this argument. You're emitting a new outlined function here and don't need info about your current function. Done! Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:2906-2911 @@ +2905,8 @@ + + CodeGenFunction TargetAuxCGF(CGM, true); + CGOpenMPTargetRegionInfo CGInfo(CS, CodeGen); + CodeGenFunction::CGCapturedStmtRAII CapInfoRAII(TargetAuxCGF, CGInfo); + auto *TargetAuxFn = TargetAuxCGF.GenerateCapturedStmtFunction(CS); + TargetAuxFn-addFnAttr(llvm::Attribute::AlwaysInline); + + // Collect the arguments of the main function. ABataev wrote: You'd better to emit internal function separately in a new static function. Then you don't need to create TargetAuxCGF and TargetMainCGF. You should use just CGF everywhere. One CodeGenFunction instance per function. Done! Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:2970-2972 @@ +2969,5 @@ + + auto ai = Args.begin(); + for (RecordDecl::field_iterator ri = RD-field_begin(), re = RD-field_end(); + ri != re; ++ri, ++ai) { + ABataev wrote: Variable names should start with an upper case letter (e.g. Leader or Boats). Ok, thought iterators were an exception to that rule. Fixed now! Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:3070-3107 @@ +3069,40 @@ +} else { + // We expect all the sizes to be constant, so we collect them to create + // a constant array. + SmallVectoruint64_t, 16 ConstSizes; + for (auto *V : Sizes) +ConstSizes.push_back(castllvm::ConstantInt(V)-getZExtValue()); + + auto SizeTypeBytes = + CGF.getContext() + .getTypeSizeInChars(CGF.getContext().getSizeType()) + .getQuantity(); + + llvm::Constant *SizesArrayInit; + switch (SizeTypeBytes) { + default: +llvm_unreachable(Unexpected size-type type!); + case 1: { +SmallVectoruint8_t, 16 ConstSizesL(ConstSizes.begin(), + ConstSizes.end()); +SizesArrayInit = +llvm::ConstantDataArray::get(CGM.getLLVMContext(), ConstSizesL); + } break; + case 2: { +SmallVectoruint16_t, 16 ConstSizesL(ConstSizes.begin(), + ConstSizes.end()); +SizesArrayInit = +llvm::ConstantDataArray::get(CGM.getLLVMContext(), ConstSizesL); + } break; + case 4: { +SmallVectoruint32_t, 16 ConstSizesL(ConstSizes.begin(), + ConstSizes.end()); +SizesArrayInit = +llvm::ConstantDataArray::get(CGM.getLLVMContext(), ConstSizesL); + } break; + case 8: { +SizesArrayInit = +llvm::ConstantDataArray::get(CGM.getLLVMContext(), ConstSizes); + } break; + } + auto *SizesArrayGbl = new llvm::GlobalVariable( ABataev wrote: Try instead: SizesArrayInit = llvm::ConstantArray::get(llvm::ArrayType::get(CGM.SizeTy, Sizes.size()), Sizes); Done! Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:3161-3164 @@ +3160,6 @@ + } else { +BasePointersArray = llvm::Constant::getNullValue(CGM.VoidPtrPtrTy); +PointersArray = llvm::Constant::getNullValue(CGM.VoidPtrPtrTy); +SizesArray = llvm::Constant::getNullValue(CGM.SizeTy-getPointerTo()); +MapTypesArray = llvm::Constant::getNullValue(CGM.Int32Ty-getPointerTo()); + } ABataev wrote: llvm::ConstantPointerNull::get(type); Done! Comment at: lib/CodeGen/CGStmtOpenMP.cpp:2139-2203 @@ +2138,67 @@ + + bool hasVLACaptures = false; + const CapturedStmt CS = *castCapturedStmt(S.getAssociatedStmt()); + auto ri = CS.getCapturedRecordDecl()-field_begin(); + auto ii = CS.capture_init_begin(); + for
Re: [PATCH] D11361: [OpenMP] Target directive host codegen
ABataev added inline comments. Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:2887 @@ +2886,3 @@ +llvm::Value * +CGOpenMPRuntime::emitTargetOutlinedFunction(CodeGenFunction CGF, +const OMPExecutableDirective D, I don't think you need this argument. You're emitting a new outlined function here and don't need info about your current function. Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:2906-2911 @@ +2905,8 @@ + + CodeGenFunction TargetAuxCGF(CGM, true); + CGOpenMPTargetRegionInfo CGInfo(CS, CodeGen); + CodeGenFunction::CGCapturedStmtRAII CapInfoRAII(TargetAuxCGF, CGInfo); + auto *TargetAuxFn = TargetAuxCGF.GenerateCapturedStmtFunction(CS); + TargetAuxFn-addFnAttr(llvm::Attribute::AlwaysInline); + + // Collect the arguments of the main function. You'd better to emit internal function separately in a new static function. Then you don't need to create TargetAuxCGF and TargetMainCGF. You should use just CGF everywhere. One CodeGenFunction instance per function. Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:2970-2972 @@ +2969,5 @@ + + auto ai = Args.begin(); + for (RecordDecl::field_iterator ri = RD-field_begin(), re = RD-field_end(); + ri != re; ++ri, ++ai) { + Variable names should start with an upper case letter (e.g. Leader or Boats). Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:3070-3107 @@ +3069,40 @@ +} else { + // We expect all the sizes to be constant, so we collect them to create + // a constant array. + SmallVectoruint64_t, 16 ConstSizes; + for (auto *V : Sizes) +ConstSizes.push_back(castllvm::ConstantInt(V)-getZExtValue()); + + auto SizeTypeBytes = + CGF.getContext() + .getTypeSizeInChars(CGF.getContext().getSizeType()) + .getQuantity(); + + llvm::Constant *SizesArrayInit; + switch (SizeTypeBytes) { + default: +llvm_unreachable(Unexpected size-type type!); + case 1: { +SmallVectoruint8_t, 16 ConstSizesL(ConstSizes.begin(), + ConstSizes.end()); +SizesArrayInit = +llvm::ConstantDataArray::get(CGM.getLLVMContext(), ConstSizesL); + } break; + case 2: { +SmallVectoruint16_t, 16 ConstSizesL(ConstSizes.begin(), + ConstSizes.end()); +SizesArrayInit = +llvm::ConstantDataArray::get(CGM.getLLVMContext(), ConstSizesL); + } break; + case 4: { +SmallVectoruint32_t, 16 ConstSizesL(ConstSizes.begin(), + ConstSizes.end()); +SizesArrayInit = +llvm::ConstantDataArray::get(CGM.getLLVMContext(), ConstSizesL); + } break; + case 8: { +SizesArrayInit = +llvm::ConstantDataArray::get(CGM.getLLVMContext(), ConstSizes); + } break; + } + auto *SizesArrayGbl = new llvm::GlobalVariable( Try instead: SizesArrayInit = llvm::ConstantArray::get(llvm::ArrayType::get(CGM.SizeTy, Sizes.size()), Sizes); Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:3161-3164 @@ +3160,6 @@ + } else { +BasePointersArray = llvm::Constant::getNullValue(CGM.VoidPtrPtrTy); +PointersArray = llvm::Constant::getNullValue(CGM.VoidPtrPtrTy); +SizesArray = llvm::Constant::getNullValue(CGM.SizeTy-getPointerTo()); +MapTypesArray = llvm::Constant::getNullValue(CGM.Int32Ty-getPointerTo()); + } llvm::ConstantPointerNull::get(type); Comment at: lib/CodeGen/CGOpenMPRuntime.h:190-204 @@ +189,17 @@ +public: + /// \brief Values for bit flags used to specify the mapping type for + /// offloading. + enum OpenMPOffloadMappingFlags { +/// \brief Allocate memory on the device and move data from host to device. +OMP_MAP_TO = 0x01, +/// \brief Allocate memory on the device and move data from device to host. +OMP_MAP_FROM = 0x02, + }; + +private: + enum OpenMPOffloadingReservedDeviceIDs { +/// \brief Device ID if the device was not defined, runtime should get it +/// from environment variables in the spec. +OMP_DEVICEID_UNDEF = -1, + }; + These enums must not be exposed by CGOpenMPRuntime until they are used in arguments of runtime member functions. Comment at: lib/CodeGen/CGOpenMPRuntime.h:768 @@ -710,1 +767,3 @@ + ArrayRefllvm::Value * Sizes, + ArrayRefunsigned MapTypes, bool hasVLACaptures); }; I don't like the idea of using 'unsigned' as a map type. You should create some particular OpenMPMapClauseKind (just like OpenMPDefaultClauseKind, OpenMPDependClauseKind, OpenMPProcBindClauseKind etc.) and use it where required. Comment at: lib/CodeGen/CGStmtOpenMP.cpp:2139-2203 @@
Re: [PATCH] D11361: [OpenMP] Target directive host codegen
ABataev added a comment. Another one thing I forget to mention. Current implementation of CGOpenMPRuntime is libomp-specific. You're trying to add functionality that is libtarget-specific. Maybe it is a good idea to separate support for libomp and libtarget runtime libraries? http://reviews.llvm.org/D11361 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Re: [PATCH] D11361: [OpenMP] Target directive host codegen
sfantao added a comment. Alexey, Thanks for the review! Find my comments inlined. Thanks again! Samuel Comment at: lib/CodeGen/CGExpr.cpp:1969 @@ -1945,3 +1968,3 @@ else - return EmitCapturedFieldLValue(*this, CapturedStmtInfo-lookup(VD), - CapturedStmtInfo-getContextValue()); + return EmitCapturedValue(*this, CapturedStmtInfo-lookup(VD), + CapturedStmtInfo-getContextValue()); ABataev wrote: Samuel, why you don't want to capture all used variables in CapturedDecl instead of creating ImplicitParamDecl for each captured variable? Also, you will solve possible trouble with VLAs automatically using CapturedDecl. Alexey, I'm not sure I understand what you mean here. Unlike the other captured regions, the target region outlined function does not take a context that captures all the variables in fields of a record as argument. Instead, it takes all the captured references as arguments. This will enable the device runtime library to decide what is best to pass the arguments to the device (see my response to John's question in my previous diff). It happens that all the machinery in the common infrastructure that creates the outlined functions (`CodeGenFunction::StartFunction` and `GenerateCapturedStmtFunction`) is prepared to get the context record from the `CapuredDecl`. Therefore, in order to not disrupt the common infrastructure, in `Sema::ActOnOpenMPRegionEnd` I am creating a new `CapturedDecl` that contains implicit parameters. I gather the information to build the new `CapturedDecl` from the `CapturedDecl` that is created with the context argument and the `RecordDecl` fields so that I don't need to touch the capturing code in Sema. Having `CapturedDecl` with implicit parameters will drive `StartFunction` to create the outlined region with the right signature without having to change anything in there. I only had to guard the initialization of VLAs and 'this' in `GenerateCapturedStmtFunction` to not do anything that expects the context argument. However, during the emission of the VLAs that happens in `StartFunction`, the emission of these implicit parameters is attempted based on references that are marked as `refersToEnclosingVariableOrCapture`- this is the reason for the change in `EmitDeclRefLValue`. Given that the references in the outlined function statements are still expecting the VLAs expression used in the caller of the outlined function, `PrepareOMPTargetDirectiveBodyEmission` will make sure that the mapped values to those expressions will be the same as the ones that use the new expression based on implicit parameters. Let me know if you need me to clarify anything. Thanks! Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:2953 @@ +2952,3 @@ + BasePointer = Pointer = LV.getAddress(); + uint64_t SizeVal = CGM.getContext().getTypeSize(ri-getType()) / 8; + Size = CGF.Builder.getInt64(SizeVal); ABataev wrote: Use CGM.getContext().getTypeSizeInChars() instead of CGM.getContext().getTypeSize() / 8. Done! Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:2954 @@ +2953,3 @@ + uint64_t SizeVal = CGM.getContext().getTypeSize(ri-getType()) / 8; + Size = CGF.Builder.getInt64(SizeVal); + ABataev wrote: Maybe llvm::ConstantInt::get(CGF.SizeTy, SizeVal)? I agree, it makes more sense to use size_t. Thanks for the suggestion! Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:2963 @@ +2962,3 @@ + uint64_t SizeVal = + CGM.getContext().getTypeSize(PtrTy-getPointeeType()) / 8; + Size = CGF.Builder.getInt64(SizeVal); ABataev wrote: Use CGM.getContext().getTypeSizeInChars() instead of CGM.getContext().getTypeSize() / 8. done! Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:2980 @@ +2979,3 @@ +uint64_t ElementTypeSize = +CGM.getContext().getTypeSize(ElementType) / 8; +Size = CGF.Builder.getInt64(ElementTypeSize); ABataev wrote: The same Done! Comment at: lib/CodeGen/CGStmtOpenMP.cpp:2144-2145 @@ +2143,4 @@ + auto *ThisRef = LocalDeclMap[*pi]; + LValue Addr = LValue::MakeAddr(ThisRef, ri-getType(), CharUnits(), + CGM.getContext()); + CXXThisValue = EmitLoadOfLValue(Addr, CS.getLocStart()).getScalarVal(); ABataev wrote: MakeNaturalAlignAddrLValue(ThisRef, ri-getType())? Now using `MakeNaturalAlignAddrLValue`. Comment at: lib/CodeGen/CGStmtOpenMP.cpp:2147 @@ +2146,3 @@ + CXXThisValue = EmitLoadOfLValue(Addr, CS.getLocStart()).getScalarVal(); + ; + continue; ABataev wrote: Extra semicolon Fixed. http://reviews.llvm.org/D11361 ___ cfe-commits mailing list
Re: [PATCH] D11361: [OpenMP] Target directive host codegen
ABataev added inline comments. Comment at: lib/CodeGen/CGExpr.cpp:1969 @@ -1945,3 +1968,3 @@ else - return EmitCapturedFieldLValue(*this, CapturedStmtInfo-lookup(VD), - CapturedStmtInfo-getContextValue()); + return EmitCapturedValue(*this, CapturedStmtInfo-lookup(VD), + CapturedStmtInfo-getContextValue()); Samuel, why you don't want to capture all used variables in CapturedDecl instead of creating ImplicitParamDecl for each captured variable? Also, you will solve possible trouble with VLAs automatically using CapturedDecl. Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:2953 @@ +2952,3 @@ + BasePointer = Pointer = LV.getAddress(); + uint64_t SizeVal = CGM.getContext().getTypeSize(ri-getType()) / 8; + Size = CGF.Builder.getInt64(SizeVal); Use CGM.getContext().getTypeSizeInChars() instead of CGM.getContext().getTypeSize() / 8. Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:2954 @@ +2953,3 @@ + uint64_t SizeVal = CGM.getContext().getTypeSize(ri-getType()) / 8; + Size = CGF.Builder.getInt64(SizeVal); + Maybe llvm::ConstantInt::get(CGF.SizeTy, SizeVal)? Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:2963 @@ +2962,3 @@ + uint64_t SizeVal = + CGM.getContext().getTypeSize(PtrTy-getPointeeType()) / 8; + Size = CGF.Builder.getInt64(SizeVal); Use CGM.getContext().getTypeSizeInChars() instead of CGM.getContext().getTypeSize() / 8. Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:2980 @@ +2979,3 @@ +uint64_t ElementTypeSize = +CGM.getContext().getTypeSize(ElementType) / 8; +Size = CGF.Builder.getInt64(ElementTypeSize); The same Comment at: lib/CodeGen/CGStmtOpenMP.cpp:2144-2145 @@ +2143,4 @@ + auto *ThisRef = LocalDeclMap[*pi]; + LValue Addr = LValue::MakeAddr(ThisRef, ri-getType(), CharUnits(), + CGM.getContext()); + CXXThisValue = EmitLoadOfLValue(Addr, CS.getLocStart()).getScalarVal(); MakeNaturalAlignAddrLValue(ThisRef, ri-getType())? Comment at: lib/CodeGen/CGStmtOpenMP.cpp:2147 @@ +2146,3 @@ + CXXThisValue = EmitLoadOfLValue(Addr, CS.getLocStart()).getScalarVal(); + ; + continue; Extra semicolon http://reviews.llvm.org/D11361 ___ cfe-commits mailing list cfe-commits@lists.llvm.org http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Re: [PATCH] D11361: [OpenMP] Target directive host codegen
sfantao updated this revision to Diff 32211. sfantao added a comment. This patch tries to avoid as much as possible changing the common infrastructure, by adapting the CapturedDecl creation in SEMA and by adding support to a second type of capture - ImplicitParamDecl (on top of the existent FieldDecl). Also adds codegen for device clause as the Parsing and SEMA support was added in the meantime. The regression test was not updated yet. I wanted to make sure the direction this is taking is approved before diving into that. Thanks! Samuel http://reviews.llvm.org/D11361 Files: include/clang/AST/Decl.h include/clang/AST/Stmt.h include/clang/Basic/CapturedStmt.h include/clang/Sema/ScopeInfo.h lib/CodeGen/CGExpr.cpp lib/CodeGen/CGOpenMPRuntime.cpp lib/CodeGen/CGOpenMPRuntime.h lib/CodeGen/CGStmt.cpp lib/CodeGen/CGStmtOpenMP.cpp lib/CodeGen/CodeGenFunction.cpp lib/CodeGen/CodeGenFunction.h lib/Sema/SemaOpenMP.cpp test/OpenMP/target_codegen.cpp Index: test/OpenMP/target_codegen.cpp === --- /dev/null +++ test/OpenMP/target_codegen.cpp @@ -0,0 +1,583 @@ +// RUN: %clang_cc1 -verify -fopenmp -x c++ -triple powerpc64le-unknown-unknown -emit-llvm %s -o - | FileCheck %s +// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple powerpc64le-unknown-unknown -emit-pch -o %t %s +// RUN: %clang_cc1 -fopenmp -x c++ -triple powerpc64le-unknown-unknown -std=c++11 -include-pch %t -verify %s -emit-llvm -o - | FileCheck %s +// expected-no-diagnostics +// REQUIRES: powerpc-registered-target +#ifndef HEADER +#define HEADER + +// CHECK-DAG: [[TT:%.+]] = type { i64, i8 } +// CHECK-DAG: [[S1:%.+]] = type { double } + +// We have 8 target regions, but only 7 that actually will generate offloading +// code, and only 6 will have mapped arguments. + +// CHECK-DAG: [[MAPT2:@.+]] = private constant [1 x i32] [i32 3] +// CHECK-DAG: [[MAPT3:@.+]] = private constant [2 x i32] [i32 3, i32 3] +// CHECK-DAG: [[MAPT4:@.+]] = private constant [9 x i32] [i32 3, i32 3, i32 1, i32 3, i32 3, i32 1, i32 1, i32 3, i32 3] +// CHECK-DAG: [[MAPT5:@.+]] = private constant [3 x i32] [i32 3, i32 3, i32 3] +// CHECK-DAG: [[MAPT6:@.+]] = private constant [4 x i32] [i32 3, i32 3, i32 3, i32 3] +// CHECK-DAG: [[MAPT7:@.+]] = private constant [5 x i32] [i32 3, i32 3, i32 1, i32 1, i32 3] +// CHECK-DAG: @{{.*}} = private constant i8 0 +// CHECK-DAG: @{{.*}} = private constant i8 0 +// CHECK-DAG: @{{.*}} = private constant i8 0 +// CHECK-DAG: @{{.*}} = private constant i8 0 +// CHECK-DAG: @{{.*}} = private constant i8 0 +// CHECK-DAG: @{{.*}} = private constant i8 0 +// CHECK-DAG: @{{.*}} = private constant i8 0 + +templatetypename tx, typename ty +struct TT{ + tx X; + ty Y; +}; + +// CHECK: define {{.*}}[[FOO:@.+]]( +int foo(int n) { + int a = 0; + short aa = 0; + float b[10]; + float bn[n]; + double c[5][10]; + double cn[5][n]; + TTlong, char d; + + // CHECK: br label %[[TRY:[^,]+]] + // CHECK: [[TRY]] + // CHECK: [[RET:%.+]] = call i32 @__tgt_target(i32 -1, i8* @{{[^,]+}}, i32 0, i8** null, i8** null, i64* null, i32* null) + // CHECK-NEXT: [[ERROR:%.+]] = icmp ne i32 [[RET]], 0 + // CHECK-NEXT: br i1 [[ERROR]], label %[[FAIL:[^,]+]], label %[[END:[^,]+]] + // CHECK: [[FAIL]] + // CHECK: call void [[HVT0:@.+]]() + // CHECK-NEXT: br label %[[END]] + // CHECK: [[END]] + #pragma omp target + { + } + + // CHECK: call void [[HVT1:@.+]](i32* {{[^,]+}}) + #pragma omp target if(0) + { +a += 1; + } + + // CHECK: br label %[[TRY:[^,]+]] + // CHECK: [[TRY]] + // CHECK-DAG: [[RET:%.+]] = call i32 @__tgt_target(i32 -1, i8* @{{[^,]+}}, i32 1, i8** [[BP:%[^,]+]], i8** [[P:%[^,]+]], i64* [[S:%[^,]+]], i32* getelementptr inbounds ([1 x i32], [1 x i32]* [[MAPT2]], i32 0, i32 0)) + + // CHECK-DAG: store i64 4, i64* [[SADDR0:%.+]] + // CHECK-DAG: [[SADDR0]] = getelementptr inbounds i64, i64* [[S]], i32 [[IDX0:[0-9]+]] + // CHECK-DAG: [[BPADDR0:%.+]] = getelementptr inbounds i8*, i8** [[BP]], i32 [[IDX0]] + // CHECK-DAG: [[PADDR0:%.+]] = getelementptr inbounds i8*, i8** [[P]], i32 [[IDX0]] + // CHECK-DAG: store i8* [[BP0:%[^,]+]], i8** [[BPADDR0]] + // CHECK-DAG: store i8* [[P0:%[^,]+]], i8** [[PADDR0]] + // CHECK-DAG: [[BP0]] = bitcast i32* %{{.+}} to i8* + // CHECK-DAG: [[P0]] = bitcast i32* %{{.+}} to i8* + + // CHECK: [[ERROR:%.+]] = icmp ne i32 [[RET]], 0 + // CHECK-NEXT: br i1 [[ERROR]], label %[[FAIL:[^,]+]], label %[[END:[^,]+]] + // CHECK: [[FAIL]] + // CHECK: call void [[HVT2:@.+]](i32* {{[^,]+}}) + // CHECK-NEXT: br label %[[END]] + // CHECK: [[END]] + #pragma omp target if(1) + { +a += 1; + } + + // CHECK: [[IF:%.+]] = icmp sgt i32 {{[^,]+}}, 10 + // CHECK: br i1 [[IF]], label %[[TRY:[^,]+]], label %[[FAIL:[^,]+]] + // CHECK: [[TRY]] + // CHECK-DAG: [[RET:%.+]] = call i32 @__tgt_target(i32 -1, i8*
Re: [PATCH] D11361: [OpenMP] Target directive host codegen
sfantao added a comment. Alexey, John, Thanks for the review! I've tried to address your concerns in the last diff. Please, check the inlined comments to find answers for the remarks of the previous diff. Thanks again! Samuel Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:863-864 @@ -840,1 +862,4 @@ } + case OMPRTL__tgt_target: { +// Build to int32_t __tgt_target(int32_t device_id, void *host_ptr, int32_t +// arg_num, void** args_base, void **args, int64_t *arg_sizes, int32_t rjmccall wrote: Spurious to at the start. Fixed! Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:2921 @@ +2920,3 @@ + uint64_t SizeVal = + CGM.getDataLayout().getTypeSizeInBits(V-getType()) / 8; + Size = CGF.Builder.getInt64(SizeVal); rjmccall wrote: getTypeStoreSize() Now using getTypeSizeInChars from the ASTcontext as suggested bellow. Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:2930 @@ +2929,3 @@ + uint64_t SizeVal = + CGM.getDataLayout().getTypeSizeInBits(PtrTy-getElementType()) / 8; + Size = CGF.Builder.getInt64(SizeVal); rjmccall wrote: You should ask the ASTContext to compute this size instead of making assumptions about the layout size of the IR type. Also, what are the semantics supposed to be for mapping to and from? Do referents need to be trivially copyable? What if there are pointers or references? What happens to virtual bases? Using using getTypeSizeInChars from the ASTcontext. This patch only deals with trivially copiable types. By default, a variable that is captured in the target region is mapped by value using a to-from policy. In order to do something different than the default, the user has to use a map clause (I'll send out a patch for it once we have the Parsing/SEMA in place). As per the current spec, the map clause allows a user to map the pointee of a pointer as well as only mapping a section of an array or pointee. In the next version of the OpenMP spec we will have the ability to map aggregate fields and more support for deep copy. We expect to be able to handle all the cases with the proper flags in OpenMPOffloadMappingFlags. About virtual bases: OpenMP 4.0 forbids virtual members in mappable variable. Nevertheless, it is possible this constraint be lifted in future versions. Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:2940 @@ +2939,3 @@ + uint64_t SizeVal = + CGM.getDataLayout().getTypeSizeInBits(PtrTy-getElementType()) / 8; + Size = CGF.Builder.getInt64(SizeVal); rjmccall wrote: Same thing, please ask the ASTContext. Also, you might need to care about variably-sized types here. Now using ASTContext. I was planing to deal with the VLA types once I add support for the map clause, but I agree it makes more sense to do it now. Thanks for the suggestion. Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:3009 @@ +3008,3 @@ +SizesArray = +CGF.Builder.CreateAlloca(CGM.Int64Ty, PointerNum, .offload_sizes); + rjmccall wrote: This is creating a bunch of dynamic allocas instead of just temporary values. Please call CreateMemTemp with an array type instead. Done! Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:3014 @@ +3013,3 @@ +llvm::Constant *MapTypesArrayInit = +llvm::ConstantDataArray::get(CGF.Builder.getContext(), MapTypes); +MapTypesArray = rjmccall wrote: The sizes aren't constant if you've captured a VLA. But this comment is actually just wrong, because this isn't building something for the sizes at all; it's building something for the flags. That said, I think you ought to be able to do the same thing with the sizes when you don't have a VLA. Sorry for the error in the comment. It is fixed now. Also, I added code to deal with constant sizes for when we don't have VLAs. Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:3017 @@ +3016,3 @@ +new llvm::GlobalVariable(CGM.getModule(), MapTypesArrayInit-getType(), + true, llvm::GlobalValue::PrivateLinkage, + MapTypesArrayInit, .offload_maptypes); rjmccall wrote: Please comment boolean arguments like this: /*constant*/ true And please mark this variable unnamed_addr. Added comments for the boolean. Setting unnamed_addr now, thanks! Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:3037 @@ +3036,3 @@ + llvm::Value *S = CGF.Builder.CreateConstInBoundsGEP1_32( + SPtrTy-getElementType(), SizesArray, i); + rjmccall wrote: You already know the element types for all of these. The code will be much more readable if you just use those types directly. Ok, now using the types explicitly. Comment at: lib/CodeGen/CGOpenMPRuntime.cpp:3044 @@ +3043,3 @@ +