[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)

2024-02-06 Thread Gheorghe-Teodor Bercea via cfe-commits


@@ -6106,6 +6106,8 @@ class OMPTeamsGenericLoopDirective final : public 
OMPLoopDirective {
 class OMPTargetTeamsGenericLoopDirective final : public OMPLoopDirective {
   friend class ASTStmtReader;
   friend class OMPExecutableDirective;
+  /// true if loop directive's associated loop can be a parallel for.
+  bool CanBeParallelFor = false;

doru1004 wrote:

I don't think it is possible to have the analysis in Sema and not use a flag 
here.

The two options we have are:
1. Do the analysis in Sema and have the flag and then read the flag in CG.
2. Have the analysis in CG and then there's no reason to pass anything around 
and CG can call the function when needed.

There is a 3rd hybrid way to do this where this function is moved back into CG:
```
bool Sema::teamsLoopCanBeParallelFor(Stmt *AStmt) {
  TeamsLoopChecker Checker(*this);
  Checker.Visit(AStmt);
  return Checker.teamsLoopCanBeParallelFor();
}
```

But then I don't know how you can call the TeamsLoopChecker which lives in Sema.

https://github.com/llvm/llvm-project/pull/72417
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[openmp] [clang] [llvm] [OpenMP] Remove `register_requires` global constructor (PR #80460)

2024-02-02 Thread Gheorghe-Teodor Bercea via cfe-commits


@@ -199,7 +199,7 @@ static int initLibrary(DeviceTy ) {
 Entry.size) != OFFLOAD_SUCCESS)
   REPORT("Failed to write symbol for USM %s\n", Entry.name);
   }
-} else {
+} else if (Entry.addr) {

doru1004 wrote:

So it will only enter this branch if size is 0 and then if the address is not 
nullptr.

https://github.com/llvm/llvm-project/pull/80460
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[openmp] [clang] [llvm] [OpenMP] Remove `register_requires` global constructor (PR #80460)

2024-02-02 Thread Gheorghe-Teodor Bercea via cfe-commits


@@ -199,7 +199,7 @@ static int initLibrary(DeviceTy ) {
 Entry.size) != OFFLOAD_SUCCESS)
   REPORT("Failed to write symbol for USM %s\n", Entry.name);
   }
-} else {
+} else if (Entry.addr) {

doru1004 wrote:

Or is that too restrictive

https://github.com/llvm/llvm-project/pull/80460
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[openmp] [clang] [llvm] [OpenMP] Remove `register_requires` global constructor (PR #80460)

2024-02-02 Thread Gheorghe-Teodor Bercea via cfe-commits


@@ -199,7 +199,7 @@ static int initLibrary(DeviceTy ) {
 Entry.size) != OFFLOAD_SUCCESS)
   REPORT("Failed to write symbol for USM %s\n", Entry.name);
   }
-} else {
+} else if (Entry.addr) {

doru1004 wrote:

Should we check for size > 0 too? 

https://github.com/llvm/llvm-project/pull/80460
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [openmp] [clang] [OpenMP] Remove `register_requires` global constructor (PR #80460)

2024-02-02 Thread Gheorghe-Teodor Bercea via cfe-commits


@@ -199,7 +199,7 @@ static int initLibrary(DeviceTy ) {
 Entry.size) != OFFLOAD_SUCCESS)
   REPORT("Failed to write symbol for USM %s\n", Entry.name);
   }
-} else {
+} else if (Entry.addr) {

doru1004 wrote:

Also, could you explain in a comment why Entry.addr is used here?

https://github.com/llvm/llvm-project/pull/80460
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[llvm] [openmp] [clang] [OpenMP] Remove `register_requires` global constructor (PR #80460)

2024-02-02 Thread Gheorghe-Teodor Bercea via cfe-commits


@@ -199,7 +199,7 @@ static int initLibrary(DeviceTy ) {
 Entry.size) != OFFLOAD_SUCCESS)
   REPORT("Failed to write symbol for USM %s\n", Entry.name);
   }
-} else {
+} else if (Entry.addr) {

doru1004 wrote:

So now we don't have a "default" else branch here. Is it not needed?

https://github.com/llvm/llvm-project/pull/80460
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[flang] [clang] [mlir] [Flang][OpenMP][MLIR] Add support for -nogpulib option (PR #71045)

2024-01-09 Thread Gheorghe-Teodor Bercea via cfe-commits

https://github.com/doru1004 approved this pull request.

LG

https://github.com/llvm/llvm-project/pull/71045
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [openmp] [Clang][OpenMP] Fix mapping of structs to device (PR #75642)

2023-12-19 Thread Gheorghe-Teodor Bercea via cfe-commits

doru1004 wrote:

> The newly added test `offloading/struct_mapping_with_pointers.cpp` fails on 
> NVIDIA GPUs as well.
> 
> ```
>  TEST 'libomptarget :: nvptx64-nvidia-cuda :: 
> offloading/struct_mapping_with_pointers.cpp' FAILED 
> Exit Code: 1
> 
> Command Output (stdout):
> --
> # RUN: at line 2
> /gpfs/jlse-fs0/users/ac.shilei.tian/build/llvm/release/bin/clang++ -fopenmp 
> -pthread   -I 
> /home/ac.shilei.tian/Documents/vscode/llvm-project/openmp/libomptarget/test 
> -I /gpfs/jlse
> -fs0/users/ac.shilei.tian/build/openmp/release/runtime/src -L 
> /gpfs/jlse-fs0/users/ac.shilei.tian/build/openmp/release/libomptarget -L 
> /gpfs/jlse-fs0/users/ac.shilei.tian/build/ll
> vm/release/./lib -L 
> /gpfs/jlse-fs0/users/ac.shilei.tian/build/openmp/release/runtime/src  
> -Wl,-rpath,/gpfs/jlse-fs0/users/ac.shilei.tian/build/openmp/release/libomptarget
>  -Wl,-rpa
> th,/gpfs/jlse-fs0/users/ac.shilei.tian/build/openmp/release/runtime/src 
> -Wl,-rpath,/gpfs/jlse-fs0/users/ac.shilei.tian/build/llvm/release/./lib 
> -Wl,-rpath,/soft/compilers/cuda/cud
> a-11.8.0/targets/x86_64-linux/lib 
> --libomptarget-nvptx-bc-path=/gpfs/jlse-fs0/users/ac.shilei.tian/build/openmp/release/libomptarget/DeviceRTL
>  -fopenmp-targets=nvptx64-nvidia-cuda
>  
> /home/ac.shilei.tian/Documents/vscode/llvm-project/openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp
>  -o /gpfs/jlse-fs0/users/ac.shilei.tian/build/openmp/releas
> e/libomptarget/test/nvptx64-nvidia-cuda/offloading/Output/struct_mapping_with_pointers.cpp.tmp
>  
> /gpfs/jlse-fs0/users/ac.shilei.tian/build/openmp/release/libomptarget/libomptarget.d
> evicertl.a && env LIBOMPTARGET_DEBUG=1 
> /gpfs/jlse-fs0/users/ac.shilei.tian/build/openmp/release/libomptarget/test/nvptx64-nvidia-cuda/offloading/Output/struct_mapping_with_pointer
> s.cpp.tmp 2>&1 | 
> /gpfs/jlse-fs0/users/ac.shilei.tian/build/llvm/release/bin/FileCheck 
> /home/ac.shilei.tian/Documents/vscode/llvm-project/openmp/libomptarget/test/offloading/struct
> _mapping_with_pointers.cpp
> # executed command: 
> /gpfs/jlse-fs0/users/ac.shilei.tian/build/llvm/release/bin/clang++ -fopenmp 
> -pthread -I 
> /home/ac.shilei.tian/Documents/vscode/llvm-project/openmp/libomptarget/
> test -I /gpfs/jlse-fs0/users/ac.shilei.tian/build/openmp/release/runtime/src 
> -L /gpfs/jlse-fs0/users/ac.shilei.tian/build/openmp/release/libomptarget -L 
> /gpfs/jlse-fs0/users/ac.sh
> ilei.tian/build/llvm/release/./lib -L 
> /gpfs/jlse-fs0/users/ac.shilei.tian/build/openmp/release/runtime/src 
> -Wl,-rpath,/gpfs/jlse-fs0/users/ac.shilei.tian/build/openmp/release/libo
> mptarget 
> -Wl,-rpath,/gpfs/jlse-fs0/users/ac.shilei.tian/build/openmp/release/runtime/src
>  -Wl,-rpath,/gpfs/jlse-fs0/users/ac.shilei.tian/build/llvm/release/./lib 
> -Wl,-rpath,/soft/c
> ompilers/cuda/cuda-11.8.0/targets/x86_64-linux/lib 
> --libomptarget-nvptx-bc-path=/gpfs/jlse-fs0/users/ac.shilei.tian/build/openmp/release/libomptarget/DeviceRTL
>  -fopenmp-targets=nv
> ptx64-nvidia-cuda 
> /home/ac.shilei.tian/Documents/vscode/llvm-project/openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp
>  -o /gpfs/jlse-fs0/users/ac.shilei.tian/bu
> ild/openmp/release/libomptarget/test/nvptx64-nvidia-cuda/offloading/Output/struct_mapping_with_pointers.cpp.tmp
>  /gpfs/jlse-fs0/users/ac.shilei.tian/build/openmp/release/libomptarg
> et/libomptarget.devicertl.a
> # executed command: env LIBOMPTARGET_DEBUG=1 
> /gpfs/jlse-fs0/users/ac.shilei.tian/build/openmp/release/libomptarget/test/nvptx64-nvidia-cuda/offloading/Output/struct_mapping_with_p
> ointers.cpp.tmp
> # executed command: 
> /gpfs/jlse-fs0/users/ac.shilei.tian/build/llvm/release/bin/FileCheck 
> /home/ac.shilei.tian/Documents/vscode/llvm-project/openmp/libomptarget/test/offloading/str
> uct_mapping_with_pointers.cpp
> # .---command stderr
> # | 
> /home/ac.shilei.tian/Documents/vscode/llvm-project/openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp:106:12:
>  error: CHECK: expected string not found in inpu
> t
> # |  // CHECK: dat.datum[dat.arr[0][0]] = 0
> # |^
> # | :124:24: note: scanning from here
> # | dat.val_more_datum = 18
> # |^
> # | :125:1: note: possible intended match here
> # | dat.datum[dat.arr[0][0]] = 32542
> # | ^
> # |
> # | Input file: 
> # | Check file: 
> /home/ac.shilei.tian/Documents/vscode/llvm-project/openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp
> # |
> # | -dump-input=help explains the following input dump.
> # |
> # | Input was:
> # | <<
> # |  .
> # |  .
> # |  .
> # |119: omptarget --> Done unregistering library!
> # |120: omptarget --> Deinit offload library!
> # |121: TARGET CUDA RTL --> Missing 2 resources to be returned
> # |122: dat.xi = 4
> # |123: dat.val_datum = 8
> # |124: dat.val_more_datum = 18
> # | check:106'0  

[clang] [openmp] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)

2023-12-18 Thread Gheorghe-Teodor Bercea via cfe-commits

https://github.com/doru1004 closed 
https://github.com/llvm/llvm-project/pull/72410
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[openmp] [clang] [Clang][OpenMP] Fix mapping of structs to device (PR #75642)

2023-12-18 Thread Gheorghe-Teodor Bercea via cfe-commits

https://github.com/doru1004 closed 
https://github.com/llvm/llvm-project/pull/75642
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [openmp] [Clang][OpenMP] Fix mapping of structs to device (PR #75642)

2023-12-15 Thread Gheorghe-Teodor Bercea via cfe-commits

doru1004 wrote:

@alexey-bataev I have reworked the previous patch with your advice in mind. The 
emitCombinedEntry function was not changed since eliminating the combined entry 
has many ramifications which would need to be handled in a separate patch. For 
now this fixes the immediate error in a way that allows us to later get rid of 
the combined entry later on if we want to.

https://github.com/llvm/llvm-project/pull/75642
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [openmp] [Clang][OpenMP] Fix mapping of structs to device (PR #75642)

2023-12-15 Thread Gheorghe-Teodor Bercea via cfe-commits

https://github.com/doru1004 updated 
https://github.com/llvm/llvm-project/pull/75642

>From 32454489d4e77f22ab935827dffe0febbb7b0626 Mon Sep 17 00:00:00 2001
From: Doru Bercea 
Date: Fri, 15 Dec 2023 10:22:38 -0500
Subject: [PATCH] Fix mapping of structs to device.

---
 clang/lib/CodeGen/CGOpenMPRuntime.cpp | 148 +++
 clang/test/OpenMP/map_struct_ordering.cpp | 172 ++
 .../struct_mapping_with_pointers.cpp  | 114 
 3 files changed, 401 insertions(+), 33 deletions(-)
 create mode 100644 clang/test/OpenMP/map_struct_ordering.cpp
 create mode 100644 
openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp

diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp 
b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
index 7f7e6f53066644..ea6645a39e8321 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
@@ -6811,8 +6811,10 @@ class MappableExprsHandler {
   OpenMPMapClauseKind MapType, ArrayRef 
MapModifiers,
   ArrayRef MotionModifiers,
   OMPClauseMappableExprCommon::MappableExprComponentListRef Components,
-  MapCombinedInfoTy , StructRangeInfoTy ,
-  bool IsFirstComponentList, bool IsImplicit,
+  MapCombinedInfoTy ,
+  MapCombinedInfoTy ,
+  StructRangeInfoTy , bool IsFirstComponentList,
+  bool IsImplicit, bool GenerateAllInfoForClauses,
   const ValueDecl *Mapper = nullptr, bool ForDeviceAddr = false,
   const ValueDecl *BaseDecl = nullptr, const Expr *MapExpr = nullptr,
   ArrayRef
@@ -7098,6 +7100,25 @@ class MappableExprsHandler {
 bool IsNonContiguous = CombinedInfo.NonContigInfo.IsNonContiguous;
 bool IsPrevMemberReference = false;
 
+// We need to check if we will be encountering any MEs. If we do not
+// encounter any ME expression it means we will be mapping the whole 
struct.
+// In that case we need to skip adding an entry for the struct to the
+// CombinedInfo list and instead add an entry to the StructBaseCombinedInfo
+// list only when generating all info for clauses.
+bool IsMappingWholeStruct = true;
+if (!GenerateAllInfoForClauses) {
+  IsMappingWholeStruct = false;
+} else {
+  for (auto TempI = I; TempI != CE; ++TempI) {
+const MemberExpr *PossibleME =
+dyn_cast(TempI->getAssociatedExpression());
+if (PossibleME) {
+  IsMappingWholeStruct = false;
+  break;
+}
+  }
+}
+
 for (; I != CE; ++I) {
   // If the current component is member of a struct (parent struct) mark 
it.
   if (!EncounteredME) {
@@ -7317,21 +7338,41 @@ class MappableExprsHandler {
   break;
 }
 llvm::Value *Size = getExprTypeSize(I->getAssociatedExpression());
+// Skip adding an entry in the CurInfo of this combined entry if the
+// whole struct is currently being mapped. The struct needs to be added
+// in the first position before any data internal to the struct is 
being
+// mapped.
 if (!IsMemberPointerOrAddr ||
 (Next == CE && MapType != OMPC_MAP_unknown)) {
-  CombinedInfo.Exprs.emplace_back(MapDecl, MapExpr);
-  CombinedInfo.BasePointers.push_back(BP.getPointer());
-  CombinedInfo.DevicePtrDecls.push_back(nullptr);
-  CombinedInfo.DevicePointers.push_back(DeviceInfoTy::None);
-  CombinedInfo.Pointers.push_back(LB.getPointer());
-  CombinedInfo.Sizes.push_back(
-  CGF.Builder.CreateIntCast(Size, CGF.Int64Ty, /*isSigned=*/true));
-  CombinedInfo.NonContigInfo.Dims.push_back(IsNonContiguous ? DimSize
-: 1);
+  if (!IsMappingWholeStruct) {
+CombinedInfo.Exprs.emplace_back(MapDecl, MapExpr);
+CombinedInfo.BasePointers.push_back(BP.getPointer());
+CombinedInfo.DevicePtrDecls.push_back(nullptr);
+CombinedInfo.DevicePointers.push_back(DeviceInfoTy::None);
+CombinedInfo.Pointers.push_back(LB.getPointer());
+CombinedInfo.Sizes.push_back(CGF.Builder.CreateIntCast(
+Size, CGF.Int64Ty, /*isSigned=*/true));
+CombinedInfo.NonContigInfo.Dims.push_back(IsNonContiguous ? DimSize
+  : 1);
+  } else {
+StructBaseCombinedInfo.Exprs.emplace_back(MapDecl, MapExpr);
+StructBaseCombinedInfo.BasePointers.push_back(BP.getPointer());
+StructBaseCombinedInfo.DevicePtrDecls.push_back(nullptr);
+
StructBaseCombinedInfo.DevicePointers.push_back(DeviceInfoTy::None);
+StructBaseCombinedInfo.Pointers.push_back(LB.getPointer());
+StructBaseCombinedInfo.Sizes.push_back(CGF.Builder.CreateIntCast(
+Size, CGF.Int64Ty, /*isSigned=*/true));
+StructBaseCombinedInfo.NonContigInfo.Dims.push_back(
+

[clang] [openmp] [Clang][OpenMP] Fix mapping of structs to device (PR #75642)

2023-12-15 Thread Gheorghe-Teodor Bercea via cfe-commits

https://github.com/doru1004 updated 
https://github.com/llvm/llvm-project/pull/75642

>From e0e1f5e7bb2f95f2568b5dd647b883f4740bcafd Mon Sep 17 00:00:00 2001
From: Doru Bercea 
Date: Fri, 15 Dec 2023 10:22:38 -0500
Subject: [PATCH] Fix mapping of structs to device.

---
 clang/lib/CodeGen/CGOpenMPRuntime.cpp | 146 +++
 clang/test/OpenMP/map_struct_ordering.cpp | 172 ++
 .../struct_mapping_with_pointers.cpp  | 114 
 3 files changed, 399 insertions(+), 33 deletions(-)
 create mode 100644 clang/test/OpenMP/map_struct_ordering.cpp
 create mode 100644 
openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp

diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp 
b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
index 7f7e6f53066644..350e7108b8d5a7 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
@@ -6811,8 +6811,10 @@ class MappableExprsHandler {
   OpenMPMapClauseKind MapType, ArrayRef 
MapModifiers,
   ArrayRef MotionModifiers,
   OMPClauseMappableExprCommon::MappableExprComponentListRef Components,
-  MapCombinedInfoTy , StructRangeInfoTy ,
-  bool IsFirstComponentList, bool IsImplicit,
+  MapCombinedInfoTy ,
+  MapCombinedInfoTy ,
+  StructRangeInfoTy , bool IsFirstComponentList,
+  bool IsImplicit, bool GenerateAllInfoForClauses,
   const ValueDecl *Mapper = nullptr, bool ForDeviceAddr = false,
   const ValueDecl *BaseDecl = nullptr, const Expr *MapExpr = nullptr,
   ArrayRef
@@ -7098,6 +7100,25 @@ class MappableExprsHandler {
 bool IsNonContiguous = CombinedInfo.NonContigInfo.IsNonContiguous;
 bool IsPrevMemberReference = false;
 
+// We need to check if we will be encountering any MEs. If we do not
+// encounter any ME expression it means we will be mapping the whole 
struct.
+// In that case we need to skip adding an entry for the struct to the
+// CombinedInfo list and instead add an entry to the StructBaseCombinedInfo
+// list only when generating all info for clauses.
+bool IsMappingWholeStruct = true;
+if (!GenerateAllInfoForClauses) {
+  IsMappingWholeStruct = false;
+} else {
+  for (auto TempI = I; TempI != CE; ++TempI) {
+const MemberExpr *PossibleME =
+dyn_cast(TempI->getAssociatedExpression());
+if (PossibleME) {
+  IsMappingWholeStruct = false;
+  break;
+}
+  }
+}
+
 for (; I != CE; ++I) {
   // If the current component is member of a struct (parent struct) mark 
it.
   if (!EncounteredME) {
@@ -7317,21 +7338,41 @@ class MappableExprsHandler {
   break;
 }
 llvm::Value *Size = getExprTypeSize(I->getAssociatedExpression());
+// Skip adding an entry in the CurInfo of this combined entry if the
+// whole struct is currently being mapped. The struct needs to be added
+// in the first position before any data internal to the struct is 
being
+// mapped.
 if (!IsMemberPointerOrAddr ||
 (Next == CE && MapType != OMPC_MAP_unknown)) {
-  CombinedInfo.Exprs.emplace_back(MapDecl, MapExpr);
-  CombinedInfo.BasePointers.push_back(BP.getPointer());
-  CombinedInfo.DevicePtrDecls.push_back(nullptr);
-  CombinedInfo.DevicePointers.push_back(DeviceInfoTy::None);
-  CombinedInfo.Pointers.push_back(LB.getPointer());
-  CombinedInfo.Sizes.push_back(
-  CGF.Builder.CreateIntCast(Size, CGF.Int64Ty, /*isSigned=*/true));
-  CombinedInfo.NonContigInfo.Dims.push_back(IsNonContiguous ? DimSize
-: 1);
+  if (!IsMappingWholeStruct) {
+CombinedInfo.Exprs.emplace_back(MapDecl, MapExpr);
+CombinedInfo.BasePointers.push_back(BP.getPointer());
+CombinedInfo.DevicePtrDecls.push_back(nullptr);
+CombinedInfo.DevicePointers.push_back(DeviceInfoTy::None);
+CombinedInfo.Pointers.push_back(LB.getPointer());
+CombinedInfo.Sizes.push_back(CGF.Builder.CreateIntCast(
+Size, CGF.Int64Ty, /*isSigned=*/true));
+CombinedInfo.NonContigInfo.Dims.push_back(IsNonContiguous ? DimSize
+  : 1);
+  } else {
+StructBaseCombinedInfo.Exprs.emplace_back(MapDecl, MapExpr);
+StructBaseCombinedInfo.BasePointers.push_back(BP.getPointer());
+StructBaseCombinedInfo.DevicePtrDecls.push_back(nullptr);
+
StructBaseCombinedInfo.DevicePointers.push_back(DeviceInfoTy::None);
+StructBaseCombinedInfo.Pointers.push_back(LB.getPointer());
+StructBaseCombinedInfo.Sizes.push_back(CGF.Builder.CreateIntCast(
+Size, CGF.Int64Ty, /*isSigned=*/true));
+StructBaseCombinedInfo.NonContigInfo.Dims.push_back(
+

[clang] [openmp] [Clang][OpenMP] Fix mapping of structs to device (PR #75642)

2023-12-15 Thread Gheorghe-Teodor Bercea via cfe-commits

https://github.com/doru1004 updated 
https://github.com/llvm/llvm-project/pull/75642

>From ae6cf04a149f00f52c1da8e7b9c1ca3af5393f99 Mon Sep 17 00:00:00 2001
From: Doru Bercea 
Date: Fri, 15 Dec 2023 10:22:38 -0500
Subject: [PATCH] Fix mapping of structs to device.

---
 clang/lib/CodeGen/CGOpenMPRuntime.cpp | 147 +++
 clang/test/OpenMP/map_struct_ordering.cpp | 172 ++
 .../struct_mapping_with_pointers.cpp  | 114 
 3 files changed, 400 insertions(+), 33 deletions(-)
 create mode 100644 clang/test/OpenMP/map_struct_ordering.cpp
 create mode 100644 
openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp

diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp 
b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
index 7f7e6f53066644..02f5d8fca7090c 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
@@ -6811,8 +6811,10 @@ class MappableExprsHandler {
   OpenMPMapClauseKind MapType, ArrayRef 
MapModifiers,
   ArrayRef MotionModifiers,
   OMPClauseMappableExprCommon::MappableExprComponentListRef Components,
-  MapCombinedInfoTy , StructRangeInfoTy ,
-  bool IsFirstComponentList, bool IsImplicit,
+  MapCombinedInfoTy ,
+  MapCombinedInfoTy ,
+  StructRangeInfoTy , bool IsFirstComponentList,
+  bool IsImplicit, bool GenerateAllInfoForClauses,
   const ValueDecl *Mapper = nullptr, bool ForDeviceAddr = false,
   const ValueDecl *BaseDecl = nullptr, const Expr *MapExpr = nullptr,
   ArrayRef
@@ -7098,6 +7100,25 @@ class MappableExprsHandler {
 bool IsNonContiguous = CombinedInfo.NonContigInfo.IsNonContiguous;
 bool IsPrevMemberReference = false;
 
+// We need to check if we will be encountering any MEs. If we do not
+// encounter any ME expression it means we will be mapping the whole 
struct.
+// In that case we need to skip adding an entry for the struct to the
+// CombinedInfo list and instead add an entry to the StructBaseCombinedInfo
+// list only when generating all info for clauses.
+bool IsMappingWholeStruct = true;
+if (!GenerateAllInfoForClauses) {
+  IsMappingWholeStruct = false;
+} else {
+  for (auto TempI = I; TempI != CE; ++TempI) {
+const MemberExpr *PossibleME =
+dyn_cast(TempI->getAssociatedExpression());
+if (PossibleME) {
+  IsMappingWholeStruct = false;
+  break;
+}
+  }
+}
+
 for (; I != CE; ++I) {
   // If the current component is member of a struct (parent struct) mark 
it.
   if (!EncounteredME) {
@@ -7317,21 +7338,41 @@ class MappableExprsHandler {
   break;
 }
 llvm::Value *Size = getExprTypeSize(I->getAssociatedExpression());
+// Skip adding an entry in the CurInfo of this combined entry if the
+// whole struct is currently being mapped. The struct needs to be added
+// in the first position before any data internal to the struct is 
being
+// mapped.
 if (!IsMemberPointerOrAddr ||
 (Next == CE && MapType != OMPC_MAP_unknown)) {
-  CombinedInfo.Exprs.emplace_back(MapDecl, MapExpr);
-  CombinedInfo.BasePointers.push_back(BP.getPointer());
-  CombinedInfo.DevicePtrDecls.push_back(nullptr);
-  CombinedInfo.DevicePointers.push_back(DeviceInfoTy::None);
-  CombinedInfo.Pointers.push_back(LB.getPointer());
-  CombinedInfo.Sizes.push_back(
-  CGF.Builder.CreateIntCast(Size, CGF.Int64Ty, /*isSigned=*/true));
-  CombinedInfo.NonContigInfo.Dims.push_back(IsNonContiguous ? DimSize
-: 1);
+  if (!IsMappingWholeStruct) {
+CombinedInfo.Exprs.emplace_back(MapDecl, MapExpr);
+CombinedInfo.BasePointers.push_back(BP.getPointer());
+CombinedInfo.DevicePtrDecls.push_back(nullptr);
+CombinedInfo.DevicePointers.push_back(DeviceInfoTy::None);
+CombinedInfo.Pointers.push_back(LB.getPointer());
+CombinedInfo.Sizes.push_back(CGF.Builder.CreateIntCast(
+Size, CGF.Int64Ty, /*isSigned=*/true));
+CombinedInfo.NonContigInfo.Dims.push_back(IsNonContiguous ? DimSize
+  : 1);
+  } else {
+StructBaseCombinedInfo.Exprs.emplace_back(MapDecl, MapExpr);
+StructBaseCombinedInfo.BasePointers.push_back(BP.getPointer());
+StructBaseCombinedInfo.DevicePtrDecls.push_back(nullptr);
+
StructBaseCombinedInfo.DevicePointers.push_back(DeviceInfoTy::None);
+StructBaseCombinedInfo.Pointers.push_back(LB.getPointer());
+StructBaseCombinedInfo.Sizes.push_back(CGF.Builder.CreateIntCast(
+Size, CGF.Int64Ty, /*isSigned=*/true));
+StructBaseCombinedInfo.NonContigInfo.Dims.push_back(
+

[clang] [openmp] [Clang][OpenMP] Fix mapping of structs to device (PR #75642)

2023-12-15 Thread Gheorghe-Teodor Bercea via cfe-commits

https://github.com/doru1004 created 
https://github.com/llvm/llvm-project/pull/75642

Fix mapping of structs to device.

The following example fails:

```
#include 
#include 

struct Descriptor {
  int *datum;
  long int x;
  int xi;
  long int arr[1][30];
};

int main() {
  Descriptor dat = Descriptor();
  dat.datum = (int *)malloc(sizeof(int)*10);
  dat.xi = 3;
  dat.arr[0][0] = 1;

  #pragma omp target enter data map(to: dat.datum[:10]) map(to: dat)

  #pragma omp target
  {
dat.xi = 4;
dat.datum[dat.arr[0][0]] = dat.xi;
  }

  #pragma omp target exit data map(from: dat)

 return 0;
}
```

This is a rework of the previous attempt: 
https://github.com/llvm/llvm-project/pull/72410

>From 2dc40b67e55985de4e9e89758d6c65eb73faac02 Mon Sep 17 00:00:00 2001
From: Doru Bercea 
Date: Fri, 15 Dec 2023 10:22:38 -0500
Subject: [PATCH] Fix mapping of structs to device.

---
 clang/lib/CodeGen/CGOpenMPRuntime.cpp | 147 +++
 clang/test/OpenMP/map_struct_ordering.cpp | 172 ++
 .../struct_mapping_with_pointers.cpp  | 115 
 3 files changed, 401 insertions(+), 33 deletions(-)
 create mode 100644 clang/test/OpenMP/map_struct_ordering.cpp
 create mode 100644 
openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp

diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp 
b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
index 7f7e6f53066644..02f5d8fca7090c 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
@@ -6811,8 +6811,10 @@ class MappableExprsHandler {
   OpenMPMapClauseKind MapType, ArrayRef 
MapModifiers,
   ArrayRef MotionModifiers,
   OMPClauseMappableExprCommon::MappableExprComponentListRef Components,
-  MapCombinedInfoTy , StructRangeInfoTy ,
-  bool IsFirstComponentList, bool IsImplicit,
+  MapCombinedInfoTy ,
+  MapCombinedInfoTy ,
+  StructRangeInfoTy , bool IsFirstComponentList,
+  bool IsImplicit, bool GenerateAllInfoForClauses,
   const ValueDecl *Mapper = nullptr, bool ForDeviceAddr = false,
   const ValueDecl *BaseDecl = nullptr, const Expr *MapExpr = nullptr,
   ArrayRef
@@ -7098,6 +7100,25 @@ class MappableExprsHandler {
 bool IsNonContiguous = CombinedInfo.NonContigInfo.IsNonContiguous;
 bool IsPrevMemberReference = false;
 
+// We need to check if we will be encountering any MEs. If we do not
+// encounter any ME expression it means we will be mapping the whole 
struct.
+// In that case we need to skip adding an entry for the struct to the
+// CombinedInfo list and instead add an entry to the StructBaseCombinedInfo
+// list only when generating all info for clauses.
+bool IsMappingWholeStruct = true;
+if (!GenerateAllInfoForClauses) {
+  IsMappingWholeStruct = false;
+} else {
+  for (auto TempI = I; TempI != CE; ++TempI) {
+const MemberExpr *PossibleME =
+dyn_cast(TempI->getAssociatedExpression());
+if (PossibleME) {
+  IsMappingWholeStruct = false;
+  break;
+}
+  }
+}
+
 for (; I != CE; ++I) {
   // If the current component is member of a struct (parent struct) mark 
it.
   if (!EncounteredME) {
@@ -7317,21 +7338,41 @@ class MappableExprsHandler {
   break;
 }
 llvm::Value *Size = getExprTypeSize(I->getAssociatedExpression());
+// Skip adding an entry in the CurInfo of this combined entry if the
+// whole struct is currently being mapped. The struct needs to be added
+// in the first position before any data internal to the struct is 
being
+// mapped.
 if (!IsMemberPointerOrAddr ||
 (Next == CE && MapType != OMPC_MAP_unknown)) {
-  CombinedInfo.Exprs.emplace_back(MapDecl, MapExpr);
-  CombinedInfo.BasePointers.push_back(BP.getPointer());
-  CombinedInfo.DevicePtrDecls.push_back(nullptr);
-  CombinedInfo.DevicePointers.push_back(DeviceInfoTy::None);
-  CombinedInfo.Pointers.push_back(LB.getPointer());
-  CombinedInfo.Sizes.push_back(
-  CGF.Builder.CreateIntCast(Size, CGF.Int64Ty, /*isSigned=*/true));
-  CombinedInfo.NonContigInfo.Dims.push_back(IsNonContiguous ? DimSize
-: 1);
+  if (!IsMappingWholeStruct) {
+CombinedInfo.Exprs.emplace_back(MapDecl, MapExpr);
+CombinedInfo.BasePointers.push_back(BP.getPointer());
+CombinedInfo.DevicePtrDecls.push_back(nullptr);
+CombinedInfo.DevicePointers.push_back(DeviceInfoTy::None);
+CombinedInfo.Pointers.push_back(LB.getPointer());
+CombinedInfo.Sizes.push_back(CGF.Builder.CreateIntCast(
+Size, CGF.Int64Ty, /*isSigned=*/true));
+CombinedInfo.NonContigInfo.Dims.push_back(IsNonContiguous ? DimSize
+  : 

[clang] [openmp] [OpenMP][Fix] Fix test initializations (PR #74797)

2023-12-07 Thread Gheorghe-Teodor Bercea via cfe-commits

https://github.com/doru1004 closed 
https://github.com/llvm/llvm-project/pull/74797
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[openmp] [clang] [OpenMP][Fix] Fix test initializations (PR #74797)

2023-12-07 Thread Gheorghe-Teodor Bercea via cfe-commits

https://github.com/doru1004 created 
https://github.com/llvm/llvm-project/pull/74797

Make sure arrays used in test are properly initialized.

>From 6712acd1175d1d6d55ce261651a543872a221c9a Mon Sep 17 00:00:00 2001
From: Doru Bercea 
Date: Wed, 15 Nov 2023 11:07:09 -0500
Subject: [PATCH 1/2] Fix ordering when mapping a struct.

---
 clang/lib/CodeGen/CGOpenMPRuntime.cpp |  22 +++
 clang/test/OpenMP/map_struct_ordering.cpp | 172 ++
 .../struct_mapping_with_pointers.cpp  | 114 
 3 files changed, 308 insertions(+)
 create mode 100644 clang/test/OpenMP/map_struct_ordering.cpp
 create mode 100644 
openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp

diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp 
b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
index d2be8141a3a4b..84a6b36646897 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
@@ -7731,10 +7731,30 @@ class MappableExprsHandler {
   IsImplicit, Mapper, VarRef, ForDeviceAddr);
 };
 
+// Sort all map clauses and make sure all the maps containing array
+// sections are processed last.
+llvm::SmallVector SortedMapClauses;
 for (const auto *Cl : Clauses) {
   const auto *C = dyn_cast(Cl);
   if (!C)
 continue;
+  const auto *EI = C->getVarRefs().begin();
+  if (*EI && !isa(*EI)) {
+SortedMapClauses.emplace_back(C);
+  }
+}
+for (const auto *Cl : Clauses) {
+  const auto *C = dyn_cast(Cl);
+  if (!C)
+continue;
+  const auto *EI = C->getVarRefs().begin();
+  if (*EI && isa(*EI)) {
+SortedMapClauses.emplace_back(C);
+  }
+}
+
+// Iterate over all map clauses:
+for (const OMPMapClause *C : SortedMapClauses) {
   MapKind Kind = Other;
   if (llvm::is_contained(C->getMapTypeModifiers(),
  OMPC_MAP_MODIFIER_present))
@@ -7751,6 +7771,7 @@ class MappableExprsHandler {
 ++EI;
   }
 }
+
 for (const auto *Cl : Clauses) {
   const auto *C = dyn_cast(Cl);
   if (!C)
@@ -7767,6 +7788,7 @@ class MappableExprsHandler {
 ++EI;
   }
 }
+
 for (const auto *Cl : Clauses) {
   const auto *C = dyn_cast(Cl);
   if (!C)
diff --git a/clang/test/OpenMP/map_struct_ordering.cpp 
b/clang/test/OpenMP/map_struct_ordering.cpp
new file mode 100644
index 0..035b39b5b12ab
--- /dev/null
+++ b/clang/test/OpenMP/map_struct_ordering.cpp
@@ -0,0 +1,172 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --include-generated-funcs --replace-value-regex 
"__omp_offloading_[0-9a-z]+_[0-9a-z]+" --prefix-filecheck-ir-name _ --version 4
+
+// RUN: %clang_cc1  -verify -fopenmp -x c++ -std=c++11 -triple 
powerpc64le-unknown-unknown -fopenmp-targets=powerpc64le-ibm-linux-gnu 
-emit-llvm %s -o - -Wno-openmp-mapping | FileCheck %s --check-prefix=CHECK
+
+// expected-no-diagnostics
+#ifndef HEADER
+#define HEADER
+
+struct Descriptor {
+  int *datum;
+  long int x;
+  int xi;
+  long int arr[1][30];
+};
+
+int map_struct() {
+  Descriptor dat = Descriptor();
+  dat.xi = 3;
+  dat.arr[0][0] = 1;
+
+  #pragma omp target enter data map(to: dat.datum[:10]) map(to: dat)
+
+  #pragma omp target
+  {
+dat.xi = 4;
+dat.datum[dat.arr[0][0]] = dat.xi;
+  }
+
+  #pragma omp target exit data map(from: dat)
+
+  return dat.xi;
+}
+
+#endif
+// CHECK-LABEL: define dso_local noundef signext i32 @_Z10map_structv(
+// CHECK-SAME: ) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[DAT:%.*]] = alloca [[STRUCT_DESCRIPTOR:%.*]], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_BASEPTRS:%.*]] = alloca [3 x ptr], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_PTRS:%.*]] = alloca [3 x ptr], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_MAPPERS:%.*]] = alloca [3 x ptr], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_SIZES:%.*]] = alloca [3 x i64], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_BASEPTRS4:%.*]] = alloca [1 x ptr], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_PTRS5:%.*]] = alloca [1 x ptr], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_MAPPERS6:%.*]] = alloca [1 x ptr], align 8
+// CHECK-NEXT:[[KERNEL_ARGS:%.*]] = alloca 
[[STRUCT___TGT_KERNEL_ARGUMENTS:%.*]], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_BASEPTRS7:%.*]] = alloca [1 x ptr], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_PTRS8:%.*]] = alloca [1 x ptr], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_MAPPERS9:%.*]] = alloca [1 x ptr], align 8
+// CHECK-NEXT:call void @llvm.memset.p0.i64(ptr align 8 [[DAT]], i8 0, i64 
264, i1 false)
+// CHECK-NEXT:[[XI:%.*]] = getelementptr inbounds [[STRUCT_DESCRIPTOR]], 
ptr [[DAT]], i32 0, i32 2
+// CHECK-NEXT:store i32 3, ptr [[XI]], align 8
+// CHECK-NEXT:[[ARR:%.*]] = getelementptr inbounds [[STRUCT_DESCRIPTOR]], 
ptr [[DAT]], i32 0, i32 3
+// CHECK-NEXT:[[ARRAYIDX:%.*]] = getelementptr inbounds [1 x [30 x i64]], 
ptr [[ARR]], i64 0, i64 0
+// CHECK-NEXT:

[openmp] [clang] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)

2023-11-21 Thread Gheorghe-Teodor Bercea via cfe-commits


@@ -7731,10 +7731,30 @@ class MappableExprsHandler {
   IsImplicit, Mapper, VarRef, ForDeviceAddr);
 };
 
+// Sort all map clauses and make sure all the maps containing array
+// sections are processed last.
+llvm::SmallVector SortedMapClauses;

doru1004 wrote:

@alexey-bataev I have looked at the code again and I really can't see another 
solution to this problem. If you have a different fix in mind please let me 
know.

https://github.com/llvm/llvm-project/pull/72410
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [openmp] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)

2023-11-21 Thread Gheorghe-Teodor Bercea via cfe-commits


@@ -7731,10 +7731,30 @@ class MappableExprsHandler {
   IsImplicit, Mapper, VarRef, ForDeviceAddr);
 };
 
+// Sort all map clauses and make sure all the maps containing array
+// sections are processed last.
+llvm::SmallVector SortedMapClauses;

doru1004 wrote:

I can't find any "bug" in the existing code. It works as intended. The problem 
is that it doesn't handle these types of situations and I don't see how else to 
fix an ordering problem other than by re-ordering. If you have a different 
solution in mind please let me know.

https://github.com/llvm/llvm-project/pull/72410
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [openmp] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)

2023-11-21 Thread Gheorghe-Teodor Bercea via cfe-commits


@@ -7731,10 +7731,30 @@ class MappableExprsHandler {
   IsImplicit, Mapper, VarRef, ForDeviceAddr);
 };
 
+// Sort all map clauses and make sure all the maps containing array
+// sections are processed last.
+llvm::SmallVector SortedMapClauses;

doru1004 wrote:

Well I don't see anything other that's wrong other than the order and the order 
comes from how the user wrote the code so I am not sure how else to fix it.

https://github.com/llvm/llvm-project/pull/72410
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [openmp] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)

2023-11-21 Thread Gheorghe-Teodor Bercea via cfe-commits


@@ -7731,10 +7731,30 @@ class MappableExprsHandler {
   IsImplicit, Mapper, VarRef, ForDeviceAddr);
 };
 
+// Sort all map clauses and make sure all the maps containing array
+// sections are processed last.
+llvm::SmallVector SortedMapClauses;

doru1004 wrote:

Implicit ordering isn't working in the case in the example above, please see 
the code. The entries are in the wrong order in the runtime and the problem 
starts here.

https://github.com/llvm/llvm-project/pull/72410
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [openmp] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)

2023-11-21 Thread Gheorghe-Teodor Bercea via cfe-commits


@@ -7731,10 +7731,30 @@ class MappableExprsHandler {
   IsImplicit, Mapper, VarRef, ForDeviceAddr);
 };
 
+// Sort all map clauses and make sure all the maps containing array
+// sections are processed last.
+llvm::SmallVector SortedMapClauses;

doru1004 wrote:

at runtime, if things happen in the wrong order, the processing of the base 
struct overwrites the pointer attachment for the array.

https://github.com/llvm/llvm-project/pull/72410
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[openmp] [clang] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)

2023-11-21 Thread Gheorghe-Teodor Bercea via cfe-commits


@@ -7731,10 +7731,30 @@ class MappableExprsHandler {
   IsImplicit, Mapper, VarRef, ForDeviceAddr);
 };
 
+// Sort all map clauses and make sure all the maps containing array
+// sections are processed last.
+llvm::SmallVector SortedMapClauses;

doru1004 wrote:

@alexey-bataev I agree it's not ideal. the problem is related to the order in 
which the clauses are processed. We cannot process the base struct after we 
have processed an array section inside the struct.

https://github.com/llvm/llvm-project/pull/72410
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [openmp] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)

2023-11-21 Thread Gheorghe-Teodor Bercea via cfe-commits


@@ -7731,10 +7731,30 @@ class MappableExprsHandler {
   IsImplicit, Mapper, VarRef, ForDeviceAddr);
 };
 
+// Sort all map clauses and make sure all the maps containing array
+// sections are processed last.
+llvm::SmallVector SortedMapClauses;

doru1004 wrote:

It's a form of sorting it's more like a split between all section-containing 
maps and the ones that don't.

https://github.com/llvm/llvm-project/pull/72410
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [openmp] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)

2023-11-21 Thread Gheorghe-Teodor Bercea via cfe-commits


@@ -7731,10 +7731,30 @@ class MappableExprsHandler {
   IsImplicit, Mapper, VarRef, ForDeviceAddr);
 };
 
+// Sort all map clauses and make sure all the maps containing array
+// sections are processed last.
+llvm::SmallVector SortedMapClauses;

doru1004 wrote:

Ah yes, so I just moved all the maps containing sections at the end of the 
clause list. I want those maps to happen last after all the structs and other 
maps have happened.

https://github.com/llvm/llvm-project/pull/72410
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [openmp] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)

2023-11-21 Thread Gheorghe-Teodor Bercea via cfe-commits


@@ -7731,10 +7731,30 @@ class MappableExprsHandler {
   IsImplicit, Mapper, VarRef, ForDeviceAddr);
 };
 
+// Sort all map clauses and make sure all the maps containing array
+// sections are processed last.
+llvm::SmallVector SortedMapClauses;

doru1004 wrote:

Are you asking what is the sorting criteria?

https://github.com/llvm/llvm-project/pull/72410
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [openmp] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)

2023-11-21 Thread Gheorghe-Teodor Bercea via cfe-commits


@@ -7731,10 +7731,30 @@ class MappableExprsHandler {
   IsImplicit, Mapper, VarRef, ForDeviceAddr);
 };
 
+// Sort all map clauses and make sure all the maps containing array
+// sections are processed last.
+llvm::SmallVector SortedMapClauses;

doru1004 wrote:

I don't understand the question.

https://github.com/llvm/llvm-project/pull/72410
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[openmp] [clang] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)

2023-11-20 Thread Gheorghe-Teodor Bercea via cfe-commits

https://github.com/doru1004 updated 
https://github.com/llvm/llvm-project/pull/72410

>From 6712acd1175d1d6d55ce261651a543872a221c9a Mon Sep 17 00:00:00 2001
From: Doru Bercea 
Date: Wed, 15 Nov 2023 11:07:09 -0500
Subject: [PATCH] Fix ordering when mapping a struct.

---
 clang/lib/CodeGen/CGOpenMPRuntime.cpp |  22 +++
 clang/test/OpenMP/map_struct_ordering.cpp | 172 ++
 .../struct_mapping_with_pointers.cpp  | 114 
 3 files changed, 308 insertions(+)
 create mode 100644 clang/test/OpenMP/map_struct_ordering.cpp
 create mode 100644 
openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp

diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp 
b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
index d2be8141a3a4b31..84a6b36646897d7 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
@@ -7731,10 +7731,30 @@ class MappableExprsHandler {
   IsImplicit, Mapper, VarRef, ForDeviceAddr);
 };
 
+// Sort all map clauses and make sure all the maps containing array
+// sections are processed last.
+llvm::SmallVector SortedMapClauses;
 for (const auto *Cl : Clauses) {
   const auto *C = dyn_cast(Cl);
   if (!C)
 continue;
+  const auto *EI = C->getVarRefs().begin();
+  if (*EI && !isa(*EI)) {
+SortedMapClauses.emplace_back(C);
+  }
+}
+for (const auto *Cl : Clauses) {
+  const auto *C = dyn_cast(Cl);
+  if (!C)
+continue;
+  const auto *EI = C->getVarRefs().begin();
+  if (*EI && isa(*EI)) {
+SortedMapClauses.emplace_back(C);
+  }
+}
+
+// Iterate over all map clauses:
+for (const OMPMapClause *C : SortedMapClauses) {
   MapKind Kind = Other;
   if (llvm::is_contained(C->getMapTypeModifiers(),
  OMPC_MAP_MODIFIER_present))
@@ -7751,6 +7771,7 @@ class MappableExprsHandler {
 ++EI;
   }
 }
+
 for (const auto *Cl : Clauses) {
   const auto *C = dyn_cast(Cl);
   if (!C)
@@ -7767,6 +7788,7 @@ class MappableExprsHandler {
 ++EI;
   }
 }
+
 for (const auto *Cl : Clauses) {
   const auto *C = dyn_cast(Cl);
   if (!C)
diff --git a/clang/test/OpenMP/map_struct_ordering.cpp 
b/clang/test/OpenMP/map_struct_ordering.cpp
new file mode 100644
index 000..035b39b5b12ab4a
--- /dev/null
+++ b/clang/test/OpenMP/map_struct_ordering.cpp
@@ -0,0 +1,172 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --include-generated-funcs --replace-value-regex 
"__omp_offloading_[0-9a-z]+_[0-9a-z]+" --prefix-filecheck-ir-name _ --version 4
+
+// RUN: %clang_cc1  -verify -fopenmp -x c++ -std=c++11 -triple 
powerpc64le-unknown-unknown -fopenmp-targets=powerpc64le-ibm-linux-gnu 
-emit-llvm %s -o - -Wno-openmp-mapping | FileCheck %s --check-prefix=CHECK
+
+// expected-no-diagnostics
+#ifndef HEADER
+#define HEADER
+
+struct Descriptor {
+  int *datum;
+  long int x;
+  int xi;
+  long int arr[1][30];
+};
+
+int map_struct() {
+  Descriptor dat = Descriptor();
+  dat.xi = 3;
+  dat.arr[0][0] = 1;
+
+  #pragma omp target enter data map(to: dat.datum[:10]) map(to: dat)
+
+  #pragma omp target
+  {
+dat.xi = 4;
+dat.datum[dat.arr[0][0]] = dat.xi;
+  }
+
+  #pragma omp target exit data map(from: dat)
+
+  return dat.xi;
+}
+
+#endif
+// CHECK-LABEL: define dso_local noundef signext i32 @_Z10map_structv(
+// CHECK-SAME: ) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[DAT:%.*]] = alloca [[STRUCT_DESCRIPTOR:%.*]], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_BASEPTRS:%.*]] = alloca [3 x ptr], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_PTRS:%.*]] = alloca [3 x ptr], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_MAPPERS:%.*]] = alloca [3 x ptr], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_SIZES:%.*]] = alloca [3 x i64], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_BASEPTRS4:%.*]] = alloca [1 x ptr], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_PTRS5:%.*]] = alloca [1 x ptr], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_MAPPERS6:%.*]] = alloca [1 x ptr], align 8
+// CHECK-NEXT:[[KERNEL_ARGS:%.*]] = alloca 
[[STRUCT___TGT_KERNEL_ARGUMENTS:%.*]], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_BASEPTRS7:%.*]] = alloca [1 x ptr], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_PTRS8:%.*]] = alloca [1 x ptr], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_MAPPERS9:%.*]] = alloca [1 x ptr], align 8
+// CHECK-NEXT:call void @llvm.memset.p0.i64(ptr align 8 [[DAT]], i8 0, i64 
264, i1 false)
+// CHECK-NEXT:[[XI:%.*]] = getelementptr inbounds [[STRUCT_DESCRIPTOR]], 
ptr [[DAT]], i32 0, i32 2
+// CHECK-NEXT:store i32 3, ptr [[XI]], align 8
+// CHECK-NEXT:[[ARR:%.*]] = getelementptr inbounds [[STRUCT_DESCRIPTOR]], 
ptr [[DAT]], i32 0, i32 3
+// CHECK-NEXT:[[ARRAYIDX:%.*]] = getelementptr inbounds [1 x [30 x i64]], 
ptr [[ARR]], i64 0, i64 0
+// CHECK-NEXT:[[ARRAYIDX1:%.*]] = getelementptr inbounds [30 x i64], ptr 

[clang] [openmp] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)

2023-11-20 Thread Gheorghe-Teodor Bercea via cfe-commits

https://github.com/doru1004 updated 
https://github.com/llvm/llvm-project/pull/72410

>From 2ea93a7b4841671dc12ee39a25a66c536d92d83f Mon Sep 17 00:00:00 2001
From: Doru Bercea 
Date: Wed, 15 Nov 2023 11:07:09 -0500
Subject: [PATCH] Fix ordering when mapping a struct.

---
 clang/lib/CodeGen/CGOpenMPRuntime.cpp |  23 +++
 clang/test/OpenMP/map_struct_ordering.cpp | 172 ++
 .../struct_mapping_with_pointers.cpp  | 114 
 3 files changed, 309 insertions(+)
 create mode 100644 clang/test/OpenMP/map_struct_ordering.cpp
 create mode 100644 
openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp

diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp 
b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
index d2be8141a3a4b31..b4b8794947687c0 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
@@ -7731,10 +7731,31 @@ class MappableExprsHandler {
   IsImplicit, Mapper, VarRef, ForDeviceAddr);
 };
 
+// Sort all map clauses and make sure all the maps containing array
+// sections are processed last.
+llvm::SmallVector SortedMapClauses;
 for (const auto *Cl : Clauses) {
   const auto *C = dyn_cast(Cl);
   if (!C)
 continue;
+  const auto *EI = C->getVarRefs().begin();
+  if (*EI && !isa(*EI)) {
+SortedMapClauses.emplace_back(C);
+  }
+}
+for (const auto *Cl : Clauses) {
+  const auto *C = dyn_cast(Cl);
+  if (!C)
+continue;
+  const auto *EI = C->getVarRefs().begin();
+  if (*EI && isa(*EI)) {
+SortedMapClauses.emplace_back(C);
+  }
+}
+
+// Iterate over all non-section maps first to avoid overwriting pointer
+// attachment.
+for (const OMPMapClause *C : SortedMapClauses) {
   MapKind Kind = Other;
   if (llvm::is_contained(C->getMapTypeModifiers(),
  OMPC_MAP_MODIFIER_present))
@@ -7751,6 +7772,7 @@ class MappableExprsHandler {
 ++EI;
   }
 }
+
 for (const auto *Cl : Clauses) {
   const auto *C = dyn_cast(Cl);
   if (!C)
@@ -7767,6 +7789,7 @@ class MappableExprsHandler {
 ++EI;
   }
 }
+
 for (const auto *Cl : Clauses) {
   const auto *C = dyn_cast(Cl);
   if (!C)
diff --git a/clang/test/OpenMP/map_struct_ordering.cpp 
b/clang/test/OpenMP/map_struct_ordering.cpp
new file mode 100644
index 000..035b39b5b12ab4a
--- /dev/null
+++ b/clang/test/OpenMP/map_struct_ordering.cpp
@@ -0,0 +1,172 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --include-generated-funcs --replace-value-regex 
"__omp_offloading_[0-9a-z]+_[0-9a-z]+" --prefix-filecheck-ir-name _ --version 4
+
+// RUN: %clang_cc1  -verify -fopenmp -x c++ -std=c++11 -triple 
powerpc64le-unknown-unknown -fopenmp-targets=powerpc64le-ibm-linux-gnu 
-emit-llvm %s -o - -Wno-openmp-mapping | FileCheck %s --check-prefix=CHECK
+
+// expected-no-diagnostics
+#ifndef HEADER
+#define HEADER
+
+struct Descriptor {
+  int *datum;
+  long int x;
+  int xi;
+  long int arr[1][30];
+};
+
+int map_struct() {
+  Descriptor dat = Descriptor();
+  dat.xi = 3;
+  dat.arr[0][0] = 1;
+
+  #pragma omp target enter data map(to: dat.datum[:10]) map(to: dat)
+
+  #pragma omp target
+  {
+dat.xi = 4;
+dat.datum[dat.arr[0][0]] = dat.xi;
+  }
+
+  #pragma omp target exit data map(from: dat)
+
+  return dat.xi;
+}
+
+#endif
+// CHECK-LABEL: define dso_local noundef signext i32 @_Z10map_structv(
+// CHECK-SAME: ) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[DAT:%.*]] = alloca [[STRUCT_DESCRIPTOR:%.*]], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_BASEPTRS:%.*]] = alloca [3 x ptr], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_PTRS:%.*]] = alloca [3 x ptr], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_MAPPERS:%.*]] = alloca [3 x ptr], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_SIZES:%.*]] = alloca [3 x i64], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_BASEPTRS4:%.*]] = alloca [1 x ptr], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_PTRS5:%.*]] = alloca [1 x ptr], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_MAPPERS6:%.*]] = alloca [1 x ptr], align 8
+// CHECK-NEXT:[[KERNEL_ARGS:%.*]] = alloca 
[[STRUCT___TGT_KERNEL_ARGUMENTS:%.*]], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_BASEPTRS7:%.*]] = alloca [1 x ptr], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_PTRS8:%.*]] = alloca [1 x ptr], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_MAPPERS9:%.*]] = alloca [1 x ptr], align 8
+// CHECK-NEXT:call void @llvm.memset.p0.i64(ptr align 8 [[DAT]], i8 0, i64 
264, i1 false)
+// CHECK-NEXT:[[XI:%.*]] = getelementptr inbounds [[STRUCT_DESCRIPTOR]], 
ptr [[DAT]], i32 0, i32 2
+// CHECK-NEXT:store i32 3, ptr [[XI]], align 8
+// CHECK-NEXT:[[ARR:%.*]] = getelementptr inbounds [[STRUCT_DESCRIPTOR]], 
ptr [[DAT]], i32 0, i32 3
+// CHECK-NEXT:[[ARRAYIDX:%.*]] = getelementptr inbounds [1 x [30 x i64]], 
ptr [[ARR]], i64 0, i64 0
+// CHECK-NEXT:

[clang] [openmp] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)

2023-11-20 Thread Gheorghe-Teodor Bercea via cfe-commits

https://github.com/doru1004 updated 
https://github.com/llvm/llvm-project/pull/72410

>From d29229095203dccdee5ded18c0df0474e006ad53 Mon Sep 17 00:00:00 2001
From: Doru Bercea 
Date: Wed, 15 Nov 2023 11:07:09 -0500
Subject: [PATCH] Fix ordering when mapping a struct.

---
 clang/lib/CodeGen/CGOpenMPRuntime.cpp |  27 ++-
 clang/test/OpenMP/map_struct_ordering.cpp | 172 ++
 .../struct_mapping_with_pointers.cpp  | 114 
 3 files changed, 311 insertions(+), 2 deletions(-)
 create mode 100644 clang/test/OpenMP/map_struct_ordering.cpp
 create mode 100644 
openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp

diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp 
b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
index d2be8141a3a4b31..a39115300fa641e 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
@@ -7731,10 +7731,31 @@ class MappableExprsHandler {
   IsImplicit, Mapper, VarRef, ForDeviceAddr);
 };
 
+// Sort all map clauses and make sure all the maps containing array
+// sections are processed last.
+llvm::SmallVector SortedMapClauses;
 for (const auto *Cl : Clauses) {
   const auto *C = dyn_cast(Cl);
   if (!C)
 continue;
+  const auto *EI = C->getVarRefs().begin();
+  if (*EI && !isa(*EI)) {
+SortedMapClauses.emplace_back(C);
+  }
+}
+for (const auto *Cl : Clauses) {
+  const auto *C = dyn_cast(Cl);
+  if (!C)
+continue;
+  const auto *EI = C->getVarRefs().begin();
+  if (*EI && isa(*EI)) {
+SortedMapClauses.emplace_back(C);
+  }
+}
+
+// Iterate over all non-section maps first to avoid overwriting pointer
+// attachment.
+for (const OMPMapClause *C : SortedMapClauses) {
   MapKind Kind = Other;
   if (llvm::is_contained(C->getMapTypeModifiers(),
  OMPC_MAP_MODIFIER_present))
@@ -7746,11 +7767,12 @@ class MappableExprsHandler {
 const Expr *E = (C->getMapLoc().isValid()) ? *EI : nullptr;
 InfoGen(std::get<0>(L), Kind, std::get<1>(L), C->getMapType(),
 C->getMapTypeModifiers(), std::nullopt,
-/*ReturnDevicePointer=*/false, C->isImplicit(), std::get<2>(L),
-E);
+/*ReturnDevicePointer=*/false, C->isImplicit(),
+std::get<2>(L), E);
 ++EI;
   }
 }
+
 for (const auto *Cl : Clauses) {
   const auto *C = dyn_cast(Cl);
   if (!C)
@@ -7767,6 +7789,7 @@ class MappableExprsHandler {
 ++EI;
   }
 }
+
 for (const auto *Cl : Clauses) {
   const auto *C = dyn_cast(Cl);
   if (!C)
diff --git a/clang/test/OpenMP/map_struct_ordering.cpp 
b/clang/test/OpenMP/map_struct_ordering.cpp
new file mode 100644
index 000..035b39b5b12ab4a
--- /dev/null
+++ b/clang/test/OpenMP/map_struct_ordering.cpp
@@ -0,0 +1,172 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --include-generated-funcs --replace-value-regex 
"__omp_offloading_[0-9a-z]+_[0-9a-z]+" --prefix-filecheck-ir-name _ --version 4
+
+// RUN: %clang_cc1  -verify -fopenmp -x c++ -std=c++11 -triple 
powerpc64le-unknown-unknown -fopenmp-targets=powerpc64le-ibm-linux-gnu 
-emit-llvm %s -o - -Wno-openmp-mapping | FileCheck %s --check-prefix=CHECK
+
+// expected-no-diagnostics
+#ifndef HEADER
+#define HEADER
+
+struct Descriptor {
+  int *datum;
+  long int x;
+  int xi;
+  long int arr[1][30];
+};
+
+int map_struct() {
+  Descriptor dat = Descriptor();
+  dat.xi = 3;
+  dat.arr[0][0] = 1;
+
+  #pragma omp target enter data map(to: dat.datum[:10]) map(to: dat)
+
+  #pragma omp target
+  {
+dat.xi = 4;
+dat.datum[dat.arr[0][0]] = dat.xi;
+  }
+
+  #pragma omp target exit data map(from: dat)
+
+  return dat.xi;
+}
+
+#endif
+// CHECK-LABEL: define dso_local noundef signext i32 @_Z10map_structv(
+// CHECK-SAME: ) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[DAT:%.*]] = alloca [[STRUCT_DESCRIPTOR:%.*]], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_BASEPTRS:%.*]] = alloca [3 x ptr], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_PTRS:%.*]] = alloca [3 x ptr], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_MAPPERS:%.*]] = alloca [3 x ptr], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_SIZES:%.*]] = alloca [3 x i64], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_BASEPTRS4:%.*]] = alloca [1 x ptr], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_PTRS5:%.*]] = alloca [1 x ptr], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_MAPPERS6:%.*]] = alloca [1 x ptr], align 8
+// CHECK-NEXT:[[KERNEL_ARGS:%.*]] = alloca 
[[STRUCT___TGT_KERNEL_ARGUMENTS:%.*]], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_BASEPTRS7:%.*]] = alloca [1 x ptr], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_PTRS8:%.*]] = alloca [1 x ptr], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_MAPPERS9:%.*]] = alloca [1 x ptr], align 8
+// CHECK-NEXT:call void @llvm.memset.p0.i64(ptr align 8 [[DAT]], 

[openmp] [clang] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)

2023-11-15 Thread Gheorghe-Teodor Bercea via cfe-commits

doru1004 wrote:

> This being in clang instead seems like a good change. Are there no CodeGen 
> tests changed? We should add one if so. Probably just take your 
> `libomptarget` test and run `update_cc_test_checks` on it with the arguments 
> found in other test files.

Just added the test.

https://github.com/llvm/llvm-project/pull/72410
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [openmp] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)

2023-11-15 Thread Gheorghe-Teodor Bercea via cfe-commits

https://github.com/doru1004 updated 
https://github.com/llvm/llvm-project/pull/72410

>From a16ffab67e8f8134fd943761da730c120bbae88d Mon Sep 17 00:00:00 2001
From: Doru Bercea 
Date: Wed, 15 Nov 2023 11:07:09 -0500
Subject: [PATCH] Fix ordering when mapping a struct.

---
 clang/lib/CodeGen/CGOpenMPRuntime.cpp |  44 -
 clang/test/OpenMP/map_struct_ordering.cpp | 172 ++
 .../struct_mapping_with_pointers.cpp  | 114 
 3 files changed, 323 insertions(+), 7 deletions(-)
 create mode 100644 clang/test/OpenMP/map_struct_ordering.cpp
 create mode 100644 
openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp

diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp 
b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
index d2be8141a3a4b31..0079530f90f723d 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
@@ -7731,6 +7731,8 @@ class MappableExprsHandler {
   IsImplicit, Mapper, VarRef, ForDeviceAddr);
 };
 
+// Iterate over all non-section maps first to avoid overwriting pointer
+// attachment.
 for (const auto *Cl : Clauses) {
   const auto *C = dyn_cast(Cl);
   if (!C)
@@ -7742,15 +7744,42 @@ class MappableExprsHandler {
   else if (C->getMapType() == OMPC_MAP_alloc)
 Kind = Allocs;
   const auto *EI = C->getVarRefs().begin();
-  for (const auto L : C->component_lists()) {
-const Expr *E = (C->getMapLoc().isValid()) ? *EI : nullptr;
-InfoGen(std::get<0>(L), Kind, std::get<1>(L), C->getMapType(),
-C->getMapTypeModifiers(), std::nullopt,
-/*ReturnDevicePointer=*/false, C->isImplicit(), std::get<2>(L),
-E);
-++EI;
+  if (*EI && !isa(*EI)) {
+for (const auto L : C->component_lists()) {
+  const Expr *E = (C->getMapLoc().isValid()) ? *EI : nullptr;
+  InfoGen(std::get<0>(L), Kind, std::get<1>(L), C->getMapType(),
+  C->getMapTypeModifiers(), std::nullopt,
+  /*ReturnDevicePointer=*/false, C->isImplicit(),
+  std::get<2>(L), E);
+  ++EI;
+}
+  }
+}
+
+// Process the maps with sections.
+for (const auto *Cl : Clauses) {
+  const auto *C = dyn_cast(Cl);
+  if (!C)
+continue;
+  MapKind Kind = Other;
+  if (llvm::is_contained(C->getMapTypeModifiers(),
+ OMPC_MAP_MODIFIER_present))
+Kind = Present;
+  else if (C->getMapType() == OMPC_MAP_alloc)
+Kind = Allocs;
+  const auto *EI = C->getVarRefs().begin();
+  if (*EI && isa(*EI)) {
+for (const auto L : C->component_lists()) {
+  const Expr *E = (C->getMapLoc().isValid()) ? *EI : nullptr;
+  InfoGen(std::get<0>(L), Kind, std::get<1>(L), C->getMapType(),
+  C->getMapTypeModifiers(), std::nullopt,
+  /*ReturnDevicePointer=*/false, C->isImplicit(),
+  std::get<2>(L), E);
+  ++EI;
+}
   }
 }
+
 for (const auto *Cl : Clauses) {
   const auto *C = dyn_cast(Cl);
   if (!C)
@@ -7767,6 +7796,7 @@ class MappableExprsHandler {
 ++EI;
   }
 }
+
 for (const auto *Cl : Clauses) {
   const auto *C = dyn_cast(Cl);
   if (!C)
diff --git a/clang/test/OpenMP/map_struct_ordering.cpp 
b/clang/test/OpenMP/map_struct_ordering.cpp
new file mode 100644
index 000..035b39b5b12ab4a
--- /dev/null
+++ b/clang/test/OpenMP/map_struct_ordering.cpp
@@ -0,0 +1,172 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --include-generated-funcs --replace-value-regex 
"__omp_offloading_[0-9a-z]+_[0-9a-z]+" --prefix-filecheck-ir-name _ --version 4
+
+// RUN: %clang_cc1  -verify -fopenmp -x c++ -std=c++11 -triple 
powerpc64le-unknown-unknown -fopenmp-targets=powerpc64le-ibm-linux-gnu 
-emit-llvm %s -o - -Wno-openmp-mapping | FileCheck %s --check-prefix=CHECK
+
+// expected-no-diagnostics
+#ifndef HEADER
+#define HEADER
+
+struct Descriptor {
+  int *datum;
+  long int x;
+  int xi;
+  long int arr[1][30];
+};
+
+int map_struct() {
+  Descriptor dat = Descriptor();
+  dat.xi = 3;
+  dat.arr[0][0] = 1;
+
+  #pragma omp target enter data map(to: dat.datum[:10]) map(to: dat)
+
+  #pragma omp target
+  {
+dat.xi = 4;
+dat.datum[dat.arr[0][0]] = dat.xi;
+  }
+
+  #pragma omp target exit data map(from: dat)
+
+  return dat.xi;
+}
+
+#endif
+// CHECK-LABEL: define dso_local noundef signext i32 @_Z10map_structv(
+// CHECK-SAME: ) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[DAT:%.*]] = alloca [[STRUCT_DESCRIPTOR:%.*]], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_BASEPTRS:%.*]] = alloca [3 x ptr], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_PTRS:%.*]] = alloca [3 x ptr], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_MAPPERS:%.*]] = alloca [3 x ptr], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_SIZES:%.*]] = alloca [3 x i64], 

[clang] [openmp] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)

2023-11-15 Thread Gheorghe-Teodor Bercea via cfe-commits

doru1004 wrote:

> This being in clang instead seems like a good change. Are there no CodeGen 
> tests changed? We should add one if so. Probably just take your 
> `libomptarget` test and run `update_cc_test_checks` on it with the arguments 
> found in other test files.

No code gen test changes. Happy to add one no problem.

https://github.com/llvm/llvm-project/pull/72410
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [openmp] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)

2023-11-15 Thread Gheorghe-Teodor Bercea via cfe-commits

https://github.com/doru1004 updated 
https://github.com/llvm/llvm-project/pull/72410

>From ed9d50576cf167b4d9017e55333220d1601d088f Mon Sep 17 00:00:00 2001
From: Doru Bercea 
Date: Wed, 15 Nov 2023 11:07:09 -0500
Subject: [PATCH] Fix ordering when mapping a struct.

---
 clang/lib/CodeGen/CGOpenMPRuntime.cpp |  44 +--
 .../struct_mapping_with_pointers.cpp  | 114 ++
 2 files changed, 151 insertions(+), 7 deletions(-)
 create mode 100644 
openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp

diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp 
b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
index d2be8141a3a4b31..0079530f90f723d 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
@@ -7731,6 +7731,8 @@ class MappableExprsHandler {
   IsImplicit, Mapper, VarRef, ForDeviceAddr);
 };
 
+// Iterate over all non-section maps first to avoid overwriting pointer
+// attachment.
 for (const auto *Cl : Clauses) {
   const auto *C = dyn_cast(Cl);
   if (!C)
@@ -7742,15 +7744,42 @@ class MappableExprsHandler {
   else if (C->getMapType() == OMPC_MAP_alloc)
 Kind = Allocs;
   const auto *EI = C->getVarRefs().begin();
-  for (const auto L : C->component_lists()) {
-const Expr *E = (C->getMapLoc().isValid()) ? *EI : nullptr;
-InfoGen(std::get<0>(L), Kind, std::get<1>(L), C->getMapType(),
-C->getMapTypeModifiers(), std::nullopt,
-/*ReturnDevicePointer=*/false, C->isImplicit(), std::get<2>(L),
-E);
-++EI;
+  if (*EI && !isa(*EI)) {
+for (const auto L : C->component_lists()) {
+  const Expr *E = (C->getMapLoc().isValid()) ? *EI : nullptr;
+  InfoGen(std::get<0>(L), Kind, std::get<1>(L), C->getMapType(),
+  C->getMapTypeModifiers(), std::nullopt,
+  /*ReturnDevicePointer=*/false, C->isImplicit(),
+  std::get<2>(L), E);
+  ++EI;
+}
+  }
+}
+
+// Process the maps with sections.
+for (const auto *Cl : Clauses) {
+  const auto *C = dyn_cast(Cl);
+  if (!C)
+continue;
+  MapKind Kind = Other;
+  if (llvm::is_contained(C->getMapTypeModifiers(),
+ OMPC_MAP_MODIFIER_present))
+Kind = Present;
+  else if (C->getMapType() == OMPC_MAP_alloc)
+Kind = Allocs;
+  const auto *EI = C->getVarRefs().begin();
+  if (*EI && isa(*EI)) {
+for (const auto L : C->component_lists()) {
+  const Expr *E = (C->getMapLoc().isValid()) ? *EI : nullptr;
+  InfoGen(std::get<0>(L), Kind, std::get<1>(L), C->getMapType(),
+  C->getMapTypeModifiers(), std::nullopt,
+  /*ReturnDevicePointer=*/false, C->isImplicit(),
+  std::get<2>(L), E);
+  ++EI;
+}
   }
 }
+
 for (const auto *Cl : Clauses) {
   const auto *C = dyn_cast(Cl);
   if (!C)
@@ -7767,6 +7796,7 @@ class MappableExprsHandler {
 ++EI;
   }
 }
+
 for (const auto *Cl : Clauses) {
   const auto *C = dyn_cast(Cl);
   if (!C)
diff --git 
a/openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp 
b/openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp
new file mode 100644
index 000..c7ce4bade8de9a2
--- /dev/null
+++ b/openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp
@@ -0,0 +1,114 @@
+// clang-format off
+// RUN: %libomptarget-compilexx-generic && env LIBOMPTARGET_DEBUG=1 
%libomptarget-run-generic 2>&1 | %fcheck-generic
+// clang-format on
+
+#include 
+#include 
+
+struct Descriptor {
+  int *datum;
+  long int x;
+  int *more_datum;
+  int xi;
+  int val_datum, val_more_datum;
+  long int arr[1][30];
+  int val_arr;
+};
+
+int main() {
+  Descriptor dat = Descriptor();
+  dat.datum = (int *)malloc(sizeof(int) * 10);
+  dat.more_datum = (int *)malloc(sizeof(int) * 20);
+  dat.xi = 3;
+  dat.arr[0][0] = 1;
+
+  dat.datum[7] = 7;
+  dat.more_datum[17] = 17;
+
+  /// The struct is mapped with type 0x0 when the pointer fields are mapped.
+  /// The struct is also map explicitely by the user. The second mapping by
+  /// the user must not overwrite the mapping set up for the pointer fields
+  /// when mapping the struct happens after the mapping of the pointers.
+
+  // clang-format off
+  // CHECK: Libomptarget --> Entry  0: Base=[[DAT_HST_PTR_BASE:0x.*]], 
Begin=[[DAT_HST_PTR_BASE]], Size=288, Type=0x0, Name=unknown
+  // CHECK: Libomptarget --> Entry  1: Base=[[DAT_HST_PTR_BASE]], 
Begin=[[DAT_HST_PTR_BASE]], Size=288, Type=0x10001, Name=unknown
+  // CHECK: Libomptarget --> Entry  2: Base=[[DAT_HST_PTR_BASE]], 
Begin=[[DATUM_HST_PTR_BASE:0x.*]], Size=40, Type=0x10011, Name=unknown
+  // CHECK: Libomptarget --> Entry  3: Base=[[MORE_DATUM_HST_PTR_BASE:0x.*]], 
Begin=[[MORE_DATUM_HST_PTR_BEGIN:0x.*]], 

[openmp] [clang] [Clang][OpenMP] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)

2023-11-15 Thread Gheorghe-Teodor Bercea via cfe-commits

https://github.com/doru1004 edited 
https://github.com/llvm/llvm-project/pull/72410
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [openmp] Fix ordering of processing of map clauses when mapping a struct. (PR #72410)

2023-11-15 Thread Gheorghe-Teodor Bercea via cfe-commits

https://github.com/doru1004 created 
https://github.com/llvm/llvm-project/pull/72410

Mapping a struct, if done in the wrong order, can overwrite the pointer 
attachment details. This fixes this problem.

Original failing example:

```
#include 
#include 

struct Descriptor {
  int *datum;
  long int x;
  int xi;
  long int arr[1][30];
};

int main() {
  Descriptor dat = Descriptor();
  dat.datum = (int *)malloc(sizeof(int)*10);
  dat.xi = 3;
  dat.arr[0][0] = 1;

  #pragma omp target enter data map(to: dat.datum[:10]) map(to: dat)

  #pragma omp target
  {
dat.xi = 4;
dat.datum[dat.arr[0][0]] = dat.xi;
  }

  #pragma omp target exit data map(from: dat)

 return 0;
}
```

Previous attempt at fixing this: https://github.com/llvm/llvm-project/pull/70821

>From 6f9450b5fa9ff47c35e7498b3a536a218655a9d6 Mon Sep 17 00:00:00 2001
From: Doru Bercea 
Date: Wed, 15 Nov 2023 11:07:09 -0500
Subject: [PATCH] Fix ordering when mapping a struct.

---
 clang/lib/CodeGen/CGOpenMPRuntime.cpp |  44 +--
 .../struct_mapping_with_pointers.cpp  | 114 ++
 2 files changed, 151 insertions(+), 7 deletions(-)
 create mode 100644 
openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp

diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp 
b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
index d2be8141a3a4b31..50518c46152bbaf 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
@@ -7731,6 +7731,8 @@ class MappableExprsHandler {
   IsImplicit, Mapper, VarRef, ForDeviceAddr);
 };
 
+// Iterate over all non-section maps first to avoid overwriting pointer
+// attachment.
 for (const auto *Cl : Clauses) {
   const auto *C = dyn_cast(Cl);
   if (!C)
@@ -7742,15 +7744,42 @@ class MappableExprsHandler {
   else if (C->getMapType() == OMPC_MAP_alloc)
 Kind = Allocs;
   const auto *EI = C->getVarRefs().begin();
-  for (const auto L : C->component_lists()) {
-const Expr *E = (C->getMapLoc().isValid()) ? *EI : nullptr;
-InfoGen(std::get<0>(L), Kind, std::get<1>(L), C->getMapType(),
-C->getMapTypeModifiers(), std::nullopt,
-/*ReturnDevicePointer=*/false, C->isImplicit(), std::get<2>(L),
-E);
-++EI;
+  if (*EI && !isa(*EI)) {
+for (const auto L : C->component_lists()) {
+  const Expr *E = (C->getMapLoc().isValid()) ? *EI : nullptr;
+  InfoGen(std::get<0>(L), Kind, std::get<1>(L), C->getMapType(),
+  C->getMapTypeModifiers(), std::nullopt,
+  /*ReturnDevicePointer=*/false, C->isImplicit(), 
std::get<2>(L),
+  E);
+  ++EI;
+}
+  }
+}
+
+// Process the maps with sections.
+for (const auto *Cl : Clauses) {
+  const auto *C = dyn_cast(Cl);
+  if (!C)
+continue;
+  MapKind Kind = Other;
+  if (llvm::is_contained(C->getMapTypeModifiers(),
+ OMPC_MAP_MODIFIER_present))
+Kind = Present;
+  else if (C->getMapType() == OMPC_MAP_alloc)
+Kind = Allocs;
+  const auto *EI = C->getVarRefs().begin();
+  if (*EI && isa(*EI)) {
+for (const auto L : C->component_lists()) {
+  const Expr *E = (C->getMapLoc().isValid()) ? *EI : nullptr;
+  InfoGen(std::get<0>(L), Kind, std::get<1>(L), C->getMapType(),
+  C->getMapTypeModifiers(), std::nullopt,
+  /*ReturnDevicePointer=*/false, C->isImplicit(), 
std::get<2>(L),
+  E);
+  ++EI;
+}
   }
 }
+
 for (const auto *Cl : Clauses) {
   const auto *C = dyn_cast(Cl);
   if (!C)
@@ -7767,6 +7796,7 @@ class MappableExprsHandler {
 ++EI;
   }
 }
+
 for (const auto *Cl : Clauses) {
   const auto *C = dyn_cast(Cl);
   if (!C)
diff --git 
a/openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp 
b/openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp
new file mode 100644
index 000..c7ce4bade8de9a2
--- /dev/null
+++ b/openmp/libomptarget/test/offloading/struct_mapping_with_pointers.cpp
@@ -0,0 +1,114 @@
+// clang-format off
+// RUN: %libomptarget-compilexx-generic && env LIBOMPTARGET_DEBUG=1 
%libomptarget-run-generic 2>&1 | %fcheck-generic
+// clang-format on
+
+#include 
+#include 
+
+struct Descriptor {
+  int *datum;
+  long int x;
+  int *more_datum;
+  int xi;
+  int val_datum, val_more_datum;
+  long int arr[1][30];
+  int val_arr;
+};
+
+int main() {
+  Descriptor dat = Descriptor();
+  dat.datum = (int *)malloc(sizeof(int) * 10);
+  dat.more_datum = (int *)malloc(sizeof(int) * 20);
+  dat.xi = 3;
+  dat.arr[0][0] = 1;
+
+  dat.datum[7] = 7;
+  dat.more_datum[17] = 17;
+
+  /// The struct is mapped with type 0x0 when the pointer fields are mapped.
+  /// The struct is also map explicitely by the user. The second mapping by
+  /// the user must not 

[libunwind] [OpenMP][libomptarget] Add map checks when running under unified shared memory (PR #69005)

2023-10-16 Thread Gheorghe-Teodor Bercea via cfe-commits

https://github.com/doru1004 updated 
https://github.com/llvm/llvm-project/pull/69005

>From cb4121c466a0fc357d6ca129bfdd4e7c5e2d11ee Mon Sep 17 00:00:00 2001
From: Doru Bercea 
Date: Wed, 16 Nov 2022 17:23:48 -0600
Subject: [PATCH 1/2] Fix declare target implementation to support enter.

---
 clang/include/clang/Basic/Attr.td |  4 +-
 .../clang/Basic/DiagnosticParseKinds.td   | 12 -
 clang/lib/AST/AttrImpl.cpp|  2 +-
 clang/lib/CodeGen/CGExpr.cpp  | 12 +++--
 clang/lib/CodeGen/CGOpenMPRuntime.cpp | 24 ++---
 clang/lib/CodeGen/CodeGenModule.cpp   |  6 ++-
 clang/lib/Parse/ParseOpenMP.cpp   | 39 ++
 clang/lib/Sema/SemaOpenMP.cpp | 10 ++--
 .../test/OpenMP/declare_target_ast_print.cpp  | 53 +++
 9 files changed, 130 insertions(+), 32 deletions(-)

diff --git a/clang/include/clang/Basic/Attr.td 
b/clang/include/clang/Basic/Attr.td
index 16cf932c3760bd3..eaf4a6db3600e07 100644
--- a/clang/include/clang/Basic/Attr.td
+++ b/clang/include/clang/Basic/Attr.td
@@ -3749,8 +3749,8 @@ def OMPDeclareTargetDecl : InheritableAttr {
   let Documentation = [OMPDeclareTargetDocs];
   let Args = [
 EnumArgument<"MapType", "MapTypeTy",
- [ "to", "link" ],
- [ "MT_To", "MT_Link" ]>,
+ [ "to", "enter", "link" ],
+ [ "MT_To", "MT_Enter", "MT_Link" ]>,
 EnumArgument<"DevType", "DevTypeTy",
  [ "host", "nohost", "any" ],
  [ "DT_Host", "DT_NoHost", "DT_Any" ]>,
diff --git a/clang/include/clang/Basic/DiagnosticParseKinds.td 
b/clang/include/clang/Basic/DiagnosticParseKinds.td
index 674d6bd34fc544f..27cd3da1f191c3d 100644
--- a/clang/include/clang/Basic/DiagnosticParseKinds.td
+++ b/clang/include/clang/Basic/DiagnosticParseKinds.td
@@ -1383,12 +1383,22 @@ def note_omp_assumption_clause_continue_here
 : Note<"the ignored tokens spans until here">;
 def err_omp_declare_target_unexpected_clause: Error<
   "unexpected '%0' clause, only %select{'device_type'|'to' or 'link'|'to', 
'link' or 'device_type'|'device_type', 'indirect'|'to', 'link', 'device_type' 
or 'indirect'}1 clauses expected">;
+def err_omp_declare_target_unexpected_clause_52: Error<
+  "unexpected '%0' clause, only %select{'device_type'|'enter' or 
'link'|'enter', 'link' or 'device_type'|'device_type', 'indirect'|'enter', 
'link', 'device_type' or 'indirect'}1 clauses expected">;
 def err_omp_begin_declare_target_unexpected_implicit_to_clause: Error<
   "unexpected '(', only 'to', 'link' or 'device_type' clauses expected for 
'begin declare target' directive">;
-def err_omp_declare_target_unexpected_clause_after_implicit_to: Error<
+def err_omp_declare_target_wrong_clause_after_implicit_to: Error<
   "unexpected clause after an implicit 'to' clause">;
+def err_omp_declare_target_wrong_clause_after_implicit_enter: Error<
+  "unexpected clause after an implicit 'enter' clause">;
 def err_omp_declare_target_missing_to_or_link_clause: Error<
   "expected at least one %select{'to' or 'link'|'to', 'link' or 'indirect'}0 
clause">;
+def err_omp_declare_target_missing_enter_or_link_clause: Error<
+  "expected at least one %select{'enter' or 'link'|'enter', 'link' or 
'indirect'}0 clause">;
+def err_omp_declare_target_unexpected_to_clause: Error<
+  "unexpected 'to' clause, use 'enter' instead">;
+def err_omp_declare_target_unexpected_enter_clause: Error<
+  "unexpected 'enter' clause, use 'to' instead">;
 def err_omp_declare_target_multiple : Error<
   "%0 appears multiple times in clauses on the same declare target directive">;
 def err_omp_declare_target_indirect_device_type: Error<
diff --git a/clang/lib/AST/AttrImpl.cpp b/clang/lib/AST/AttrImpl.cpp
index cecbd703ac61e8c..da842f6b190e74d 100644
--- a/clang/lib/AST/AttrImpl.cpp
+++ b/clang/lib/AST/AttrImpl.cpp
@@ -137,7 +137,7 @@ void OMPDeclareTargetDeclAttr::printPrettyPragma(
   // Use fake syntax because it is for testing and debugging purpose only.
   if (getDevType() != DT_Any)
 OS << " device_type(" << ConvertDevTypeTyToStr(getDevType()) << ")";
-  if (getMapType() != MT_To)
+  if (getMapType() != MT_To && getMapType() != MT_Enter)
 OS << ' ' << ConvertMapTypeTyToStr(getMapType());
   if (Expr *E = getIndirectExpr()) {
 OS << " indirect(";
diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp
index ee09a8566c3719e..77085ff34fca233 100644
--- a/clang/lib/CodeGen/CGExpr.cpp
+++ b/clang/lib/CodeGen/CGExpr.cpp
@@ -2495,14 +2495,16 @@ static Address 
emitDeclTargetVarDeclLValue(CodeGenFunction ,
const VarDecl *VD, QualType T) {
   llvm::Optional Res =
   OMPDeclareTargetDeclAttr::isDeclareTargetDeclaration(VD);
-  // Return an invalid address if variable is MT_To and unified
-  // memory is not enabled. For all other cases: MT_Link and
-  // MT_To with unified memory, return a valid address.
-  if (!Res || (*Res 

[clang] [OpenMP][libomptarget] Add map checks when running under unified shared memory (PR #69005)

2023-10-16 Thread Gheorghe-Teodor Bercea via cfe-commits

https://github.com/doru1004 updated 
https://github.com/llvm/llvm-project/pull/69005

>From cb4121c466a0fc357d6ca129bfdd4e7c5e2d11ee Mon Sep 17 00:00:00 2001
From: Doru Bercea 
Date: Wed, 16 Nov 2022 17:23:48 -0600
Subject: [PATCH 1/2] Fix declare target implementation to support enter.

---
 clang/include/clang/Basic/Attr.td |  4 +-
 .../clang/Basic/DiagnosticParseKinds.td   | 12 -
 clang/lib/AST/AttrImpl.cpp|  2 +-
 clang/lib/CodeGen/CGExpr.cpp  | 12 +++--
 clang/lib/CodeGen/CGOpenMPRuntime.cpp | 24 ++---
 clang/lib/CodeGen/CodeGenModule.cpp   |  6 ++-
 clang/lib/Parse/ParseOpenMP.cpp   | 39 ++
 clang/lib/Sema/SemaOpenMP.cpp | 10 ++--
 .../test/OpenMP/declare_target_ast_print.cpp  | 53 +++
 9 files changed, 130 insertions(+), 32 deletions(-)

diff --git a/clang/include/clang/Basic/Attr.td 
b/clang/include/clang/Basic/Attr.td
index 16cf932c3760bd3..eaf4a6db3600e07 100644
--- a/clang/include/clang/Basic/Attr.td
+++ b/clang/include/clang/Basic/Attr.td
@@ -3749,8 +3749,8 @@ def OMPDeclareTargetDecl : InheritableAttr {
   let Documentation = [OMPDeclareTargetDocs];
   let Args = [
 EnumArgument<"MapType", "MapTypeTy",
- [ "to", "link" ],
- [ "MT_To", "MT_Link" ]>,
+ [ "to", "enter", "link" ],
+ [ "MT_To", "MT_Enter", "MT_Link" ]>,
 EnumArgument<"DevType", "DevTypeTy",
  [ "host", "nohost", "any" ],
  [ "DT_Host", "DT_NoHost", "DT_Any" ]>,
diff --git a/clang/include/clang/Basic/DiagnosticParseKinds.td 
b/clang/include/clang/Basic/DiagnosticParseKinds.td
index 674d6bd34fc544f..27cd3da1f191c3d 100644
--- a/clang/include/clang/Basic/DiagnosticParseKinds.td
+++ b/clang/include/clang/Basic/DiagnosticParseKinds.td
@@ -1383,12 +1383,22 @@ def note_omp_assumption_clause_continue_here
 : Note<"the ignored tokens spans until here">;
 def err_omp_declare_target_unexpected_clause: Error<
   "unexpected '%0' clause, only %select{'device_type'|'to' or 'link'|'to', 
'link' or 'device_type'|'device_type', 'indirect'|'to', 'link', 'device_type' 
or 'indirect'}1 clauses expected">;
+def err_omp_declare_target_unexpected_clause_52: Error<
+  "unexpected '%0' clause, only %select{'device_type'|'enter' or 
'link'|'enter', 'link' or 'device_type'|'device_type', 'indirect'|'enter', 
'link', 'device_type' or 'indirect'}1 clauses expected">;
 def err_omp_begin_declare_target_unexpected_implicit_to_clause: Error<
   "unexpected '(', only 'to', 'link' or 'device_type' clauses expected for 
'begin declare target' directive">;
-def err_omp_declare_target_unexpected_clause_after_implicit_to: Error<
+def err_omp_declare_target_wrong_clause_after_implicit_to: Error<
   "unexpected clause after an implicit 'to' clause">;
+def err_omp_declare_target_wrong_clause_after_implicit_enter: Error<
+  "unexpected clause after an implicit 'enter' clause">;
 def err_omp_declare_target_missing_to_or_link_clause: Error<
   "expected at least one %select{'to' or 'link'|'to', 'link' or 'indirect'}0 
clause">;
+def err_omp_declare_target_missing_enter_or_link_clause: Error<
+  "expected at least one %select{'enter' or 'link'|'enter', 'link' or 
'indirect'}0 clause">;
+def err_omp_declare_target_unexpected_to_clause: Error<
+  "unexpected 'to' clause, use 'enter' instead">;
+def err_omp_declare_target_unexpected_enter_clause: Error<
+  "unexpected 'enter' clause, use 'to' instead">;
 def err_omp_declare_target_multiple : Error<
   "%0 appears multiple times in clauses on the same declare target directive">;
 def err_omp_declare_target_indirect_device_type: Error<
diff --git a/clang/lib/AST/AttrImpl.cpp b/clang/lib/AST/AttrImpl.cpp
index cecbd703ac61e8c..da842f6b190e74d 100644
--- a/clang/lib/AST/AttrImpl.cpp
+++ b/clang/lib/AST/AttrImpl.cpp
@@ -137,7 +137,7 @@ void OMPDeclareTargetDeclAttr::printPrettyPragma(
   // Use fake syntax because it is for testing and debugging purpose only.
   if (getDevType() != DT_Any)
 OS << " device_type(" << ConvertDevTypeTyToStr(getDevType()) << ")";
-  if (getMapType() != MT_To)
+  if (getMapType() != MT_To && getMapType() != MT_Enter)
 OS << ' ' << ConvertMapTypeTyToStr(getMapType());
   if (Expr *E = getIndirectExpr()) {
 OS << " indirect(";
diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp
index ee09a8566c3719e..77085ff34fca233 100644
--- a/clang/lib/CodeGen/CGExpr.cpp
+++ b/clang/lib/CodeGen/CGExpr.cpp
@@ -2495,14 +2495,16 @@ static Address 
emitDeclTargetVarDeclLValue(CodeGenFunction ,
const VarDecl *VD, QualType T) {
   llvm::Optional Res =
   OMPDeclareTargetDeclAttr::isDeclareTargetDeclaration(VD);
-  // Return an invalid address if variable is MT_To and unified
-  // memory is not enabled. For all other cases: MT_Link and
-  // MT_To with unified memory, return a valid address.
-  if (!Res || (*Res 

[clang] [OpenMP][libomptarget] Add map checks when running under unified shared memory (PR #69005)

2023-10-16 Thread Gheorghe-Teodor Bercea via cfe-commits

https://github.com/doru1004 updated 
https://github.com/llvm/llvm-project/pull/69005

>From cb4121c466a0fc357d6ca129bfdd4e7c5e2d11ee Mon Sep 17 00:00:00 2001
From: Doru Bercea 
Date: Wed, 16 Nov 2022 17:23:48 -0600
Subject: [PATCH 1/2] Fix declare target implementation to support enter.

---
 clang/include/clang/Basic/Attr.td |  4 +-
 .../clang/Basic/DiagnosticParseKinds.td   | 12 -
 clang/lib/AST/AttrImpl.cpp|  2 +-
 clang/lib/CodeGen/CGExpr.cpp  | 12 +++--
 clang/lib/CodeGen/CGOpenMPRuntime.cpp | 24 ++---
 clang/lib/CodeGen/CodeGenModule.cpp   |  6 ++-
 clang/lib/Parse/ParseOpenMP.cpp   | 39 ++
 clang/lib/Sema/SemaOpenMP.cpp | 10 ++--
 .../test/OpenMP/declare_target_ast_print.cpp  | 53 +++
 9 files changed, 130 insertions(+), 32 deletions(-)

diff --git a/clang/include/clang/Basic/Attr.td 
b/clang/include/clang/Basic/Attr.td
index 16cf932c3760bd3..eaf4a6db3600e07 100644
--- a/clang/include/clang/Basic/Attr.td
+++ b/clang/include/clang/Basic/Attr.td
@@ -3749,8 +3749,8 @@ def OMPDeclareTargetDecl : InheritableAttr {
   let Documentation = [OMPDeclareTargetDocs];
   let Args = [
 EnumArgument<"MapType", "MapTypeTy",
- [ "to", "link" ],
- [ "MT_To", "MT_Link" ]>,
+ [ "to", "enter", "link" ],
+ [ "MT_To", "MT_Enter", "MT_Link" ]>,
 EnumArgument<"DevType", "DevTypeTy",
  [ "host", "nohost", "any" ],
  [ "DT_Host", "DT_NoHost", "DT_Any" ]>,
diff --git a/clang/include/clang/Basic/DiagnosticParseKinds.td 
b/clang/include/clang/Basic/DiagnosticParseKinds.td
index 674d6bd34fc544f..27cd3da1f191c3d 100644
--- a/clang/include/clang/Basic/DiagnosticParseKinds.td
+++ b/clang/include/clang/Basic/DiagnosticParseKinds.td
@@ -1383,12 +1383,22 @@ def note_omp_assumption_clause_continue_here
 : Note<"the ignored tokens spans until here">;
 def err_omp_declare_target_unexpected_clause: Error<
   "unexpected '%0' clause, only %select{'device_type'|'to' or 'link'|'to', 
'link' or 'device_type'|'device_type', 'indirect'|'to', 'link', 'device_type' 
or 'indirect'}1 clauses expected">;
+def err_omp_declare_target_unexpected_clause_52: Error<
+  "unexpected '%0' clause, only %select{'device_type'|'enter' or 
'link'|'enter', 'link' or 'device_type'|'device_type', 'indirect'|'enter', 
'link', 'device_type' or 'indirect'}1 clauses expected">;
 def err_omp_begin_declare_target_unexpected_implicit_to_clause: Error<
   "unexpected '(', only 'to', 'link' or 'device_type' clauses expected for 
'begin declare target' directive">;
-def err_omp_declare_target_unexpected_clause_after_implicit_to: Error<
+def err_omp_declare_target_wrong_clause_after_implicit_to: Error<
   "unexpected clause after an implicit 'to' clause">;
+def err_omp_declare_target_wrong_clause_after_implicit_enter: Error<
+  "unexpected clause after an implicit 'enter' clause">;
 def err_omp_declare_target_missing_to_or_link_clause: Error<
   "expected at least one %select{'to' or 'link'|'to', 'link' or 'indirect'}0 
clause">;
+def err_omp_declare_target_missing_enter_or_link_clause: Error<
+  "expected at least one %select{'enter' or 'link'|'enter', 'link' or 
'indirect'}0 clause">;
+def err_omp_declare_target_unexpected_to_clause: Error<
+  "unexpected 'to' clause, use 'enter' instead">;
+def err_omp_declare_target_unexpected_enter_clause: Error<
+  "unexpected 'enter' clause, use 'to' instead">;
 def err_omp_declare_target_multiple : Error<
   "%0 appears multiple times in clauses on the same declare target directive">;
 def err_omp_declare_target_indirect_device_type: Error<
diff --git a/clang/lib/AST/AttrImpl.cpp b/clang/lib/AST/AttrImpl.cpp
index cecbd703ac61e8c..da842f6b190e74d 100644
--- a/clang/lib/AST/AttrImpl.cpp
+++ b/clang/lib/AST/AttrImpl.cpp
@@ -137,7 +137,7 @@ void OMPDeclareTargetDeclAttr::printPrettyPragma(
   // Use fake syntax because it is for testing and debugging purpose only.
   if (getDevType() != DT_Any)
 OS << " device_type(" << ConvertDevTypeTyToStr(getDevType()) << ")";
-  if (getMapType() != MT_To)
+  if (getMapType() != MT_To && getMapType() != MT_Enter)
 OS << ' ' << ConvertMapTypeTyToStr(getMapType());
   if (Expr *E = getIndirectExpr()) {
 OS << " indirect(";
diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp
index ee09a8566c3719e..77085ff34fca233 100644
--- a/clang/lib/CodeGen/CGExpr.cpp
+++ b/clang/lib/CodeGen/CGExpr.cpp
@@ -2495,14 +2495,16 @@ static Address 
emitDeclTargetVarDeclLValue(CodeGenFunction ,
const VarDecl *VD, QualType T) {
   llvm::Optional Res =
   OMPDeclareTargetDeclAttr::isDeclareTargetDeclaration(VD);
-  // Return an invalid address if variable is MT_To and unified
-  // memory is not enabled. For all other cases: MT_Link and
-  // MT_To with unified memory, return a valid address.
-  if (!Res || (*Res 

[clang] [OpenMP][libomptarget] Add map checks when running under unified shared memory (PR #69005)

2023-10-16 Thread Gheorghe-Teodor Bercea via cfe-commits

https://github.com/doru1004 updated 
https://github.com/llvm/llvm-project/pull/69005

>From cb4121c466a0fc357d6ca129bfdd4e7c5e2d11ee Mon Sep 17 00:00:00 2001
From: Doru Bercea 
Date: Wed, 16 Nov 2022 17:23:48 -0600
Subject: [PATCH 1/2] Fix declare target implementation to support enter.

---
 clang/include/clang/Basic/Attr.td |  4 +-
 .../clang/Basic/DiagnosticParseKinds.td   | 12 -
 clang/lib/AST/AttrImpl.cpp|  2 +-
 clang/lib/CodeGen/CGExpr.cpp  | 12 +++--
 clang/lib/CodeGen/CGOpenMPRuntime.cpp | 24 ++---
 clang/lib/CodeGen/CodeGenModule.cpp   |  6 ++-
 clang/lib/Parse/ParseOpenMP.cpp   | 39 ++
 clang/lib/Sema/SemaOpenMP.cpp | 10 ++--
 .../test/OpenMP/declare_target_ast_print.cpp  | 53 +++
 9 files changed, 130 insertions(+), 32 deletions(-)

diff --git a/clang/include/clang/Basic/Attr.td 
b/clang/include/clang/Basic/Attr.td
index 16cf932c3760bd3..eaf4a6db3600e07 100644
--- a/clang/include/clang/Basic/Attr.td
+++ b/clang/include/clang/Basic/Attr.td
@@ -3749,8 +3749,8 @@ def OMPDeclareTargetDecl : InheritableAttr {
   let Documentation = [OMPDeclareTargetDocs];
   let Args = [
 EnumArgument<"MapType", "MapTypeTy",
- [ "to", "link" ],
- [ "MT_To", "MT_Link" ]>,
+ [ "to", "enter", "link" ],
+ [ "MT_To", "MT_Enter", "MT_Link" ]>,
 EnumArgument<"DevType", "DevTypeTy",
  [ "host", "nohost", "any" ],
  [ "DT_Host", "DT_NoHost", "DT_Any" ]>,
diff --git a/clang/include/clang/Basic/DiagnosticParseKinds.td 
b/clang/include/clang/Basic/DiagnosticParseKinds.td
index 674d6bd34fc544f..27cd3da1f191c3d 100644
--- a/clang/include/clang/Basic/DiagnosticParseKinds.td
+++ b/clang/include/clang/Basic/DiagnosticParseKinds.td
@@ -1383,12 +1383,22 @@ def note_omp_assumption_clause_continue_here
 : Note<"the ignored tokens spans until here">;
 def err_omp_declare_target_unexpected_clause: Error<
   "unexpected '%0' clause, only %select{'device_type'|'to' or 'link'|'to', 
'link' or 'device_type'|'device_type', 'indirect'|'to', 'link', 'device_type' 
or 'indirect'}1 clauses expected">;
+def err_omp_declare_target_unexpected_clause_52: Error<
+  "unexpected '%0' clause, only %select{'device_type'|'enter' or 
'link'|'enter', 'link' or 'device_type'|'device_type', 'indirect'|'enter', 
'link', 'device_type' or 'indirect'}1 clauses expected">;
 def err_omp_begin_declare_target_unexpected_implicit_to_clause: Error<
   "unexpected '(', only 'to', 'link' or 'device_type' clauses expected for 
'begin declare target' directive">;
-def err_omp_declare_target_unexpected_clause_after_implicit_to: Error<
+def err_omp_declare_target_wrong_clause_after_implicit_to: Error<
   "unexpected clause after an implicit 'to' clause">;
+def err_omp_declare_target_wrong_clause_after_implicit_enter: Error<
+  "unexpected clause after an implicit 'enter' clause">;
 def err_omp_declare_target_missing_to_or_link_clause: Error<
   "expected at least one %select{'to' or 'link'|'to', 'link' or 'indirect'}0 
clause">;
+def err_omp_declare_target_missing_enter_or_link_clause: Error<
+  "expected at least one %select{'enter' or 'link'|'enter', 'link' or 
'indirect'}0 clause">;
+def err_omp_declare_target_unexpected_to_clause: Error<
+  "unexpected 'to' clause, use 'enter' instead">;
+def err_omp_declare_target_unexpected_enter_clause: Error<
+  "unexpected 'enter' clause, use 'to' instead">;
 def err_omp_declare_target_multiple : Error<
   "%0 appears multiple times in clauses on the same declare target directive">;
 def err_omp_declare_target_indirect_device_type: Error<
diff --git a/clang/lib/AST/AttrImpl.cpp b/clang/lib/AST/AttrImpl.cpp
index cecbd703ac61e8c..da842f6b190e74d 100644
--- a/clang/lib/AST/AttrImpl.cpp
+++ b/clang/lib/AST/AttrImpl.cpp
@@ -137,7 +137,7 @@ void OMPDeclareTargetDeclAttr::printPrettyPragma(
   // Use fake syntax because it is for testing and debugging purpose only.
   if (getDevType() != DT_Any)
 OS << " device_type(" << ConvertDevTypeTyToStr(getDevType()) << ")";
-  if (getMapType() != MT_To)
+  if (getMapType() != MT_To && getMapType() != MT_Enter)
 OS << ' ' << ConvertMapTypeTyToStr(getMapType());
   if (Expr *E = getIndirectExpr()) {
 OS << " indirect(";
diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp
index ee09a8566c3719e..77085ff34fca233 100644
--- a/clang/lib/CodeGen/CGExpr.cpp
+++ b/clang/lib/CodeGen/CGExpr.cpp
@@ -2495,14 +2495,16 @@ static Address 
emitDeclTargetVarDeclLValue(CodeGenFunction ,
const VarDecl *VD, QualType T) {
   llvm::Optional Res =
   OMPDeclareTargetDeclAttr::isDeclareTargetDeclaration(VD);
-  // Return an invalid address if variable is MT_To and unified
-  // memory is not enabled. For all other cases: MT_Link and
-  // MT_To with unified memory, return a valid address.
-  if (!Res || (*Res 

[clang] [OpenMP][libomptarget] Add map checks when running under unified shared memory (PR #69005)

2023-10-16 Thread Gheorghe-Teodor Bercea via cfe-commits


@@ -444,6 +486,29 @@ DeviceTy::getTgtPtrBegin(void *HstPtrBegin, int64_t Size, 
bool UpdateRefCount,
  LR.TPR.getEntry()->dynRefCountToStr().c_str(), DynRefCountAction,
  LR.TPR.getEntry()->holdRefCountToStr().c_str(), HoldRefCountAction);
 LR.TPR.TargetPointer = (void *)TP;
+
+// If this entry is not marked as being host pointer (the way the
+// implementation works today this is never true, mistake?) then we
+// have to check if this is a host pointer or not. This is a host pointer
+// if the host address matches the target address.
+if ((PM->RTLs.RequiresFlags & OMP_REQ_UNIFIED_SHARED_MEMORY) &&
+!LR.TPR.Flags.IsHostPointer) {

doru1004 wrote:

There are several tests which exercise the call to the getTgtPtrBegin. The 
reason this change is needed is because the first condition, if true, and it 
can be true even when USM is enabled, then the USM branch will not be taken at 
all and the IsHostPointer and IsPresent will not be correctly set.

https://github.com/llvm/llvm-project/pull/69005
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][libomptarget] Add map checks when running under unified shared memory (PR #69005)

2023-10-13 Thread Gheorghe-Teodor Bercea via cfe-commits

https://github.com/doru1004 updated 
https://github.com/llvm/llvm-project/pull/69005

>From cb4121c466a0fc357d6ca129bfdd4e7c5e2d11ee Mon Sep 17 00:00:00 2001
From: Doru Bercea 
Date: Wed, 16 Nov 2022 17:23:48 -0600
Subject: [PATCH 1/2] Fix declare target implementation to support enter.

---
 clang/include/clang/Basic/Attr.td |  4 +-
 .../clang/Basic/DiagnosticParseKinds.td   | 12 -
 clang/lib/AST/AttrImpl.cpp|  2 +-
 clang/lib/CodeGen/CGExpr.cpp  | 12 +++--
 clang/lib/CodeGen/CGOpenMPRuntime.cpp | 24 ++---
 clang/lib/CodeGen/CodeGenModule.cpp   |  6 ++-
 clang/lib/Parse/ParseOpenMP.cpp   | 39 ++
 clang/lib/Sema/SemaOpenMP.cpp | 10 ++--
 .../test/OpenMP/declare_target_ast_print.cpp  | 53 +++
 9 files changed, 130 insertions(+), 32 deletions(-)

diff --git a/clang/include/clang/Basic/Attr.td 
b/clang/include/clang/Basic/Attr.td
index 16cf932c3760bd3..eaf4a6db3600e07 100644
--- a/clang/include/clang/Basic/Attr.td
+++ b/clang/include/clang/Basic/Attr.td
@@ -3749,8 +3749,8 @@ def OMPDeclareTargetDecl : InheritableAttr {
   let Documentation = [OMPDeclareTargetDocs];
   let Args = [
 EnumArgument<"MapType", "MapTypeTy",
- [ "to", "link" ],
- [ "MT_To", "MT_Link" ]>,
+ [ "to", "enter", "link" ],
+ [ "MT_To", "MT_Enter", "MT_Link" ]>,
 EnumArgument<"DevType", "DevTypeTy",
  [ "host", "nohost", "any" ],
  [ "DT_Host", "DT_NoHost", "DT_Any" ]>,
diff --git a/clang/include/clang/Basic/DiagnosticParseKinds.td 
b/clang/include/clang/Basic/DiagnosticParseKinds.td
index 674d6bd34fc544f..27cd3da1f191c3d 100644
--- a/clang/include/clang/Basic/DiagnosticParseKinds.td
+++ b/clang/include/clang/Basic/DiagnosticParseKinds.td
@@ -1383,12 +1383,22 @@ def note_omp_assumption_clause_continue_here
 : Note<"the ignored tokens spans until here">;
 def err_omp_declare_target_unexpected_clause: Error<
   "unexpected '%0' clause, only %select{'device_type'|'to' or 'link'|'to', 
'link' or 'device_type'|'device_type', 'indirect'|'to', 'link', 'device_type' 
or 'indirect'}1 clauses expected">;
+def err_omp_declare_target_unexpected_clause_52: Error<
+  "unexpected '%0' clause, only %select{'device_type'|'enter' or 
'link'|'enter', 'link' or 'device_type'|'device_type', 'indirect'|'enter', 
'link', 'device_type' or 'indirect'}1 clauses expected">;
 def err_omp_begin_declare_target_unexpected_implicit_to_clause: Error<
   "unexpected '(', only 'to', 'link' or 'device_type' clauses expected for 
'begin declare target' directive">;
-def err_omp_declare_target_unexpected_clause_after_implicit_to: Error<
+def err_omp_declare_target_wrong_clause_after_implicit_to: Error<
   "unexpected clause after an implicit 'to' clause">;
+def err_omp_declare_target_wrong_clause_after_implicit_enter: Error<
+  "unexpected clause after an implicit 'enter' clause">;
 def err_omp_declare_target_missing_to_or_link_clause: Error<
   "expected at least one %select{'to' or 'link'|'to', 'link' or 'indirect'}0 
clause">;
+def err_omp_declare_target_missing_enter_or_link_clause: Error<
+  "expected at least one %select{'enter' or 'link'|'enter', 'link' or 
'indirect'}0 clause">;
+def err_omp_declare_target_unexpected_to_clause: Error<
+  "unexpected 'to' clause, use 'enter' instead">;
+def err_omp_declare_target_unexpected_enter_clause: Error<
+  "unexpected 'enter' clause, use 'to' instead">;
 def err_omp_declare_target_multiple : Error<
   "%0 appears multiple times in clauses on the same declare target directive">;
 def err_omp_declare_target_indirect_device_type: Error<
diff --git a/clang/lib/AST/AttrImpl.cpp b/clang/lib/AST/AttrImpl.cpp
index cecbd703ac61e8c..da842f6b190e74d 100644
--- a/clang/lib/AST/AttrImpl.cpp
+++ b/clang/lib/AST/AttrImpl.cpp
@@ -137,7 +137,7 @@ void OMPDeclareTargetDeclAttr::printPrettyPragma(
   // Use fake syntax because it is for testing and debugging purpose only.
   if (getDevType() != DT_Any)
 OS << " device_type(" << ConvertDevTypeTyToStr(getDevType()) << ")";
-  if (getMapType() != MT_To)
+  if (getMapType() != MT_To && getMapType() != MT_Enter)
 OS << ' ' << ConvertMapTypeTyToStr(getMapType());
   if (Expr *E = getIndirectExpr()) {
 OS << " indirect(";
diff --git a/clang/lib/CodeGen/CGExpr.cpp b/clang/lib/CodeGen/CGExpr.cpp
index ee09a8566c3719e..77085ff34fca233 100644
--- a/clang/lib/CodeGen/CGExpr.cpp
+++ b/clang/lib/CodeGen/CGExpr.cpp
@@ -2495,14 +2495,16 @@ static Address 
emitDeclTargetVarDeclLValue(CodeGenFunction ,
const VarDecl *VD, QualType T) {
   llvm::Optional Res =
   OMPDeclareTargetDeclAttr::isDeclareTargetDeclaration(VD);
-  // Return an invalid address if variable is MT_To and unified
-  // memory is not enabled. For all other cases: MT_Link and
-  // MT_To with unified memory, return a valid address.
-  if (!Res || (*Res 

r368491 - [OpenMP] Add support for close map modifier in Clang

2019-08-09 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Fri Aug  9 14:42:13 2019
New Revision: 368491

URL: http://llvm.org/viewvc/llvm-project?rev=368491=rev
Log:
[OpenMP] Add support for close map modifier in Clang

Summary:
This patch adds support for the close map modifier in Clang.

This ensures that the new map type is marked and passed to the OpenMP runtime 
appropriately.

Additional regression tests have been merged from patch D55892 (author @saghir).

Reviewers: ABataev, caomhin, jdoerfert, kkwli0

Reviewed By: ABataev

Subscribers: kkwli0, Hahnfeld, saghir, guansong, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D65341

Modified:
cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp
cfe/trunk/test/OpenMP/target_data_codegen.cpp
cfe/trunk/test/OpenMP/target_enter_data_codegen.cpp
cfe/trunk/test/OpenMP/target_exit_data_codegen.cpp
cfe/trunk/test/OpenMP/target_map_codegen.cpp

Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp?rev=368491=368490=368491=diff
==
--- cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp Fri Aug  9 14:42:13 2019
@@ -7116,6 +7116,9 @@ public:
 OMP_MAP_LITERAL = 0x100,
 /// Implicit map
 OMP_MAP_IMPLICIT = 0x200,
+/// Close is a hint to the runtime to allocate memory close to
+/// the target device.
+OMP_MAP_CLOSE = 0x400,
 /// The 16 MSBs of the flags indicate whether the entry is member of some
 /// struct/class.
 OMP_MAP_MEMBER_OF = 0x,
@@ -7296,6 +7299,9 @@ private:
 if (llvm::find(MapModifiers, OMPC_MAP_MODIFIER_always)
 != MapModifiers.end())
   Bits |= OMP_MAP_ALWAYS;
+if (llvm::find(MapModifiers, OMPC_MAP_MODIFIER_close)
+!= MapModifiers.end())
+  Bits |= OMP_MAP_CLOSE;
 return Bits;
   }
 
@@ -7724,10 +7730,10 @@ private:
 
   if (!IsExpressionFirstInfo) {
 // If we have a PTR_AND_OBJ pair where the OBJ is a pointer as 
well,
-// then we reset the TO/FROM/ALWAYS/DELETE flags.
+// then we reset the TO/FROM/ALWAYS/DELETE/CLOSE flags.
 if (IsPointer)
   Flags &= ~(OMP_MAP_TO | OMP_MAP_FROM | OMP_MAP_ALWAYS |
- OMP_MAP_DELETE);
+ OMP_MAP_DELETE | OMP_MAP_CLOSE);
 
 if (ShouldBeMemberOf) {
   // Set placeholder value MEMBER_OF= to indicate that the flag

Modified: cfe/trunk/test/OpenMP/target_data_codegen.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/target_data_codegen.cpp?rev=368491=368490=368491=diff
==
--- cfe/trunk/test/OpenMP/target_data_codegen.cpp (original)
+++ cfe/trunk/test/OpenMP/target_data_codegen.cpp Fri Aug  9 14:42:13 2019
@@ -40,6 +40,10 @@ double gc[100];
 // CK1: [[SIZE04:@.+]] = {{.+}}constant [2 x i64] [i64 sdiv exact (i64 sub 
(i64 ptrtoint (double** getelementptr (double*, double** getelementptr inbounds 
(%struct.ST, %struct.ST* @gb, i32 0, i32 1), i32 1) to i64), i64 ptrtoint 
(double** getelementptr inbounds (%struct.ST, %struct.ST* @gb, i32 0, i32 1) to 
i64)), i64 ptrtoint (i8* getelementptr (i8, i8* null, i32 1) to i64)), i64 24]
 // CK1: [[MTYPE04:@.+]] = {{.+}}constant [2 x i64] [i64 32, i64 
281474976710673]
 
+// CK1: [[MTYPE05:@.+]] = {{.+}}constant [1 x i64] [i64 1057]
+
+// CK1: [[MTYPE06:@.+]] = {{.+}}constant [1 x i64] [i64 1061]
+
 // CK1-LABEL: _Z3fooi
 void foo(int arg) {
   int la;
@@ -163,6 +167,64 @@ void foo(int arg) {
   // CK1-DAG: [[GEPP]] = getelementptr inbounds {{.+}}[[P]]
   #pragma omp target data map(to: gb.b[:3])
   {++arg;}
+
+  // CK1: %{{.+}} = add nsw i32 %{{[^,]+}}, 1
+  {++arg;}
+
+  // Region 05
+  // CK1-DAG: call void @__tgt_target_data_begin(i64 -1, i32 1, i8** 
[[GEPBP:%.+]], i8** [[GEPP:%.+]], i[[sz]]* [[GEPS:%.+]], {{.+}}getelementptr 
{{.+}}[1 x i{{.+}}]* [[MTYPE05]]{{.+}})
+  // CK1-DAG: [[GEPBP]] = getelementptr inbounds {{.+}}[[BP:%[^,]+]]
+  // CK1-DAG: [[GEPP]] = getelementptr inbounds {{.+}}[[P:%[^,]+]]
+  // CK1-DAG: [[GEPS]] = getelementptr inbounds {{.+}}[[S:%[^,]+]]
+
+  // CK1-DAG: [[BP0:%.+]] = getelementptr inbounds {{.+}}[[BP]], i{{.+}} 0, 
i{{.+}} 0
+  // CK1-DAG: [[P0:%.+]] = getelementptr inbounds {{.+}}[[P]], i{{.+}} 0, 
i{{.+}} 0
+  // CK1-DAG: [[S0:%.+]] = getelementptr inbounds {{.+}}[[S]], i{{.+}} 0, 
i{{.+}} 0
+  // CK1-DAG: [[CBP0:%.+]] = bitcast i8** [[BP0]] to float**
+  // CK1-DAG: [[CP0:%.+]] = bitcast i8** [[P0]] to float**
+  // CK1-DAG: store float* [[VAR0:%.+]], float** [[CBP0]]
+  // CK1-DAG: store float* [[VAR0]], float** [[CP0]]
+  // CK1-DAG: store i[[sz]] [[CSVAL0:%[^,]+]], i[[sz]]* [[S0]]
+  // CK1-64-DAG: [[CSVAL0]] = mul nuw i64 %{{[^,]+}}, 4
+  // CK1-32-DAG: [[CSVAL0]] = sext i32 [[CSVAL032:%.+]] to i64
+  // CK1-32-DAG: 

r367613 - [OpenMP] Fix declare target link implementation

2019-08-01 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Thu Aug  1 14:15:58 2019
New Revision: 367613

URL: http://llvm.org/viewvc/llvm-project?rev=367613=rev
Log:
[OpenMP] Fix declare target link implementation

Summary:
This patch fixes the case where variables in different compilation units or the 
same compilation unit are under the declare target link clause AND have the 
same name.
This also fixes the name clash error that occurs when unified memory is 
activated.
The changes in this patch include:
- Pointers to internal variables are given unique names.
- Externally visible variables are given the same name as before.
- All pointer variables (external or internal) are weakly linked.

Reviewers: ABataev, jdoerfert, caomhin

Reviewed By: ABataev

Subscribers: lebedev.ri, guansong, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D64592

Modified:
cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp
cfe/trunk/test/OpenMP/declare_target_codegen.cpp
cfe/trunk/test/OpenMP/declare_target_link_codegen.cpp
cfe/trunk/test/OpenMP/nvptx_target_requires_unified_shared_memory.cpp

Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp?rev=367613=367612=367613=diff
==
--- cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp Thu Aug  1 14:15:58 2019
@@ -2552,6 +2552,32 @@ CGOpenMPRuntime::createDispatchNextFunct
   return CGM.CreateRuntimeFunction(FnTy, Name);
 }
 
+/// Obtain information that uniquely identifies a target entry. This
+/// consists of the file and device IDs as well as line number associated with
+/// the relevant entry source location.
+static void getTargetEntryUniqueInfo(ASTContext , SourceLocation Loc,
+ unsigned , unsigned ,
+ unsigned ) {
+  SourceManager  = C.getSourceManager();
+
+  // The loc should be always valid and have a file ID (the user cannot use
+  // #pragma directives in macros)
+
+  assert(Loc.isValid() && "Source location is expected to be always valid.");
+
+  PresumedLoc PLoc = SM.getPresumedLoc(Loc);
+  assert(PLoc.isValid() && "Source location is expected to be always valid.");
+
+  llvm::sys::fs::UniqueID ID;
+  if (auto EC = llvm::sys::fs::getUniqueID(PLoc.getFilename(), ID))
+SM.getDiagnostics().Report(diag::err_cannot_open_file)
+<< PLoc.getFilename() << EC.message();
+
+  DeviceID = ID.getDevice();
+  FileID = ID.getFile();
+  LineNum = PLoc.getLine();
+}
+
 Address CGOpenMPRuntime::getAddrOfDeclareTargetVar(const VarDecl *VD) {
   if (CGM.getLangOpts().OpenMPSimd)
 return Address::invalid();
@@ -2563,19 +2589,27 @@ Address CGOpenMPRuntime::getAddrOfDeclar
 SmallString<64> PtrName;
 {
   llvm::raw_svector_ostream OS(PtrName);
-  OS << CGM.getMangledName(GlobalDecl(VD)) << "_decl_tgt_ref_ptr";
+  OS << CGM.getMangledName(GlobalDecl(VD));
+  if (!VD->isExternallyVisible()) {
+unsigned DeviceID, FileID, Line;
+getTargetEntryUniqueInfo(CGM.getContext(),
+ VD->getCanonicalDecl()->getBeginLoc(),
+ DeviceID, FileID, Line);
+OS << llvm::format("_%x", FileID);
+  }
+  OS << "_decl_tgt_ref_ptr";
 }
 llvm::Value *Ptr = CGM.getModule().getNamedValue(PtrName);
 if (!Ptr) {
   QualType PtrTy = CGM.getContext().getPointerType(VD->getType());
   Ptr = 
getOrCreateInternalVariable(CGM.getTypes().ConvertTypeForMem(PtrTy),
 PtrName);
-  if (!CGM.getLangOpts().OpenMPIsDevice) {
-auto *GV = cast(Ptr);
-GV->setLinkage(llvm::GlobalValue::ExternalLinkage);
+
+  auto *GV = cast(Ptr);
+  GV->setLinkage(llvm::GlobalValue::WeakAnyLinkage);
+
+  if (!CGM.getLangOpts().OpenMPIsDevice)
 GV->setInitializer(CGM.GetAddrOfGlobal(VD));
-  }
-  CGM.addUsedGlobal(cast(Ptr));
   registerTargetGlobalVariable(VD, cast(Ptr));
 }
 return Address(Ptr, CGM.getContext().getDeclAlign(VD));
@@ -2749,32 +2783,6 @@ llvm::Function *CGOpenMPRuntime::emitThr
   return nullptr;
 }
 
-/// Obtain information that uniquely identifies a target entry. This
-/// consists of the file and device IDs as well as line number associated with
-/// the relevant entry source location.
-static void getTargetEntryUniqueInfo(ASTContext , SourceLocation Loc,
- unsigned , unsigned ,
- unsigned ) {
-  SourceManager  = C.getSourceManager();
-
-  // The loc should be always valid and have a file ID (the user cannot use
-  // #pragma directives in macros)
-
-  assert(Loc.isValid() && "Source location is expected to be always valid.");
-
-  PresumedLoc PLoc = SM.getPresumedLoc(Loc);
-  assert(PLoc.isValid() && "Source location is expected to be always 

r363959 - [OpenMP] Add support for handling declare target to clause when unified memory is required

2019-06-20 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Thu Jun 20 11:04:47 2019
New Revision: 363959

URL: http://llvm.org/viewvc/llvm-project?rev=363959=rev
Log:
[OpenMP] Add support for handling declare target to clause when unified memory 
is required

Summary:
This patch adds support for the handling of the variables under the declare 
target to clause.

The variables in this case are handled like link variables are. A pointer is 
created on the host and then mapped to the device. The runtime will then copy 
the address of the host variable in the device pointer.

Reviewers: ABataev, AlexEichenberger, caomhin

Reviewed By: ABataev

Subscribers: guansong, jdoerfert, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D63108

Modified:
cfe/trunk/lib/CodeGen/CGDeclCXX.cpp
cfe/trunk/lib/CodeGen/CGExpr.cpp
cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp
cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h
cfe/trunk/lib/CodeGen/CodeGenModule.cpp
cfe/trunk/test/OpenMP/declare_target_codegen.cpp
cfe/trunk/test/OpenMP/declare_target_link_codegen.cpp
cfe/trunk/test/OpenMP/nvptx_target_requires_unified_shared_memory.cpp

Modified: cfe/trunk/lib/CodeGen/CGDeclCXX.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGDeclCXX.cpp?rev=363959=363958=363959=diff
==
--- cfe/trunk/lib/CodeGen/CGDeclCXX.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGDeclCXX.cpp Thu Jun 20 11:04:47 2019
@@ -74,7 +74,7 @@ static void EmitDeclDestroy(CodeGenFunct
   // bails even if the attribute is not present.
   if (D.isNoDestroy(CGF.getContext()))
 return;
-  
+
   CodeGenModule  = CGF.CGM;
 
   // FIXME:  __attribute__((cleanup)) ?

Modified: cfe/trunk/lib/CodeGen/CGExpr.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGExpr.cpp?rev=363959=363958=363959=diff
==
--- cfe/trunk/lib/CodeGen/CGExpr.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGExpr.cpp Thu Jun 20 11:04:47 2019
@@ -2295,15 +2295,22 @@ static LValue EmitThreadPrivateVarDeclLV
   return CGF.MakeAddrLValue(Addr, T, AlignmentSource::Decl);
 }
 
-static Address emitDeclTargetLinkVarDeclLValue(CodeGenFunction ,
-   const VarDecl *VD, QualType T) {
+static Address emitDeclTargetVarDeclLValue(CodeGenFunction ,
+   const VarDecl *VD, QualType T) {
   llvm::Optional Res =
   OMPDeclareTargetDeclAttr::isDeclareTargetDeclaration(VD);
-  if (!Res || *Res == OMPDeclareTargetDeclAttr::MT_To)
+  // Return an invalid address if variable is MT_To and unified
+  // memory is not enabled. For all other cases: MT_Link and
+  // MT_To with unified memory, return a valid address.
+  if (!Res || (*Res == OMPDeclareTargetDeclAttr::MT_To &&
+   !CGF.CGM.getOpenMPRuntime().hasRequiresUnifiedSharedMemory()))
 return Address::invalid();
-  assert(*Res == OMPDeclareTargetDeclAttr::MT_Link && "Expected link clause");
+  assert(((*Res == OMPDeclareTargetDeclAttr::MT_Link) ||
+  (*Res == OMPDeclareTargetDeclAttr::MT_To &&
+   CGF.CGM.getOpenMPRuntime().hasRequiresUnifiedSharedMemory())) &&
+ "Expected link clause OR to clause with unified memory enabled.");
   QualType PtrTy = CGF.getContext().getPointerType(VD->getType());
-  Address Addr = CGF.CGM.getOpenMPRuntime().getAddrOfDeclareTargetLink(VD);
+  Address Addr = CGF.CGM.getOpenMPRuntime().getAddrOfDeclareTargetVar(VD);
   return CGF.EmitLoadOfPointer(Addr, PtrTy->castAs());
 }
 
@@ -2359,7 +2366,7 @@ static LValue EmitGlobalVarDeclLValue(Co
   // Check if the variable is marked as declare target with link clause in
   // device codegen.
   if (CGF.getLangOpts().OpenMPIsDevice) {
-Address Addr = emitDeclTargetLinkVarDeclLValue(CGF, VD, T);
+Address Addr = emitDeclTargetVarDeclLValue(CGF, VD, T);
 if (Addr.isValid())
   return CGF.MakeAddrLValue(Addr, T, AlignmentSource::Decl);
   }

Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp?rev=363959=363958=363959=diff
==
--- cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp Thu Jun 20 11:04:47 2019
@@ -2552,16 +2552,18 @@ CGOpenMPRuntime::createDispatchNextFunct
   return CGM.CreateRuntimeFunction(FnTy, Name);
 }
 
-Address CGOpenMPRuntime::getAddrOfDeclareTargetLink(const VarDecl *VD) {
+Address CGOpenMPRuntime::getAddrOfDeclareTargetVar(const VarDecl *VD) {
   if (CGM.getLangOpts().OpenMPSimd)
 return Address::invalid();
   llvm::Optional Res =
   OMPDeclareTargetDeclAttr::isDeclareTargetDeclaration(VD);
-  if (Res && *Res == OMPDeclareTargetDeclAttr::MT_Link) {
+  if (Res && (*Res == OMPDeclareTargetDeclAttr::MT_Link ||
+  (*Res == 

r363809 - [OpenMP] Strengthen regression tests for task allocation under nowait depend clauses NFC

2019-06-19 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Wed Jun 19 07:26:43 2019
New Revision: 363809

URL: http://llvm.org/viewvc/llvm-project?rev=363809=rev
Log:
[OpenMP] Strengthen regression tests for task allocation under nowait depend 
clauses NFC

Summary:
This patch strengthens the tests introduced in D63009 by:
- adding new test for default device ID.
- modifying existing tests to pass device ID local variable to the task 
allocation function.

Reviewers: ABataev, Hahnfeld, caomhin, jdoerfert

Reviewed By: ABataev

Subscribers: guansong, jdoerfert, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D63454

Added:
cfe/trunk/test/OpenMP/target_constant_device_codegen.cpp
Modified:
cfe/trunk/test/OpenMP/target_depend_codegen.cpp
cfe/trunk/test/OpenMP/target_enter_data_depend_codegen.cpp
cfe/trunk/test/OpenMP/target_exit_data_depend_codegen.cpp
cfe/trunk/test/OpenMP/target_parallel_depend_codegen.cpp
cfe/trunk/test/OpenMP/target_parallel_for_depend_codegen.cpp
cfe/trunk/test/OpenMP/target_parallel_for_simd_depend_codegen.cpp
cfe/trunk/test/OpenMP/target_simd_depend_codegen.cpp
cfe/trunk/test/OpenMP/target_teams_depend_codegen.cpp
cfe/trunk/test/OpenMP/target_teams_distribute_depend_codegen.cpp

cfe/trunk/test/OpenMP/target_teams_distribute_parallel_for_depend_codegen.cpp

cfe/trunk/test/OpenMP/target_teams_distribute_parallel_for_simd_depend_codegen.cpp
cfe/trunk/test/OpenMP/target_teams_distribute_simd_depend_codegen.cpp
cfe/trunk/test/OpenMP/target_update_depend_codegen.cpp

Added: cfe/trunk/test/OpenMP/target_constant_device_codegen.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/target_constant_device_codegen.cpp?rev=363809=auto
==
--- cfe/trunk/test/OpenMP/target_constant_device_codegen.cpp (added)
+++ cfe/trunk/test/OpenMP/target_constant_device_codegen.cpp Wed Jun 19 
07:26:43 2019
@@ -0,0 +1,34 @@
+// Test host codegen.
+// RUN: %clang_cc1 -verify -fopenmp -x c++ -triple powerpc64le-unknown-unknown 
-fopenmp-targets=powerpc64le-ibm-linux-gnu -emit-llvm %s -o - | FileCheck %s 
--check-prefix CHECK --check-prefix CHECK-64
+// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple 
powerpc64le-unknown-unknown -fopenmp-targets=powerpc64le-ibm-linux-gnu 
-emit-pch -o %t %s
+// RUN: %clang_cc1 -fopenmp -x c++ -triple powerpc64le-unknown-unknown 
-fopenmp-targets=powerpc64le-ibm-linux-gnu -std=c++11 -include-pch %t -verify 
%s -emit-llvm -o - | FileCheck %s --check-prefix CHECK --check-prefix CHECK-64
+
+// expected-no-diagnostics
+#ifndef HEADER
+#define HEADER
+
+int global;
+extern int global;
+
+// CHECK: define {{.*}}[[FOO:@.+]](
+int foo(int n) {
+  int a = 0;
+  float b[10];
+  double cn[5][n];
+
+  #pragma omp target nowait depend(in: global) depend(out: a, b, cn[4])
+  {
+  }
+
+  // CHECK: call i8* @__kmpc_omp_target_task_alloc({{.*}}, i64 -1)
+
+  #pragma omp target device(1) nowait depend(in: global) depend(out: a, b, 
cn[4])
+  {
+  }
+
+  // CHECK: call i8* @__kmpc_omp_target_task_alloc({{.*}}, i64 1)
+
+  return a;
+}
+
+#endif

Modified: cfe/trunk/test/OpenMP/target_depend_codegen.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/target_depend_codegen.cpp?rev=363809=363808=363809=diff
==
--- cfe/trunk/test/OpenMP/target_depend_codegen.cpp (original)
+++ cfe/trunk/test/OpenMP/target_depend_codegen.cpp Wed Jun 19 07:26:43 2019
@@ -132,8 +132,10 @@ int foo(int n) {
   // CHECK:   [[GEP:%.+]] = getelementptr inbounds %{{.+}}, %{{.+}}* 
%{{.+}}, i32 0, i32 2
   // CHECK:   [[DEV:%.+]] = load i32, i32* [[DEVICE_CAP]],
   // CHECK:   store i32 [[DEV]], i32* [[GEP]],
+  // CHECK:   [[DEV1:%.+]] = load i32, i32* [[DEVICE_CAP]],
+  // CHECK:   [[DEV2:%.+]] = sext i32 [[DEV1]] to i64
 
-  // CHECK:   [[TASK:%.+]] = call i8* 
@__kmpc_omp_target_task_alloc(%struct.ident_t* @0, i32 [[GTID]], i32 1, i[[SZ]] 
{{104|52}}, i[[SZ]] {{16|12}}, i32 (i32, i8*)* bitcast (i32 (i32, %{{.+}}*)* 
[[TASK_ENTRY1_:@.+]] to i32 (i32, i8*)*), i64
+  // CHECK:   [[TASK:%.+]] = call i8* 
@__kmpc_omp_target_task_alloc(%struct.ident_t* @0, i32 [[GTID]], i32 1, i[[SZ]] 
{{104|52}}, i[[SZ]] {{16|12}}, i32 (i32, i8*)* bitcast (i32 (i32, %{{.+}}*)* 
[[TASK_ENTRY1_:@.+]] to i32 (i32, i8*)*), i64 [[DEV2]])
   // CHECK:   [[BC_TASK:%.+]] = bitcast i8* [[TASK]] to [[TASK_TY1_:%.+]]*
   // CHECK:   getelementptr inbounds [3 x %struct.kmp_depend_info], [3 x 
%struct.kmp_depend_info]* %{{.+}}, i[[SZ]] 0, i[[SZ]] 0
   // CHECK:   getelementptr inbounds [3 x %struct.kmp_depend_info], [3 x 
%struct.kmp_depend_info]* %{{.+}}, i[[SZ]] 0, i[[SZ]] 1
@@ -148,8 +150,10 @@ int foo(int n) {
   // CHECK:   [[GEP:%.+]] = getelementptr inbounds %{{.+}}, %{{.+}}* 
%{{.+}}, i32 0, i32 2
   // CHECK:   [[DEV:%.+]] = load i32, i32* [[DEVICE_CAP]],
   // CHECK:   store 

r363451 - [OpenMP] Add target task alloc function with device ID

2019-06-14 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Fri Jun 14 13:19:54 2019
New Revision: 363451

URL: http://llvm.org/viewvc/llvm-project?rev=363451=rev
Log:
[OpenMP] Add target task alloc function with device ID

Summary: Add a new call to Clang to perform task allocation for the target.

Reviewers: ABataev, AlexEichenberger, caomhin

Reviewed By: ABataev, AlexEichenberger

Subscribers: openmp-commits, Hahnfeld, guansong, jdoerfert, cfe-commits

Tags: #clang, #openmp

Differential Revision: https://reviews.llvm.org/D63009

Modified:
cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp
cfe/trunk/test/OpenMP/target_depend_codegen.cpp
cfe/trunk/test/OpenMP/target_enter_data_depend_codegen.cpp
cfe/trunk/test/OpenMP/target_exit_data_depend_codegen.cpp
cfe/trunk/test/OpenMP/target_parallel_depend_codegen.cpp
cfe/trunk/test/OpenMP/target_parallel_for_depend_codegen.cpp
cfe/trunk/test/OpenMP/target_parallel_for_simd_depend_codegen.cpp
cfe/trunk/test/OpenMP/target_simd_depend_codegen.cpp
cfe/trunk/test/OpenMP/target_teams_depend_codegen.cpp
cfe/trunk/test/OpenMP/target_teams_distribute_depend_codegen.cpp

cfe/trunk/test/OpenMP/target_teams_distribute_parallel_for_depend_codegen.cpp

cfe/trunk/test/OpenMP/target_teams_distribute_parallel_for_simd_depend_codegen.cpp
cfe/trunk/test/OpenMP/target_teams_distribute_simd_depend_codegen.cpp
cfe/trunk/test/OpenMP/target_update_depend_codegen.cpp

Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp?rev=363451=363450=363451=diff
==
--- cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp Fri Jun 14 13:19:54 2019
@@ -475,6 +475,12 @@ enum OpenMPOffloadingRequiresDirFlags :
   OMP_REQ_DYNAMIC_ALLOCATORS  = 0x010,
   LLVM_MARK_AS_BITMASK_ENUM(/*LargestValue=*/OMP_REQ_DYNAMIC_ALLOCATORS)
 };
+
+enum OpenMPOffloadingReservedDeviceIDs {
+  /// Device ID if the device was not defined, runtime should get it
+  /// from environment variables in the spec.
+  OMP_DEVICEID_UNDEF = -1,
+};
 } // anonymous namespace
 
 /// Describes ident structure that describes a source location.
@@ -604,6 +610,11 @@ enum OpenMPRTLFunction {
   // kmp_int32 flags, size_t sizeof_kmp_task_t, size_t sizeof_shareds,
   // kmp_routine_entry_t *task_entry);
   OMPRTL__kmpc_omp_task_alloc,
+  // Call to kmp_task_t * __kmpc_omp_target_task_alloc(ident_t *,
+  // kmp_int32 gtid, kmp_int32 flags, size_t sizeof_kmp_task_t,
+  // size_t sizeof_shareds, kmp_routine_entry_t *task_entry,
+  // kmp_int64 device_id);
+  OMPRTL__kmpc_omp_target_task_alloc,
   // Call to kmp_int32 __kmpc_omp_task(ident_t *, kmp_int32 gtid, kmp_task_t *
   // new_task);
   OMPRTL__kmpc_omp_task,
@@ -1912,6 +1923,21 @@ llvm::FunctionCallee CGOpenMPRuntime::cr
 RTLFn = CGM.CreateRuntimeFunction(FnTy, /*Name=*/"__kmpc_omp_task_alloc");
 break;
   }
+  case OMPRTL__kmpc_omp_target_task_alloc: {
+// Build kmp_task_t *__kmpc_omp_target_task_alloc(ident_t *, kmp_int32 
gtid,
+// kmp_int32 flags, size_t sizeof_kmp_task_t, size_t sizeof_shareds,
+// kmp_routine_entry_t *task_entry, kmp_int64 device_id);
+assert(KmpRoutineEntryPtrTy != nullptr &&
+   "Type kmp_routine_entry_t must be created.");
+llvm::Type *TypeParams[] = {getIdentTyPointerTy(), CGM.Int32Ty, 
CGM.Int32Ty,
+CGM.SizeTy, CGM.SizeTy, KmpRoutineEntryPtrTy,
+CGM.Int64Ty};
+// Return void * and then cast to particular kmp_task_t type.
+auto *FnTy =
+llvm::FunctionType::get(CGM.VoidPtrTy, TypeParams, /*isVarArg=*/false);
+RTLFn = CGM.CreateRuntimeFunction(FnTy, 
/*Name=*/"__kmpc_omp_target_task_alloc");
+break;
+  }
   case OMPRTL__kmpc_omp_task: {
 // Build kmp_int32 __kmpc_omp_task(ident_t *, kmp_int32 gtid, kmp_task_t
 // *new_task);
@@ -5074,13 +5100,30 @@ CGOpenMPRuntime::emitTaskInit(CodeGenFun
   : CGF.Builder.getInt32(Data.Final.getInt() ? FinalFlag : 0);
   TaskFlags = CGF.Builder.CreateOr(TaskFlags, CGF.Builder.getInt32(Flags));
   llvm::Value *SharedsSize = CGM.getSize(C.getTypeSizeInChars(SharedsTy));
-  llvm::Value *AllocArgs[] = {emitUpdateLocation(CGF, Loc),
-  getThreadID(CGF, Loc), TaskFlags,
-  KmpTaskTWithPrivatesTySize, SharedsSize,
-  CGF.Builder.CreatePointerBitCastOrAddrSpaceCast(
-  TaskEntry, KmpRoutineEntryPtrTy)};
-  llvm::Value *NewTask = CGF.EmitRuntimeCall(
+  SmallVector AllocArgs = {emitUpdateLocation(CGF, Loc),
+  getThreadID(CGF, Loc), TaskFlags, KmpTaskTWithPrivatesTySize,
+  SharedsSize, CGF.Builder.CreatePointerBitCastOrAddrSpaceCast(
+  TaskEntry, KmpRoutineEntryPtrTy)};
+  llvm::Value *NewTask;
+  if (D.hasClausesOfKind()) {
+// Check if we have any device 

r363435 - [OpenMP] Avoid emitting maps for target link variables when unified memory is used

2019-06-14 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Fri Jun 14 10:58:26 2019
New Revision: 363435

URL: http://llvm.org/viewvc/llvm-project?rev=363435=rev
Log:
[OpenMP] Avoid emitting maps for target link variables when unified memory is 
used

Summary: This patch avoids the emission of maps for target link variables when 
unified memory is present.

Reviewers: ABataev, caomhin

Reviewed By: ABataev

Subscribers: guansong, jdoerfert, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D60883

Modified:
cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp
cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h
cfe/trunk/lib/Sema/SemaOpenMP.cpp
cfe/trunk/test/OpenMP/nvptx_target_requires_unified_shared_memory.cpp

Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp?rev=363435=363434=363435=diff
==
--- cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp Fri Jun 14 10:58:26 2019
@@ -8266,7 +8266,8 @@ public:
   continue;
 llvm::Optional Res =
 OMPDeclareTargetDeclAttr::isDeclareTargetDeclaration(VD);
-if (!Res || *Res != OMPDeclareTargetDeclAttr::MT_Link)
+if (CGF.CGM.getOpenMPRuntime().hasRequiresUnifiedSharedMemory() ||
+!Res || *Res != OMPDeclareTargetDeclAttr::MT_Link)
   continue;
 StructRangeInfoTy PartialStruct;
 generateInfoForComponentList(
@@ -9251,6 +9252,10 @@ bool CGOpenMPRuntime::hasAllocateAttribu
   return false;
 }
 
+bool CGOpenMPRuntime::hasRequiresUnifiedSharedMemory() const {
+  return HasRequiresUnifiedSharedMemory;
+}
+
 CGOpenMPRuntime::DisableAutoDeclareTargetRAII::DisableAutoDeclareTargetRAII(
 CodeGenModule )
 : CGM(CGM) {

Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h?rev=363435=363434=363435=diff
==
--- cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h (original)
+++ cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h Fri Jun 14 10:58:26 2019
@@ -1623,6 +1623,9 @@ public:
   /// the predefined allocator and translates it into the corresponding address
   /// space.
   virtual bool hasAllocateAttributeForGlobalVar(const VarDecl *VD, LangAS );
+
+  /// Return whether the unified_shared_memory has been specified.
+  bool hasRequiresUnifiedSharedMemory() const;
 };
 
 /// Class supports emissionof SIMD-only code.

Modified: cfe/trunk/lib/Sema/SemaOpenMP.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Sema/SemaOpenMP.cpp?rev=363435=363434=363435=diff
==
--- cfe/trunk/lib/Sema/SemaOpenMP.cpp (original)
+++ cfe/trunk/lib/Sema/SemaOpenMP.cpp Fri Jun 14 10:58:26 2019
@@ -2667,7 +2667,8 @@ public:
   llvm::Optional Res =
   OMPDeclareTargetDeclAttr::isDeclareTargetDeclaration(VD);
   if (VD->hasGlobalStorage() && CS && !CS->capturesVariable(VD) &&
-  (!Res || *Res != OMPDeclareTargetDeclAttr::MT_Link))
+  (Stack->hasRequiresDeclWithClause() ||
+   !Res || *Res != OMPDeclareTargetDeclAttr::MT_Link))
 return;
 
   SourceLocation ELoc = E->getExprLoc();

Modified: cfe/trunk/test/OpenMP/nvptx_target_requires_unified_shared_memory.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/nvptx_target_requires_unified_shared_memory.cpp?rev=363435=363434=363435=diff
==
--- cfe/trunk/test/OpenMP/nvptx_target_requires_unified_shared_memory.cpp 
(original)
+++ cfe/trunk/test/OpenMP/nvptx_target_requires_unified_shared_memory.cpp Fri 
Jun 14 10:58:26 2019
@@ -26,42 +26,35 @@ int bar(int n){
 // CHECK: [[VAR:@.+]] = global double 1.00e+01
 // CHECK: [[VAR_DECL_TGT_LINK_PTR:@.+]] = global double* [[VAR]]
 
-// CHECK: [[OFFLOAD_SIZES:@.+]] = private unnamed_addr constant [3 x i64] [i64 
4, i64 8, i64 8]
-// CHECK: [[OFFLOAD_MAPTYPES:@.+]] = private unnamed_addr constant [3 x i64] 
[i64 800, i64 800, i64 531]
+// CHECK: [[OFFLOAD_SIZES:@.+]] = private unnamed_addr constant [2 x i64] [i64 
4, i64 8]
+// CHECK: [[OFFLOAD_MAPTYPES:@.+]] = private unnamed_addr constant [2 x i64] 
[i64 800, i64 800]
 
 // CHECK: [[N_CASTED:%.+]] = alloca i64
 // CHECK: [[SUM_CASTED:%.+]] = alloca i64
 
-// CHECK: [[OFFLOAD_BASEPTRS:%.+]] = alloca [3 x i8*]
-// CHECK: [[OFFLOAD_PTRS:%.+]] = alloca [3 x i8*]
+// CHECK: [[OFFLOAD_BASEPTRS:%.+]] = alloca [2 x i8*]
+// CHECK: [[OFFLOAD_PTRS:%.+]] = alloca [2 x i8*]
 
 // CHECK: [[LOAD1:%.+]] = load i64, i64* [[N_CASTED]]
 // CHECK: [[LOAD2:%.+]] = load i64, i64* [[SUM_CASTED]]
 
-// CHECK: [[BPTR1:%.+]] = getelementptr inbounds [3 x i8*], [3 x i8*]* 
[[OFFLOAD_BASEPTRS]], i32 0, i32 0
+// CHECK: [[BPTR1:%.+]] = getelementptr inbounds [2 x i8*], [2 

r361658 - [OpenMP] Add test for requires and unified shared memory clause with declare target link

2019-05-24 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Fri May 24 11:48:42 2019
New Revision: 361658

URL: http://llvm.org/viewvc/llvm-project?rev=361658=rev
Log:
[OpenMP] Add test for requires and unified shared memory clause with declare 
target link

Summary:
This patch adds a test for requires with unified share memory clause when a 
declare target link is present.

This test needs to go in prior to changes to declare target link for comparison 
purposes.

Reviewers: ABataev, caomhin

Reviewed By: ABataev

Subscribers: guansong, jdoerfert, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D62407

Added:
cfe/trunk/test/OpenMP/nvptx_target_requires_unified_shared_memory.cpp

Added: cfe/trunk/test/OpenMP/nvptx_target_requires_unified_shared_memory.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/nvptx_target_requires_unified_shared_memory.cpp?rev=361658=auto
==
--- cfe/trunk/test/OpenMP/nvptx_target_requires_unified_shared_memory.cpp 
(added)
+++ cfe/trunk/test/OpenMP/nvptx_target_requires_unified_shared_memory.cpp Fri 
May 24 11:48:42 2019
@@ -0,0 +1,67 @@
+// Test declare target link under unified memory requirement.
+// RUN: %clang_cc1 -verify -fopenmp -fopenmp-cuda-mode -x c++ -triple 
powerpc64le-unknown-unknown -fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s 
-o - | FileCheck %s --check-prefix CHECK
+// expected-no-diagnostics
+
+#ifndef HEADER
+#define HEADER
+
+#define N 1000
+
+double var = 10.0;
+
+#pragma omp requires unified_shared_memory
+#pragma omp declare target link(var)
+
+int bar(int n){
+  double sum = 0;
+
+#pragma omp target
+  for(int i = 0; i < n; i++) {
+sum += var;
+  }
+
+  return sum;
+}
+
+// CHECK: [[VAR:@.+]] = global double 1.00e+01
+// CHECK: [[VAR_DECL_TGT_LINK_PTR:@.+]] = global double* [[VAR]]
+
+// CHECK: [[OFFLOAD_SIZES:@.+]] = private unnamed_addr constant [3 x i64] [i64 
4, i64 8, i64 8]
+// CHECK: [[OFFLOAD_MAPTYPES:@.+]] = private unnamed_addr constant [3 x i64] 
[i64 800, i64 800, i64 531]
+
+// CHECK: [[N_CASTED:%.+]] = alloca i64
+// CHECK: [[SUM_CASTED:%.+]] = alloca i64
+
+// CHECK: [[OFFLOAD_BASEPTRS:%.+]] = alloca [3 x i8*]
+// CHECK: [[OFFLOAD_PTRS:%.+]] = alloca [3 x i8*]
+
+// CHECK: [[LOAD1:%.+]] = load i64, i64* [[N_CASTED]]
+// CHECK: [[LOAD2:%.+]] = load i64, i64* [[SUM_CASTED]]
+
+// CHECK: [[BPTR1:%.+]] = getelementptr inbounds [3 x i8*], [3 x i8*]* 
[[OFFLOAD_BASEPTRS]], i32 0, i32 0
+// CHECK: [[BCAST1:%.+]] = bitcast i8** [[BPTR1]] to i64*
+// CHECK: store i64 [[LOAD1]], i64* [[BCAST1]]
+// CHECK: [[BPTR2:%.+]] = getelementptr inbounds [3 x i8*], [3 x i8*]* 
[[OFFLOAD_PTRS]], i32 0, i32 0
+// CHECK: [[BCAST2:%.+]] = bitcast i8** [[BPTR2]] to i64*
+// CHECK: store i64 [[LOAD1]], i64* [[BCAST2]]
+
+// CHECK: [[BPTR3:%.+]] = getelementptr inbounds [3 x i8*], [3 x i8*]* 
[[OFFLOAD_BASEPTRS]], i32 0, i32 1
+// CHECK: [[BCAST3:%.+]] = bitcast i8** [[BPTR3]] to i64*
+// CHECK: store i64 [[LOAD2]], i64* [[BCAST3]]
+// CHECK: [[BPTR4:%.+]] = getelementptr inbounds [3 x i8*], [3 x i8*]* 
[[OFFLOAD_PTRS]], i32 0, i32 1
+// CHECK: [[BCAST4:%.+]] = bitcast i8** [[BPTR4]] to i64*
+// CHECK: store i64 [[LOAD2]], i64* [[BCAST4]]
+
+// CHECK: [[BPTR5:%.+]] = getelementptr inbounds [3 x i8*], [3 x i8*]* 
[[OFFLOAD_BASEPTRS]], i32 0, i32 2
+// CHECK: [[BCAST5:%.+]] = bitcast i8** [[BPTR5]] to double***
+// CHECK: store double** [[VAR_DECL_TGT_LINK_PTR]], double*** [[BCAST5]]
+// CHECK: [[BPTR6:%.+]] = getelementptr inbounds [3 x i8*], [3 x i8*]* 
[[OFFLOAD_PTRS]], i32 0, i32 2
+// CHECK: [[BCAST6:%.+]] = bitcast i8** [[BPTR6]] to double**
+// CHECK: store double* [[VAR]], double** [[BCAST6]]
+
+// CHECK: [[BPTR7:%.+]] = getelementptr inbounds [3 x i8*], [3 x i8*]* 
[[OFFLOAD_BASEPTRS]], i32 0, i32 0
+// CHECK: [[BPTR8:%.+]] = getelementptr inbounds [3 x i8*], [3 x i8*]* 
[[OFFLOAD_PTRS]], i32 0, i32 0
+
+// CHECK: call i32 @__tgt_target(i64 -1, i8* @{{.*}}.region_id, i32 3, i8** 
[[BPTR7]], i8** [[BPTR8]], i64* getelementptr inbounds ([3 x i64], [3 x i64]* 
[[OFFLOAD_SIZES]], i32 0, i32 0), i64* getelementptr inbounds ([3 x i64], [3 x 
i64]* [[OFFLOAD_MAPTYPES]], i32 0, i32 0))
+
+#endif


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r361298 - [OpenMP] Add support for registering requires directives with the runtime

2019-05-21 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Tue May 21 12:42:01 2019
New Revision: 361298

URL: http://llvm.org/viewvc/llvm-project?rev=361298=rev
Log:
[OpenMP] Add support for registering requires directives with the runtime

Summary:
This patch adds support for the registration of the requires directives with 
the runtime.

Each requires directive clause will enable a particular flag to be set.

The set of flags is passed to the runtime to be checked for compatibility with 
other such flags coming from other object files.

The registration function is called whenever OpenMP is present even if a 
requires directive is not present. This helps detect cases in which requires 
directives are used inconsistently.

Reviewers: ABataev, AlexEichenberger, caomhin

Reviewed By: ABataev, AlexEichenberger

Subscribers: jholewinski, guansong, jfb, jdoerfert, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D60568

Modified:
cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp
cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h
cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.h
cfe/trunk/lib/CodeGen/CodeGenModule.cpp
cfe/trunk/test/OpenMP/openmp_offload_registration.cpp
cfe/trunk/test/OpenMP/target_codegen.cpp
cfe/trunk/test/OpenMP/target_codegen_registration.cpp
cfe/trunk/test/OpenMP/target_depend_codegen.cpp
cfe/trunk/test/OpenMP/target_parallel_codegen.cpp
cfe/trunk/test/OpenMP/target_parallel_codegen_registration.cpp
cfe/trunk/test/OpenMP/target_parallel_depend_codegen.cpp
cfe/trunk/test/OpenMP/target_parallel_for_codegen.cpp
cfe/trunk/test/OpenMP/target_parallel_for_codegen_registration.cpp
cfe/trunk/test/OpenMP/target_parallel_for_depend_codegen.cpp
cfe/trunk/test/OpenMP/target_parallel_for_simd_codegen.cpp
cfe/trunk/test/OpenMP/target_parallel_for_simd_codegen_registration.cpp
cfe/trunk/test/OpenMP/target_parallel_for_simd_depend_codegen.cpp
cfe/trunk/test/OpenMP/target_parallel_if_codegen.cpp
cfe/trunk/test/OpenMP/target_parallel_num_threads_codegen.cpp
cfe/trunk/test/OpenMP/target_simd_codegen.cpp
cfe/trunk/test/OpenMP/target_simd_codegen_registration.cpp
cfe/trunk/test/OpenMP/target_simd_depend_codegen.cpp
cfe/trunk/test/OpenMP/target_teams_codegen.cpp
cfe/trunk/test/OpenMP/target_teams_codegen_registration.cpp
cfe/trunk/test/OpenMP/target_teams_depend_codegen.cpp
cfe/trunk/test/OpenMP/target_teams_distribute_codegen.cpp
cfe/trunk/test/OpenMP/target_teams_distribute_codegen_registration.cpp
cfe/trunk/test/OpenMP/target_teams_distribute_depend_codegen.cpp

cfe/trunk/test/OpenMP/target_teams_distribute_parallel_for_depend_codegen.cpp

cfe/trunk/test/OpenMP/target_teams_distribute_parallel_for_simd_codegen_registration.cpp

cfe/trunk/test/OpenMP/target_teams_distribute_parallel_for_simd_depend_codegen.cpp
cfe/trunk/test/OpenMP/target_teams_distribute_simd_codegen.cpp
cfe/trunk/test/OpenMP/target_teams_distribute_simd_codegen_registration.cpp
cfe/trunk/test/OpenMP/target_teams_distribute_simd_depend_codegen.cpp
cfe/trunk/test/OpenMP/target_teams_num_teams_codegen.cpp
cfe/trunk/test/OpenMP/target_teams_thread_limit_codegen.cpp

Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp?rev=361298=361297=361298=diff
==
--- cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp Tue May 21 12:42:01 2019
@@ -457,6 +457,26 @@ enum OpenMPLocationFlags : unsigned {
   LLVM_MARK_AS_BITMASK_ENUM(/*LargestValue=*/OMP_IDENT_WORK_DISTRIBUTE)
 };
 
+namespace {
+LLVM_ENABLE_BITMASK_ENUMS_IN_NAMESPACE();
+/// Values for bit flags for marking which requires clauses have been used.
+enum OpenMPOffloadingRequiresDirFlags : int64_t {
+  /// flag undefined.
+  OMP_REQ_UNDEFINED   = 0x000,
+  /// no requires clause present.
+  OMP_REQ_NONE= 0x001,
+  /// reverse_offload clause.
+  OMP_REQ_REVERSE_OFFLOAD = 0x002,
+  /// unified_address clause.
+  OMP_REQ_UNIFIED_ADDRESS = 0x004,
+  /// unified_shared_memory clause.
+  OMP_REQ_UNIFIED_SHARED_MEMORY   = 0x008,
+  /// dynamic_allocators clause.
+  OMP_REQ_DYNAMIC_ALLOCATORS  = 0x010,
+  LLVM_MARK_AS_BITMASK_ENUM(/*LargestValue=*/OMP_REQ_DYNAMIC_ALLOCATORS)
+};
+} // anonymous namespace
+
 /// Describes ident structure that describes a source location.
 /// All descriptions are taken from
 /// https://github.com/llvm/llvm-project/blob/master/openmp/runtime/src/kmp.h
@@ -694,6 +714,8 @@ enum OpenMPRTLFunction {
   // *host_ptr, int32_t arg_num, void** args_base, void **args, size_t
   // *arg_sizes, int64_t *arg_types, int32_t num_teams, int32_t thread_limit);
   OMPRTL__tgt_target_teams_nowait,
+  // Call to void __tgt_register_requires(int64_t flags);
+  

r361066 - [OpenMP][bugfix] Add missing math functions variants for log and abs.

2019-05-17 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Fri May 17 12:15:53 2019
New Revision: 361066

URL: http://llvm.org/viewvc/llvm-project?rev=361066=rev
Log:
[OpenMP][bugfix] Add missing math functions variants for log and abs.

Summary: When including the random header in C++, some of the math functions it 
relies on are not present in the CUDA headers. We include this variants in this 
case.

Reviewers: jdoerfert, hfinkel, tra, caomhin

Reviewed By: tra

Subscribers: efriedma, guansong, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D62046

Modified:
cfe/trunk/lib/Headers/__clang_cuda_cmath.h
cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h

Modified: cfe/trunk/lib/Headers/__clang_cuda_cmath.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/__clang_cuda_cmath.h?rev=361066=361065=361066=diff
==
--- cfe/trunk/lib/Headers/__clang_cuda_cmath.h (original)
+++ cfe/trunk/lib/Headers/__clang_cuda_cmath.h Fri May 17 12:15:53 2019
@@ -51,6 +51,11 @@ __DEVICE__ long abs(long __n) { return :
 __DEVICE__ float abs(float __x) { return ::fabsf(__x); }
 __DEVICE__ double abs(double __x) { return ::fabs(__x); }
 #endif
+// TODO: remove once variat is supported.
+#if defined(_OPENMP) && defined(__cplusplus)
+__DEVICE__ const float abs(const float __x) { return ::fabsf((float)__x); }
+__DEVICE__ const double abs(const double __x) { return ::fabs((double)__x); }
+#endif
 __DEVICE__ float acos(float __x) { return ::acosf(__x); }
 __DEVICE__ float asin(float __x) { return ::asinf(__x); }
 __DEVICE__ float atan(float __x) { return ::atanf(__x); }

Modified: cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h?rev=361066=361065=361066=diff
==
--- cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h (original)
+++ cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h Fri May 17 
12:15:53 2019
@@ -42,6 +42,14 @@ __DEVICE__ long long abs(long long);
 __DEVICE__ double abs(double);
 __DEVICE__ float abs(float);
 #endif
+// While providing the CUDA declarations and definitions for math functions,
+// we may manually define additional functions.
+// TODO: Once variant is supported the additional functions will have
+// to be removed.
+#if defined(_OPENMP) && defined(__cplusplus)
+__DEVICE__ const double abs(const double);
+__DEVICE__ const float abs(const float);
+#endif
 __DEVICE__ int abs(int) __NOEXCEPT;
 __DEVICE__ double acos(double);
 __DEVICE__ float acos(float);
@@ -144,6 +152,9 @@ __DEVICE__ double log2(double);
 __DEVICE__ float log2(float);
 __DEVICE__ double logb(double);
 __DEVICE__ float logb(float);
+#if defined(_OPENMP) && defined(__cplusplus)
+__DEVICE__ long double log(long double);
+#endif
 __DEVICE__ double log(double);
 __DEVICE__ float log(float);
 __DEVICE__ long lrint(double);


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r360809 - [OpenMP][Bugfix] Move double and float versions of abs under c++ macro

2019-05-15 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Wed May 15 13:28:23 2019
New Revision: 360809

URL: http://llvm.org/viewvc/llvm-project?rev=360809=rev
Log:
[OpenMP][Bugfix] Move double and float versions of abs under c++ macro

Summary:
This is a fix for the reported bug:

[[ https://bugs.llvm.org/show_bug.cgi?id=41861 | 41861 ]]

abs functions need to be moved under the c++ macro to avoid conflicts with 
included headers.

Reviewers: tra, jdoerfert, hfinkel, ABataev, caomhin

Reviewed By: jdoerfert

Subscribers: guansong, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D61959

Modified:
cfe/trunk/lib/Headers/__clang_cuda_cmath.h
cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h
cfe/trunk/test/Headers/Inputs/include/cstdlib
cfe/trunk/test/Headers/nvptx_device_cmath_functions.c
cfe/trunk/test/Headers/nvptx_device_cmath_functions.cpp
cfe/trunk/test/Headers/nvptx_device_cmath_functions_cxx17.cpp
cfe/trunk/test/Headers/nvptx_device_math_functions.c
cfe/trunk/test/Headers/nvptx_device_math_functions.cpp
cfe/trunk/test/Headers/nvptx_device_math_functions_cxx17.cpp

Modified: cfe/trunk/lib/Headers/__clang_cuda_cmath.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/__clang_cuda_cmath.h?rev=360809=360808=360809=diff
==
--- cfe/trunk/lib/Headers/__clang_cuda_cmath.h (original)
+++ cfe/trunk/lib/Headers/__clang_cuda_cmath.h Wed May 15 13:28:23 2019
@@ -48,9 +48,9 @@
 #if !(defined(_OPENMP) && defined(__cplusplus))
 __DEVICE__ long long abs(long long __n) { return ::llabs(__n); }
 __DEVICE__ long abs(long __n) { return ::labs(__n); }
-#endif
 __DEVICE__ float abs(float __x) { return ::fabsf(__x); }
 __DEVICE__ double abs(double __x) { return ::fabs(__x); }
+#endif
 __DEVICE__ float acos(float __x) { return ::acosf(__x); }
 __DEVICE__ float asin(float __x) { return ::asinf(__x); }
 __DEVICE__ float atan(float __x) { return ::atanf(__x); }

Modified: cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h?rev=360809=360808=360809=diff
==
--- cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h (original)
+++ cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h Wed May 15 
13:28:23 2019
@@ -39,10 +39,10 @@
 #if !(defined(_OPENMP) && defined(__cplusplus))
 __DEVICE__ long abs(long);
 __DEVICE__ long long abs(long long);
-#endif
-__DEVICE__ int abs(int) __NOEXCEPT;
 __DEVICE__ double abs(double);
 __DEVICE__ float abs(float);
+#endif
+__DEVICE__ int abs(int) __NOEXCEPT;
 __DEVICE__ double acos(double);
 __DEVICE__ float acos(float);
 __DEVICE__ double acosh(double);

Modified: cfe/trunk/test/Headers/Inputs/include/cstdlib
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Headers/Inputs/include/cstdlib?rev=360809=360808=360809=diff
==
--- cfe/trunk/test/Headers/Inputs/include/cstdlib (original)
+++ cfe/trunk/test/Headers/Inputs/include/cstdlib Wed May 15 13:28:23 2019
@@ -3,9 +3,11 @@
 #if __cplusplus >= 201703L
 extern int abs (int __x) throw()  __attribute__ ((__const__)) ;
 extern long int labs (long int __x) throw() __attribute__ ((__const__)) ;
+extern float fabs (float __x) throw() __attribute__ ((__const__)) ;
 #else
 extern int abs (int __x) __attribute__ ((__const__)) ;
 extern long int labs (long int __x) __attribute__ ((__const__)) ;
+extern float fabs (float __x) __attribute__ ((__const__)) ;
 #endif
 
 namespace std

Modified: cfe/trunk/test/Headers/nvptx_device_cmath_functions.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Headers/nvptx_device_cmath_functions.c?rev=360809=360808=360809=diff
==
--- cfe/trunk/test/Headers/nvptx_device_cmath_functions.c (original)
+++ cfe/trunk/test/Headers/nvptx_device_cmath_functions.c Wed May 15 13:28:23 
2019
@@ -17,5 +17,9 @@ void test_sqrt(double a1) {
 double l2 = pow(a1, a1);
 // CHECK-YES: call double @__nv_modf(double
 double l3 = modf(a1 + 3.5, );
+// CHECK-YES: call double @__nv_fabs(double
+double l4 = fabs(a1);
+// CHECK-YES: call i32 @__nv_abs(i32
+double l5 = abs((int)a1);
   }
 }

Modified: cfe/trunk/test/Headers/nvptx_device_cmath_functions.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Headers/nvptx_device_cmath_functions.cpp?rev=360809=360808=360809=diff
==
--- cfe/trunk/test/Headers/nvptx_device_cmath_functions.cpp (original)
+++ cfe/trunk/test/Headers/nvptx_device_cmath_functions.cpp Wed May 15 13:28:23 
2019
@@ -18,5 +18,9 @@ void test_sqrt(double a1) {
 double l2 = pow(a1, a1);
 // CHECK-YES: call double @__nv_modf(double
 double l3 = 

r360804 - [OpenMP][bugfix] Fix issues with C++ 17 compilation when handling math functions

2019-05-15 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Wed May 15 13:18:21 2019
New Revision: 360804

URL: http://llvm.org/viewvc/llvm-project?rev=360804=rev
Log:
[OpenMP][bugfix] Fix issues with C++ 17 compilation when handling math functions

Summary: In OpenMP device offloading we must ensure that unde C++ 17, the 
inclusion of cstdlib will works correctly.

Reviewers: ABataev, tra, jdoerfert, hfinkel, caomhin

Reviewed By: jdoerfert

Subscribers: Hahnfeld, guansong, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D61949

Added:
cfe/trunk/test/Headers/nvptx_device_cmath_functions_cxx17.cpp
cfe/trunk/test/Headers/nvptx_device_math_functions_cxx17.cpp
Modified:
cfe/trunk/lib/Headers/__clang_cuda_cmath.h
cfe/trunk/lib/Headers/__clang_cuda_device_functions.h
cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h
cfe/trunk/test/Headers/Inputs/include/cstdlib

Modified: cfe/trunk/lib/Headers/__clang_cuda_cmath.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/__clang_cuda_cmath.h?rev=360804=360803=360804=diff
==
--- cfe/trunk/lib/Headers/__clang_cuda_cmath.h (original)
+++ cfe/trunk/lib/Headers/__clang_cuda_cmath.h Wed May 15 13:18:21 2019
@@ -36,6 +36,15 @@
 #define __DEVICE__ static __device__ __inline__ __attribute__((always_inline))
 #endif
 
+// For C++ 17 we need to include noexcept attribute to be compatible
+// with the header-defined version. This may be removed once
+// variant is supported.
+#if defined(_OPENMP) && defined(__cplusplus) && __cplusplus >= 201703L
+#define __NOEXCEPT noexcept
+#else
+#define __NOEXCEPT
+#endif
+
 #if !(defined(_OPENMP) && defined(__cplusplus))
 __DEVICE__ long long abs(long long __n) { return ::llabs(__n); }
 __DEVICE__ long abs(long __n) { return ::labs(__n); }
@@ -50,7 +59,7 @@ __DEVICE__ float ceil(float __x) { retur
 __DEVICE__ float cos(float __x) { return ::cosf(__x); }
 __DEVICE__ float cosh(float __x) { return ::coshf(__x); }
 __DEVICE__ float exp(float __x) { return ::expf(__x); }
-__DEVICE__ float fabs(float __x) { return ::fabsf(__x); }
+__DEVICE__ float fabs(float __x) __NOEXCEPT { return ::fabsf(__x); }
 __DEVICE__ float floor(float __x) { return ::floorf(__x); }
 __DEVICE__ float fmod(float __x, float __y) { return ::fmodf(__x, __y); }
 // TODO: remove when variant is supported
@@ -465,6 +474,7 @@ _GLIBCXX_END_NAMESPACE_VERSION
 } // namespace std
 #endif
 
+#undef __NOEXCEPT
 #undef __DEVICE__
 
 #endif

Modified: cfe/trunk/lib/Headers/__clang_cuda_device_functions.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/__clang_cuda_device_functions.h?rev=360804=360803=360804=diff
==
--- cfe/trunk/lib/Headers/__clang_cuda_device_functions.h (original)
+++ cfe/trunk/lib/Headers/__clang_cuda_device_functions.h Wed May 15 13:18:21 
2019
@@ -37,6 +37,15 @@
 #define __FAST_OR_SLOW(fast, slow) slow
 #endif
 
+// For C++ 17 we need to include noexcept attribute to be compatible
+// with the header-defined version. This may be removed once
+// variant is supported.
+#if defined(_OPENMP) && defined(__cplusplus) && __cplusplus >= 201703L
+#define __NOEXCEPT noexcept
+#else
+#define __NOEXCEPT
+#endif
+
 __DEVICE__ int __all(int __a) { return __nvvm_vote_all(__a); }
 __DEVICE__ int __any(int __a) { return __nvvm_vote_any(__a); }
 __DEVICE__ unsigned int __ballot(int __a) { return __nvvm_vote_ballot(__a); }
@@ -1474,7 +1483,8 @@ __DEVICE__ unsigned int __vsubus4(unsign
   return r;
 }
 #endif // CUDA_VERSION >= 9020
-__DEVICE__ int abs(int __a) { return __nv_abs(__a); }
+__DEVICE__ int abs(int __a) __NOEXCEPT { return __nv_abs(__a); }
+__DEVICE__ double fabs(double __a) __NOEXCEPT { return __nv_fabs(__a); }
 __DEVICE__ double acos(double __a) { return __nv_acos(__a); }
 __DEVICE__ float acosf(float __a) { return __nv_acosf(__a); }
 __DEVICE__ double acosh(double __a) { return __nv_acosh(__a); }
@@ -1533,7 +1543,6 @@ __DEVICE__ float exp2f(float __a) { retu
 __DEVICE__ float expf(float __a) { return __nv_expf(__a); }
 __DEVICE__ double expm1(double __a) { return __nv_expm1(__a); }
 __DEVICE__ float expm1f(float __a) { return __nv_expm1f(__a); }
-__DEVICE__ double fabs(double __a) { return __nv_fabs(__a); }
 __DEVICE__ float fabsf(float __a) { return __nv_fabsf(__a); }
 __DEVICE__ double fdim(double __a, double __b) { return __nv_fdim(__a, __b); }
 __DEVICE__ float fdimf(float __a, float __b) { return __nv_fdimf(__a, __b); }
@@ -1572,15 +1581,15 @@ __DEVICE__ float j1f(float __a) { return
 __DEVICE__ double jn(int __n, double __a) { return __nv_jn(__n, __a); }
 __DEVICE__ float jnf(int __n, float __a) { return __nv_jnf(__n, __a); }
 #if defined(__LP64__) || defined(_WIN64)
-__DEVICE__ long labs(long __a) { return __nv_llabs(__a); };
+__DEVICE__ long labs(long __a) __NOEXCEPT { return __nv_llabs(__a); };
 #else
-__DEVICE__ long labs(long __a) { return __nv_abs(__a); };

r360626 - [OpenMP][Clang][BugFix] Split declares and math functions inclusion.

2019-05-13 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Mon May 13 15:11:44 2019
New Revision: 360626

URL: http://llvm.org/viewvc/llvm-project?rev=360626=rev
Log:
[OpenMP][Clang][BugFix] Split declares and math functions inclusion.

Summary: This patches fixes an issue in which the __clang_cuda_cmath.h header 
is being included even when cmath or math.h headers are not included.

Reviewers: jdoerfert, ABataev, hfinkel, caomhin, tra

Reviewed By: tra

Subscribers: tra, mgorny, guansong, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D61765

Added:
cfe/trunk/lib/Headers/openmp_wrappers/__clang_openmp_math_declares.h
cfe/trunk/test/Headers/Inputs/include/cstdlib
Modified:
cfe/trunk/lib/Driver/ToolChains/Clang.cpp
cfe/trunk/lib/Headers/CMakeLists.txt
cfe/trunk/lib/Headers/__clang_cuda_cmath.h
cfe/trunk/lib/Headers/__clang_cuda_device_functions.h
cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h
cfe/trunk/lib/Headers/openmp_wrappers/__clang_openmp_math.h
cfe/trunk/lib/Headers/openmp_wrappers/cmath
cfe/trunk/lib/Headers/openmp_wrappers/math.h
cfe/trunk/test/Headers/nvptx_device_cmath_functions.c
cfe/trunk/test/Headers/nvptx_device_cmath_functions.cpp
cfe/trunk/test/Headers/nvptx_device_math_functions.c
cfe/trunk/test/Headers/nvptx_device_math_functions.cpp

Modified: cfe/trunk/lib/Driver/ToolChains/Clang.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/Clang.cpp?rev=360626=360625=360626=diff
==
--- cfe/trunk/lib/Driver/ToolChains/Clang.cpp (original)
+++ cfe/trunk/lib/Driver/ToolChains/Clang.cpp Mon May 13 15:11:44 2019
@@ -1166,7 +1166,7 @@ void Clang::AddPreprocessingOptions(Comp
 }
 
 CmdArgs.push_back("-include");
-CmdArgs.push_back("__clang_openmp_math.h");
+CmdArgs.push_back("__clang_openmp_math_declares.h");
   }
 
   // Add -i* options, and automatically translate to

Modified: cfe/trunk/lib/Headers/CMakeLists.txt
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/CMakeLists.txt?rev=360626=360625=360626=diff
==
--- cfe/trunk/lib/Headers/CMakeLists.txt (original)
+++ cfe/trunk/lib/Headers/CMakeLists.txt Mon May 13 15:11:44 2019
@@ -132,6 +132,7 @@ set(openmp_wrapper_files
   openmp_wrappers/math.h
   openmp_wrappers/cmath
   openmp_wrappers/__clang_openmp_math.h
+  openmp_wrappers/__clang_openmp_math_declares.h
 )
 
 set(output_dir ${LLVM_LIBRARY_OUTPUT_INTDIR}/clang/${CLANG_VERSION}/include)

Modified: cfe/trunk/lib/Headers/__clang_cuda_cmath.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/__clang_cuda_cmath.h?rev=360626=360625=360626=diff
==
--- cfe/trunk/lib/Headers/__clang_cuda_cmath.h (original)
+++ cfe/trunk/lib/Headers/__clang_cuda_cmath.h Mon May 13 15:11:44 2019
@@ -36,8 +36,10 @@
 #define __DEVICE__ static __device__ __inline__ __attribute__((always_inline))
 #endif
 
+#if !(defined(_OPENMP) && defined(__cplusplus))
 __DEVICE__ long long abs(long long __n) { return ::llabs(__n); }
 __DEVICE__ long abs(long __n) { return ::labs(__n); }
+#endif
 __DEVICE__ float abs(float __x) { return ::fabsf(__x); }
 __DEVICE__ double abs(double __x) { return ::fabs(__x); }
 __DEVICE__ float acos(float __x) { return ::acosf(__x); }

Modified: cfe/trunk/lib/Headers/__clang_cuda_device_functions.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/__clang_cuda_device_functions.h?rev=360626=360625=360626=diff
==
--- cfe/trunk/lib/Headers/__clang_cuda_device_functions.h (original)
+++ cfe/trunk/lib/Headers/__clang_cuda_device_functions.h Mon May 13 15:11:44 
2019
@@ -1493,8 +1493,10 @@ __DEVICE__ double cbrt(double __a) { ret
 __DEVICE__ float cbrtf(float __a) { return __nv_cbrtf(__a); }
 __DEVICE__ double ceil(double __a) { return __nv_ceil(__a); }
 __DEVICE__ float ceilf(float __a) { return __nv_ceilf(__a); }
+#ifndef _OPENMP
 __DEVICE__ int clock() { return __nvvm_read_ptx_sreg_clock(); }
 __DEVICE__ long long clock64() { return __nvvm_read_ptx_sreg_clock64(); }
+#endif
 __DEVICE__ double copysign(double __a, double __b) {
   return __nv_copysign(__a, __b);
 }

Modified: cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h?rev=360626=360625=360626=diff
==
--- cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h (original)
+++ cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h Mon May 13 
15:11:44 2019
@@ -27,11 +27,13 @@
   static __inline__ __attribute__((always_inline)) __attribute__((device))
 #endif
 
-__DEVICE__ double abs(double);
-__DEVICE__ float abs(float);

r360265 - [OpenMP][Clang] Support for target math functions

2019-05-08 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Wed May  8 08:52:33 2019
New Revision: 360265

URL: http://llvm.org/viewvc/llvm-project?rev=360265=rev
Log:
[OpenMP][Clang] Support for target math functions

Summary:
In this patch we propose a temporary solution to resolving math functions for 
the NVPTX toolchain, temporary until OpenMP variant is supported by Clang.

We intercept the inclusion of math.h and cmath headers and if we are in the 
OpenMP-NVPTX case, we re-use CUDA's math function resolution mechanism.

Authors:
@gtbercea
@jdoerfert

Reviewers: hfinkel, caomhin, ABataev, tra

Reviewed By: hfinkel, ABataev, tra

Subscribers: JDevlieghere, mgorny, guansong, cfe-commits, jdoerfert

Tags: #clang

Differential Revision: https://reviews.llvm.org/D61399

Added:
cfe/trunk/lib/Headers/openmp_wrappers/
cfe/trunk/lib/Headers/openmp_wrappers/__clang_openmp_math.h
cfe/trunk/lib/Headers/openmp_wrappers/cmath
cfe/trunk/lib/Headers/openmp_wrappers/math.h
cfe/trunk/test/Headers/Inputs/include/cmath
cfe/trunk/test/Headers/Inputs/include/limits
cfe/trunk/test/Headers/nvptx_device_cmath_functions.c
cfe/trunk/test/Headers/nvptx_device_cmath_functions.cpp
cfe/trunk/test/Headers/nvptx_device_math_functions.c
cfe/trunk/test/Headers/nvptx_device_math_functions.cpp
Modified:
cfe/trunk/lib/Driver/ToolChain.cpp
cfe/trunk/lib/Driver/ToolChains/Clang.cpp
cfe/trunk/lib/Headers/CMakeLists.txt
cfe/trunk/lib/Headers/__clang_cuda_cmath.h
cfe/trunk/lib/Headers/__clang_cuda_device_functions.h
cfe/trunk/lib/Headers/__clang_cuda_libdevice_declares.h
cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h
cfe/trunk/test/Driver/openmp-offload-gpu.c
cfe/trunk/test/Headers/Inputs/include/math.h

Modified: cfe/trunk/lib/Driver/ToolChain.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChain.cpp?rev=360265=360264=360265=diff
==
--- cfe/trunk/lib/Driver/ToolChain.cpp (original)
+++ cfe/trunk/lib/Driver/ToolChain.cpp Wed May  8 08:52:33 2019
@@ -425,7 +425,7 @@ bool ToolChain::needsProfileRT(const Arg
   Args.hasArg(options::OPT_fprofile_instr_generate) ||
   Args.hasArg(options::OPT_fprofile_instr_generate_EQ) ||
   Args.hasArg(options::OPT_fcreate_profile) ||
-  Args.hasArg(options::OPT_forder_file_instrumentation)) 
+  Args.hasArg(options::OPT_forder_file_instrumentation))
 return true;
 
   return false;

Modified: cfe/trunk/lib/Driver/ToolChains/Clang.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/Clang.cpp?rev=360265=360264=360265=diff
==
--- cfe/trunk/lib/Driver/ToolChains/Clang.cpp (original)
+++ cfe/trunk/lib/Driver/ToolChains/Clang.cpp Wed May  8 08:52:33 2019
@@ -1151,6 +1151,24 @@ void Clang::AddPreprocessingOptions(Comp
   if (JA.isOffloading(Action::OFK_Cuda))
 getToolChain().AddCudaIncludeArgs(Args, CmdArgs);
 
+  // If we are offloading to a target via OpenMP we need to include the
+  // openmp_wrappers folder which contains alternative system headers.
+  if (JA.isDeviceOffloading(Action::OFK_OpenMP) &&
+  getToolChain().getTriple().isNVPTX()){
+if (!Args.hasArg(options::OPT_nobuiltininc)) {
+  // Add openmp_wrappers/* to our system include path.  This lets us wrap
+  // standard library headers.
+  SmallString<128> P(D.ResourceDir);
+  llvm::sys::path::append(P, "include");
+  llvm::sys::path::append(P, "openmp_wrappers");
+  CmdArgs.push_back("-internal-isystem");
+  CmdArgs.push_back(Args.MakeArgString(P));
+}
+
+CmdArgs.push_back("-include");
+CmdArgs.push_back("__clang_openmp_math.h");
+  }
+
   // Add -i* options, and automatically translate to
   // -include-pch/-include-pth for transparent PCH support. It's
   // wonky, but we include looking for .gch so we can support seamless

Modified: cfe/trunk/lib/Headers/CMakeLists.txt
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/CMakeLists.txt?rev=360265=360264=360265=diff
==
--- cfe/trunk/lib/Headers/CMakeLists.txt (original)
+++ cfe/trunk/lib/Headers/CMakeLists.txt Wed May  8 08:52:33 2019
@@ -128,6 +128,12 @@ set(ppc_wrapper_files
   ppc_wrappers/mmintrin.h
 )
 
+set(openmp_wrapper_files
+  openmp_wrappers/math.h
+  openmp_wrappers/cmath
+  openmp_wrappers/__clang_openmp_math.h
+)
+
 set(output_dir ${LLVM_LIBRARY_OUTPUT_INTDIR}/clang/${CLANG_VERSION}/include)
 set(out_files)
 set(generated_files)
@@ -156,7 +162,7 @@ endfunction(clang_generate_header)
 
 
 # Copy header files from the source directory to the build directory
-foreach( f ${files} ${cuda_wrapper_files} ${ppc_wrapper_files} )
+foreach( f ${files} ${cuda_wrapper_files} ${ppc_wrapper_files} 
${openmp_wrapper_files})
   copy_header_to_output_dir(${CMAKE_CURRENT_SOURCE_DIR} ${f})
 endforeach( f 

r360063 - [OpenMP][Clang] Support for target math functions

2019-05-06 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Mon May  6 11:19:15 2019
New Revision: 360063

URL: http://llvm.org/viewvc/llvm-project?rev=360063=rev
Log:
[OpenMP][Clang] Support for target math functions

Summary:
In this patch we propose a temporary solution to resolving math functions for 
the NVPTX toolchain, temporary until OpenMP variant is supported by Clang.

We intercept the inclusion of math.h and cmath headers and if we are in the 
OpenMP-NVPTX case, we re-use CUDA's math function resolution mechanism.

Authors:
@gtbercea
@jdoerfert

Reviewers: hfinkel, caomhin, ABataev, tra

Reviewed By: hfinkel, ABataev, tra

Subscribers: mgorny, guansong, cfe-commits, jdoerfert

Tags: #clang

Differential Revision: https://reviews.llvm.org/D61399

Added:
cfe/trunk/lib/Headers/openmp_wrappers/
cfe/trunk/lib/Headers/openmp_wrappers/__clang_openmp_math.h
cfe/trunk/lib/Headers/openmp_wrappers/cmath
cfe/trunk/lib/Headers/openmp_wrappers/math.h
cfe/trunk/test/Headers/Inputs/include/cmath
cfe/trunk/test/Headers/Inputs/include/limits
cfe/trunk/test/Headers/nvptx_device_cmath_functions.c
cfe/trunk/test/Headers/nvptx_device_cmath_functions.cpp
cfe/trunk/test/Headers/nvptx_device_math_functions.c
cfe/trunk/test/Headers/nvptx_device_math_functions.cpp
Modified:
cfe/trunk/lib/Driver/ToolChains/Clang.cpp
cfe/trunk/lib/Headers/CMakeLists.txt
cfe/trunk/lib/Headers/__clang_cuda_cmath.h
cfe/trunk/lib/Headers/__clang_cuda_device_functions.h
cfe/trunk/lib/Headers/__clang_cuda_libdevice_declares.h
cfe/trunk/lib/Headers/__clang_cuda_math_forward_declares.h
cfe/trunk/test/Driver/openmp-offload-gpu.c
cfe/trunk/test/Headers/Inputs/include/math.h

Modified: cfe/trunk/lib/Driver/ToolChains/Clang.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/Clang.cpp?rev=360063=360062=360063=diff
==
--- cfe/trunk/lib/Driver/ToolChains/Clang.cpp (original)
+++ cfe/trunk/lib/Driver/ToolChains/Clang.cpp Mon May  6 11:19:15 2019
@@ -1151,6 +1151,21 @@ void Clang::AddPreprocessingOptions(Comp
   if (JA.isOffloading(Action::OFK_Cuda))
 getToolChain().AddCudaIncludeArgs(Args, CmdArgs);
 
+  // If we are offloading to a target via OpenMP we need to include the
+  // openmp_wrappers folder which contains alternative system headers.
+  if (JA.isDeviceOffloading(Action::OFK_OpenMP) &&
+  getToolChain().getTriple().isNVPTX()){
+if (!Args.hasArg(options::OPT_nobuiltininc)) {
+  // Add openmp_wrappers/* to our system include path.  This lets us wrap
+  // standard library headers.
+  SmallString<128> P(D.ResourceDir);
+  llvm::sys::path::append(P, "include");
+  llvm::sys::path::append(P, "openmp_wrappers");
+  CmdArgs.push_back("-internal-isystem");
+  CmdArgs.push_back(Args.MakeArgString(P));
+}
+  }
+
   // Add -i* options, and automatically translate to
   // -include-pch/-include-pth for transparent PCH support. It's
   // wonky, but we include looking for .gch so we can support seamless

Modified: cfe/trunk/lib/Headers/CMakeLists.txt
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/CMakeLists.txt?rev=360063=360062=360063=diff
==
--- cfe/trunk/lib/Headers/CMakeLists.txt (original)
+++ cfe/trunk/lib/Headers/CMakeLists.txt Mon May  6 11:19:15 2019
@@ -33,6 +33,9 @@ set(files
   avxintrin.h
   bmi2intrin.h
   bmiintrin.h
+  openmp_wrappers/math.h
+  openmp_wrappers/cmath
+  openmp_wrappers/__clang_openmp_math.h
   __clang_cuda_builtin_vars.h
   __clang_cuda_cmath.h
   __clang_cuda_complex_builtins.h

Modified: cfe/trunk/lib/Headers/__clang_cuda_cmath.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Headers/__clang_cuda_cmath.h?rev=360063=360062=360063=diff
==
--- cfe/trunk/lib/Headers/__clang_cuda_cmath.h (original)
+++ cfe/trunk/lib/Headers/__clang_cuda_cmath.h Mon May  6 11:19:15 2019
@@ -30,7 +30,11 @@
 // implementation.  Declaring in the global namespace and pulling into 
namespace
 // std covers all of the known knowns.
 
+#ifdef _OPENMP
+#define __DEVICE__ static __attribute__((always_inline))
+#else
 #define __DEVICE__ static __device__ __inline__ __attribute__((always_inline))
+#endif
 
 __DEVICE__ long long abs(long long __n) { return ::llabs(__n); }
 __DEVICE__ long abs(long __n) { return ::labs(__n); }
@@ -47,6 +51,8 @@ __DEVICE__ float exp(float __x) { return
 __DEVICE__ float fabs(float __x) { return ::fabsf(__x); }
 __DEVICE__ float floor(float __x) { return ::floorf(__x); }
 __DEVICE__ float fmod(float __x, float __y) { return ::fmodf(__x, __y); }
+// TODO: remove when variant is supported
+#ifndef _OPENMP
 __DEVICE__ int fpclassify(float __x) {
   return __builtin_fpclassify(FP_NAN, FP_INFINITE, FP_NORMAL, FP_SUBNORMAL,
   FP_ZERO, __x);
@@ -55,6 

r359910 - [CUDA][Clang][Bugfix] Add missing CUDA 9.2 case

2019-05-03 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Fri May  3 10:59:18 2019
New Revision: 359910

URL: http://llvm.org/viewvc/llvm-project?rev=359910=rev
Log:
[CUDA][Clang][Bugfix] Add missing CUDA 9.2 case

Summary:
The bug was reported on the OpenMP-dev list:

.../obj-release/lib/clang/9.0.0/include/__clang_cuda_intrinsics.h:173:35: 
error: '__nvvm_shfl_sync_idx_i32' needs target feature ptx60|ptx61|ptx63|ptx64
__MAKE_SYNC_SHUFFLES(__shfl_sync, __nvvm_shfl_sync_idx_i32,

This problem occurs when trying to compile a .cu file that requires a newer ptx 
version (>ptx60 in this case) than ptx42.



Reviewers: tra, ABataev, caomhin

Reviewed By: tra

Subscribers: jdoerfert, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D61474

Modified:
cfe/trunk/lib/Driver/ToolChains/Cuda.cpp

Modified: cfe/trunk/lib/Driver/ToolChains/Cuda.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/Cuda.cpp?rev=359910=359909=359910=diff
==
--- cfe/trunk/lib/Driver/ToolChains/Cuda.cpp (original)
+++ cfe/trunk/lib/Driver/ToolChains/Cuda.cpp Fri May  3 10:59:18 2019
@@ -656,6 +656,9 @@ void CudaToolChain::addClangTargetOption
 case CudaVersion::CUDA_100:
   PtxFeature = "+ptx63";
   break;
+case CudaVersion::CUDA_92:
+  PtxFeature = "+ptx61";
+  break;
 case CudaVersion::CUDA_91:
   PtxFeature = "+ptx61";
   break;


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r358711 - [OpenMP][NFC] Fix requires target test.

2019-04-18 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Thu Apr 18 13:34:43 2019
New Revision: 358711

URL: http://llvm.org/viewvc/llvm-project?rev=358711=rev
Log:
[OpenMP][NFC] Fix requires target test.

Summary:
Fix requires target test.


Reviewers: ABataev

Subscribers: guansong, jdoerfert, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D60886

Modified:
cfe/trunk/test/OpenMP/requires_target_messages.cpp

Modified: cfe/trunk/test/OpenMP/requires_target_messages.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/requires_target_messages.cpp?rev=358711=358710=358711=diff
==
--- cfe/trunk/test/OpenMP/requires_target_messages.cpp (original)
+++ cfe/trunk/test/OpenMP/requires_target_messages.cpp Thu Apr 18 13:34:43 2019
@@ -2,14 +2,14 @@
 
 void foo2() {
   int a;
-  #pragma omp target // expected-note 4 {{Target previously encountered here}}
+  #pragma omp target // expected-note 4 {{target previously encountered here}}
   {
 a = a + 1;
   }
 }
 
 #pragma omp requires atomic_default_mem_order(seq_cst)
-#pragma omp requires unified_address //expected-error {{Target region 
encountered before requires directive with 'unified_address' clause}}
-#pragma omp requires unified_shared_memory //expected-error {{Target region 
encountered before requires directive with 'unified_shared_memory' clause}}
-#pragma omp requires reverse_offload //expected-error {{Target region 
encountered before requires directive with 'reverse_offload' clause}}
-#pragma omp requires dynamic_allocators //expected-error {{Target region 
encountered before requires directive with 'dynamic_allocators' clause}}
+#pragma omp requires unified_address //expected-error {{target region 
encountered before requires directive with 'unified_address' clause}}
+#pragma omp requires unified_shared_memory //expected-error {{target region 
encountered before requires directive with 'unified_shared_memory' clause}}
+#pragma omp requires reverse_offload //expected-error {{target region 
encountered before requires directive with 'reverse_offload' clause}}
+#pragma omp requires dynamic_allocators //expected-error {{target region 
encountered before requires directive with 'dynamic_allocators' clause}}


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r358709 - [OpenMP] Add checks for requires and target directives.

2019-04-18 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Thu Apr 18 12:53:43 2019
New Revision: 358709

URL: http://llvm.org/viewvc/llvm-project?rev=358709=rev
Log:
[OpenMP] Add checks for requires and target directives.

Summary: The requires directive containing target related clauses must appear 
before any target region in the compilation unit.

Reviewers: ABataev, AlexEichenberger, caomhin

Reviewed By: ABataev

Subscribers: guansong, jfb, jdoerfert, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D60875

Added:
cfe/trunk/test/OpenMP/requires_target_messages.cpp
Modified:
cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
cfe/trunk/lib/Sema/SemaOpenMP.cpp
cfe/trunk/test/OpenMP/requires_messages.cpp

Modified: cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td?rev=358709=358708=358709=diff
==
--- cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td (original)
+++ cfe/trunk/include/clang/Basic/DiagnosticSemaKinds.td Thu Apr 18 12:53:43 
2019
@@ -9132,6 +9132,10 @@ def err_omp_requires_clause_redeclaratio
   "Only one %0 clause can appear on a requires directive in a single 
translation unit">;
 def note_omp_requires_previous_clause : Note <
   "%0 clause previously used here">;
+def err_omp_target_before_requires : Error <
+  "target region encountered before requires directive with '%0' clause">;
+def note_omp_requires_encountered_target : Note <
+  "target previously encountered here">;
 def err_omp_invalid_scope : Error <
   "'#pragma omp %0' directive must appear only in file scope">;
 def note_omp_invalid_length_on_this_ptr_mapping : Note <

Modified: cfe/trunk/lib/Sema/SemaOpenMP.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Sema/SemaOpenMP.cpp?rev=358709=358708=358709=diff
==
--- cfe/trunk/lib/Sema/SemaOpenMP.cpp (original)
+++ cfe/trunk/lib/Sema/SemaOpenMP.cpp Thu Apr 18 12:53:43 2019
@@ -193,6 +193,8 @@ private:
   /// Expression for the predefined allocators.
   Expr *OMPPredefinedAllocators[OMPAllocateDeclAttr::OMPUserDefinedMemAlloc] = 
{
   nullptr};
+  /// Vector of previously encountered target directives
+  SmallVector TargetLocations;
 
 public:
   explicit DSAStackTy(Sema ) : SemaRef(S) {}
@@ -454,6 +456,16 @@ public:
 return IsDuplicate;
   }
 
+  /// Add location of previously encountered target to internal vector
+  void addTargetDirLocation(SourceLocation LocStart) {
+TargetLocations.push_back(LocStart);
+  }
+
+  // Return previously encountered target region locations.
+  ArrayRef getEncounteredTargetLocs() const {
+return TargetLocations;
+  }
+
   /// Set default data sharing attribute to none.
   void setDefaultDSANone(SourceLocation Loc) {
 assert(!isStackEmpty());
@@ -2418,6 +2430,27 @@ Sema::ActOnOpenMPRequiresDirective(Sourc
 
 OMPRequiresDecl *Sema::CheckOMPRequiresDecl(SourceLocation Loc,
 ArrayRef ClauseList) {
+  /// For target specific clauses, the requires directive cannot be
+  /// specified after the handling of any of the target regions in the
+  /// current compilation unit.
+  ArrayRef TargetLocations =
+  DSAStack->getEncounteredTargetLocs();
+  if (!TargetLocations.empty()) {
+for (const OMPClause *CNew : ClauseList) {
+  // Check if any of the requires clauses affect target regions.
+  if (isa(CNew) ||
+  isa(CNew) ||
+  isa(CNew) ||
+  isa(CNew)) {
+Diag(Loc, diag::err_omp_target_before_requires)
+<< getOpenMPClauseName(CNew->getClauseKind());
+for (SourceLocation TargetLoc : TargetLocations) {
+  Diag(TargetLoc, diag::note_omp_requires_encountered_target);
+}
+  }
+}
+  }
+
   if (!DSAStack->hasDuplicateRequiresClause(ClauseList))
 return OMPRequiresDecl::Create(Context, getCurLexicalContext(), Loc,
ClauseList);
@@ -4167,6 +4200,16 @@ StmtResult Sema::ActOnOpenMPExecutableDi
 ->setIsOMPStructuredBlock(true);
   }
 
+  if (!CurContext->isDependentContext() &&
+  isOpenMPTargetExecutionDirective(Kind) &&
+  !(DSAStack->hasRequiresDeclWithClause() ||
+DSAStack->hasRequiresDeclWithClause() ||
+DSAStack->hasRequiresDeclWithClause() ||
+DSAStack->hasRequiresDeclWithClause())) {
+// Register target to DSA Stack.
+DSAStack->addTargetDirLocation(StartLoc);
+  }
+
   return Res;
 }
 

Modified: cfe/trunk/test/OpenMP/requires_messages.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/requires_messages.cpp?rev=358709=358708=358709=diff
==
--- cfe/trunk/test/OpenMP/requires_messages.cpp (original)
+++ cfe/trunk/test/OpenMP/requires_messages.cpp Thu Apr 18 12:53:43 2019

r350759 - [OpenMP] Avoid remainder operations for loop index values on a collapsed loop nest.

2019-01-09 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Wed Jan  9 12:45:26 2019
New Revision: 350759

URL: http://llvm.org/viewvc/llvm-project?rev=350759=rev
Log:
[OpenMP] Avoid remainder operations for loop index values on a collapsed loop 
nest.

Summary: Change the strategy for computing loop index variables after 
collapsing a loop nest via the collapse clause by replacing the expensive 
remainder operation with multiplications and additions.

Reviewers: ABataev, caomhin

Reviewed By: ABataev

Subscribers: guansong, arphaman, cfe-commits

Differential Revision: https://reviews.llvm.org/D56413

Modified:
cfe/trunk/lib/Sema/SemaOpenMP.cpp
cfe/trunk/test/OpenMP/for_codegen.cpp
cfe/trunk/test/OpenMP/for_simd_codegen.cpp
cfe/trunk/test/OpenMP/parallel_for_simd_codegen.cpp
cfe/trunk/test/OpenMP/simd_codegen.cpp

Modified: cfe/trunk/lib/Sema/SemaOpenMP.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Sema/SemaOpenMP.cpp?rev=350759=350758=350759=diff
==
--- cfe/trunk/lib/Sema/SemaOpenMP.cpp (original)
+++ cfe/trunk/lib/Sema/SemaOpenMP.cpp Wed Jan  9 12:45:26 2019
@@ -5579,31 +5579,59 @@ checkOpenMPLoop(OpenMPDirectiveKind DKin
   Built.Updates.resize(NestedLoopCount);
   Built.Finals.resize(NestedLoopCount);
   {
-ExprResult Div;
-// Go from inner nested loop to outer.
-for (int Cnt = NestedLoopCount - 1; Cnt >= 0; --Cnt) {
+// We implement the following algorithm for obtaining the
+// original loop iteration variable values based on the
+// value of the collapsed loop iteration variable IV.
+//
+// Let n+1 be the number of collapsed loops in the nest.
+// Iteration variables (I0, I1,  In)
+// Iteration counts (N0, N1, ... Nn)
+//
+// Acc = IV;
+//
+// To compute Ik for loop k, 0 <= k <= n, generate:
+//Prod = N(k+1) * N(k+2) * ... * Nn;
+//Ik = Acc / Prod;
+//Acc -= Ik * Prod;
+//
+ExprResult Acc = IV;
+for (unsigned int Cnt = 0; Cnt < NestedLoopCount; ++Cnt) {
   LoopIterationSpace  = IterSpaces[Cnt];
   SourceLocation UpdLoc = IS.IncSrcRange.getBegin();
-  // Build: Iter = (IV / Div) % IS.NumIters
-  // where Div is product of previous iterations' IS.NumIters.
   ExprResult Iter;
-  if (Div.isUsable()) {
-Iter =
-SemaRef.BuildBinOp(CurScope, UpdLoc, BO_Div, IV.get(), Div.get());
-  } else {
-Iter = IV;
-assert((Cnt == (int)NestedLoopCount - 1) &&
-   "unusable div expected on first iteration only");
-  }
 
-  if (Cnt != 0 && Iter.isUsable())
-Iter = SemaRef.BuildBinOp(CurScope, UpdLoc, BO_Rem, Iter.get(),
-  IS.NumIterations);
+  // Compute prod
+  ExprResult Prod =
+  SemaRef.ActOnIntegerConstant(SourceLocation(), 1).get();
+  for (unsigned int K = Cnt+1; K < NestedLoopCount; ++K)
+Prod = SemaRef.BuildBinOp(CurScope, UpdLoc, BO_Mul, Prod.get(),
+  IterSpaces[K].NumIterations);
+
+  // Iter = Acc / Prod
+  // If there is at least one more inner loop to avoid
+  // multiplication by 1.
+  if (Cnt + 1 < NestedLoopCount)
+Iter = SemaRef.BuildBinOp(CurScope, UpdLoc, BO_Div,
+  Acc.get(), Prod.get());
+  else
+Iter = Acc;
   if (!Iter.isUsable()) {
 HasErrors = true;
 break;
   }
 
+  // Update Acc:
+  // Acc -= Iter * Prod
+  // Check if there is at least one more inner loop to avoid
+  // multiplication by 1.
+  if (Cnt + 1 < NestedLoopCount)
+Prod = SemaRef.BuildBinOp(CurScope, UpdLoc, BO_Mul,
+  Iter.get(), Prod.get());
+  else
+Prod = Iter;
+  Acc = SemaRef.BuildBinOp(CurScope, UpdLoc, BO_Sub,
+   Acc.get(), Prod.get());
+
   // Build update: IS.CounterVar(Private) = IS.Start + Iter * IS.Step
   auto *VD = cast(cast(IS.CounterVar)->getDecl());
   DeclRefExpr *CounterVar = buildDeclRefExpr(
@@ -5632,22 +5660,6 @@ checkOpenMPLoop(OpenMPDirectiveKind DKin
 break;
   }
 
-  // Build Div for the next iteration: Div <- Div * IS.NumIters
-  if (Cnt != 0) {
-if (Div.isUnset())
-  Div = IS.NumIterations;
-else
-  Div = SemaRef.BuildBinOp(CurScope, UpdLoc, BO_Mul, Div.get(),
-   IS.NumIterations);
-
-// Add parentheses (for debugging purposes only).
-if (Div.isUsable())
-  Div = tryBuildCapture(SemaRef, Div.get(), Captures);
-if (!Div.isUsable()) {
-  HasErrors = true;
-  break;
-}
-  }
   if (!Update.isUsable() || !Final.isUsable()) {
 HasErrors = true;
 break;

Modified: cfe/trunk/test/OpenMP/for_codegen.cpp
URL: 

r350758 - [OpenMP] Add flag for preventing the extension to 64 bits for the collapse loop counter

2019-01-09 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Wed Jan  9 12:38:35 2019
New Revision: 350758

URL: http://llvm.org/viewvc/llvm-project?rev=350758=rev
Log:
[OpenMP] Add flag for preventing the extension to 64 bits for the collapse loop 
counter

Summary: Introduce a compiler flag for cases when the user knows that the 
collapsed loop counter can be safely represented using at most 32 bits. This 
will prevent the emission of expensive mathematical operations (such as the div 
operation) on the iteration variable using 64 bits where 32 bit operations are 
sufficient.

Reviewers: ABataev, caomhin

Reviewed By: ABataev

Subscribers: hfinkel, kkwli0, guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D55928

Modified:
cfe/trunk/docs/OpenMPSupport.rst
cfe/trunk/include/clang/Basic/LangOptions.def
cfe/trunk/include/clang/Driver/Options.td
cfe/trunk/lib/Driver/ToolChains/Clang.cpp
cfe/trunk/lib/Frontend/CompilerInvocation.cpp
cfe/trunk/lib/Sema/SemaOpenMP.cpp
cfe/trunk/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp

Modified: cfe/trunk/docs/OpenMPSupport.rst
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/docs/OpenMPSupport.rst?rev=350758=350757=350758=diff
==
--- cfe/trunk/docs/OpenMPSupport.rst (original)
+++ cfe/trunk/docs/OpenMPSupport.rst Wed Jan  9 12:38:35 2019
@@ -108,6 +108,16 @@ are stored in the global memory. In `Cud
 between the threads and it is user responsibility to share the required data
 between the threads in the parallel regions.
 
+Collapsed loop nest counter
+---
+
+When using the collapse clause on a loop nest the default behaviour is to
+automatically extend the representation of the loop counter to 64 bits for
+the cases where the sizes of the collapsed loops are not known at compile
+time. To prevent this conservative choice and use at most 32 bits,
+compile your program with the `-fopenmp-optimistic-collapse`.
+
+
 Features not supported or with limited support for Cuda devices
 ---
 

Modified: cfe/trunk/include/clang/Basic/LangOptions.def
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/LangOptions.def?rev=350758=350757=350758=diff
==
--- cfe/trunk/include/clang/Basic/LangOptions.def (original)
+++ cfe/trunk/include/clang/Basic/LangOptions.def Wed Jan  9 12:38:35 2019
@@ -207,6 +207,7 @@ LANGOPT(OpenMPCUDAForceFullRuntime , 1,
 LANGOPT(OpenMPHostCXXExceptions, 1, 0, "C++ exceptions handling in the 
host code.")
 LANGOPT(OpenMPCUDANumSMs  , 32, 0, "Number of SMs for CUDA devices.")
 LANGOPT(OpenMPCUDABlocksPerSM  , 32, 0, "Number of blocks per SM for CUDA 
devices.")
+LANGOPT(OpenMPOptimisticCollapse  , 1, 0, "Use at most 32 bits to represent 
the collapsed loop nest counter.")
 LANGOPT(RenderScript  , 1, 0, "RenderScript")
 
 LANGOPT(CUDAIsDevice  , 1, 0, "compiling for CUDA device")

Modified: cfe/trunk/include/clang/Driver/Options.td
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Driver/Options.td?rev=350758=350757=350758=diff
==
--- cfe/trunk/include/clang/Driver/Options.td (original)
+++ cfe/trunk/include/clang/Driver/Options.td Wed Jan  9 12:38:35 2019
@@ -1574,6 +1574,10 @@ def fopenmp_cuda_number_of_sm_EQ : Joine
   Flags<[CC1Option, NoArgumentUnused, HelpHidden]>;
 def fopenmp_cuda_blocks_per_sm_EQ : Joined<["-"], 
"fopenmp-cuda-blocks-per-sm=">, Group,
   Flags<[CC1Option, NoArgumentUnused, HelpHidden]>;
+def fopenmp_optimistic_collapse : Flag<["-"], "fopenmp-optimistic-collapse">, 
Group,
+  Flags<[CC1Option, NoArgumentUnused, HelpHidden]>;
+def fno_openmp_optimistic_collapse : Flag<["-"], 
"fno-openmp-optimistic-collapse">, Group,
+  Flags<[NoArgumentUnused, HelpHidden]>;
 def fno_optimize_sibling_calls : Flag<["-"], "fno-optimize-sibling-calls">, 
Group;
 def foptimize_sibling_calls : Flag<["-"], "foptimize-sibling-calls">, 
Group;
 def fno_escaping_block_tail_calls : Flag<["-"], 
"fno-escaping-block-tail-calls">, Group, Flags<[CC1Option]>;

Modified: cfe/trunk/lib/Driver/ToolChains/Clang.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/Clang.cpp?rev=350758=350757=350758=diff
==
--- cfe/trunk/lib/Driver/ToolChains/Clang.cpp (original)
+++ cfe/trunk/lib/Driver/ToolChains/Clang.cpp Wed Jan  9 12:38:35 2019
@@ -4434,6 +4434,10 @@ void Clang::ConstructJob(Compilation ,
   Args.AddAllArgs(CmdArgs, options::OPT_fopenmp_version_EQ);
   Args.AddAllArgs(CmdArgs, options::OPT_fopenmp_cuda_number_of_sm_EQ);
   Args.AddAllArgs(CmdArgs, options::OPT_fopenmp_cuda_blocks_per_sm_EQ);
+  if (Args.hasFlag(options::OPT_fopenmp_optimistic_collapse,
+

r347915 - [OpenMP] Add a new version of the SPMD deinit kernel function

2018-11-29 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Thu Nov 29 12:53:49 2018
New Revision: 347915

URL: http://llvm.org/viewvc/llvm-project?rev=347915=rev
Log:
[OpenMP] Add a new version of the SPMD deinit kernel function

Summary: This patch adds a new runtime for the SPMD deinit kernel function 
which replaces the previous function. The new function takes as argument the 
flag which signals whether the runtime is required or not. This enables the 
compiler to optimize out the part of the deinit function which are not needed.

Reviewers: ABataev, caomhin

Reviewed By: ABataev

Subscribers: jholewinski, guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D54970

Modified:
cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
cfe/trunk/test/OpenMP/nvptx_target_parallel_codegen.cpp
cfe/trunk/test/OpenMP/nvptx_target_parallel_proc_bind_codegen.cpp
cfe/trunk/test/OpenMP/nvptx_target_parallel_reduction_codegen.cpp
cfe/trunk/test/OpenMP/nvptx_target_teams_codegen.cpp
cfe/trunk/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp

cfe/trunk/test/OpenMP/nvptx_target_teams_distribute_parallel_for_generic_mode_codegen.cpp

cfe/trunk/test/OpenMP/nvptx_target_teams_distribute_parallel_for_simd_codegen.cpp
cfe/trunk/test/OpenMP/nvptx_teams_reduction_codegen.cpp

Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp?rev=347915=347914=347915=diff
==
--- cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp Thu Nov 29 12:53:49 2018
@@ -33,8 +33,8 @@ enum OpenMPRTLFunctionNVPTX {
   /// Call to void __kmpc_spmd_kernel_init(kmp_int32 thread_limit,
   /// int16_t RequiresOMPRuntime, int16_t RequiresDataSharing);
   OMPRTL_NVPTX__kmpc_spmd_kernel_init,
-  /// Call to void __kmpc_spmd_kernel_deinit();
-  OMPRTL_NVPTX__kmpc_spmd_kernel_deinit,
+  /// Call to void __kmpc_spmd_kernel_deinit_v2(int16_t RequiresOMPRuntime);
+  OMPRTL_NVPTX__kmpc_spmd_kernel_deinit_v2,
   /// Call to void __kmpc_kernel_prepare_parallel(void
   /// *outlined_function, int16_t
   /// IsOMPRuntimeInitialized);
@@ -1413,8 +1413,11 @@ void CGOpenMPRuntimeNVPTX::emitSPMDEntry
 
   CGF.EmitBlock(OMPDeInitBB);
   // DeInitialize the OMP state in the runtime; called by all active threads.
+  llvm::Value *Args[] = {/*RequiresOMPRuntime=*/
+ CGF.Builder.getInt16(RequiresFullRuntime ? 1 : 0)};
   CGF.EmitRuntimeCall(
-  createNVPTXRuntimeFunction(OMPRTL_NVPTX__kmpc_spmd_kernel_deinit), None);
+  createNVPTXRuntimeFunction(
+  OMPRTL_NVPTX__kmpc_spmd_kernel_deinit_v2), Args);
   CGF.EmitBranch(EST.ExitBB);
 
   CGF.EmitBlock(EST.ExitBB);
@@ -1597,11 +1600,12 @@ CGOpenMPRuntimeNVPTX::createNVPTXRuntime
 RTLFn = CGM.CreateRuntimeFunction(FnTy, "__kmpc_spmd_kernel_init");
 break;
   }
-  case OMPRTL_NVPTX__kmpc_spmd_kernel_deinit: {
-// Build void __kmpc_spmd_kernel_deinit();
+  case OMPRTL_NVPTX__kmpc_spmd_kernel_deinit_v2: {
+// Build void __kmpc_spmd_kernel_deinit_v2(int16_t RequiresOMPRuntime);
+llvm::Type *TypeParams[] = {CGM.Int16Ty};
 auto *FnTy =
-llvm::FunctionType::get(CGM.VoidTy, llvm::None, /*isVarArg*/ false);
-RTLFn = CGM.CreateRuntimeFunction(FnTy, "__kmpc_spmd_kernel_deinit");
+llvm::FunctionType::get(CGM.VoidTy, TypeParams, /*isVarArg*/ false);
+RTLFn = CGM.CreateRuntimeFunction(FnTy, "__kmpc_spmd_kernel_deinit_v2");
 break;
   }
   case OMPRTL_NVPTX__kmpc_kernel_prepare_parallel: {

Modified: cfe/trunk/test/OpenMP/nvptx_target_parallel_codegen.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/nvptx_target_parallel_codegen.cpp?rev=347915=347914=347915=diff
==
--- cfe/trunk/test/OpenMP/nvptx_target_parallel_codegen.cpp (original)
+++ cfe/trunk/test/OpenMP/nvptx_target_parallel_codegen.cpp Thu Nov 29 12:53:49 
2018
@@ -68,7 +68,7 @@ int bar(int n){
   // CHECK: br label {{%?}}[[DONE:.+]]
   //
   // CHECK: [[DONE]]
-  // CHECK: call void @__kmpc_spmd_kernel_deinit()
+  // CHECK: call void @__kmpc_spmd_kernel_deinit_v2(i16 1)
   // CHECK: br label {{%?}}[[EXIT:.+]]
   //
   // CHECK: [[EXIT]]
@@ -111,7 +111,7 @@ int bar(int n){
   // CHECK: br label {{%?}}[[DONE:.+]]
   //
   // CHECK: [[DONE]]
-  // CHECK: call void @__kmpc_spmd_kernel_deinit()
+  // CHECK: call void @__kmpc_spmd_kernel_deinit_v2(i16 1)
   // CHECK: br label {{%?}}[[EXIT:.+]]
   //
   // CHECK: [[EXIT]]

Modified: cfe/trunk/test/OpenMP/nvptx_target_parallel_proc_bind_codegen.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/nvptx_target_parallel_proc_bind_codegen.cpp?rev=347915=347914=347915=diff
==
--- 

r345527 - [OpenMP] Fix condition.

2018-10-29 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Mon Oct 29 12:44:25 2018
New Revision: 345527

URL: http://llvm.org/viewvc/llvm-project?rev=345527=rev
Log:
[OpenMP] Fix condition.

Summary: Iteration variable must be strictly less than the number of 
iterations. This fixes a bug introduced by previous patch D53448.

Reviewers: ABataev, caomhin

Reviewed By: ABataev

Subscribers: guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D53827

Modified:
cfe/trunk/lib/Sema/SemaOpenMP.cpp
cfe/trunk/test/OpenMP/distribute_parallel_for_codegen.cpp
cfe/trunk/test/OpenMP/distribute_parallel_for_simd_codegen.cpp
cfe/trunk/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp

Modified: cfe/trunk/lib/Sema/SemaOpenMP.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Sema/SemaOpenMP.cpp?rev=345527=345526=345527=diff
==
--- cfe/trunk/lib/Sema/SemaOpenMP.cpp (original)
+++ cfe/trunk/lib/Sema/SemaOpenMP.cpp Mon Oct 29 12:44:25 2018
@@ -5299,7 +5299,8 @@ checkOpenMPLoop(OpenMPDirectiveKind DKin
   ExprResult CombDistCond;
   if (isOpenMPLoopBoundSharingDirective(DKind)) {
 CombDistCond =
-SemaRef.BuildBinOp(CurScope, CondLoc, BO_LE, IV.get(), 
NumIterations.get());
+SemaRef.BuildBinOp(
+CurScope, CondLoc, BO_LT, IV.get(), NumIterations.get());
   }
 
   ExprResult CombCond;

Modified: cfe/trunk/test/OpenMP/distribute_parallel_for_codegen.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/distribute_parallel_for_codegen.cpp?rev=345527=345526=345527=diff
==
--- cfe/trunk/test/OpenMP/distribute_parallel_for_codegen.cpp (original)
+++ cfe/trunk/test/OpenMP/distribute_parallel_for_codegen.cpp Mon Oct 29 
12:44:25 2018
@@ -447,7 +447,7 @@ int main() {
   // LAMBDA-DAG: [[OMP_IV_VAL_1:%.+]] = load {{.+}} [[OMP_IV]],
   // LAMBDA-DAG: [[OMP_UB_VAL_3:%.+]] = load {{.+}}
   // LAMBDA-DAG: [[OMP_UB_VAL_3_PLUS_ONE:%.+]] = add {{.+}} 
[[OMP_UB_VAL_3]], 1
-  // LAMBDA: [[CMP_IV_UB:%.+]] = icmp sle {{.+}} [[OMP_IV_VAL_1]], 
[[OMP_UB_VAL_3_PLUS_ONE]]
+  // LAMBDA: [[CMP_IV_UB:%.+]] = icmp slt {{.+}} [[OMP_IV_VAL_1]], 
[[OMP_UB_VAL_3_PLUS_ONE]]
   // LAMBDA: br {{.+}} [[CMP_IV_UB]], label %[[DIST_INNER_LOOP_BODY:.+]], 
label %[[DIST_INNER_LOOP_END:.+]]
 
   // check that PrevLB and PrevUB are passed to the 'for'
@@ -1210,7 +1210,7 @@ int main() {
 // CHECK-DAG: [[OMP_IV_VAL_1:%.+]] = load {{.+}} [[OMP_IV]],
 // CHECK-DAG: [[OMP_UB_VAL_3:%.+]] = load {{.+}}
 // CHECK-DAG: [[OMP_UB_VAL_3_PLUS_ONE:%.+]] = add {{.+}} [[OMP_UB_VAL_3]], 
1
-// CHECK: [[CMP_IV_UB:%.+]] = icmp sle {{.+}} [[OMP_IV_VAL_1]], 
[[OMP_UB_VAL_3_PLUS_ONE]]
+// CHECK: [[CMP_IV_UB:%.+]] = icmp slt {{.+}} [[OMP_IV_VAL_1]], 
[[OMP_UB_VAL_3_PLUS_ONE]]
 // CHECK: br {{.+}} [[CMP_IV_UB]], label %[[DIST_INNER_LOOP_BODY:.+]], 
label %[[DIST_INNER_LOOP_END:.+]]
 
 // check that PrevLB and PrevUB are passed to the 'for'
@@ -1938,7 +1938,7 @@ int main() {
 // CHECK-DAG: [[OMP_IV_VAL_1:%.+]] = load {{.+}} [[OMP_IV]],
 // CHECK-DAG: [[OMP_UB_VAL_3:%.+]] = load {{.+}}
 // CHECK-DAG: [[OMP_UB_VAL_3_PLUS_ONE:%.+]] = add {{.+}} [[OMP_UB_VAL_3]], 1
-// CHECK: [[CMP_IV_UB:%.+]] = icmp sle {{.+}} [[OMP_IV_VAL_1]], 
[[OMP_UB_VAL_3_PLUS_ONE]]
+// CHECK: [[CMP_IV_UB:%.+]] = icmp slt {{.+}} [[OMP_IV_VAL_1]], 
[[OMP_UB_VAL_3_PLUS_ONE]]
 // CHECK: br {{.+}} [[CMP_IV_UB]], label %[[DIST_INNER_LOOP_BODY:.+]], label 
%[[DIST_INNER_LOOP_END:.+]]
 
 // check that PrevLB and PrevUB are passed to the 'for'

Modified: cfe/trunk/test/OpenMP/distribute_parallel_for_simd_codegen.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/distribute_parallel_for_simd_codegen.cpp?rev=345527=345526=345527=diff
==
--- cfe/trunk/test/OpenMP/distribute_parallel_for_simd_codegen.cpp (original)
+++ cfe/trunk/test/OpenMP/distribute_parallel_for_simd_codegen.cpp Mon Oct 29 
12:44:25 2018
@@ -446,7 +446,7 @@ int main() {
   // LAMBDA-DAG: [[OMP_IV_VAL_1:%.+]] = load {{.+}} [[OMP_IV]],
   // LAMBDA-DAG: [[OMP_UB_VAL_3:%.+]] = load {{.+}}
   // LAMBDA-DAG: [[OMP_UB_VAL_3_PLUS_ONE:%.+]] = add {{.+}} 
[[OMP_UB_VAL_3]], 1
-  // LAMBDA: [[CMP_IV_UB:%.+]] = icmp sle {{.+}} [[OMP_IV_VAL_1]], 
[[OMP_UB_VAL_3_PLUS_ONE]]
+  // LAMBDA: [[CMP_IV_UB:%.+]] = icmp slt {{.+}} [[OMP_IV_VAL_1]], 
[[OMP_UB_VAL_3_PLUS_ONE]]
   // LAMBDA: br {{.+}} [[CMP_IV_UB]], label %[[DIST_INNER_LOOP_BODY:.+]], 
label %[[DIST_INNER_LOOP_END:.+]]
 
   // check that PrevLB and PrevUB are passed to the 'for'
@@ -1209,7 +1209,7 @@ int main() {
 // CHECK-DAG: [[OMP_IV_VAL_1:%.+]] = load {{.+}} [[OMP_IV]],
 // CHECK-DAG: [[OMP_UB_VAL_3:%.+]] = load {{.+}}
 // CHECK-DAG: [[OMP_UB_VAL_3_PLUS_ONE:%.+]] = add {{.+}} [[OMP_UB_VAL_3]], 
1
-// CHECK: 

r345509 - [OpenMP][NVPTX] Use single loops when generating code for distribute parallel for

2018-10-29 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Mon Oct 29 08:45:47 2018
New Revision: 345509

URL: http://llvm.org/viewvc/llvm-project?rev=345509=rev
Log:
[OpenMP][NVPTX] Use single loops when generating code for distribute parallel 
for

Summary: This patch adds a new code generation path for bound sharing 
directives containing distribute parallel for. The new code generation scheme 
applies to chunked schedules on distribute and parallel for directives. The 
scheme simplifies the code that is being generated by eliminating the need for 
an outer for loop over chunks for both distribute and parallel for directives. 
In the case of distribute it applies to any sized chunk while in the parallel 
for case it only applies when chunk size is 1.

Reviewers: ABataev, caomhin

Reviewed By: ABataev

Subscribers: jholewinski, guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D53448

Modified:
cfe/trunk/include/clang/AST/StmtOpenMP.h
cfe/trunk/lib/AST/StmtOpenMP.cpp
cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp
cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h
cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.h
cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp
cfe/trunk/lib/Sema/SemaOpenMP.cpp
cfe/trunk/lib/Serialization/ASTReaderStmt.cpp
cfe/trunk/lib/Serialization/ASTWriterStmt.cpp
cfe/trunk/test/OpenMP/distribute_parallel_for_codegen.cpp
cfe/trunk/test/OpenMP/distribute_parallel_for_simd_codegen.cpp
cfe/trunk/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp

Modified: cfe/trunk/include/clang/AST/StmtOpenMP.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/AST/StmtOpenMP.h?rev=345509=345508=345509=diff
==
--- cfe/trunk/include/clang/AST/StmtOpenMP.h (original)
+++ cfe/trunk/include/clang/AST/StmtOpenMP.h Mon Oct 29 08:45:47 2018
@@ -392,9 +392,11 @@ class OMPLoopDirective : public OMPExecu
 CombinedConditionOffset = 25,
 CombinedNextLowerBoundOffset = 26,
 CombinedNextUpperBoundOffset = 27,
+CombinedDistConditionOffset = 28,
+CombinedParForInDistConditionOffset = 29,
 // Offset to the end (and start of the following counters/updates/finals
 // arrays) for combined distribute loop directives.
-CombinedDistributeEnd = 28,
+CombinedDistributeEnd = 30,
   };
 
   /// Get the counters storage.
@@ -605,6 +607,17 @@ protected:
"expected loop bound sharing directive");
 *std::next(child_begin(), CombinedNextUpperBoundOffset) = CombNUB;
   }
+  void setCombinedDistCond(Expr *CombDistCond) {
+assert(isOpenMPLoopBoundSharingDirective(getDirectiveKind()) &&
+   "expected loop bound distribute sharing directive");
+*std::next(child_begin(), CombinedDistConditionOffset) = CombDistCond;
+  }
+  void setCombinedParForInDistCond(Expr *CombParForInDistCond) {
+assert(isOpenMPLoopBoundSharingDirective(getDirectiveKind()) &&
+   "expected loop bound distribute sharing directive");
+*std::next(child_begin(),
+   CombinedParForInDistConditionOffset) = CombParForInDistCond;
+  }
   void setCounters(ArrayRef A);
   void setPrivateCounters(ArrayRef A);
   void setInits(ArrayRef A);
@@ -637,6 +650,13 @@ public:
 /// Update of UpperBound for statically scheduled omp loops for
 /// outer loop in combined constructs (e.g. 'distribute parallel for')
 Expr *NUB;
+/// Distribute Loop condition used when composing 'omp distribute'
+///  with 'omp for' in a same construct when schedule is chunked.
+Expr *DistCond;
+/// 'omp parallel for' loop condition used when composed with
+/// 'omp distribute' in the same construct and when schedule is
+/// chunked and the chunk size is 1.
+Expr *ParForInDistCond;
   };
 
   /// The expressions built for the OpenMP loop CodeGen for the
@@ -754,6 +774,8 @@ public:
   DistCombinedFields.Cond = nullptr;
   DistCombinedFields.NLB = nullptr;
   DistCombinedFields.NUB = nullptr;
+  DistCombinedFields.DistCond = nullptr;
+  DistCombinedFields.ParForInDistCond = nullptr;
 }
   };
 
@@ -922,6 +944,18 @@ public:
 return const_cast(reinterpret_cast(
 *std::next(child_begin(), CombinedNextUpperBoundOffset)));
   }
+  Expr *getCombinedDistCond() const {
+assert(isOpenMPLoopBoundSharingDirective(getDirectiveKind()) &&
+   "expected loop bound distribute sharing directive");
+return const_cast(reinterpret_cast(
+*std::next(child_begin(), CombinedDistConditionOffset)));
+  }
+  Expr *getCombinedParForInDistCond() const {
+assert(isOpenMPLoopBoundSharingDirective(getDirectiveKind()) &&
+   "expected loop bound distribute sharing directive");
+return const_cast(reinterpret_cast(
+*std::next(child_begin(), CombinedParForInDistConditionOffset)));
+  }
   const Stmt *getBody() const {
 // This relies on the loop form is already checked 

r345507 - [OpenMP][NVPTX] Enable default scheduling for parallel for in non-SPMD cases.

2018-10-29 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Mon Oct 29 08:23:23 2018
New Revision: 345507

URL: http://llvm.org/viewvc/llvm-project?rev=345507=rev
Log:
[OpenMP][NVPTX] Enable default scheduling for parallel for in non-SPMD cases.

Summary: This patch enables the choosing of the default schedule for parallel 
for loops even in non-SPMD cases.

Reviewers: ABataev, caomhin

Reviewed By: ABataev

Subscribers: jholewinski, guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D53443

Modified:
cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
cfe/trunk/test/OpenMP/nvptx_parallel_for_codegen.cpp

Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp?rev=345507=345506=345507=diff
==
--- cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp Mon Oct 29 08:23:23 2018
@@ -4238,16 +4238,17 @@ void CGOpenMPRuntimeNVPTX::getDefaultDis
 Chunk = CGF.EmitScalarConversion(getNVPTXNumThreads(CGF),
 CGF.getContext().getIntTypeForBitwidth(32, /*Signed=*/0),
 S.getIterationVariable()->getType(), S.getBeginLoc());
+return;
   }
+  CGOpenMPRuntime::getDefaultDistScheduleAndChunk(
+  CGF, S, ScheduleKind, Chunk);
 }
 
 void CGOpenMPRuntimeNVPTX::getDefaultScheduleAndChunk(
 CodeGenFunction , const OMPLoopDirective ,
 OpenMPScheduleClauseKind ,
 llvm::Value *) const {
-  if (getExecutionMode() == CGOpenMPRuntimeNVPTX::EM_SPMD) {
-ScheduleKind = OMPC_SCHEDULE_static;
-Chunk = CGF.Builder.getIntN(CGF.getContext().getTypeSize(
-S.getIterationVariable()->getType()), 1);
-  }
+  ScheduleKind = OMPC_SCHEDULE_static;
+  Chunk = CGF.Builder.getIntN(CGF.getContext().getTypeSize(
+  S.getIterationVariable()->getType()), 1);
 }

Modified: cfe/trunk/test/OpenMP/nvptx_parallel_for_codegen.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/nvptx_parallel_for_codegen.cpp?rev=345507=345506=345507=diff
==
--- cfe/trunk/test/OpenMP/nvptx_parallel_for_codegen.cpp (original)
+++ cfe/trunk/test/OpenMP/nvptx_parallel_for_codegen.cpp Mon Oct 29 08:23:23 
2018
@@ -57,7 +57,10 @@ int bar(int n){
 // CHECK: store i32 0, {{.*}} [[OMP_LB:%.+]],
 // CHECK: store i32 9, {{.*}} [[OMP_UB:%.+]],
 // CHECK: store i32 1, {{.*}} [[OMP_ST:%.+]],
-// CHECK: call void @__kmpc_for_static_init_4({{.*}} i32 34, {{.*}} 
[[OMP_LB]], {{.*}} [[OMP_UB]], {{.*}} [[OMP_ST]], i32 1, i32 1)
+// CHECK: call void @__kmpc_for_static_init_4({{.*}} i32 33, {{.*}} 
[[OMP_LB]], {{.*}} [[OMP_UB]], {{.*}} [[OMP_ST]], i32 1, i32 1)
+// CHECK: br label %[[OMP_DISPATCH_COND:.+]]
+
+// CHECK: [[OMP_DISPATCH_COND]]
 // CHECK: [[OMP_UB_1:%.+]] = load {{.*}} [[OMP_UB]]
 // CHECK: [[COMP_1:%.+]] = icmp sgt {{.*}} [[OMP_UB_1]]
 // CHECK: br i1 [[COMP_1]], label %[[COND_TRUE:.+]], label %[[COND_FALSE:.+]]
@@ -74,6 +77,12 @@ int bar(int n){
 // CHECK: store i32 [[COND_RES]], i32* [[OMP_UB]]
 // CHECK: [[OMP_LB_1:%.+]] = load i32, i32* [[OMP_LB]]
 // CHECK: store i32 [[OMP_LB_1]], i32* [[OMP_IV]]
+// CHECK: [[OMP_IV_1:%.+]] = load i32, i32* [[OMP_IV]]
+// CHECK: [[OMP_UB_3:%.+]] = load i32, i32* [[OMP_UB]]
+// CHECK: [[COMP_2:%.+]] = icmp sle i32 [[OMP_IV_1]], [[OMP_UB_3]]
+// CHECK: br i1 [[COMP_2]], label %[[DISPATCH_BODY:.+]], label 
%[[DISPATCH_END:.+]]
+
+// CHECK: [[DISPATCH_BODY]]
 // CHECK: br label %[[OMP_INNER_FOR_COND:.+]]
 
 // CHECK: [[OMP_INNER_FOR_COND]]
@@ -94,7 +103,20 @@ int bar(int n){
 // CHECK: store i32 [[ADD_1]], i32* [[OMP_IV]]
 // CHECK: br label %[[OMP_INNER_FOR_COND]]
 
-// CHECK: [[OMP_INNER_FOR_END]]
+// CHECK: [[OMP_INNER_FOR_COND]]
+// CHECK: br label %[[OMP_DISPATCH_INC:.+]]
+
+// CHECK: [[OMP_DISPATCH_INC]]
+// CHECK: [[OMP_LB_2:%.+]] = load i32, i32* [[OMP_LB]]
+// CHECK: [[OMP_ST_1:%.+]] = load i32, i32* [[OMP_ST]]
+// CHECK: [[ADD_2:%.+]] = add nsw i32 [[OMP_LB_2]], [[OMP_ST_1]]
+// CHECK: store i32 [[ADD_2]], i32* [[OMP_LB]]
+// CHECK: [[OMP_UB_5:%.+]] = load i32, i32* [[OMP_UB]]
+// CHECK: [[OMP_ST_2:%.+]] = load i32, i32* [[OMP_ST]]
+// CHECK: [[ADD_3:%.+]] = add nsw i32 [[OMP_UB_5]], [[OMP_ST_2]]
+// CHECK: store i32 [[ADD_3]], i32* [[OMP_UB]]
+
+// CHECK: [[DISPATCH_END]]
 // CHECK: call void @__kmpc_for_static_fini(
 // CHECK: ret void
 


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r345417 - [NFC][OpenMP] Add new test for parallel for code generation.

2018-10-26 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Fri Oct 26 11:59:52 2018
New Revision: 345417

URL: http://llvm.org/viewvc/llvm-project?rev=345417=rev
Log:
[NFC][OpenMP] Add new test for parallel for code generation.

Summary:
This is a simple test of the parallel for code generation. It will be used to 
showcase the change introduced by patch D53443.


Reviewers: ABataev, caomhin

Reviewed By: ABataev

Subscribers: guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D53772

Added:
cfe/trunk/test/OpenMP/nvptx_parallel_for_codegen.cpp

Added: cfe/trunk/test/OpenMP/nvptx_parallel_for_codegen.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/nvptx_parallel_for_codegen.cpp?rev=345417=auto
==
--- cfe/trunk/test/OpenMP/nvptx_parallel_for_codegen.cpp (added)
+++ cfe/trunk/test/OpenMP/nvptx_parallel_for_codegen.cpp Fri Oct 26 11:59:52 
2018
@@ -0,0 +1,101 @@
+// Test target codegen - host bc file has to be created first.
+// RUN: %clang_cc1 -verify -fopenmp -x c++ -triple powerpc64le-unknown-unknown 
-fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm-bc %s -o %t-ppc-host.bc
+// RUN: %clang_cc1 -verify -fopenmp -x c++ -triple nvptx64-unknown-unknown 
-fopenmp-targets=nvptx64-nvidia-cuda -emit-llvm %s -fopenmp-is-device 
-fopenmp-host-ir-file-path %t-ppc-host.bc -o - | FileCheck %s --check-prefix 
CHECK --check-prefix CHECK-64
+// expected-no-diagnostics
+#ifndef HEADER
+#define HEADER
+
+template
+tx ftemplate(int n) {
+  tx b[10];
+
+  #pragma omp target
+  {
+tx d = n;
+#pragma omp parallel for
+for(int i=0; i<10; i++) {
+  b[i] += d;
+}
+b[3] += 1;
+  }
+
+  return b[3];
+}
+
+int bar(int n){
+  int a = 0;
+
+  a += ftemplate(n);
+
+  return a;
+}
+
+// CHECK-LABEL: define {{.*}}void 
{{@__omp_offloading_.+template.+l12}}_worker()
+// CHECK: call void @llvm.nvvm.barrier0()
+// CHECK: call i1 @__kmpc_kernel_parallel(
+// CHECK: call void @__omp_outlined___wrapper(
+
+// CHECK: define weak void @__omp_offloading_{{.*}}l12(
+// CHECK: call void @__omp_offloading_{{.*}}l12_worker()
+// CHECK: call void @__kmpc_kernel_init(
+// CHECK: call void @__kmpc_data_sharing_init_stack()
+// CHECK: call i8* @__kmpc_data_sharing_push_stack(i64 4, i16 0)
+// CHECK: call void @__kmpc_kernel_prepare_parallel(
+// CHECK: call void @__kmpc_begin_sharing_variables({{.*}}, i64 2)
+// CHECK: call void @llvm.nvvm.barrier0()
+// CHECK: call void @llvm.nvvm.barrier0()
+// CHECK: call void @__kmpc_end_sharing_variables()
+// CHECK: call void @__kmpc_data_sharing_pop_stack(
+// CHECK: call void @__kmpc_kernel_deinit(i16 1)
+
+// CHECK: define internal void @__omp_outlined__(
+// CHECK: alloca
+// CHECK: alloca
+// CHECK: alloca
+// CHECK: alloca
+// CHECK: [[OMP_IV:%.*]] = alloca i32
+// CHECK: store i32 0, {{.*}} [[OMP_LB:%.+]],
+// CHECK: store i32 9, {{.*}} [[OMP_UB:%.+]],
+// CHECK: store i32 1, {{.*}} [[OMP_ST:%.+]],
+// CHECK: call void @__kmpc_for_static_init_4({{.*}} i32 34, {{.*}} 
[[OMP_LB]], {{.*}} [[OMP_UB]], {{.*}} [[OMP_ST]], i32 1, i32 1)
+// CHECK: [[OMP_UB_1:%.+]] = load {{.*}} [[OMP_UB]]
+// CHECK: [[COMP_1:%.+]] = icmp sgt {{.*}} [[OMP_UB_1]]
+// CHECK: br i1 [[COMP_1]], label %[[COND_TRUE:.+]], label %[[COND_FALSE:.+]]
+
+// CHECK: [[COND_TRUE]]
+// CHECK: br label %[[COND_END:.+]]
+
+// CHECK: [[COND_FALSE]]
+// CHECK: [[OMP_UB_2:%.+]] = load {{.*}}* [[OMP_UB]]
+// CHECK: br label %[[COND_END]]
+
+// CHECK: [[COND_END]]
+// CHECK: [[COND_RES:%.+]] = phi i32 [ 9, %[[COND_TRUE]] ], [ [[OMP_UB_2]], 
%[[COND_FALSE]] ]
+// CHECK: store i32 [[COND_RES]], i32* [[OMP_UB]]
+// CHECK: [[OMP_LB_1:%.+]] = load i32, i32* [[OMP_LB]]
+// CHECK: store i32 [[OMP_LB_1]], i32* [[OMP_IV]]
+// CHECK: br label %[[OMP_INNER_FOR_COND:.+]]
+
+// CHECK: [[OMP_INNER_FOR_COND]]
+// CHECK: [[OMP_IV_2:%.+]] = load i32, i32* [[OMP_IV]]
+// CHECK: [[OMP_UB_4:%.+]] = load i32, i32* [[OMP_UB]]
+// CHECK: [[COMP_3:%.+]] = icmp sle i32 [[OMP_IV_2]], [[OMP_UB_4]]
+// CHECK: br i1 [[COMP_3]], label %[[OMP_INNER_FOR_BODY:.+]], label 
%[[OMP_INNER_FOR_END:.+]]
+
+// CHECK: [[OMP_INNER_FOR_BODY]]
+// CHECK: br label %[[OMP_BODY_CONTINUE:.+]]
+
+// CHECK: [[OMP_BODY_CONTINUE]]
+// CHECK: br label %[[OMP_INNER_FOR_INC:.+]]
+
+// CHECK: [[OMP_INNER_FOR_INC]]
+// CHECK: [[OMP_IV_3:%.+]] = load i32, i32* [[OMP_IV]]
+// CHECK: [[ADD_1:%.+]] = add nsw i32 [[OMP_IV_3]], 1
+// CHECK: store i32 [[ADD_1]], i32* [[OMP_IV]]
+// CHECK: br label %[[OMP_INNER_FOR_COND]]
+
+// CHECK: [[OMP_INNER_FOR_END]]
+// CHECK: call void @__kmpc_for_static_fini(
+// CHECK: ret void
+
+#endif


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r343260 - [OpenMP] Make default parallel for schedule in NVPTX target regions in SPMD mode achieve coalescing

2018-09-27 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Thu Sep 27 13:29:00 2018
New Revision: 343260

URL: http://llvm.org/viewvc/llvm-project?rev=343260=rev
Log:
[OpenMP] Make default parallel for schedule in NVPTX target regions in SPMD 
mode achieve coalescing

Summary: Set default schedule for parallel for loops to schedule(static, 1) 
when using SPMD mode on the NVPTX device offloading toolchain to ensure 
coalescing.

Reviewers: ABataev, Hahnfeld, caomhin

Reviewed By: ABataev

Subscribers: jholewinski, guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D52629

Modified:
cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h
cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.h
cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp
cfe/trunk/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp

cfe/trunk/test/OpenMP/nvptx_target_teams_distribute_parallel_for_simd_codegen.cpp

Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h?rev=343260=343259=343260=diff
==
--- cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h (original)
+++ cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h Thu Sep 27 13:29:00 2018
@@ -1496,6 +1496,12 @@ public:
   const OMPLoopDirective , OpenMPDistScheduleClauseKind ,
   llvm::Value *) const {}
 
+  /// Choose default schedule type and chunk value for the
+  /// schedule clause.
+  virtual void getDefaultScheduleAndChunk(CodeGenFunction ,
+  const OMPLoopDirective , OpenMPScheduleClauseKind ,
+  llvm::Value *) const {}
+
   /// Emits call of the outlined function with the provided arguments,
   /// translating these arguments to correct target-specific arguments.
   virtual void

Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp?rev=343260=343259=343260=diff
==
--- cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp Thu Sep 27 13:29:00 2018
@@ -4093,3 +4093,14 @@ void CGOpenMPRuntimeNVPTX::getDefaultDis
 S.getIterationVariable()->getType(), S.getBeginLoc());
   }
 }
+
+void CGOpenMPRuntimeNVPTX::getDefaultScheduleAndChunk(
+CodeGenFunction , const OMPLoopDirective ,
+OpenMPScheduleClauseKind ,
+llvm::Value *) const {
+  if (getExecutionMode() == CGOpenMPRuntimeNVPTX::EM_SPMD) {
+ScheduleKind = OMPC_SCHEDULE_static;
+Chunk = CGF.Builder.getIntN(CGF.getContext().getTypeSize(
+S.getIterationVariable()->getType()), 1);
+  }
+}

Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.h?rev=343260=343259=343260=diff
==
--- cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.h (original)
+++ cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.h Thu Sep 27 13:29:00 2018
@@ -340,11 +340,16 @@ public:
   ///
   void functionFinished(CodeGenFunction ) override;
 
-  /// Choose a default value for the schedule clause.
+  /// Choose a default value for the dist_schedule clause.
   void getDefaultDistScheduleAndChunk(CodeGenFunction ,
   const OMPLoopDirective , OpenMPDistScheduleClauseKind ,
   llvm::Value *) const override;
 
+  /// Choose a default value for the schedule clause.
+  void getDefaultScheduleAndChunk(CodeGenFunction ,
+  const OMPLoopDirective , OpenMPScheduleClauseKind ,
+  llvm::Value *) const override;
+
 private:
   /// Track the execution mode when codegening directives within a target
   /// region. The appropriate mode (SPMD/NON-SPMD) is set on entry to the

Modified: cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp?rev=343260=343259=343260=diff
==
--- cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp Thu Sep 27 13:29:00 2018
@@ -2310,6 +2310,10 @@ bool CodeGenFunction::EmitOMPWorksharing
S.getIterationVariable()->getType(),
S.getBeginLoc());
 }
+  } else {
+// Default behaviour for schedule clause.
+CGM.getOpenMPRuntime().getDefaultScheduleAndChunk(
+*this, S, ScheduleKind.Schedule, Chunk);
   }
   const unsigned IVSize = getContext().getTypeSize(IVExpr->getType());
   const bool IVSigned = 
IVExpr->getType()->hasSignedIntegerRepresentation();

Modified: 
cfe/trunk/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp
URL: 

r343253 - [OpenMP] Make default distribute schedule for NVPTX target regions in SPMD mode achieve coalescing

2018-09-27 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Thu Sep 27 12:22:56 2018
New Revision: 343253

URL: http://llvm.org/viewvc/llvm-project?rev=343253=rev
Log:
[OpenMP] Make default distribute schedule for NVPTX target regions in SPMD mode 
achieve coalescing

Summary: For the OpenMP NVPTX toolchain choose a default distribute schedule 
that ensures coalescing on the GPU when in SPMD mode. This significantly 
increases the performance of offloaded target code and reduces the number of 
registers used on the GPU side.

Reviewers: ABataev, caomhin, Hahnfeld

Reviewed By: ABataev, Hahnfeld

Subscribers: Hahnfeld, jholewinski, guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D52434

Modified:
cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h
cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.h
cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp
cfe/trunk/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp

cfe/trunk/test/OpenMP/nvptx_target_teams_distribute_parallel_for_simd_codegen.cpp

Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h?rev=343253=343252=343253=diff
==
--- cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h (original)
+++ cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h Thu Sep 27 12:22:56 2018
@@ -1490,6 +1490,12 @@ public:
   const VarDecl *NativeParam,
   const VarDecl *TargetParam) const;
 
+  /// Choose default schedule type and chunk value for the
+  /// dist_schedule clause.
+  virtual void getDefaultDistScheduleAndChunk(CodeGenFunction ,
+  const OMPLoopDirective , OpenMPDistScheduleClauseKind ,
+  llvm::Value *) const {}
+
   /// Emits call of the outlined function with the provided arguments,
   /// translating these arguments to correct target-specific arguments.
   virtual void

Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp?rev=343253=343252=343253=diff
==
--- cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp Thu Sep 27 12:22:56 2018
@@ -4081,3 +4081,15 @@ void CGOpenMPRuntimeNVPTX::functionFinis
   FunctionGlobalizedDecls.erase(CGF.CurFn);
   CGOpenMPRuntime::functionFinished(CGF);
 }
+
+void CGOpenMPRuntimeNVPTX::getDefaultDistScheduleAndChunk(
+CodeGenFunction , const OMPLoopDirective ,
+OpenMPDistScheduleClauseKind ,
+llvm::Value *) const {
+  if (getExecutionMode() == CGOpenMPRuntimeNVPTX::EM_SPMD) {
+ScheduleKind = OMPC_DIST_SCHEDULE_static;
+Chunk = CGF.EmitScalarConversion(getNVPTXNumThreads(CGF),
+CGF.getContext().getIntTypeForBitwidth(32, /*Signed=*/0),
+S.getIterationVariable()->getType(), S.getBeginLoc());
+  }
+}

Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.h?rev=343253=343252=343253=diff
==
--- cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.h (original)
+++ cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.h Thu Sep 27 12:22:56 2018
@@ -340,6 +340,11 @@ public:
   ///
   void functionFinished(CodeGenFunction ) override;
 
+  /// Choose a default value for the schedule clause.
+  void getDefaultDistScheduleAndChunk(CodeGenFunction ,
+  const OMPLoopDirective , OpenMPDistScheduleClauseKind ,
+  llvm::Value *) const override;
+
 private:
   /// Track the execution mode when codegening directives within a target
   /// region. The appropriate mode (SPMD/NON-SPMD) is set on entry to the

Modified: cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp?rev=343253=343252=343253=diff
==
--- cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp Thu Sep 27 12:22:56 2018
@@ -3325,6 +3325,10 @@ void CodeGenFunction::EmitOMPDistributeL
S.getIterationVariable()->getType(),
S.getBeginLoc());
 }
+  } else {
+// Default behaviour for dist_schedule clause.
+CGM.getOpenMPRuntime().getDefaultDistScheduleAndChunk(
+*this, S, ScheduleKind, Chunk);
   }
   const unsigned IVSize = getContext().getTypeSize(IVExpr->getType());
   const bool IVSigned = 
IVExpr->getType()->hasSignedIntegerRepresentation();

Modified: 
cfe/trunk/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp
URL: 

r340772 - [OpenMP][NVPTX] Use appropriate _CALL_ELF macro when offloading

2018-08-27 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Mon Aug 27 13:16:20 2018
New Revision: 340772

URL: http://llvm.org/viewvc/llvm-project?rev=340772=rev
Log:
[OpenMP][NVPTX] Use appropriate _CALL_ELF macro when offloading

Summary: When offloading to a device and using the powerpc64le version of the 
auxiliary triple, the _CALL_ELF macro is not set correctly to 2 resulting in 
the attempt to include a header that does not exist. This patch fixes this 
problem.

Reviewers: Hahnfeld, ABataev, caomhin

Reviewed By: Hahnfeld

Subscribers: guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D51312

Modified:
cfe/trunk/lib/Frontend/InitPreprocessor.cpp
cfe/trunk/test/Preprocessor/aux-triple.c

Modified: cfe/trunk/lib/Frontend/InitPreprocessor.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Frontend/InitPreprocessor.cpp?rev=340772=340771=340772=diff
==
--- cfe/trunk/lib/Frontend/InitPreprocessor.cpp (original)
+++ cfe/trunk/lib/Frontend/InitPreprocessor.cpp Mon Aug 27 13:16:20 2018
@@ -1106,14 +1106,19 @@ static void InitializePredefinedAuxMacro
   auto AuxTriple = AuxTI.getTriple();
 
   // Define basic target macros needed by at least bits/wordsize.h and
-  // bits/mathinline.h
+  // bits/mathinline.h.
+  // On PowerPC, explicitely set _CALL_ELF macro needed for gnu/stubs.h.
   switch (AuxTriple.getArch()) {
   case llvm::Triple::x86_64:
 Builder.defineMacro("__x86_64__");
 break;
   case llvm::Triple::ppc64:
+Builder.defineMacro("__powerpc64__");
+Builder.defineMacro("_CALL_ELF", "1");
+break;
   case llvm::Triple::ppc64le:
 Builder.defineMacro("__powerpc64__");
+Builder.defineMacro("_CALL_ELF", "2");
 break;
   default:
 break;

Modified: cfe/trunk/test/Preprocessor/aux-triple.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Preprocessor/aux-triple.c?rev=340772=340771=340772=diff
==
--- cfe/trunk/test/Preprocessor/aux-triple.c (original)
+++ cfe/trunk/test/Preprocessor/aux-triple.c Mon Aug 27 13:16:20 2018
@@ -14,7 +14,7 @@
 // RUN: %clang_cc1 -x cuda -E -dM -ffreestanding < /dev/null \
 // RUN: -triple nvptx64-none-none -aux-triple 
powerpc64le-unknown-linux-gnu \
 // RUN:   | FileCheck -match-full-lines %s \
-// RUN: -check-prefixes NVPTX64,PPC64,LINUX,LINUX-CPP
+// RUN: -check-prefixes NVPTX64,PPC64LE,LINUX,LINUX-CPP
 // RUN: %clang_cc1 -x cuda -E -dM -ffreestanding < /dev/null \
 // RUN: -triple nvptx64-none-none -aux-triple x86_64-unknown-linux-gnu \
 // RUN:   | FileCheck -match-full-lines %s \
@@ -24,7 +24,7 @@
 // RUN: %clang_cc1 -E -dM -ffreestanding < /dev/null \
 // RUN: -fopenmp -fopenmp-is-device -triple nvptx64-none-none \
 // RUN: -aux-triple powerpc64le-unknown-linux-gnu \
-// RUN:   | FileCheck -match-full-lines -check-prefixes NVPTX64,PPC64,LINUX %s
+// RUN:   | FileCheck -match-full-lines -check-prefixes NVPTX64,PPC64LE,LINUX 
%s
 // RUN: %clang_cc1 -E -dM -ffreestanding < /dev/null \
 // RUN: -fopenmp -fopenmp-is-device -triple nvptx64-none-none \
 // RUN: -aux-triple x86_64-unknown-linux-gnu \
@@ -33,13 +33,15 @@
 // RUN: -fopenmp -fopenmp-is-device -triple nvptx64-none-none \
 // RUN: -aux-triple powerpc64le-unknown-linux-gnu \
 // RUN:   | FileCheck -match-full-lines %s \
-// RUN: -check-prefixes NVPTX64,PPC64,LINUX,LINUX-CPP
+// RUN: -check-prefixes NVPTX64,PPC64LE,LINUX,LINUX-CPP
 // RUN: %clang_cc1 -x c++ -E -dM -ffreestanding < /dev/null \
 // RUN: -fopenmp -fopenmp-is-device -triple nvptx64-none-none \
 // RUN: -aux-triple x86_64-unknown-linux-gnu \
 // RUN:   | FileCheck -match-full-lines %s \
 // RUN: -check-prefixes NVPTX64,X86_64,LINUX,LINUX-CPP
 
+// PPC64LE:#define _CALL_ELF 2
+
 // NONE-NOT:#define _GNU_SOURCE
 // LINUX-CPP:#define _GNU_SOURCE 1
 
@@ -56,7 +58,7 @@
 // LINUX:#define __linux__ 1
 
 // NONE-NOT:#define __powerpc64__
-// PPC64:#define __powerpc64__ 1
+// PPC64LE:#define __powerpc64__ 1
 
 // NONE-NOT:#define __x86_64__
 // X86_64:#define __x86_64__ 1


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r337015 - [OpenMP] Initialize data sharing stack for SPMD case

2018-07-13 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Fri Jul 13 09:18:24 2018
New Revision: 337015

URL: http://llvm.org/viewvc/llvm-project?rev=337015=rev
Log:
[OpenMP] Initialize data sharing stack for SPMD case

Summary: In the SPMD case, we need to initialize the data sharing and 
globalization infrastructure. This covers the case when an SPMD region calls a 
function in a different compilation unit.

Reviewers: ABataev, carlo.bertolli, caomhin

Reviewed By: ABataev

Subscribers: Hahnfeld, jholewinski, guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D49188

Modified:
cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
cfe/trunk/test/OpenMP/nvptx_data_sharing.cpp
cfe/trunk/test/OpenMP/nvptx_target_parallel_codegen.cpp
cfe/trunk/test/OpenMP/nvptx_target_parallel_proc_bind_codegen.cpp
cfe/trunk/test/OpenMP/nvptx_target_parallel_reduction_codegen.cpp
cfe/trunk/test/OpenMP/nvptx_target_teams_codegen.cpp
cfe/trunk/test/OpenMP/nvptx_target_teams_distribute_parallel_for_codegen.cpp

cfe/trunk/test/OpenMP/nvptx_target_teams_distribute_parallel_for_simd_codegen.cpp

Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp?rev=337015=337014=337015=diff
==
--- cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp Fri Jul 13 09:18:24 2018
@@ -81,6 +81,8 @@ enum OpenMPRTLFunctionNVPTX {
   OMPRTL_NVPTX__kmpc_end_reduce_nowait,
   /// Call to void __kmpc_data_sharing_init_stack();
   OMPRTL_NVPTX__kmpc_data_sharing_init_stack,
+  /// Call to void __kmpc_data_sharing_init_stack_spmd();
+  OMPRTL_NVPTX__kmpc_data_sharing_init_stack_spmd,
   /// Call to void* __kmpc_data_sharing_push_stack(size_t size,
   /// int16_t UseSharedMemory);
   OMPRTL_NVPTX__kmpc_data_sharing_push_stack,
@@ -1025,6 +1027,12 @@ void CGOpenMPRuntimeNVPTX::emitSPMDEntry
  /*RequiresDataSharing=*/Bld.getInt16(1)};
   CGF.EmitRuntimeCall(
   createNVPTXRuntimeFunction(OMPRTL_NVPTX__kmpc_spmd_kernel_init), Args);
+
+  // For data sharing, we need to initialize the stack.
+  CGF.EmitRuntimeCall(
+  createNVPTXRuntimeFunction(
+  OMPRTL_NVPTX__kmpc_data_sharing_init_stack_spmd));
+
   CGF.EmitBranch(ExecuteBB);
 
   CGF.EmitBlock(ExecuteBB);
@@ -1107,11 +1115,6 @@ void CGOpenMPRuntimeNVPTX::emitWorkerLoo
   // Wait for parallel work
   syncCTAThreads(CGF);
 
-  // For data sharing, we need to initialize the stack for workers.
-  CGF.EmitRuntimeCall(
-  createNVPTXRuntimeFunction(
-  OMPRTL_NVPTX__kmpc_data_sharing_init_stack));
-
   Address WorkFn =
   CGF.CreateDefaultAlignTempAlloca(CGF.Int8PtrTy, /*Name=*/"work_fn");
   Address ExecStatus =
@@ -1417,6 +1420,13 @@ CGOpenMPRuntimeNVPTX::createNVPTXRuntime
 RTLFn = CGM.CreateRuntimeFunction(FnTy, "__kmpc_data_sharing_init_stack");
 break;
   }
+  case OMPRTL_NVPTX__kmpc_data_sharing_init_stack_spmd: {
+/// Build void __kmpc_data_sharing_init_stack_spmd();
+auto *FnTy =
+llvm::FunctionType::get(CGM.VoidTy, llvm::None, /*isVarArg*/ false);
+RTLFn = CGM.CreateRuntimeFunction(FnTy, 
"__kmpc_data_sharing_init_stack_spmd");
+break;
+  }
   case OMPRTL_NVPTX__kmpc_data_sharing_push_stack: {
 // Build void *__kmpc_data_sharing_push_stack(size_t size,
 // int16_t UseSharedMemory);

Modified: cfe/trunk/test/OpenMP/nvptx_data_sharing.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/nvptx_data_sharing.cpp?rev=337015=337014=337015=diff
==
--- cfe/trunk/test/OpenMP/nvptx_data_sharing.cpp (original)
+++ cfe/trunk/test/OpenMP/nvptx_data_sharing.cpp Fri Jul 13 09:18:24 2018
@@ -30,7 +30,7 @@ void test_ds(){
 /// = In the worker function = ///
 // CK1: {{.*}}define internal void 
@__omp_offloading{{.*}}test_ds{{.*}}_worker()
 // CK1: call void @llvm.nvvm.barrier0()
-// CK1: call void @__kmpc_data_sharing_init_stack
+// CK1-NOT: call void @__kmpc_data_sharing_init_stack
 
 /// = In the kernel function = ///
 

Modified: cfe/trunk/test/OpenMP/nvptx_target_parallel_codegen.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/nvptx_target_parallel_codegen.cpp?rev=337015=337014=337015=diff
==
--- cfe/trunk/test/OpenMP/nvptx_target_parallel_codegen.cpp (original)
+++ cfe/trunk/test/OpenMP/nvptx_target_parallel_codegen.cpp Fri Jul 13 09:18:24 
2018
@@ -60,6 +60,7 @@ int bar(int n){
   // CHECK: [[AA:%.+]] = load i16*, i16** [[AA_ADDR]], align
   // CHECK: [[THREAD_LIMIT:%.+]] = call i32 @llvm.nvvm.read.ptx.sreg.ntid.x()
   // CHECK: call void @__kmpc_spmd_kernel_init(i32 [[THREAD_LIMIT]],
+  // CHECK: call void @__kmpc_data_sharing_init_stack_spmd
   // CHECK: br label 

r328219 - [OpenMP][Clang] Add call to global data sharing stack initialization on the workers side

2018-03-22 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Thu Mar 22 10:33:27 2018
New Revision: 328219

URL: http://llvm.org/viewvc/llvm-project?rev=328219=rev
Log:
[OpenMP][Clang] Add call to global data sharing stack initialization on the 
workers side

Summary: The workers also need to initialize the global stack. The call to the 
initialization function needs to happen after the kernel_init() function is 
called by the master. This ensures that the per-team data structures of the 
runtime have been initialized.

Reviewers: ABataev, grokos, carlo.bertolli, caomhin

Reviewed By: ABataev

Subscribers: jholewinski, guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D44749

Modified:
cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
cfe/trunk/test/OpenMP/nvptx_data_sharing.cpp

Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp?rev=328219=328218=328219=diff
==
--- cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp Thu Mar 22 10:33:27 2018
@@ -801,6 +801,11 @@ void CGOpenMPRuntimeNVPTX::emitWorkerLoo
   // Wait for parallel work
   syncCTAThreads(CGF);
 
+  // For data sharing, we need to initialize the stack for workers.
+  CGF.EmitRuntimeCall(
+  createNVPTXRuntimeFunction(
+  OMPRTL_NVPTX__kmpc_data_sharing_init_stack));
+
   Address WorkFn =
   CGF.CreateDefaultAlignTempAlloca(CGF.Int8PtrTy, /*Name=*/"work_fn");
   Address ExecStatus =

Modified: cfe/trunk/test/OpenMP/nvptx_data_sharing.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/OpenMP/nvptx_data_sharing.cpp?rev=328219=328218=328219=diff
==
--- cfe/trunk/test/OpenMP/nvptx_data_sharing.cpp (original)
+++ cfe/trunk/test/OpenMP/nvptx_data_sharing.cpp Thu Mar 22 10:33:27 2018
@@ -27,6 +27,11 @@ void test_ds(){
   }
 }
 
+/// = In the worker function = ///
+// CK1: {{.*}}define internal void 
@__omp_offloading{{.*}}test_ds{{.*}}_worker()
+// CK1: call void @llvm.nvvm.barrier0()
+// CK1: call void @__kmpc_data_sharing_init_stack
+
 /// = In the kernel function = ///
 
 // CK1: {{.*}}define void @__omp_offloading{{.*}}test_ds{{.*}}()


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r327513 - [OpenMP] Add OpenMP data sharing infrastructure using global memory

2018-03-14 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Wed Mar 14 07:17:45 2018
New Revision: 327513

URL: http://llvm.org/viewvc/llvm-project?rev=327513=rev
Log:
[OpenMP] Add OpenMP data sharing infrastructure using global memory

Summary:
This patch handles the Clang code generation phase for the OpenMP data sharing 
infrastructure.

TODO: add a more detailed description.

Reviewers: ABataev, carlo.bertolli, caomhin, hfinkel, Hahnfeld

Reviewed By: ABataev

Subscribers: jholewinski, guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D43660

Added:
cfe/trunk/test/OpenMP/nvptx_data_sharing.cpp
Modified:
cfe/trunk/lib/CodeGen/CGDecl.cpp
cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp
cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h
cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.h
cfe/trunk/lib/CodeGen/CGStmtOpenMP.cpp
cfe/trunk/lib/CodeGen/CodeGenFunction.cpp
cfe/trunk/test/OpenMP/nvptx_parallel_codegen.cpp

Modified: cfe/trunk/lib/CodeGen/CGDecl.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGDecl.cpp?rev=327513=327512=327513=diff
==
--- cfe/trunk/lib/CodeGen/CGDecl.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGDecl.cpp Wed Mar 14 07:17:45 2018
@@ -1068,9 +1068,17 @@ CodeGenFunction::EmitAutoVarAlloca(const
 }
 
 // A normal fixed sized variable becomes an alloca in the entry block,
-// unless it's an NRVO variable.
-
-if (NRVO) {
+// unless:
+// - it's an NRVO variable.
+// - we are compiling OpenMP and it's an OpenMP local variable.
+
+Address OpenMPLocalAddr =
+getLangOpts().OpenMP
+? CGM.getOpenMPRuntime().getAddressOfLocalVariable(*this, )
+: Address::invalid();
+if (getLangOpts().OpenMP && OpenMPLocalAddr.isValid()) {
+  address = OpenMPLocalAddr;
+} else if (NRVO) {
   // The named return value optimization: allocate this variable in the
   // return slot, so that we can elide the copy when returning this
   // variable (C++0x [class.copy]p34).
@@ -1896,9 +1904,18 @@ void CodeGenFunction::EmitParmDecl(const
   }
 }
   } else {
-// Otherwise, create a temporary to hold the value.
-DeclPtr = CreateMemTemp(Ty, getContext().getDeclAlign(),
-D.getName() + ".addr");
+// Check if the parameter address is controlled by OpenMP runtime.
+Address OpenMPLocalAddr =
+getLangOpts().OpenMP
+? CGM.getOpenMPRuntime().getAddressOfLocalVariable(*this, )
+: Address::invalid();
+if (getLangOpts().OpenMP && OpenMPLocalAddr.isValid()) {
+  DeclPtr = OpenMPLocalAddr;
+} else {
+  // Otherwise, create a temporary to hold the value.
+  DeclPtr = CreateMemTemp(Ty, getContext().getDeclAlign(),
+  D.getName() + ".addr");
+}
 DoStore = true;
   }
 

Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp?rev=327513=327512=327513=diff
==
--- cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp Wed Mar 14 07:17:45 2018
@@ -8100,6 +8100,11 @@ Address CGOpenMPRuntime::getParameterAdd
   return CGF.GetAddrOfLocalVar(NativeParam);
 }
 
+Address CGOpenMPRuntime::getAddressOfLocalVariable(CodeGenFunction ,
+   const VarDecl *VD) {
+  return Address::invalid();
+}
+
 llvm::Value *CGOpenMPSIMDRuntime::emitParallelOutlinedFunction(
 const OMPExecutableDirective , const VarDecl *ThreadIDVar,
 OpenMPDirectiveKind InnermostKind, const RegionCodeGenTy ) {

Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h?rev=327513=327512=327513=diff
==
--- cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h (original)
+++ cfe/trunk/lib/CodeGen/CGOpenMPRuntime.h Wed Mar 14 07:17:45 2018
@@ -676,7 +676,7 @@ public:
 
   /// \brief Cleans up references to the objects in finished function.
   ///
-  void functionFinished(CodeGenFunction );
+  virtual void functionFinished(CodeGenFunction );
 
   /// \brief Emits code for parallel or serial call of the \a OutlinedFn with
   /// variables captured in a record which address is stored in \a
@@ -1362,6 +1362,14 @@ public:
   emitOutlinedFunctionCall(CodeGenFunction , SourceLocation Loc,
llvm::Value *OutlinedFn,
ArrayRef Args = llvm::None) const;
+
+  /// Emits OpenMP-specific function prolog.
+  /// Required for device constructs.
+  virtual void emitFunctionProlog(CodeGenFunction , const Decl *D) {}
+
+  /// Gets the OpenMP-specific address of the local variable.
+  virtual Address 

r327460 - [OpenMP] Add flag for linking runtime bitcode library

2018-03-13 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Tue Mar 13 16:19:52 2018
New Revision: 327460

URL: http://llvm.org/viewvc/llvm-project?rev=327460=rev
Log:
[OpenMP] Add flag for linking runtime bitcode library

Summary: This patch adds an additional flag to the OpenMP device offloading 
toolchain to link in the runtime library bitcode.

Reviewers: Hahnfeld, ABataev, carlo.bertolli, caomhin, grokos, hfinkel

Reviewed By: ABataev, grokos

Subscribers: jholewinski, guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D43197

Added:
cfe/trunk/test/Driver/Inputs/libomptarget/libomptarget-nvptx-sm_20.bc
Modified:
cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td
cfe/trunk/lib/Driver/ToolChains/Cuda.cpp
cfe/trunk/test/Driver/openmp-offload-gpu.c

Modified: cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td?rev=327460=327459=327460=diff
==
--- cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td (original)
+++ cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td Tue Mar 13 16:19:52 
2018
@@ -203,6 +203,9 @@ def err_drv_expecting_fopenmp_with_fopen
 def warn_drv_omp_offload_target_duplicate : Warning<
   "The OpenMP offloading target '%0' is similar to target '%1' already 
specified - will be ignored.">, 
   InGroup;
+def warn_drv_omp_offload_target_missingbcruntime : Warning<
+  "No library '%0' found in the default clang lib directory or in 
LIBRARY_PATH. Expect degraded performance due to no inlining of runtime 
functions on target devices.">,
+  InGroup;
 def err_drv_bitcode_unsupported_on_toolchain : Error<
   "-fembed-bitcode is not supported on versions of iOS prior to 6.0">;
 

Modified: cfe/trunk/lib/Driver/ToolChains/Cuda.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/Cuda.cpp?rev=327460=327459=327460=diff
==
--- cfe/trunk/lib/Driver/ToolChains/Cuda.cpp (original)
+++ cfe/trunk/lib/Driver/ToolChains/Cuda.cpp Tue Mar 13 16:19:52 2018
@@ -581,6 +581,44 @@ void CudaToolChain::addClangTargetOption
 CC1Args.push_back("-target-feature");
 CC1Args.push_back("+ptx42");
   }
+
+  if (DeviceOffloadingKind == Action::OFK_OpenMP) {
+SmallVector LibraryPaths;
+// Add path to lib and/or lib64 folders.
+SmallString<256> DefaultLibPath =
+  llvm::sys::path::parent_path(getDriver().Dir);
+llvm::sys::path::append(DefaultLibPath,
+Twine("lib") + CLANG_LIBDIR_SUFFIX);
+LibraryPaths.emplace_back(DefaultLibPath.c_str());
+
+// Add user defined library paths from LIBRARY_PATH.
+llvm::Optional LibPath =
+llvm::sys::Process::GetEnv("LIBRARY_PATH");
+if (LibPath) {
+  SmallVector Frags;
+  const char EnvPathSeparatorStr[] = {llvm::sys::EnvPathSeparator, '\0'};
+  llvm::SplitString(*LibPath, Frags, EnvPathSeparatorStr);
+  for (StringRef Path : Frags)
+LibraryPaths.emplace_back(Path.trim());
+}
+
+std::string LibOmpTargetName =
+  "libomptarget-nvptx-" + GpuArch.str() + ".bc";
+bool FoundBCLibrary = false;
+for (StringRef LibraryPath : LibraryPaths) {
+  SmallString<128> LibOmpTargetFile(LibraryPath);
+  llvm::sys::path::append(LibOmpTargetFile, LibOmpTargetName);
+  if (llvm::sys::fs::exists(LibOmpTargetFile)) {
+CC1Args.push_back("-mlink-cuda-bitcode");
+CC1Args.push_back(DriverArgs.MakeArgString(LibOmpTargetFile));
+FoundBCLibrary = true;
+break;
+  }
+}
+if (!FoundBCLibrary)
+  getDriver().Diag(diag::warn_drv_omp_offload_target_missingbcruntime)
+  << LibOmpTargetName;
+  }
 }
 
 void CudaToolChain::AddCudaIncludeArgs(const ArgList ,

Added: cfe/trunk/test/Driver/Inputs/libomptarget/libomptarget-nvptx-sm_20.bc
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/Inputs/libomptarget/libomptarget-nvptx-sm_20.bc?rev=327460=auto
==
(empty)

Modified: cfe/trunk/test/Driver/openmp-offload-gpu.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/openmp-offload-gpu.c?rev=327460=327459=327460=diff
==
--- cfe/trunk/test/Driver/openmp-offload-gpu.c (original)
+++ cfe/trunk/test/Driver/openmp-offload-gpu.c Tue Mar 13 16:19:52 2018
@@ -142,3 +142,26 @@
 // RUN:   | FileCheck -check-prefix=CHK-NOLIBDEVICE %s
 
 // CHK-NOLIBDEVICE-NOT: error:{{.*}}sm_60
+
+/// ###
+
+/// Check that the runtime bitcode library is part of the compile line. Create 
a bogus
+/// bitcode library and add it to the LIBRARY_PATH.
+// RUN:   env LIBRARY_PATH=%S/Inputs/libomptarget %clang -### -fopenmp=libomp 

r327447 - Revert revision 327438.

2018-03-13 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Tue Mar 13 13:50:12 2018
New Revision: 327447

URL: http://llvm.org/viewvc/llvm-project?rev=327447=rev
Log:
Revert revision 327438.

Removed:
cfe/trunk/test/Driver/Inputs/libomptarget/libomptarget-nvptx-sm_20.bc
Modified:
cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td
cfe/trunk/lib/Driver/ToolChains/Cuda.cpp
cfe/trunk/test/Driver/openmp-offload-gpu.c

Modified: cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td?rev=327447=327446=327447=diff
==
--- cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td (original)
+++ cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td Tue Mar 13 13:50:12 
2018
@@ -203,9 +203,6 @@ def err_drv_expecting_fopenmp_with_fopen
 def warn_drv_omp_offload_target_duplicate : Warning<
   "The OpenMP offloading target '%0' is similar to target '%1' already 
specified - will be ignored.">, 
   InGroup;
-def warn_drv_omp_offload_target_missingbcruntime : Warning<
-  "No library '%0' found in the default clang lib directory or in 
LIBRARY_PATH. Expect degraded performance due to no inlining of runtime 
functions on target devices.">,
-  InGroup;
 def err_drv_bitcode_unsupported_on_toolchain : Error<
   "-fembed-bitcode is not supported on versions of iOS prior to 6.0">;
 

Modified: cfe/trunk/lib/Driver/ToolChains/Cuda.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/Cuda.cpp?rev=327447=327446=327447=diff
==
--- cfe/trunk/lib/Driver/ToolChains/Cuda.cpp (original)
+++ cfe/trunk/lib/Driver/ToolChains/Cuda.cpp Tue Mar 13 13:50:12 2018
@@ -581,44 +581,6 @@ void CudaToolChain::addClangTargetOption
 CC1Args.push_back("-target-feature");
 CC1Args.push_back("+ptx42");
   }
-
-  if (DeviceOffloadingKind == Action::OFK_OpenMP) {
-SmallVector LibraryPaths;
-// Add path to lib and/or lib64 folders.
-SmallString<256> DefaultLibPath =
-  llvm::sys::path::parent_path(getDriver().Dir);
-llvm::sys::path::append(DefaultLibPath,
-Twine("lib") + CLANG_LIBDIR_SUFFIX);
-LibraryPaths.emplace_back(DefaultLibPath.c_str());
-
-// Add user defined library paths from LIBRARY_PATH.
-llvm::Optional LibPath =
-llvm::sys::Process::GetEnv("LIBRARY_PATH");
-if (LibPath) {
-  SmallVector Frags;
-  const char EnvPathSeparatorStr[] = {llvm::sys::EnvPathSeparator, '\0'};
-  llvm::SplitString(*LibPath, Frags, EnvPathSeparatorStr);
-  for (StringRef Path : Frags)
-LibraryPaths.emplace_back(Path.trim());
-}
-
-std::string LibOmpTargetName =
-  "libomptarget-nvptx-" + GpuArch.str() + ".bc";
-bool FoundBCLibrary = false;
-for (StringRef LibraryPath : LibraryPaths) {
-  SmallString<128> LibOmpTargetFile(LibraryPath);
-  llvm::sys::path::append(LibOmpTargetFile, LibOmpTargetName);
-  if (llvm::sys::fs::exists(LibOmpTargetFile)) {
-CC1Args.push_back("-mlink-cuda-bitcode");
-CC1Args.push_back(DriverArgs.MakeArgString(LibOmpTargetFile));
-FoundBCLibrary = true;
-break;
-  }
-}
-if (!FoundBCLibrary)
-  getDriver().Diag(diag::warn_drv_omp_offload_target_missingbcruntime)
-  << LibOmpTargetName;
-  }
 }
 
 void CudaToolChain::AddCudaIncludeArgs(const ArgList ,

Removed: cfe/trunk/test/Driver/Inputs/libomptarget/libomptarget-nvptx-sm_20.bc
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/Inputs/libomptarget/libomptarget-nvptx-sm_20.bc?rev=327446=auto
==
(empty)

Modified: cfe/trunk/test/Driver/openmp-offload-gpu.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/openmp-offload-gpu.c?rev=327447=327446=327447=diff
==
--- cfe/trunk/test/Driver/openmp-offload-gpu.c (original)
+++ cfe/trunk/test/Driver/openmp-offload-gpu.c Tue Mar 13 13:50:12 2018
@@ -142,23 +142,3 @@
 // RUN:   | FileCheck -check-prefix=CHK-NOLIBDEVICE %s
 
 // CHK-NOLIBDEVICE-NOT: error:{{.*}}sm_60
-
-/// ###
-
-/// Check that the runtime bitcode library is part of the compile line. Create 
a bogus
-/// bitcode library and add it to the LIBRARY_PATH.
-// RUN:   env LIBRARY_PATH=%S/Inputs/libomptarget %clang -### -fopenmp=libomp 
-fopenmp-targets=nvptx64-nvidia-cuda \
-// RUN:   -Xopenmp-target -march=sm_20 -fopenmp-relocatable-target -save-temps 
\
-// RUN:   -no-canonical-prefixes %s 2>&1 | FileCheck -check-prefix=CHK-BCLIB %s
-
-// CHK-BCLIB: 
clang{{.*}}-triple{{.*}}nvptx64-nvidia-cuda{{.*}}-mlink-cuda-bitcode{{.*}}libomptarget-nvptx-sm_20.bc
-
-/// 

r327438 - [OpenMP] Add flag for linking runtime bitcode library

2018-03-13 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Tue Mar 13 12:39:19 2018
New Revision: 327438

URL: http://llvm.org/viewvc/llvm-project?rev=327438=rev
Log:
[OpenMP] Add flag for linking runtime bitcode library

Summary: This patch adds an additional flag to the OpenMP device offloading 
toolchain to link in the runtime library bitcode.

Reviewers: Hahnfeld, ABataev, carlo.bertolli, caomhin, grokos, hfinkel

Reviewed By: ABataev, grokos

Subscribers: jholewinski, guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D43197

Added:
cfe/trunk/test/Driver/Inputs/libomptarget/
cfe/trunk/test/Driver/Inputs/libomptarget/libomptarget-nvptx-sm_20.bc
Modified:
cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td
cfe/trunk/lib/Driver/ToolChains/Cuda.cpp
cfe/trunk/test/Driver/openmp-offload-gpu.c

Modified: cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td?rev=327438=327437=327438=diff
==
--- cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td (original)
+++ cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td Tue Mar 13 12:39:19 
2018
@@ -203,6 +203,9 @@ def err_drv_expecting_fopenmp_with_fopen
 def warn_drv_omp_offload_target_duplicate : Warning<
   "The OpenMP offloading target '%0' is similar to target '%1' already 
specified - will be ignored.">, 
   InGroup;
+def warn_drv_omp_offload_target_missingbcruntime : Warning<
+  "No library '%0' found in the default clang lib directory or in 
LIBRARY_PATH. Expect degraded performance due to no inlining of runtime 
functions on target devices.">,
+  InGroup;
 def err_drv_bitcode_unsupported_on_toolchain : Error<
   "-fembed-bitcode is not supported on versions of iOS prior to 6.0">;
 

Modified: cfe/trunk/lib/Driver/ToolChains/Cuda.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/Cuda.cpp?rev=327438=327437=327438=diff
==
--- cfe/trunk/lib/Driver/ToolChains/Cuda.cpp (original)
+++ cfe/trunk/lib/Driver/ToolChains/Cuda.cpp Tue Mar 13 12:39:19 2018
@@ -21,6 +21,7 @@
 #include "llvm/Option/ArgList.h"
 #include "llvm/Support/FileSystem.h"
 #include "llvm/Support/Path.h"
+#include "llvm/Support/Process.h"
 #include "llvm/Support/Program.h"
 #include 
 
@@ -580,6 +581,44 @@ void CudaToolChain::addClangTargetOption
 CC1Args.push_back("-target-feature");
 CC1Args.push_back("+ptx42");
   }
+
+  if (DeviceOffloadingKind == Action::OFK_OpenMP) {
+SmallVector LibraryPaths;
+// Add path to lib and/or lib64 folders.
+SmallString<256> DefaultLibPath =
+  llvm::sys::path::parent_path(getDriver().Dir);
+llvm::sys::path::append(DefaultLibPath,
+Twine("lib") + CLANG_LIBDIR_SUFFIX);
+LibraryPaths.emplace_back(DefaultLibPath.c_str());
+
+// Add user defined library paths from LIBRARY_PATH.
+llvm::Optional LibPath =
+llvm::sys::Process::GetEnv("LIBRARY_PATH");
+if (LibPath) {
+  SmallVector Frags;
+  const char EnvPathSeparatorStr[] = {llvm::sys::EnvPathSeparator, '\0'};
+  llvm::SplitString(*LibPath, Frags, EnvPathSeparatorStr);
+  for (StringRef Path : Frags)
+LibraryPaths.emplace_back(Path.trim());
+}
+
+std::string LibOmpTargetName =
+  "libomptarget-nvptx-" + GpuArch.str() + ".bc";
+bool FoundBCLibrary = false;
+for (StringRef LibraryPath : LibraryPaths) {
+  SmallString<128> LibOmpTargetFile(LibraryPath);
+  llvm::sys::path::append(LibOmpTargetFile, LibOmpTargetName);
+  if (llvm::sys::fs::exists(LibOmpTargetFile)) {
+CC1Args.push_back("-mlink-cuda-bitcode");
+CC1Args.push_back(DriverArgs.MakeArgString(LibOmpTargetFile));
+FoundBCLibrary = true;
+break;
+  }
+}
+if (!FoundBCLibrary)
+  getDriver().Diag(diag::warn_drv_omp_offload_target_missingbcruntime)
+  << LibOmpTargetName;
+  }
 }
 
 void CudaToolChain::AddCudaIncludeArgs(const ArgList ,

Added: cfe/trunk/test/Driver/Inputs/libomptarget/libomptarget-nvptx-sm_20.bc
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/Inputs/libomptarget/libomptarget-nvptx-sm_20.bc?rev=327438=auto
==
(empty)

Modified: cfe/trunk/test/Driver/openmp-offload-gpu.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/openmp-offload-gpu.c?rev=327438=327437=327438=diff
==
--- cfe/trunk/test/Driver/openmp-offload-gpu.c (original)
+++ cfe/trunk/test/Driver/openmp-offload-gpu.c Tue Mar 13 12:39:19 2018
@@ -142,3 +142,23 @@
 // RUN:   | FileCheck -check-prefix=CHK-NOLIBDEVICE %s
 
 // CHK-NOLIBDEVICE-NOT: error:{{.*}}sm_60
+
+/// 

r326948 - [OpenMP] Remove implicit data sharing code gen that aims to use device shared memory

2018-03-07 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Wed Mar  7 13:59:50 2018
New Revision: 326948

URL: http://llvm.org/viewvc/llvm-project?rev=326948=rev
Log:
[OpenMP] Remove implicit data sharing code gen that aims to use device shared 
memory

Summary: Remove this scheme for now since it will be covered by another more 
generic scheme using global memory. This code will be worked into an 
optimization for the generic data sharing scheme. Removing this completely and 
then adding it via future patches will make all future data sharing patches 
cleaner.

Reviewers: ABataev, carlo.bertolli, caomhin

Reviewed By: ABataev

Subscribers: jholewinski, guansong, cfe-commits

Differential Revision: https://reviews.llvm.org/D43625

Removed:
cfe/trunk/test/OpenMP/nvptx_data_sharing.cpp
Modified:
cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.h
cfe/trunk/test/OpenMP/nvptx_parallel_codegen.cpp
cfe/trunk/test/OpenMP/nvptx_target_teams_codegen.cpp

Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp?rev=326948=326947=326948=diff
==
--- cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGOpenMPRuntimeNVPTX.cpp Wed Mar  7 13:59:50 2018
@@ -33,11 +33,11 @@ enum OpenMPRTLFunctionNVPTX {
   /// \brief Call to void __kmpc_spmd_kernel_deinit();
   OMPRTL_NVPTX__kmpc_spmd_kernel_deinit,
   /// \brief Call to void __kmpc_kernel_prepare_parallel(void
-  /// *outlined_function, void ***args, kmp_int32 nArgs, int16_t
+  /// *outlined_function, int16_t
   /// IsOMPRuntimeInitialized);
   OMPRTL_NVPTX__kmpc_kernel_prepare_parallel,
-  /// \brief Call to bool __kmpc_kernel_parallel(void **outlined_function, void
-  /// ***args, int16_t IsOMPRuntimeInitialized);
+  /// \brief Call to bool __kmpc_kernel_parallel(void **outlined_function,
+  /// int16_t IsOMPRuntimeInitialized);
   OMPRTL_NVPTX__kmpc_kernel_parallel,
   /// \brief Call to void __kmpc_kernel_end_parallel();
   OMPRTL_NVPTX__kmpc_kernel_end_parallel,
@@ -288,7 +288,6 @@ void CGOpenMPRuntimeNVPTX::emitGenericKe
   EntryFunctionState EST;
   WorkerFunctionState WST(CGM, D.getLocStart());
   Work.clear();
-  WrapperFunctionsMap.clear();
 
   // Emit target region as a standalone region.
   class NVPTXPrePostActionTy : public PrePostActionTy {
@@ -508,11 +507,8 @@ void CGOpenMPRuntimeNVPTX::emitWorkerLoo
   CGF.InitTempAlloca(ExecStatus, Bld.getInt8(/*C=*/0));
   CGF.InitTempAlloca(WorkFn, llvm::Constant::getNullValue(CGF.Int8PtrTy));
 
-  // Set up shared arguments
-  Address SharedArgs =
-  CGF.CreateDefaultAlignTempAlloca(CGF.Int8PtrPtrTy, "shared_args");
   // TODO: Optimize runtime initialization and pass in correct value.
-  llvm::Value *Args[] = {WorkFn.getPointer(), SharedArgs.getPointer(),
+  llvm::Value *Args[] = {WorkFn.getPointer(),
  /*RequiresOMPRuntime=*/Bld.getInt16(1)};
   llvm::Value *Ret = CGF.EmitRuntimeCall(
   createNVPTXRuntimeFunction(OMPRTL_NVPTX__kmpc_kernel_parallel), Args);
@@ -532,9 +528,6 @@ void CGOpenMPRuntimeNVPTX::emitWorkerLoo
   // Signal start of parallel region.
   CGF.EmitBlock(ExecuteBB);
 
-  // Current context
-  ASTContext  = CGF.getContext();
-
   // Process work items: outlined parallel functions.
   for (auto *W : Work) {
 // Try to match this outlined function.
@@ -550,19 +543,14 @@ void CGOpenMPRuntimeNVPTX::emitWorkerLoo
 // Execute this outlined function.
 CGF.EmitBlock(ExecuteFNBB);
 
-// Insert call to work function via shared wrapper. The shared
-// wrapper takes exactly three arguments:
-//   - the parallelism level;
-//   - the master thread ID;
-//   - the list of references to shared arguments.
-//
-// TODO: Assert that the function is a wrapper function.s
-Address Capture = CGF.EmitLoadOfPointer(SharedArgs,
-   Ctx.getPointerType(
-  Ctx.getPointerType(Ctx.VoidPtrTy)).castAs());
-emitOutlinedFunctionCall(CGF, WST.Loc, W,
- {Bld.getInt16(/*ParallelLevel=*/0),
-  getMasterThreadID(CGF), Capture.getPointer()});
+// Insert call to work function.
+// FIXME: Pass arguments to outlined function from master thread.
+auto *Fn = cast(W);
+Address ZeroAddr =
+CGF.CreateDefaultAlignTempAlloca(CGF.Int32Ty, /*Name=*/".zero.addr");
+CGF.InitTempAlloca(ZeroAddr, CGF.Builder.getInt32(/*C=*/0));
+llvm::Value *FnArgs[] = {ZeroAddr.getPointer(), ZeroAddr.getPointer()};
+emitCall(CGF, WST.Loc, Fn, FnArgs);
 
 // Go to end of parallel region.
 CGF.EmitBranch(TerminateBB);
@@ -630,10 +618,8 @@ CGOpenMPRuntimeNVPTX::createNVPTXRuntime
   }
   case OMPRTL_NVPTX__kmpc_kernel_prepare_parallel: {
 /// Build void __kmpc_kernel_prepare_parallel(
-/// void *outlined_function, void ***args, kmp_int32 nArgs, int16_t
-

r310282 - Non-functional change. Fix previous patch D34784.

2017-08-07 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Mon Aug  7 11:43:37 2017
New Revision: 310282

URL: http://llvm.org/viewvc/llvm-project?rev=310282=rev
Log:
Non-functional change. Fix previous patch D34784.

Modified:
cfe/trunk/lib/Driver/Compilation.cpp

Modified: cfe/trunk/lib/Driver/Compilation.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/Compilation.cpp?rev=310282=310281=310282=diff
==
--- cfe/trunk/lib/Driver/Compilation.cpp (original)
+++ cfe/trunk/lib/Driver/Compilation.cpp Mon Aug  7 11:43:37 2017
@@ -60,11 +60,15 @@ Compilation::getArgsForToolChain(const T
   DerivedArgList * = TCArgs[{TC, BoundArch, DeviceOffloadKind}];
   if (!Entry) {
 // Translate OpenMP toolchain arguments provided via the -Xopenmp-target 
flags.
-Entry = TC->TranslateOpenMPTargetArgs(*TranslatedArgs, DeviceOffloadKind);
-if (!Entry)
-  Entry = TranslatedArgs;
+DerivedArgList *OpenMPArgs = TC->TranslateOpenMPTargetArgs(*TranslatedArgs,
+DeviceOffloadKind);
+if (!OpenMPArgs) {
+  Entry = TC->TranslateArgs(*TranslatedArgs, BoundArch, DeviceOffloadKind);
+} else {
+  Entry = TC->TranslateArgs(*OpenMPArgs, BoundArch, DeviceOffloadKind);
+  delete OpenMPArgs;
+}
 
-Entry = TC->TranslateArgs(*Entry, BoundArch, DeviceOffloadKind);
 if (!Entry)
   Entry = TranslatedArgs;
   }


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r310263 - [OpenMP] Add flag for specifying the target device architecture for OpenMP device offloading

2017-08-07 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Mon Aug  7 08:39:11 2017
New Revision: 310263

URL: http://llvm.org/viewvc/llvm-project?rev=310263=rev
Log:
[OpenMP] Add flag for specifying the target device architecture for OpenMP 
device offloading

Summary:
OpenMP has the ability to offload target regions to devices which may have 
different architectures.

A new -fopenmp-target-arch flag is introduced to specify the device 
architecture.

In this patch I use the new flag to specify the compute capability of the 
underlying NVIDIA architecture for the OpenMP offloading CUDA tool chain.

Only a host-offloading test is provided since full device offloading capability 
will only be available when [[ https://reviews.llvm.org/D29654 | D29654 ]] 
lands.

Reviewers: hfinkel, Hahnfeld, carlo.bertolli, caomhin, ABataev

Reviewed By: hfinkel

Subscribers: guansong, cfe-commits

Tags: #openmp

Differential Revision: https://reviews.llvm.org/D34784

Modified:
cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td
cfe/trunk/include/clang/Driver/Options.td
cfe/trunk/include/clang/Driver/ToolChain.h
cfe/trunk/lib/Driver/Compilation.cpp
cfe/trunk/lib/Driver/ToolChain.cpp
cfe/trunk/lib/Driver/ToolChains/Cuda.cpp
cfe/trunk/test/Driver/openmp-offload.c

Modified: cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td?rev=310263=310262=310263=diff
==
--- cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td (original)
+++ cfe/trunk/include/clang/Basic/DiagnosticDriverKinds.td Mon Aug  7 08:39:11 
2017
@@ -69,6 +69,10 @@ def err_drv_invalid_Xarch_argument_with_
   "invalid Xarch argument: '%0', options requiring arguments are unsupported">;
 def err_drv_invalid_Xarch_argument_isdriver : Error<
   "invalid Xarch argument: '%0', cannot change driver behavior inside Xarch 
argument">;
+def err_drv_Xopenmp_target_missing_triple : Error<
+  "cannot deduce implicit triple value for -Xopenmp-target, specify triple 
using -Xopenmp-target=">;
+def err_drv_invalid_Xopenmp_target_with_args : Error<
+  "invalid -Xopenmp-target argument: '%0', options requiring arguments are 
unsupported">;
 def err_drv_argument_only_allowed_with : Error<
   "invalid argument '%0' only allowed with '%1'">;
 def err_drv_argument_not_allowed_with : Error<

Modified: cfe/trunk/include/clang/Driver/Options.td
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Driver/Options.td?rev=310263=310262=310263=diff
==
--- cfe/trunk/include/clang/Driver/Options.td (original)
+++ cfe/trunk/include/clang/Driver/Options.td Mon Aug  7 08:39:11 2017
@@ -459,6 +459,10 @@ def Xcuda_fatbinary : Separate<["-"], "X
   HelpText<"Pass  to fatbinary invocation">, MetaVarName<"">;
 def Xcuda_ptxas : Separate<["-"], "Xcuda-ptxas">,
   HelpText<"Pass  to the ptxas assembler">, MetaVarName<"">;
+def Xopenmp_target : Separate<["-"], "Xopenmp-target">,
+  HelpText<"Pass  to the target offloading toolchain.">, 
MetaVarName<"">;
+def Xopenmp_target_EQ : JoinedAndSeparate<["-"], "Xopenmp-target=">,
+  HelpText<"Pass  to the specified target offloading toolchain. The 
triple that identifies the toolchain must be provided after the equals sign.">, 
MetaVarName<"">;
 def z : Separate<["-"], "z">, Flags<[LinkerInput, RenderAsInput]>,
   HelpText<"Pass -z  to the linker">, MetaVarName<"">,
   Group;

Modified: cfe/trunk/include/clang/Driver/ToolChain.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Driver/ToolChain.h?rev=310263=310262=310263=diff
==
--- cfe/trunk/include/clang/Driver/ToolChain.h (original)
+++ cfe/trunk/include/clang/Driver/ToolChain.h Mon Aug  7 08:39:11 2017
@@ -217,6 +217,17 @@ public:
 return nullptr;
   }
 
+  /// TranslateOpenMPTargetArgs - Create a new derived argument list for
+  /// that contains the OpenMP target specific flags passed via
+  /// -Xopenmp-target -opt=val OR -Xopenmp-target= -opt=val
+  /// Translation occurs only when the \p DeviceOffloadKind is specified.
+  ///
+  /// \param DeviceOffloadKind - The device offload kind used for the
+  /// translation.
+  virtual llvm::opt::DerivedArgList *
+  TranslateOpenMPTargetArgs(const llvm::opt::DerivedArgList ,
+  Action::OffloadKind DeviceOffloadKind) const;
+
   /// Choose a tool to use to handle the action \p JA.
   ///
   /// This can be overridden when a particular ToolChain needs to use

Modified: cfe/trunk/lib/Driver/Compilation.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/Compilation.cpp?rev=310263=310262=310263=diff
==
--- cfe/trunk/lib/Driver/Compilation.cpp (original)
+++ cfe/trunk/lib/Driver/Compilation.cpp Mon Aug  7 08:39:11 2017
@@ 

r307272 - [OpenMP] Extend CLANG target options with device offloading kind.

2017-07-06 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Thu Jul  6 09:22:21 2017
New Revision: 307272

URL: http://llvm.org/viewvc/llvm-project?rev=307272=rev
Log:
[OpenMP] Extend CLANG target options with device offloading kind.

Summary: Pass the type of the device offloading when building the tool chain 
for a particular target architecture. This is required when supporting multiple 
tool chains that target a single device type. In our particular use case, the 
OpenMP and CUDA tool chains will use the same ```addClangTargetOptions ``` 
method. This enables the reuse of common options and ensures control over 
options only supported by a particular tool chain.

Reviewers: arpith-jacob, caomhin, carlo.bertolli, ABataev, jlebar, hfinkel, 
tstellar, Hahnfeld

Reviewed By: hfinkel

Subscribers: jgravelle-google, aheejin, rengolin, jfb, dschuff, sbc100, 
cfe-commits

Differential Revision: https://reviews.llvm.org/D29647

Modified:
cfe/trunk/include/clang/Driver/ToolChain.h
cfe/trunk/lib/Driver/ToolChain.cpp
cfe/trunk/lib/Driver/ToolChains/BareMetal.cpp
cfe/trunk/lib/Driver/ToolChains/BareMetal.h
cfe/trunk/lib/Driver/ToolChains/Clang.cpp
cfe/trunk/lib/Driver/ToolChains/Cuda.cpp
cfe/trunk/lib/Driver/ToolChains/Cuda.h
cfe/trunk/lib/Driver/ToolChains/Darwin.cpp
cfe/trunk/lib/Driver/ToolChains/Darwin.h
cfe/trunk/lib/Driver/ToolChains/Fuchsia.cpp
cfe/trunk/lib/Driver/ToolChains/Fuchsia.h
cfe/trunk/lib/Driver/ToolChains/Gnu.cpp
cfe/trunk/lib/Driver/ToolChains/Gnu.h
cfe/trunk/lib/Driver/ToolChains/Hexagon.cpp
cfe/trunk/lib/Driver/ToolChains/Hexagon.h
cfe/trunk/lib/Driver/ToolChains/WebAssembly.cpp
cfe/trunk/lib/Driver/ToolChains/WebAssembly.h
cfe/trunk/lib/Driver/ToolChains/XCore.cpp
cfe/trunk/lib/Driver/ToolChains/XCore.h

Modified: cfe/trunk/include/clang/Driver/ToolChain.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/include/clang/Driver/ToolChain.h?rev=307272=307271=307272=diff
==
--- cfe/trunk/include/clang/Driver/ToolChain.h (original)
+++ cfe/trunk/include/clang/Driver/ToolChain.h Thu Jul  6 09:22:21 2017
@@ -411,7 +411,8 @@ public:
 
   /// \brief Add options that need to be passed to cc1 for this target.
   virtual void addClangTargetOptions(const llvm::opt::ArgList ,
- llvm::opt::ArgStringList ) const;
+ llvm::opt::ArgStringList ,
+ Action::OffloadKind DeviceOffloadKind) 
const;
 
   /// \brief Add warning options that need to be passed to cc1 for this target.
   virtual void addClangWarningOptions(llvm::opt::ArgStringList ) const;

Modified: cfe/trunk/lib/Driver/ToolChain.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChain.cpp?rev=307272=307271=307272=diff
==
--- cfe/trunk/lib/Driver/ToolChain.cpp (original)
+++ cfe/trunk/lib/Driver/ToolChain.cpp Thu Jul  6 09:22:21 2017
@@ -544,9 +544,9 @@ void ToolChain::AddClangSystemIncludeArg
   // Each toolchain should provide the appropriate include flags.
 }
 
-void ToolChain::addClangTargetOptions(const ArgList ,
-  ArgStringList ) const {
-}
+void ToolChain::addClangTargetOptions(
+const ArgList , ArgStringList ,
+Action::OffloadKind DeviceOffloadKind) const {}
 
 void ToolChain::addClangWarningOptions(ArgStringList ) const {}
 

Modified: cfe/trunk/lib/Driver/ToolChains/BareMetal.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/BareMetal.cpp?rev=307272=307271=307272=diff
==
--- cfe/trunk/lib/Driver/ToolChains/BareMetal.cpp (original)
+++ cfe/trunk/lib/Driver/ToolChains/BareMetal.cpp Thu Jul  6 09:22:21 2017
@@ -98,7 +98,8 @@ void BareMetal::AddClangSystemIncludeArg
 }
 
 void BareMetal::addClangTargetOptions(const ArgList ,
-  ArgStringList ) const {
+  ArgStringList ,
+  Action::OffloadKind) const {
   CC1Args.push_back("-nostdsysteminc");
 }
 

Modified: cfe/trunk/lib/Driver/ToolChains/BareMetal.h
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/BareMetal.h?rev=307272=307271=307272=diff
==
--- cfe/trunk/lib/Driver/ToolChains/BareMetal.h (original)
+++ cfe/trunk/lib/Driver/ToolChains/BareMetal.h Thu Jul  6 09:22:21 2017
@@ -54,7 +54,8 @@ public:
   void AddClangSystemIncludeArgs(const llvm::opt::ArgList ,
  llvm::opt::ArgStringList ) const 
override;
   void addClangTargetOptions(const llvm::opt::ArgList ,
- llvm::opt::ArgStringList ) const override;
+ llvm::opt::ArgStringList ,
+   

r307271 - [OpenMP] Customize CUDA-based tool chain selection

2017-07-06 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Thu Jul  6 09:08:15 2017
New Revision: 307271

URL: http://llvm.org/viewvc/llvm-project?rev=307271=rev
Log:
[OpenMP] Customize CUDA-based tool chain selection

Summary: This patch provides a generic way of selecting CUDA based tool chains 
as host-device pairs.

Reviewers: arpith-jacob, caomhin, carlo.bertolli, ABataev, Hahnfeld, jlebar, 
hfinkel, tstellar

Reviewed By: Hahnfeld

Subscribers: rengolin, cfe-commits

Differential Revision: https://reviews.llvm.org/D29658

Modified:
cfe/trunk/lib/Driver/Driver.cpp

Modified: cfe/trunk/lib/Driver/Driver.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/Driver.cpp?rev=307271=307270=307271=diff
==
--- cfe/trunk/lib/Driver/Driver.cpp (original)
+++ cfe/trunk/lib/Driver/Driver.cpp Thu Jul  6 09:08:15 2017
@@ -572,8 +572,22 @@ void Driver::CreateOffloadingDeviceToolC
   if (TT.getArch() == llvm::Triple::UnknownArch)
 Diag(clang::diag::err_drv_invalid_omp_target) << Val;
   else {
-const ToolChain  = getToolChain(C.getInputArgs(), TT);
-C.addOffloadDeviceToolChain(, Action::OFK_OpenMP);
+const ToolChain *TC;
+// CUDA toolchains have to be selected differently. They pair host
+// and device in their implementation.
+if (TT.isNVPTX()) {
+  const ToolChain *HostTC =
+  C.getSingleOffloadToolChain();
+  assert(HostTC && "Host toolchain should be always defined.");
+  auto  =
+  ToolChains[TT.str() + "/" + HostTC->getTriple().str()];
+  if (!CudaTC)
+CudaTC = llvm::make_unique(
+*this, TT, *HostTC, C.getInputArgs());
+  TC = CudaTC.get();
+} else
+  TC = (C.getInputArgs(), TT);
+C.addOffloadDeviceToolChain(TC, Action::OFK_OpenMP);
   }
 }
   } else


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r306724 - [OpenMP] Fix test for revision D29645. NFC

2017-06-29 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Thu Jun 29 11:49:16 2017
New Revision: 306724

URL: http://llvm.org/viewvc/llvm-project?rev=306724=rev
Log:
[OpenMP] Fix test for revision D29645. NFC
 

Modified:
cfe/trunk/test/Driver/openmp-offload.c

Modified: cfe/trunk/test/Driver/openmp-offload.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/openmp-offload.c?rev=306724=306723=306724=diff
==
--- cfe/trunk/test/Driver/openmp-offload.c (original)
+++ cfe/trunk/test/Driver/openmp-offload.c Thu Jun 29 11:49:16 2017
@@ -592,10 +592,8 @@
 
 /// ###
 
-/// Check -fopenmp-is-device is also passed when generating the *.i and *.s 
intermediate files.
-// RUN:   %clang -### -fopenmp=libomp 
-fopenmp-targets=powerpc64le-ibm-linux-gnu -save-temps -no-canonical-prefixes 
%s 2>&1 \
+/// Check -fopenmp-is-device is passed when compiling for the device.
+// RUN:   %clang -### -fopenmp=libomp 
-fopenmp-targets=powerpc64le-ibm-linux-gnu %s 2>&1 \
 // RUN:   | FileCheck -check-prefix=CHK-FOPENMP-IS-DEVICE %s
 
-// CHK-FOPENMP-IS-DEVICE: clang{{.*}}.i" {{.*}}" "-fopenmp-is-device"
-// CHK-FOPENMP-IS-DEVICE-NEXT: clang{{.*}}.bc" {{.*}}.i" "-fopenmp-is-device" 
"-fopenmp-host-ir-file-path"
-// CHK-FOPENMP-IS-DEVICE-NEXT: clang{{.*}}.s" {{.*}}.bc" "-fopenmp-is-device"
+// CHK-FOPENMP-IS-DEVICE: clang{{.*}} "-aux-triple" 
"powerpc64le-unknown-linux-gnu" {{.*}}.c" "-fopenmp-is-device" 
"-fopenmp-host-ir-file-path"


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r306691 - [OpenMP] Pass -fopenmp-is-device to preprocessing and machine specific code generation stages

2017-06-29 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Thu Jun 29 08:59:19 2017
New Revision: 306691

URL: http://llvm.org/viewvc/llvm-project?rev=306691=rev
Log:
[OpenMP] Pass -fopenmp-is-device to preprocessing and machine specific code 
generation stages

Summary: The preprocessing and code generation and optimization stages of the 
compiler are also passed the "-fopenmp-is-device" flag. This is used to trigger 
machine specific preprocessing and code generation when performing device 
offloading to an NVIDIA GPU via OpenMP directives.

Reviewers: arpith-jacob, caomhin, carlo.bertolli, Hahnfeld, hfinkel, tstellar

Reviewed By: Hahnfeld

Subscribers: Hahnfeld, rengolin

Differential Revision: https://reviews.llvm.org/D29645

Modified:
cfe/trunk/lib/Driver/ToolChains/Clang.cpp
cfe/trunk/test/Driver/openmp-offload.c

Modified: cfe/trunk/lib/Driver/ToolChains/Clang.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/Clang.cpp?rev=306691=306690=306691=diff
==
--- cfe/trunk/lib/Driver/ToolChains/Clang.cpp (original)
+++ cfe/trunk/lib/Driver/ToolChains/Clang.cpp Thu Jun 29 08:59:19 2017
@@ -4429,10 +4429,12 @@ void Clang::ConstructJob(Compilation ,
   // device declarations can be identified. Also, -fopenmp-is-device is passed
   // along to tell the frontend that it is generating code for a device, so 
that
   // only the relevant declarations are emitted.
-  if (IsOpenMPDevice && Inputs.size() == 2) {
+  if (IsOpenMPDevice) {
 CmdArgs.push_back("-fopenmp-is-device");
-CmdArgs.push_back("-fopenmp-host-ir-file-path");
-CmdArgs.push_back(Args.MakeArgString(Inputs.back().getFilename()));
+if (Inputs.size() == 2) {
+  CmdArgs.push_back("-fopenmp-host-ir-file-path");
+  CmdArgs.push_back(Args.MakeArgString(Inputs.back().getFilename()));
+}
   }
 
   // For all the host OpenMP offloading compile jobs we need to pass the 
targets

Modified: cfe/trunk/test/Driver/openmp-offload.c
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/test/Driver/openmp-offload.c?rev=306691=306690=306691=diff
==
--- cfe/trunk/test/Driver/openmp-offload.c (original)
+++ cfe/trunk/test/Driver/openmp-offload.c Thu Jun 29 08:59:19 2017
@@ -589,3 +589,13 @@
 // CHK-UBUJOBS-ST-SAME: [[HOSTOBJ:[^\\/]+\.o]]" "{{.*}}[[HOSTASM]]"
 // CHK-UBUJOBS-ST: clang-offload-bundler{{.*}}" "-type=o" 
"-targets=openmp-powerpc64le-ibm-linux-gnu,openmp-x86_64-pc-linux-gnu,host-powerpc64le--linux"
 "-outputs=
 // CHK-UBUJOBS-ST-SAME: [[RES:[^\\/]+\.o]]" 
"-inputs={{.*}}[[T1OBJ]],{{.*}}[[T2OBJ]],{{.*}}[[HOSTOBJ]]"
+
+/// ###
+
+/// Check -fopenmp-is-device is also passed when generating the *.i and *.s 
intermediate files.
+// RUN:   %clang -### -fopenmp=libomp 
-fopenmp-targets=powerpc64le-ibm-linux-gnu -save-temps -no-canonical-prefixes 
%s 2>&1 \
+// RUN:   | FileCheck -check-prefix=CHK-FOPENMP-IS-DEVICE %s
+
+// CHK-FOPENMP-IS-DEVICE: clang{{.*}}.i" {{.*}}" "-fopenmp-is-device"
+// CHK-FOPENMP-IS-DEVICE-NEXT: clang{{.*}}.bc" {{.*}}.i" "-fopenmp-is-device" 
"-fopenmp-host-ir-file-path"
+// CHK-FOPENMP-IS-DEVICE-NEXT: clang{{.*}}.s" {{.*}}.bc" "-fopenmp-is-device"


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


r306689 - [OpenMP] Add support for auxiliary triple specification

2017-06-29 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Thu Jun 29 08:49:03 2017
New Revision: 306689

URL: http://llvm.org/viewvc/llvm-project?rev=306689=rev
Log:
[OpenMP] Add support for auxiliary triple specification

Summary: Device offloading requires the specification of an additional flag 
containing the triple of the //other// architecture the code is being compiled 
on if such an architecture exists. If compiling for the host, the auxiliary 
triple flag will contain the triple describing the device and vice versa.

Reviewers: arpith-jacob, sfantao, caomhin, carlo.bertolli, ABataev, Hahnfeld, 
jlebar, hfinkel, tstellar

Reviewed By: Hahnfeld

Subscribers: rengolin, cfe-commits

Differential Revision: https://reviews.llvm.org/D29339

Modified:
cfe/trunk/lib/Driver/ToolChains/Clang.cpp
cfe/trunk/lib/Frontend/CompilerInstance.cpp
cfe/trunk/lib/Frontend/CompilerInvocation.cpp
cfe/trunk/lib/Frontend/InitPreprocessor.cpp
cfe/trunk/test/Driver/openmp-offload.c

Modified: cfe/trunk/lib/Driver/ToolChains/Clang.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Driver/ToolChains/Clang.cpp?rev=306689=306688=306689=diff
==
--- cfe/trunk/lib/Driver/ToolChains/Clang.cpp (original)
+++ cfe/trunk/lib/Driver/ToolChains/Clang.cpp Thu Jun 29 08:49:03 2017
@@ -129,6 +129,13 @@ forAllAssociatedToolChains(Compilation &
   else if (JA.isDeviceOffloading(Action::OFK_Cuda))
 Work(*C.getSingleOffloadToolChain());
 
+  if (JA.isHostOffloading(Action::OFK_OpenMP)) {
+auto TCs = C.getOffloadToolChains();
+for (auto II = TCs.first, IE = TCs.second; II != IE; ++II)
+  Work(*II->second);
+  } else if (JA.isDeviceOffloading(Action::OFK_OpenMP))
+Work(*C.getSingleOffloadToolChain());
+
   //
   // TODO: Add support for other offloading programming models here.
   //
@@ -1991,6 +1998,16 @@ void Clang::ConstructJob(Compilation ,
 CmdArgs.push_back("-aux-triple");
 CmdArgs.push_back(Args.MakeArgString(NormalizedTriple));
   }
+
+  if (IsOpenMPDevice) {
+// We have to pass the triple of the host if compiling for an OpenMP 
device.
+std::string NormalizedTriple =
+C.getSingleOffloadToolChain()
+->getTriple()
+.normalize();
+CmdArgs.push_back("-aux-triple");
+CmdArgs.push_back(Args.MakeArgString(NormalizedTriple));
+  }
 
   if (Triple.isOSWindows() && (Triple.getArch() == llvm::Triple::arm ||
Triple.getArch() == llvm::Triple::thumb)) {

Modified: cfe/trunk/lib/Frontend/CompilerInstance.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Frontend/CompilerInstance.cpp?rev=306689=306688=306689=diff
==
--- cfe/trunk/lib/Frontend/CompilerInstance.cpp (original)
+++ cfe/trunk/lib/Frontend/CompilerInstance.cpp Thu Jun 29 08:49:03 2017
@@ -936,8 +936,9 @@ bool CompilerInstance::ExecuteAction(Fro
   if (!hasTarget())
 return false;
 
-  // Create TargetInfo for the other side of CUDA compilation.
-  if (getLangOpts().CUDA && !getFrontendOpts().AuxTriple.empty()) {
+  // Create TargetInfo for the other side of CUDA and OpenMP compilation.
+  if ((getLangOpts().CUDA || getLangOpts().OpenMPIsDevice) &&
+  !getFrontendOpts().AuxTriple.empty()) {
 auto TO = std::make_shared();
 TO->Triple = getFrontendOpts().AuxTriple;
 TO->HostTriple = getTarget().getTriple().str();

Modified: cfe/trunk/lib/Frontend/CompilerInvocation.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Frontend/CompilerInvocation.cpp?rev=306689=306688=306689=diff
==
--- cfe/trunk/lib/Frontend/CompilerInvocation.cpp (original)
+++ cfe/trunk/lib/Frontend/CompilerInvocation.cpp Thu Jun 29 08:49:03 2017
@@ -2644,6 +2644,10 @@ bool CompilerInvocation::CreateFromArgs(
   Res.getTargetOpts().HostTriple = Res.getFrontendOpts().AuxTriple;
   }
 
+  // Set the triple of the host for OpenMP device compile.
+  if (LangOpts.OpenMPIsDevice)
+Res.getTargetOpts().HostTriple = Res.getFrontendOpts().AuxTriple;
+
   // FIXME: Override value name discarding when asan or msan is used because 
the
   // backend passes depend on the name of the alloca in order to print out
   // names.

Modified: cfe/trunk/lib/Frontend/InitPreprocessor.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/Frontend/InitPreprocessor.cpp?rev=306689=306688=306689=diff
==
--- cfe/trunk/lib/Frontend/InitPreprocessor.cpp (original)
+++ cfe/trunk/lib/Frontend/InitPreprocessor.cpp Thu Jun 29 08:49:03 2017
@@ -1043,7 +1043,7 @@ void clang::InitializePreprocessor(
   if (InitOpts.UsePredefines) {
 // FIXME: This will create multiple definitions for most of the predefined
 // macros. This is not the right way to handle this.
-if (LangOpts.CUDA && PP.getAuxTargetInfo())
+   

r305294 - Add comma to comment.

2017-06-13 Thread Gheorghe-Teodor Bercea via cfe-commits
Author: gbercea
Date: Tue Jun 13 10:35:27 2017
New Revision: 305294

URL: http://llvm.org/viewvc/llvm-project?rev=305294=rev
Log:
Add comma to comment.

Modified:
cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp

Modified: cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp
URL: 
http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp?rev=305294=305293=305294=diff
==
--- cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp (original)
+++ cfe/trunk/lib/CodeGen/CGOpenMPRuntime.cpp Tue Jun 13 10:35:27 2017
@@ -6327,7 +6327,7 @@ bool CGOpenMPRuntime::emitTargetGlobalVa
 }
   }
 
-  // If we are in target mode we do not emit any global (declare target is not
+  // If we are in target mode, we do not emit any global (declare target is not
   // implemented yet). Therefore we signal that GD was processed in this case.
   return true;
 }


___
cfe-commits mailing list
cfe-commits@lists.llvm.org
http://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits