[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #87278)

2024-04-10 Thread David Pagan via cfe-commits

https://github.com/ddpagan closed 
https://github.com/llvm/llvm-project/pull/87278
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)

2024-03-14 Thread David Pagan via cfe-commits


@@ -1537,6 +1537,12 @@ class CodeGenModule : public CodeGenTypeCache {
   void printPostfixForExternalizedDecl(llvm::raw_ostream ,
const Decl *D) const;
 
+  /// Under debug mode, print status of target teams loop transformation,
+  /// which should be either '#distribute' or '#parallel for'
+  void emitTargetTeamsLoopCodegenStatus(std::string StatusMsg,
+const OMPExecutableDirective ,
+bool IsDevice);
+

ddpagan wrote:

> Don't like the idea of adding new member function specifically for debugging

Agreed. Changed it to a static local function.

https://github.com/llvm/llvm-project/pull/72417
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)

2024-03-14 Thread David Pagan via cfe-commits


@@ -11311,6 +11311,10 @@ class Sema final {
 OpenMPDirectiveKind ,
 OpenMPDirectiveKind );
 
+  /// [target] teams loop is equivalent to parallel for if associated loop
+  /// nest meets certain critera.
+  bool teamsLoopCanBeParallelFor(Stmt *Astmt);
+

ddpagan wrote:

> Do you really need to expose it in Sema or you can make it just a static 
> local function in SemaOpenMP.cpp?

Actually, no. Thanks for noticing that. I'll make the change.

https://github.com/llvm/llvm-project/pull/72417
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)

2024-03-11 Thread David Pagan via cfe-commits

ddpagan wrote:

After some additional discussion with Alexey offline, he concluded that the 
current changes are okay, specifically for this reason:

_"Then I realized that actually it does not require AST nodes building. In this 
case, this helper class should be moved to CodeGenStmt and hidden in the 
anonymous namespace. But you also need to use it in CodeGenModule. In this case 
better to use a flag in statement, as you have it right now. I.e. having this 
analysis in Sema looks good_"


https://github.com/llvm/llvm-project/pull/72417
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)

2024-02-15 Thread David Pagan via cfe-commits

ddpagan wrote:

Ping.

https://github.com/llvm/llvm-project/pull/72417
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)

2024-01-10 Thread David Pagan via cfe-commits


@@ -6106,6 +6106,8 @@ class OMPTeamsGenericLoopDirective final : public 
OMPLoopDirective {
 class OMPTargetTeamsGenericLoopDirective final : public OMPLoopDirective {
   friend class ASTStmtReader;
   friend class OMPExecutableDirective;
+  /// true if loop directive's associated loop can be a parallel for.
+  bool CanBeParallelFor = false;

ddpagan wrote:

Ok. Thanks, Alexey.

https://github.com/llvm/llvm-project/pull/72417
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)

2024-01-10 Thread David Pagan via cfe-commits


@@ -6106,6 +6106,8 @@ class OMPTeamsGenericLoopDirective final : public 
OMPLoopDirective {
 class OMPTargetTeamsGenericLoopDirective final : public OMPLoopDirective {
   friend class ASTStmtReader;
   friend class OMPExecutableDirective;
+  /// true if loop directive's associated loop can be a parallel for.
+  bool CanBeParallelFor = false;

ddpagan wrote:

Hi Alexey - thanks for the comment.  A clarification, as I'm not sure exactly 
what you're referring to.

So currently, when OMPTargetTeamsGenericLoopDirective is created in 
SemaOpenMP.cpp, teamsLoopCanBeParallelFor(AStmt) is called as an argument to 
the create, the result of which is stored in CanBeParallelFor. Later, when the 
directive is seen in CodeGen/CGStmtOpenMP.cpp, the boolean CanBeParallelFor 
(via the function canBeParallelFor()) is checked to determine how to emit the 
directive (parallel or distribute).

Are you saying that instead of checking whether the loop can be parallel while 
we're in Sema, and saving that value when we create the target teams loop 
directive, that we should determine this through a call to the Sema function 
teamsLoopCanBeParallelFor() while in CodeGen?



https://github.com/llvm/llvm-project/pull/72417
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)

2024-01-01 Thread David Pagan via cfe-commits

ddpagan wrote:

Ping.

https://github.com/llvm/llvm-project/pull/72417
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[libc] [mlir] [clang] [compiler-rt] [clang-tools-extra] [llvm] [OpenMP] atomic compare fail : Codegen support (PR #75709)

2023-12-20 Thread David Pagan via cfe-commits

https://github.com/ddpagan commented:

Aside from Alexey's comments, LGTM.

https://github.com/llvm/llvm-project/pull/75709
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)

2023-11-16 Thread David Pagan via cfe-commits


@@ -7485,6 +7485,99 @@ void 
CodeGenModule::printPostfixForExternalizedDecl(llvm::raw_ostream ,
   }
 }
 
+namespace {
+/// A 'teams loop' with a nested 'loop bind(parallel)' or generic function
+/// call in the associated loop-nest cannot be a 'parllel for'.
+class TeamsLoopChecker final : public ConstStmtVisitor {

ddpagan wrote:

Alexey - thanks for the review comments. Good suggestion here. I'll look at 
moving the analysis.

https://github.com/llvm/llvm-project/pull/72417
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][OpenMP] Fix target data if/logical expression assert fail (PR #70268)

2023-10-26 Thread David Pagan via cfe-commits

https://github.com/ddpagan closed 
https://github.com/llvm/llvm-project/pull/70268
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [clang][OpenMP] Fix target data if/logical expression assert fail (PR #70268)

2023-10-25 Thread David Pagan via cfe-commits

https://github.com/ddpagan created 
https://github.com/llvm/llvm-project/pull/70268

Fixed assertion failure

  Basic Block in function 'main' does not have terminator!
  label %land.end

caused by premature setting of CodeGenIP upon entry to emitTargetDataCalls, 
where subsequent evaluation of logical expression created new basic blocks, 
leaving CodeGenIP pointing to the wrong basic block. CodeGenIP is now set near 
the end of the function, just prior to generating a comparison of the logical 
expression result (from the if clause) which uses CodeGenIP to insert new IR.

>From d027a1c75322c7158560630d382f8ab01726a728 Mon Sep 17 00:00:00 2001
From: Dave Pagan 
Date: Tue, 24 Oct 2023 14:12:56 -0500
Subject: [PATCH] [clang][OpenMP] Fix target data if/logical expression assert
 fail

Fixed assertion failure

  Basic Block in function 'main' does not have terminator!
  label %land.end

caused by premature setting of CodeGenIP upon entry to
emitTargetDataCalls, where subsequent evaluation of logical
expression created new basic blocks, leaving CodeGenIP pointing to
the wrong basic block. CodeGenIP is now set near the end of the
function, just prior to generating a comparison of the logical
expression result (from the if clause) which uses CodeGenIP to
insert new IR.
---
 clang/lib/CodeGen/CGOpenMPRuntime.cpp |  10 +-
 .../OpenMP/target_data_if_logical_codegen.cpp | 120 ++
 2 files changed, 125 insertions(+), 5 deletions(-)
 create mode 100644 clang/test/OpenMP/target_data_if_logical_codegen.cpp

diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp 
b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
index aae1a0ea250eea2..75fad160b716207 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
@@ -10230,11 +10230,6 @@ void CGOpenMPRuntime::emitTargetDataCalls(
   PrePostActionTy NoPrivAction;
 
   using InsertPointTy = llvm::OpenMPIRBuilder::InsertPointTy;
-  InsertPointTy AllocaIP(CGF.AllocaInsertPt->getParent(),
- CGF.AllocaInsertPt->getIterator());
-  InsertPointTy CodeGenIP(CGF.Builder.GetInsertBlock(),
-  CGF.Builder.GetInsertPoint());
-  llvm::OpenMPIRBuilder::LocationDescription OmpLoc(CodeGenIP);
 
   llvm::Value *IfCondVal = nullptr;
   if (IfCond)
@@ -10314,6 +10309,11 @@ void CGOpenMPRuntime::emitTargetDataCalls(
   // Source location for the ident struct
   llvm::Value *RTLoc = emitUpdateLocation(CGF, D.getBeginLoc());
 
+  InsertPointTy AllocaIP(CGF.AllocaInsertPt->getParent(),
+ CGF.AllocaInsertPt->getIterator());
+  InsertPointTy CodeGenIP(CGF.Builder.GetInsertBlock(),
+  CGF.Builder.GetInsertPoint());
+  llvm::OpenMPIRBuilder::LocationDescription OmpLoc(CodeGenIP);
   CGF.Builder.restoreIP(OMPBuilder.createTargetData(
   OmpLoc, AllocaIP, CodeGenIP, DeviceID, IfCondVal, Info, GenMapInfoCB,
   /*MapperFunc=*/nullptr, BodyCB, DeviceAddrCB, CustomMapperCB, RTLoc));
diff --git a/clang/test/OpenMP/target_data_if_logical_codegen.cpp 
b/clang/test/OpenMP/target_data_if_logical_codegen.cpp
new file mode 100644
index 000..85d98b0c3bcd4d8
--- /dev/null
+++ b/clang/test/OpenMP/target_data_if_logical_codegen.cpp
@@ -0,0 +1,120 @@
+// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py 
UTC_ARGS: --include-generated-funcs --replace-value-regex 
"__omp_offloading_[0-9a-z]+_[0-9a-z]+" "reduction_size[.].+[.]" 
"pl_cond[.].+[.|,]" --prefix-filecheck-ir-name _ --version 3
+// REQUIRES: amdgpu-registered-target
+
+// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple x86_64-unknown-unknown 
-fopenmp-targets=amdgcn-amd-amdhsa -emit-llvm %s -o - \
+// RUN: | FileCheck %s
+
+// Check same results after serialization round-trip
+// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple x86_64-unknown-unknown 
-fopenmp-targets=amdgcn-amd-amdhsa -emit-pch -o %t %s
+// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple x86_64-unknown-unknown 
-fopenmp-targets=amdgcn-amd-amdhsa -include-pch %t -emit-llvm %s -o - \
+// RUN: | FileCheck %s
+
+// expected-no-diagnostics
+#ifndef HEADER
+#define HEADER
+
+extern bool foo(bool);
+
+int if_logical() {
+  bool a = foo(true);
+  bool b = foo(true);
+  int pp = 42;
+  int *p = 
+  #pragma omp target data if(a && b) map(to: p[0])
+  {
+p[0]++;
+  }
+  if (p[0])
+return 1;
+  return 0;
+}
+
+int main() {
+  return if_logical();
+}
+
+#endif
+// CHECK-LABEL: define dso_local noundef i32 @_Z10if_logicalv(
+// CHECK-SAME: ) #[[ATTR0:[0-9]+]] {
+// CHECK-NEXT:  entry:
+// CHECK-NEXT:[[RETVAL:%.*]] = alloca i32, align 4
+// CHECK-NEXT:[[A:%.*]] = alloca i8, align 1
+// CHECK-NEXT:[[B:%.*]] = alloca i8, align 1
+// CHECK-NEXT:[[PP:%.*]] = alloca i32, align 4
+// CHECK-NEXT:[[P:%.*]] = alloca ptr, align 8
+// CHECK-NEXT:[[DOTOFFLOAD_BASEPTRS:%.*]] = alloca [1 x ptr], align 8
+// CHECK-NEXT:[[DOTOFFLOAD_PTRS:%.*]] = alloca [1 x ptr], align 8
+// CHECK-NEXT: