[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #87278)
https://github.com/ddpagan closed https://github.com/llvm/llvm-project/pull/87278 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)
@@ -1537,6 +1537,12 @@ class CodeGenModule : public CodeGenTypeCache { void printPostfixForExternalizedDecl(llvm::raw_ostream , const Decl *D) const; + /// Under debug mode, print status of target teams loop transformation, + /// which should be either '#distribute' or '#parallel for' + void emitTargetTeamsLoopCodegenStatus(std::string StatusMsg, +const OMPExecutableDirective , +bool IsDevice); + ddpagan wrote: > Don't like the idea of adding new member function specifically for debugging Agreed. Changed it to a static local function. https://github.com/llvm/llvm-project/pull/72417 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)
@@ -11311,6 +11311,10 @@ class Sema final { OpenMPDirectiveKind , OpenMPDirectiveKind ); + /// [target] teams loop is equivalent to parallel for if associated loop + /// nest meets certain critera. + bool teamsLoopCanBeParallelFor(Stmt *Astmt); + ddpagan wrote: > Do you really need to expose it in Sema or you can make it just a static > local function in SemaOpenMP.cpp? Actually, no. Thanks for noticing that. I'll make the change. https://github.com/llvm/llvm-project/pull/72417 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)
ddpagan wrote: After some additional discussion with Alexey offline, he concluded that the current changes are okay, specifically for this reason: _"Then I realized that actually it does not require AST nodes building. In this case, this helper class should be moved to CodeGenStmt and hidden in the anonymous namespace. But you also need to use it in CodeGenModule. In this case better to use a flag in statement, as you have it right now. I.e. having this analysis in Sema looks good_" https://github.com/llvm/llvm-project/pull/72417 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)
ddpagan wrote: Ping. https://github.com/llvm/llvm-project/pull/72417 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)
@@ -6106,6 +6106,8 @@ class OMPTeamsGenericLoopDirective final : public OMPLoopDirective { class OMPTargetTeamsGenericLoopDirective final : public OMPLoopDirective { friend class ASTStmtReader; friend class OMPExecutableDirective; + /// true if loop directive's associated loop can be a parallel for. + bool CanBeParallelFor = false; ddpagan wrote: Ok. Thanks, Alexey. https://github.com/llvm/llvm-project/pull/72417 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)
@@ -6106,6 +6106,8 @@ class OMPTeamsGenericLoopDirective final : public OMPLoopDirective { class OMPTargetTeamsGenericLoopDirective final : public OMPLoopDirective { friend class ASTStmtReader; friend class OMPExecutableDirective; + /// true if loop directive's associated loop can be a parallel for. + bool CanBeParallelFor = false; ddpagan wrote: Hi Alexey - thanks for the comment. A clarification, as I'm not sure exactly what you're referring to. So currently, when OMPTargetTeamsGenericLoopDirective is created in SemaOpenMP.cpp, teamsLoopCanBeParallelFor(AStmt) is called as an argument to the create, the result of which is stored in CanBeParallelFor. Later, when the directive is seen in CodeGen/CGStmtOpenMP.cpp, the boolean CanBeParallelFor (via the function canBeParallelFor()) is checked to determine how to emit the directive (parallel or distribute). Are you saying that instead of checking whether the loop can be parallel while we're in Sema, and saving that value when we create the target teams loop directive, that we should determine this through a call to the Sema function teamsLoopCanBeParallelFor() while in CodeGen? https://github.com/llvm/llvm-project/pull/72417 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)
ddpagan wrote: Ping. https://github.com/llvm/llvm-project/pull/72417 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[libc] [mlir] [clang] [compiler-rt] [clang-tools-extra] [llvm] [OpenMP] atomic compare fail : Codegen support (PR #75709)
https://github.com/ddpagan commented: Aside from Alexey's comments, LGTM. https://github.com/llvm/llvm-project/pull/75709 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)
@@ -7485,6 +7485,99 @@ void CodeGenModule::printPostfixForExternalizedDecl(llvm::raw_ostream , } } +namespace { +/// A 'teams loop' with a nested 'loop bind(parallel)' or generic function +/// call in the associated loop-nest cannot be a 'parllel for'. +class TeamsLoopChecker final : public ConstStmtVisitor { ddpagan wrote: Alexey - thanks for the review comments. Good suggestion here. I'll look at moving the analysis. https://github.com/llvm/llvm-project/pull/72417 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][OpenMP] Fix target data if/logical expression assert fail (PR #70268)
https://github.com/ddpagan closed https://github.com/llvm/llvm-project/pull/70268 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [clang][OpenMP] Fix target data if/logical expression assert fail (PR #70268)
https://github.com/ddpagan created https://github.com/llvm/llvm-project/pull/70268 Fixed assertion failure Basic Block in function 'main' does not have terminator! label %land.end caused by premature setting of CodeGenIP upon entry to emitTargetDataCalls, where subsequent evaluation of logical expression created new basic blocks, leaving CodeGenIP pointing to the wrong basic block. CodeGenIP is now set near the end of the function, just prior to generating a comparison of the logical expression result (from the if clause) which uses CodeGenIP to insert new IR. >From d027a1c75322c7158560630d382f8ab01726a728 Mon Sep 17 00:00:00 2001 From: Dave Pagan Date: Tue, 24 Oct 2023 14:12:56 -0500 Subject: [PATCH] [clang][OpenMP] Fix target data if/logical expression assert fail Fixed assertion failure Basic Block in function 'main' does not have terminator! label %land.end caused by premature setting of CodeGenIP upon entry to emitTargetDataCalls, where subsequent evaluation of logical expression created new basic blocks, leaving CodeGenIP pointing to the wrong basic block. CodeGenIP is now set near the end of the function, just prior to generating a comparison of the logical expression result (from the if clause) which uses CodeGenIP to insert new IR. --- clang/lib/CodeGen/CGOpenMPRuntime.cpp | 10 +- .../OpenMP/target_data_if_logical_codegen.cpp | 120 ++ 2 files changed, 125 insertions(+), 5 deletions(-) create mode 100644 clang/test/OpenMP/target_data_if_logical_codegen.cpp diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp b/clang/lib/CodeGen/CGOpenMPRuntime.cpp index aae1a0ea250eea2..75fad160b716207 100644 --- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp +++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp @@ -10230,11 +10230,6 @@ void CGOpenMPRuntime::emitTargetDataCalls( PrePostActionTy NoPrivAction; using InsertPointTy = llvm::OpenMPIRBuilder::InsertPointTy; - InsertPointTy AllocaIP(CGF.AllocaInsertPt->getParent(), - CGF.AllocaInsertPt->getIterator()); - InsertPointTy CodeGenIP(CGF.Builder.GetInsertBlock(), - CGF.Builder.GetInsertPoint()); - llvm::OpenMPIRBuilder::LocationDescription OmpLoc(CodeGenIP); llvm::Value *IfCondVal = nullptr; if (IfCond) @@ -10314,6 +10309,11 @@ void CGOpenMPRuntime::emitTargetDataCalls( // Source location for the ident struct llvm::Value *RTLoc = emitUpdateLocation(CGF, D.getBeginLoc()); + InsertPointTy AllocaIP(CGF.AllocaInsertPt->getParent(), + CGF.AllocaInsertPt->getIterator()); + InsertPointTy CodeGenIP(CGF.Builder.GetInsertBlock(), + CGF.Builder.GetInsertPoint()); + llvm::OpenMPIRBuilder::LocationDescription OmpLoc(CodeGenIP); CGF.Builder.restoreIP(OMPBuilder.createTargetData( OmpLoc, AllocaIP, CodeGenIP, DeviceID, IfCondVal, Info, GenMapInfoCB, /*MapperFunc=*/nullptr, BodyCB, DeviceAddrCB, CustomMapperCB, RTLoc)); diff --git a/clang/test/OpenMP/target_data_if_logical_codegen.cpp b/clang/test/OpenMP/target_data_if_logical_codegen.cpp new file mode 100644 index 000..85d98b0c3bcd4d8 --- /dev/null +++ b/clang/test/OpenMP/target_data_if_logical_codegen.cpp @@ -0,0 +1,120 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --include-generated-funcs --replace-value-regex "__omp_offloading_[0-9a-z]+_[0-9a-z]+" "reduction_size[.].+[.]" "pl_cond[.].+[.|,]" --prefix-filecheck-ir-name _ --version 3 +// REQUIRES: amdgpu-registered-target + +// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple x86_64-unknown-unknown -fopenmp-targets=amdgcn-amd-amdhsa -emit-llvm %s -o - \ +// RUN: | FileCheck %s + +// Check same results after serialization round-trip +// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple x86_64-unknown-unknown -fopenmp-targets=amdgcn-amd-amdhsa -emit-pch -o %t %s +// RUN: %clang_cc1 -fopenmp -x c++ -std=c++11 -triple x86_64-unknown-unknown -fopenmp-targets=amdgcn-amd-amdhsa -include-pch %t -emit-llvm %s -o - \ +// RUN: | FileCheck %s + +// expected-no-diagnostics +#ifndef HEADER +#define HEADER + +extern bool foo(bool); + +int if_logical() { + bool a = foo(true); + bool b = foo(true); + int pp = 42; + int *p = + #pragma omp target data if(a && b) map(to: p[0]) + { +p[0]++; + } + if (p[0]) +return 1; + return 0; +} + +int main() { + return if_logical(); +} + +#endif +// CHECK-LABEL: define dso_local noundef i32 @_Z10if_logicalv( +// CHECK-SAME: ) #[[ATTR0:[0-9]+]] { +// CHECK-NEXT: entry: +// CHECK-NEXT:[[RETVAL:%.*]] = alloca i32, align 4 +// CHECK-NEXT:[[A:%.*]] = alloca i8, align 1 +// CHECK-NEXT:[[B:%.*]] = alloca i8, align 1 +// CHECK-NEXT:[[PP:%.*]] = alloca i32, align 4 +// CHECK-NEXT:[[P:%.*]] = alloca ptr, align 8 +// CHECK-NEXT:[[DOTOFFLOAD_BASEPTRS:%.*]] = alloca [1 x ptr], align 8 +// CHECK-NEXT:[[DOTOFFLOAD_PTRS:%.*]] = alloca [1 x ptr], align 8 +// CHECK-NEXT: