[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)
@@ -1537,6 +1537,12 @@ class CodeGenModule : public CodeGenTypeCache { void printPostfixForExternalizedDecl(llvm::raw_ostream , const Decl *D) const; + /// Under debug mode, print status of target teams loop transformation, + /// which should be either '#distribute' or '#parallel for' + void emitTargetTeamsLoopCodegenStatus(std::string StatusMsg, +const OMPExecutableDirective , +bool IsDevice); + ddpagan wrote: > Don't like the idea of adding new member function specifically for debugging Agreed. Changed it to a static local function. https://github.com/llvm/llvm-project/pull/72417 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)
@@ -11311,6 +11311,10 @@ class Sema final { OpenMPDirectiveKind , OpenMPDirectiveKind ); + /// [target] teams loop is equivalent to parallel for if associated loop + /// nest meets certain critera. + bool teamsLoopCanBeParallelFor(Stmt *Astmt); + ddpagan wrote: > Do you really need to expose it in Sema or you can make it just a static > local function in SemaOpenMP.cpp? Actually, no. Thanks for noticing that. I'll make the change. https://github.com/llvm/llvm-project/pull/72417 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)
ddpagan wrote: After some additional discussion with Alexey offline, he concluded that the current changes are okay, specifically for this reason: _"Then I realized that actually it does not require AST nodes building. In this case, this helper class should be moved to CodeGenStmt and hidden in the anonymous namespace. But you also need to use it in CodeGenModule. In this case better to use a flag in statement, as you have it right now. I.e. having this analysis in Sema looks good_" https://github.com/llvm/llvm-project/pull/72417 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)
@@ -1537,6 +1537,12 @@ class CodeGenModule : public CodeGenTypeCache { void printPostfixForExternalizedDecl(llvm::raw_ostream , const Decl *D) const; + /// Under debug mode, print status of target teams loop transformation, + /// which should be either '#distribute' or '#parallel for' + void emitTargetTeamsLoopCodegenStatus(std::string StatusMsg, +const OMPExecutableDirective , +bool IsDevice); + alexey-bataev wrote: Don't like the idea of adding new member function specifically for debugging https://github.com/llvm/llvm-project/pull/72417 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)
@@ -11311,6 +11311,10 @@ class Sema final { OpenMPDirectiveKind , OpenMPDirectiveKind ); + /// [target] teams loop is equivalent to parallel for if associated loop + /// nest meets certain critera. + bool teamsLoopCanBeParallelFor(Stmt *Astmt); + alexey-bataev wrote: Do you really need to expose it in Sema or you can make it just a static local function in SemaOpenMP.cpp? https://github.com/llvm/llvm-project/pull/72417 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)
alexey-bataev wrote: I already told, that you don't need the changes neither in sema, nor in stmt class. All function changes are in codegen, no other changes are required. https://github.com/llvm/llvm-project/pull/72417 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)
ddpagan wrote: Ping. https://github.com/llvm/llvm-project/pull/72417 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)
@@ -6106,6 +6106,8 @@ class OMPTeamsGenericLoopDirective final : public OMPLoopDirective { class OMPTargetTeamsGenericLoopDirective final : public OMPLoopDirective { friend class ASTStmtReader; friend class OMPExecutableDirective; + /// true if loop directive's associated loop can be a parallel for. + bool CanBeParallelFor = false; doru1004 wrote: I don't think it is possible to have the analysis in Sema and not use a flag here. The two options we have are: 1. Do the analysis in Sema and have the flag and then read the flag in CG. 2. Have the analysis in CG and then there's no reason to pass anything around and CG can call the function when needed. There is a 3rd hybrid way to do this where this function is moved back into CG: ``` bool Sema::teamsLoopCanBeParallelFor(Stmt *AStmt) { TeamsLoopChecker Checker(*this); Checker.Visit(AStmt); return Checker.teamsLoopCanBeParallelFor(); } ``` But then I don't know how you can call the TeamsLoopChecker which lives in Sema. https://github.com/llvm/llvm-project/pull/72417 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)
@@ -6106,6 +6106,8 @@ class OMPTeamsGenericLoopDirective final : public OMPLoopDirective { class OMPTargetTeamsGenericLoopDirective final : public OMPLoopDirective { friend class ASTStmtReader; friend class OMPExecutableDirective; + /// true if loop directive's associated loop can be a parallel for. + bool CanBeParallelFor = false; ddpagan wrote: Ok. Thanks, Alexey. https://github.com/llvm/llvm-project/pull/72417 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)
@@ -6106,6 +6106,8 @@ class OMPTeamsGenericLoopDirective final : public OMPLoopDirective { class OMPTargetTeamsGenericLoopDirective final : public OMPLoopDirective { friend class ASTStmtReader; friend class OMPExecutableDirective; + /// true if loop directive's associated loop can be a parallel for. + bool CanBeParallelFor = false; alexey-bataev wrote: I'm saying that you can move this function to the codegen and call it in the codegen https://github.com/llvm/llvm-project/pull/72417 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)
@@ -6106,6 +6106,8 @@ class OMPTeamsGenericLoopDirective final : public OMPLoopDirective { class OMPTargetTeamsGenericLoopDirective final : public OMPLoopDirective { friend class ASTStmtReader; friend class OMPExecutableDirective; + /// true if loop directive's associated loop can be a parallel for. + bool CanBeParallelFor = false; ddpagan wrote: Hi Alexey - thanks for the comment. A clarification, as I'm not sure exactly what you're referring to. So currently, when OMPTargetTeamsGenericLoopDirective is created in SemaOpenMP.cpp, teamsLoopCanBeParallelFor(AStmt) is called as an argument to the create, the result of which is stored in CanBeParallelFor. Later, when the directive is seen in CodeGen/CGStmtOpenMP.cpp, the boolean CanBeParallelFor (via the function canBeParallelFor()) is checked to determine how to emit the directive (parallel or distribute). Are you saying that instead of checking whether the loop can be parallel while we're in Sema, and saving that value when we create the target teams loop directive, that we should determine this through a call to the Sema function teamsLoopCanBeParallelFor() while in CodeGen? https://github.com/llvm/llvm-project/pull/72417 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)
@@ -6106,6 +6106,8 @@ class OMPTeamsGenericLoopDirective final : public OMPLoopDirective { class OMPTargetTeamsGenericLoopDirective final : public OMPLoopDirective { friend class ASTStmtReader; friend class OMPExecutableDirective; + /// true if loop directive's associated loop can be a parallel for. + bool CanBeParallelFor = false; alexey-bataev wrote: Do you still need to this new field, taking into account that it does not affect sema anymore, only codegen? You don't need to store this flag, instead you can call teamsLoopCanBeParallelFor() directly in codegen https://github.com/llvm/llvm-project/pull/72417 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)
ddpagan wrote: Ping. https://github.com/llvm/llvm-project/pull/72417 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)
@@ -7485,6 +7485,99 @@ void CodeGenModule::printPostfixForExternalizedDecl(llvm::raw_ostream , } } +namespace { +/// A 'teams loop' with a nested 'loop bind(parallel)' or generic function +/// call in the associated loop-nest cannot be a 'parllel for'. +class TeamsLoopChecker final : public ConstStmtVisitor { +public: + TeamsLoopChecker(CodeGenModule ) + : CGM(CGM), TeamsLoopCanBeParallelFor{true} {} + bool teamsLoopCanBeParallelFor() const { +return TeamsLoopCanBeParallelFor; + } + // Is there a nested OpenMP loop bind(parallel) + void VisitOMPExecutableDirective(const OMPExecutableDirective *D) { +if (D->getDirectiveKind() == llvm::omp::Directive::OMPD_loop) { + if (const auto *C = D->getSingleClause()) +if (C->getBindKind() == OMPC_BIND_parallel) { + TeamsLoopCanBeParallelFor = false; + // No need to continue visiting any more + return; +} +} +for (const Stmt *Child : D->children()) + if (Child) +Visit(Child); + } + + void VisitCallExpr(const CallExpr *C) { +// Function calls inhibit parallel loop translation of 'target teams loop' +// unless the assume-no-nested-parallelism flag has been specified. +// OpenMP API runtime library calls do not inhibit parallel loop +// translation, regardless of the assume-no-nested-parallelism. +if (C) { + bool IsOpenMPAPI = false; + auto *FD = dyn_cast_or_null(C->getCalleeDecl()); + if (FD) { +std::string Name = FD->getNameInfo().getAsString(); +IsOpenMPAPI = Name.find("omp_") == 0; alexey-bataev wrote: Ah, ok, then it should be fine https://github.com/llvm/llvm-project/pull/72417 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)
@@ -7485,6 +7485,99 @@ void CodeGenModule::printPostfixForExternalizedDecl(llvm::raw_ostream , } } +namespace { +/// A 'teams loop' with a nested 'loop bind(parallel)' or generic function +/// call in the associated loop-nest cannot be a 'parllel for'. +class TeamsLoopChecker final : public ConstStmtVisitor { +public: + TeamsLoopChecker(CodeGenModule ) + : CGM(CGM), TeamsLoopCanBeParallelFor{true} {} + bool teamsLoopCanBeParallelFor() const { +return TeamsLoopCanBeParallelFor; + } + // Is there a nested OpenMP loop bind(parallel) + void VisitOMPExecutableDirective(const OMPExecutableDirective *D) { +if (D->getDirectiveKind() == llvm::omp::Directive::OMPD_loop) { + if (const auto *C = D->getSingleClause()) +if (C->getBindKind() == OMPC_BIND_parallel) { + TeamsLoopCanBeParallelFor = false; + // No need to continue visiting any more + return; +} +} +for (const Stmt *Child : D->children()) + if (Child) +Visit(Child); + } + + void VisitCallExpr(const CallExpr *C) { +// Function calls inhibit parallel loop translation of 'target teams loop' +// unless the assume-no-nested-parallelism flag has been specified. +// OpenMP API runtime library calls do not inhibit parallel loop +// translation, regardless of the assume-no-nested-parallelism. +if (C) { + bool IsOpenMPAPI = false; + auto *FD = dyn_cast_or_null(C->getCalleeDecl()); + if (FD) { +std::string Name = FD->getNameInfo().getAsString(); +IsOpenMPAPI = Name.find("omp_") == 0; dhruvachak wrote: > No, it is not. I see the following in the spec: ``` Programs must not declare names that begin with the omp_ or ompx_ prefix, as these are reserved for the OpenMP implementation. ``` https://github.com/llvm/llvm-project/pull/72417 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)
@@ -7485,6 +7485,99 @@ void CodeGenModule::printPostfixForExternalizedDecl(llvm::raw_ostream , } } +namespace { +/// A 'teams loop' with a nested 'loop bind(parallel)' or generic function +/// call in the associated loop-nest cannot be a 'parllel for'. +class TeamsLoopChecker final : public ConstStmtVisitor { +public: + TeamsLoopChecker(CodeGenModule ) + : CGM(CGM), TeamsLoopCanBeParallelFor{true} {} + bool teamsLoopCanBeParallelFor() const { +return TeamsLoopCanBeParallelFor; + } + // Is there a nested OpenMP loop bind(parallel) + void VisitOMPExecutableDirective(const OMPExecutableDirective *D) { +if (D->getDirectiveKind() == llvm::omp::Directive::OMPD_loop) { + if (const auto *C = D->getSingleClause()) +if (C->getBindKind() == OMPC_BIND_parallel) { + TeamsLoopCanBeParallelFor = false; + // No need to continue visiting any more + return; +} +} +for (const Stmt *Child : D->children()) + if (Child) +Visit(Child); + } + + void VisitCallExpr(const CallExpr *C) { +// Function calls inhibit parallel loop translation of 'target teams loop' +// unless the assume-no-nested-parallelism flag has been specified. +// OpenMP API runtime library calls do not inhibit parallel loop +// translation, regardless of the assume-no-nested-parallelism. +if (C) { + bool IsOpenMPAPI = false; + auto *FD = dyn_cast_or_null(C->getCalleeDecl()); + if (FD) { +std::string Name = FD->getNameInfo().getAsString(); +IsOpenMPAPI = Name.find("omp_") == 0; alexey-bataev wrote: No, it is not. https://github.com/llvm/llvm-project/pull/72417 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)
@@ -7485,6 +7485,99 @@ void CodeGenModule::printPostfixForExternalizedDecl(llvm::raw_ostream , } } +namespace { +/// A 'teams loop' with a nested 'loop bind(parallel)' or generic function +/// call in the associated loop-nest cannot be a 'parllel for'. +class TeamsLoopChecker final : public ConstStmtVisitor { +public: + TeamsLoopChecker(CodeGenModule ) + : CGM(CGM), TeamsLoopCanBeParallelFor{true} {} + bool teamsLoopCanBeParallelFor() const { +return TeamsLoopCanBeParallelFor; + } + // Is there a nested OpenMP loop bind(parallel) + void VisitOMPExecutableDirective(const OMPExecutableDirective *D) { +if (D->getDirectiveKind() == llvm::omp::Directive::OMPD_loop) { + if (const auto *C = D->getSingleClause()) +if (C->getBindKind() == OMPC_BIND_parallel) { + TeamsLoopCanBeParallelFor = false; + // No need to continue visiting any more + return; +} +} +for (const Stmt *Child : D->children()) + if (Child) +Visit(Child); + } + + void VisitCallExpr(const CallExpr *C) { +// Function calls inhibit parallel loop translation of 'target teams loop' +// unless the assume-no-nested-parallelism flag has been specified. +// OpenMP API runtime library calls do not inhibit parallel loop +// translation, regardless of the assume-no-nested-parallelism. +if (C) { + bool IsOpenMPAPI = false; + auto *FD = dyn_cast_or_null(C->getCalleeDecl()); + if (FD) { +std::string Name = FD->getNameInfo().getAsString(); +IsOpenMPAPI = Name.find("omp_") == 0; dhruvachak wrote: > What if this is user function with omp_ prefix? I think that was not allowed. Isn't omp_ a reserved prefix? https://github.com/llvm/llvm-project/pull/72417 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)
@@ -7485,6 +7485,99 @@ void CodeGenModule::printPostfixForExternalizedDecl(llvm::raw_ostream , } } +namespace { +/// A 'teams loop' with a nested 'loop bind(parallel)' or generic function +/// call in the associated loop-nest cannot be a 'parllel for'. +class TeamsLoopChecker final : public ConstStmtVisitor { ddpagan wrote: Alexey - thanks for the review comments. Good suggestion here. I'll look at moving the analysis. https://github.com/llvm/llvm-project/pull/72417 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)
@@ -7485,6 +7485,99 @@ void CodeGenModule::printPostfixForExternalizedDecl(llvm::raw_ostream , } } +namespace { +/// A 'teams loop' with a nested 'loop bind(parallel)' or generic function +/// call in the associated loop-nest cannot be a 'parllel for'. +class TeamsLoopChecker final : public ConstStmtVisitor { +public: + TeamsLoopChecker(CodeGenModule ) + : CGM(CGM), TeamsLoopCanBeParallelFor{true} {} + bool teamsLoopCanBeParallelFor() const { +return TeamsLoopCanBeParallelFor; + } + // Is there a nested OpenMP loop bind(parallel) + void VisitOMPExecutableDirective(const OMPExecutableDirective *D) { +if (D->getDirectiveKind() == llvm::omp::Directive::OMPD_loop) { + if (const auto *C = D->getSingleClause()) +if (C->getBindKind() == OMPC_BIND_parallel) { + TeamsLoopCanBeParallelFor = false; + // No need to continue visiting any more + return; +} +} +for (const Stmt *Child : D->children()) + if (Child) +Visit(Child); + } + + void VisitCallExpr(const CallExpr *C) { +// Function calls inhibit parallel loop translation of 'target teams loop' +// unless the assume-no-nested-parallelism flag has been specified. +// OpenMP API runtime library calls do not inhibit parallel loop +// translation, regardless of the assume-no-nested-parallelism. +if (C) { + bool IsOpenMPAPI = false; + auto *FD = dyn_cast_or_null(C->getCalleeDecl()); + if (FD) { +std::string Name = FD->getNameInfo().getAsString(); +IsOpenMPAPI = Name.find("omp_") == 0; alexey-bataev wrote: What if this is user function with omp_ prefix? https://github.com/llvm/llvm-project/pull/72417 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)
@@ -7485,6 +7485,99 @@ void CodeGenModule::printPostfixForExternalizedDecl(llvm::raw_ostream , } } +namespace { +/// A 'teams loop' with a nested 'loop bind(parallel)' or generic function +/// call in the associated loop-nest cannot be a 'parllel for'. +class TeamsLoopChecker final : public ConstStmtVisitor { +public: + TeamsLoopChecker(CodeGenModule ) + : CGM(CGM), TeamsLoopCanBeParallelFor{true} {} + bool teamsLoopCanBeParallelFor() const { +return TeamsLoopCanBeParallelFor; + } + // Is there a nested OpenMP loop bind(parallel) + void VisitOMPExecutableDirective(const OMPExecutableDirective *D) { +if (D->getDirectiveKind() == llvm::omp::Directive::OMPD_loop) { + if (const auto *C = D->getSingleClause()) +if (C->getBindKind() == OMPC_BIND_parallel) { + TeamsLoopCanBeParallelFor = false; + // No need to continue visiting any more + return; +} +} +for (const Stmt *Child : D->children()) + if (Child) +Visit(Child); + } + + void VisitCallExpr(const CallExpr *C) { +// Function calls inhibit parallel loop translation of 'target teams loop' +// unless the assume-no-nested-parallelism flag has been specified. +// OpenMP API runtime library calls do not inhibit parallel loop +// translation, regardless of the assume-no-nested-parallelism. +if (C) { + bool IsOpenMPAPI = false; + auto *FD = dyn_cast_or_null(C->getCalleeDecl()); + if (FD) { +std::string Name = FD->getNameInfo().getAsString(); +IsOpenMPAPI = Name.find("omp_") == 0; + } + TeamsLoopCanBeParallelFor = + IsOpenMPAPI || CGM.getLangOpts().OpenMPNoNestedParallelism; + if (!TeamsLoopCanBeParallelFor) +return; +} +for (const Stmt *Child : C->children()) + if (Child) +Visit(Child); + } + + void VisitCapturedStmt(const CapturedStmt *S) { +if (!S) + return; +Visit(S->getCapturedDecl()->getBody()); + } + + void VisitStmt(const Stmt *S) { +if (!S) + return; +for (const Stmt *Child : S->children()) + if (Child) +Visit(Child); + } + +private: + CodeGenModule + bool TeamsLoopCanBeParallelFor; +}; +} // namespace + +/// Determine if 'teams loop' can be emitted using 'parallel for'. +bool CodeGenModule::teamsLoopCanBeParallelFor(const OMPExecutableDirective ) { + if (D.getDirectiveKind() != llvm::omp::Directive::OMPD_target_teams_loop) +return false; + assert(D.hasAssociatedStmt() && + "Loop directive must have associated statement."); + TeamsLoopChecker Checker(*this); + Checker.Visit(D.getAssociatedStmt()); + return Checker.teamsLoopCanBeParallelFor(); +} + +void CodeGenModule::emitTargetTeamsLoopCodegenStatus( +std::string StatusMsg, const OMPExecutableDirective , bool IsDevice) { + if (IsDevice) +StatusMsg += ": DEVICE"; + else +StatusMsg += ": HOST"; + SourceLocation L = D.getBeginLoc(); + SourceManager = getContext().getSourceManager(); + PresumedLoc PLoc = SM.getPresumedLoc(L); + const char *FileName = PLoc.isValid() ? PLoc.getFilename() : nullptr; + unsigned LineNo = + PLoc.isValid() ? PLoc.getLine() : SM.getExpansionLineNumber(L); + llvm::dbgs() << StatusMsg << ": " << FileName << ": " << LineNo << "\n"; +} alexey-bataev wrote: This should be enabled only in debug mode https://github.com/llvm/llvm-project/pull/72417 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)
@@ -7485,6 +7485,99 @@ void CodeGenModule::printPostfixForExternalizedDecl(llvm::raw_ostream , } } +namespace { +/// A 'teams loop' with a nested 'loop bind(parallel)' or generic function +/// call in the associated loop-nest cannot be a 'parllel for'. +class TeamsLoopChecker final : public ConstStmtVisitor { alexey-bataev wrote: Better to do this analysis in Sema and mark the region appropriately https://github.com/llvm/llvm-project/pull/72417 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)
github-actions[bot] wrote: :warning: C/C++ code formatter, clang-format found issues in your code. :warning: You can test this locally with the following command: ``bash git-clang-format --diff ec64af5994213cf7933e961a2b7fe07193a77b61 8710c48fb90373ebd2afe1afa7399a6643b52c37 -- clang/test/OpenMP/target_teams_generic_loop_codegen_as_distribute.cpp clang/test/OpenMP/target_teams_generic_loop_codegen_as_parallel_for.cpp clang/lib/CodeGen/CGOpenMPRuntime.cpp clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp clang/lib/CodeGen/CGStmtOpenMP.cpp clang/lib/CodeGen/CodeGenModule.cpp clang/lib/CodeGen/CodeGenModule.h clang/lib/Sema/SemaOpenMP.cpp clang/test/OpenMP/nvptx_target_teams_generic_loop_codegen.cpp clang/test/OpenMP/nvptx_target_teams_generic_loop_generic_mode_codegen.cpp clang/test/OpenMP/target_teams_generic_loop_codegen-1.cpp clang/test/OpenMP/target_teams_generic_loop_codegen.cpp clang/test/OpenMP/target_teams_generic_loop_collapse_codegen.cpp clang/test/OpenMP/target_teams_generic_loop_if_codegen.cpp clang/test/OpenMP/target_teams_generic_loop_private_codegen.cpp clang/test/OpenMP/target_teams_generic_loop_uses_allocators_codegen.cpp clang/test/OpenMP/teams_generic_loop_codegen-1.cpp clang/test/OpenMP/teams_generic_loop_codegen.cpp clang/test/OpenMP/teams_generic_loop_collapse_codegen.cpp clang/test/OpenMP/teams_generic_loop_private_codegen.cpp clang/test/OpenMP/teams_generic_loop_reduction_codegen.cpp `` View the diff from clang-format here. ``diff diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp b/clang/lib/CodeGen/CGOpenMPRuntime.cpp index 97f963df7a..9545a5e3bd 100644 --- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp +++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp @@ -2645,11 +2645,11 @@ void CGOpenMPRuntime::emitForStaticFinish(CodeGenFunction , llvm::Value *Args[] = { emitUpdateLocation(CGF, Loc, isOpenMPDistributeDirective(DKind) || - (DKind == OMPD_target_teams_loop) + (DKind == OMPD_target_teams_loop) ? OMP_IDENT_WORK_DISTRIBUTE - : isOpenMPLoopDirective(DKind) - ? OMP_IDENT_WORK_LOOP - : OMP_IDENT_WORK_SECTIONS), + : isOpenMPLoopDirective(DKind) + ? OMP_IDENT_WORK_LOOP + : OMP_IDENT_WORK_SECTIONS), getThreadID(CGF, Loc)}; auto DL = ApplyDebugLocation::CreateDefaultArtificial(CGF, Loc); if (isOpenMPDistributeDirective(DKind) && diff --git a/clang/lib/CodeGen/CGStmtOpenMP.cpp b/clang/lib/CodeGen/CGStmtOpenMP.cpp index 16f84cab18..229f015e04 100644 --- a/clang/lib/CodeGen/CGStmtOpenMP.cpp +++ b/clang/lib/CodeGen/CGStmtOpenMP.cpp @@ -7924,9 +7924,9 @@ static void emitTargetTeamsGenericLoopRegionAsParallel( CGF.EmitOMPReductionClauseFinal(S, /*ReductionKind=*/OMPD_teams); }; DEBUG_WITH_TYPE(TTL_CODEGEN_TYPE, - CGF.CGM.emitTargetTeamsLoopCodegenStatus( - TTL_CODEGEN_TYPE " as parallel for", S, - CGF.CGM.getLangOpts().OpenMPIsTargetDevice)); + CGF.CGM.emitTargetTeamsLoopCodegenStatus( + TTL_CODEGEN_TYPE " as parallel for", S, + CGF.CGM.getLangOpts().OpenMPIsTargetDevice)); emitCommonOMPTeamsDirective(CGF, S, OMPD_distribute_parallel_for, CodeGenTeams); emitPostUpdateForReductionClause(CGF, S, @@ -7954,9 +7954,9 @@ static void emitTargetTeamsGenericLoopRegionAsDistribute( CGF.EmitOMPReductionClauseFinal(S, /*ReductionKind=*/OMPD_teams); }; DEBUG_WITH_TYPE(TTL_CODEGEN_TYPE, - CGF.CGM.emitTargetTeamsLoopCodegenStatus( - TTL_CODEGEN_TYPE " as distribute", S, - CGF.CGM.getLangOpts().OpenMPIsTargetDevice)); + CGF.CGM.emitTargetTeamsLoopCodegenStatus( + TTL_CODEGEN_TYPE " as distribute", S, + CGF.CGM.getLangOpts().OpenMPIsTargetDevice)); emitCommonOMPTeamsDirective(CGF, S, OMPD_distribute, CodeGen); emitPostUpdateForReductionClause(CGF, S, [](CodeGenFunction &) { return nullptr; }); diff --git a/clang/lib/CodeGen/CodeGenModule.cpp b/clang/lib/CodeGen/CodeGenModule.cpp index 0dd389b3a8..ae287c7c68 100644 --- a/clang/lib/CodeGen/CodeGenModule.cpp +++ b/clang/lib/CodeGen/CodeGenModule.cpp @@ -7492,9 +7492,7 @@ class TeamsLoopChecker final : public ConstStmtVisitor { public: TeamsLoopChecker(CodeGenModule ) : CGM(CGM), TeamsLoopCanBeParallelFor{true} {} - bool teamsLoopCanBeParallelFor() const { -return TeamsLoopCanBeParallelFor; - } + bool teamsLoopCanBeParallelFor() const { return TeamsLoopCanBeParallelFor; } // Is there a nested OpenMP loop bind(parallel) void VisitOMPExecutableDirective(const OMPExecutableDirective *D) { if (D->getDirectiveKind() ==
[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)
llvmbot wrote: @llvm/pr-subscribers-clang-codegen @llvm/pr-subscribers-clang Author: David Pagan (ddpagan) Changes IR for 'target teams loop' is now dependent on suitability of associated loop-nest. If a loop-nest: - does not contain a function call, or - the -fopenmp-assume-no-nested-parallelism has been specified, - or the call is to an OpenMP API AND - does not contain nested loop bind(parallel) directives then it can be emitted as 'target teams distribute parallel for', which is the current default. Otherwise, it is emitted as 'target teams distribute'. Added debug output indicating how 'target teams loop' was emitted. Flag is -mllvm -debug-only=target-teams-loop-codegen Added LIT tests explicitly verifying 'target teams loop' emitted as a parallel loop and a distribute loop. Updated other 'loop' related tests as needed to reflect change in IR. - These updates account for most of the changed files and additions/deletions. --- Patch is 1.06 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/72417.diff 21 Files Affected: - (modified) clang/lib/CodeGen/CGOpenMPRuntime.cpp (+6-3) - (modified) clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp (+7-3) - (modified) clang/lib/CodeGen/CGStmtOpenMP.cpp (+51-14) - (modified) clang/lib/CodeGen/CodeGenModule.cpp (+93) - (modified) clang/lib/CodeGen/CodeGenModule.h (+8) - (modified) clang/lib/Sema/SemaOpenMP.cpp (+8-2) - (modified) clang/test/OpenMP/nvptx_target_teams_generic_loop_codegen.cpp (+72-42) - (modified) clang/test/OpenMP/nvptx_target_teams_generic_loop_generic_mode_codegen.cpp (+82-315) - (modified) clang/test/OpenMP/target_teams_generic_loop_codegen-1.cpp (+25-25) - (modified) clang/test/OpenMP/target_teams_generic_loop_codegen.cpp (+17-21) - (added) clang/test/OpenMP/target_teams_generic_loop_codegen_as_distribute.cpp (+1685) - (added) clang/test/OpenMP/target_teams_generic_loop_codegen_as_parallel_for.cpp (+3998) - (modified) clang/test/OpenMP/target_teams_generic_loop_collapse_codegen.cpp (+16-16) - (modified) clang/test/OpenMP/target_teams_generic_loop_if_codegen.cpp (+118-600) - (modified) clang/test/OpenMP/target_teams_generic_loop_private_codegen.cpp (+242-1200) - (modified) clang/test/OpenMP/target_teams_generic_loop_uses_allocators_codegen.cpp (+3-3) - (modified) clang/test/OpenMP/teams_generic_loop_codegen-1.cpp (+146-1210) - (modified) clang/test/OpenMP/teams_generic_loop_codegen.cpp (+156-510) - (modified) clang/test/OpenMP/teams_generic_loop_collapse_codegen.cpp (+142-678) - (modified) clang/test/OpenMP/teams_generic_loop_private_codegen.cpp (+101-580) - (modified) clang/test/OpenMP/teams_generic_loop_reduction_codegen.cpp (+122-689) ``diff diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp b/clang/lib/CodeGen/CGOpenMPRuntime.cpp index d2be8141a3a4b31..97f963df7ad67b4 100644 --- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp +++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp @@ -2644,7 +2644,8 @@ void CGOpenMPRuntime::emitForStaticFinish(CodeGenFunction , // Call __kmpc_for_static_fini(ident_t *loc, kmp_int32 tid); llvm::Value *Args[] = { emitUpdateLocation(CGF, Loc, - isOpenMPDistributeDirective(DKind) + isOpenMPDistributeDirective(DKind) || + (DKind == OMPD_target_teams_loop) ? OMP_IDENT_WORK_DISTRIBUTE : isOpenMPLoopDirective(DKind) ? OMP_IDENT_WORK_LOOP @@ -8779,7 +8780,8 @@ getNestedDistributeDirective(ASTContext , const OMPExecutableDirective ) { OpenMPDirectiveKind DKind = NestedDir->getDirectiveKind(); switch (D.getDirectiveKind()) { case OMPD_target: - // For now, just treat 'target teams loop' as if it's distributed. + // For now, treat 'target' with nested 'teams loop' as if it's + // distributed (target teams distribute). if (isOpenMPDistributeDirective(DKind) || DKind == OMPD_teams_loop) return NestedDir; if (DKind == OMPD_teams) { @@ -9263,7 +9265,8 @@ llvm::Value *CGOpenMPRuntime::emitTargetNumIterationsCall( SizeEmitter) { OpenMPDirectiveKind Kind = D.getDirectiveKind(); const OMPExecutableDirective *TD = - // Get nested teams distribute kind directive, if any. + // Get nested teams distribute kind directive, if any. For now, treat + // 'target_teams_loop' as if it's really a target_teams_distribute. if ((!isOpenMPDistributeDirective(Kind) || !isOpenMPTeamsDirective(Kind)) && Kind != OMPD_target_teams_loop) TD = getNestedDistributeDirective(CGM.getContext(), D); diff --git a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp index abecf5250f4cf96..2e1cf1ed3abf40c 100644 --- a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp +++ b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp @@ -639,14 +639,14 @@ static bool hasNestedSPMDDirective(ASTContext ,