[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)

2024-03-14 Thread David Pagan via cfe-commits


@@ -1537,6 +1537,12 @@ class CodeGenModule : public CodeGenTypeCache {
   void printPostfixForExternalizedDecl(llvm::raw_ostream ,
const Decl *D) const;
 
+  /// Under debug mode, print status of target teams loop transformation,
+  /// which should be either '#distribute' or '#parallel for'
+  void emitTargetTeamsLoopCodegenStatus(std::string StatusMsg,
+const OMPExecutableDirective ,
+bool IsDevice);
+

ddpagan wrote:

> Don't like the idea of adding new member function specifically for debugging

Agreed. Changed it to a static local function.

https://github.com/llvm/llvm-project/pull/72417
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)

2024-03-14 Thread David Pagan via cfe-commits


@@ -11311,6 +11311,10 @@ class Sema final {
 OpenMPDirectiveKind ,
 OpenMPDirectiveKind );
 
+  /// [target] teams loop is equivalent to parallel for if associated loop
+  /// nest meets certain critera.
+  bool teamsLoopCanBeParallelFor(Stmt *Astmt);
+

ddpagan wrote:

> Do you really need to expose it in Sema or you can make it just a static 
> local function in SemaOpenMP.cpp?

Actually, no. Thanks for noticing that. I'll make the change.

https://github.com/llvm/llvm-project/pull/72417
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)

2024-03-11 Thread David Pagan via cfe-commits

ddpagan wrote:

After some additional discussion with Alexey offline, he concluded that the 
current changes are okay, specifically for this reason:

_"Then I realized that actually it does not require AST nodes building. In this 
case, this helper class should be moved to CodeGenStmt and hidden in the 
anonymous namespace. But you also need to use it in CodeGenModule. In this case 
better to use a flag in statement, as you have it right now. I.e. having this 
analysis in Sema looks good_"


https://github.com/llvm/llvm-project/pull/72417
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)

2024-03-11 Thread Alexey Bataev via cfe-commits


@@ -1537,6 +1537,12 @@ class CodeGenModule : public CodeGenTypeCache {
   void printPostfixForExternalizedDecl(llvm::raw_ostream ,
const Decl *D) const;
 
+  /// Under debug mode, print status of target teams loop transformation,
+  /// which should be either '#distribute' or '#parallel for'
+  void emitTargetTeamsLoopCodegenStatus(std::string StatusMsg,
+const OMPExecutableDirective ,
+bool IsDevice);
+

alexey-bataev wrote:

Don't like the idea of adding new member function specifically for debugging

https://github.com/llvm/llvm-project/pull/72417
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)

2024-03-11 Thread Alexey Bataev via cfe-commits


@@ -11311,6 +11311,10 @@ class Sema final {
 OpenMPDirectiveKind ,
 OpenMPDirectiveKind );
 
+  /// [target] teams loop is equivalent to parallel for if associated loop
+  /// nest meets certain critera.
+  bool teamsLoopCanBeParallelFor(Stmt *Astmt);
+

alexey-bataev wrote:

Do you really need to expose it in Sema or you can make it just a static local 
function in SemaOpenMP.cpp?

https://github.com/llvm/llvm-project/pull/72417
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)

2024-02-15 Thread Alexey Bataev via cfe-commits

alexey-bataev wrote:

I already told, that you don't need the changes neither in sema, nor in stmt 
class. All function changes are in codegen, no other changes are required.

https://github.com/llvm/llvm-project/pull/72417
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)

2024-02-15 Thread David Pagan via cfe-commits

ddpagan wrote:

Ping.

https://github.com/llvm/llvm-project/pull/72417
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)

2024-02-06 Thread Gheorghe-Teodor Bercea via cfe-commits


@@ -6106,6 +6106,8 @@ class OMPTeamsGenericLoopDirective final : public 
OMPLoopDirective {
 class OMPTargetTeamsGenericLoopDirective final : public OMPLoopDirective {
   friend class ASTStmtReader;
   friend class OMPExecutableDirective;
+  /// true if loop directive's associated loop can be a parallel for.
+  bool CanBeParallelFor = false;

doru1004 wrote:

I don't think it is possible to have the analysis in Sema and not use a flag 
here.

The two options we have are:
1. Do the analysis in Sema and have the flag and then read the flag in CG.
2. Have the analysis in CG and then there's no reason to pass anything around 
and CG can call the function when needed.

There is a 3rd hybrid way to do this where this function is moved back into CG:
```
bool Sema::teamsLoopCanBeParallelFor(Stmt *AStmt) {
  TeamsLoopChecker Checker(*this);
  Checker.Visit(AStmt);
  return Checker.teamsLoopCanBeParallelFor();
}
```

But then I don't know how you can call the TeamsLoopChecker which lives in Sema.

https://github.com/llvm/llvm-project/pull/72417
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)

2024-01-10 Thread David Pagan via cfe-commits


@@ -6106,6 +6106,8 @@ class OMPTeamsGenericLoopDirective final : public 
OMPLoopDirective {
 class OMPTargetTeamsGenericLoopDirective final : public OMPLoopDirective {
   friend class ASTStmtReader;
   friend class OMPExecutableDirective;
+  /// true if loop directive's associated loop can be a parallel for.
+  bool CanBeParallelFor = false;

ddpagan wrote:

Ok. Thanks, Alexey.

https://github.com/llvm/llvm-project/pull/72417
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)

2024-01-10 Thread Alexey Bataev via cfe-commits


@@ -6106,6 +6106,8 @@ class OMPTeamsGenericLoopDirective final : public 
OMPLoopDirective {
 class OMPTargetTeamsGenericLoopDirective final : public OMPLoopDirective {
   friend class ASTStmtReader;
   friend class OMPExecutableDirective;
+  /// true if loop directive's associated loop can be a parallel for.
+  bool CanBeParallelFor = false;

alexey-bataev wrote:

I'm saying that you can move this function to the codegen and call it in the 
codegen

https://github.com/llvm/llvm-project/pull/72417
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)

2024-01-10 Thread David Pagan via cfe-commits


@@ -6106,6 +6106,8 @@ class OMPTeamsGenericLoopDirective final : public 
OMPLoopDirective {
 class OMPTargetTeamsGenericLoopDirective final : public OMPLoopDirective {
   friend class ASTStmtReader;
   friend class OMPExecutableDirective;
+  /// true if loop directive's associated loop can be a parallel for.
+  bool CanBeParallelFor = false;

ddpagan wrote:

Hi Alexey - thanks for the comment.  A clarification, as I'm not sure exactly 
what you're referring to.

So currently, when OMPTargetTeamsGenericLoopDirective is created in 
SemaOpenMP.cpp, teamsLoopCanBeParallelFor(AStmt) is called as an argument to 
the create, the result of which is stored in CanBeParallelFor. Later, when the 
directive is seen in CodeGen/CGStmtOpenMP.cpp, the boolean CanBeParallelFor 
(via the function canBeParallelFor()) is checked to determine how to emit the 
directive (parallel or distribute).

Are you saying that instead of checking whether the loop can be parallel while 
we're in Sema, and saving that value when we create the target teams loop 
directive, that we should determine this through a call to the Sema function 
teamsLoopCanBeParallelFor() while in CodeGen?



https://github.com/llvm/llvm-project/pull/72417
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)

2024-01-02 Thread Alexey Bataev via cfe-commits


@@ -6106,6 +6106,8 @@ class OMPTeamsGenericLoopDirective final : public 
OMPLoopDirective {
 class OMPTargetTeamsGenericLoopDirective final : public OMPLoopDirective {
   friend class ASTStmtReader;
   friend class OMPExecutableDirective;
+  /// true if loop directive's associated loop can be a parallel for.
+  bool CanBeParallelFor = false;

alexey-bataev wrote:

Do you still need to this new field, taking into account that it does not 
affect sema anymore, only codegen? You don't need to store this flag, instead 
you can call teamsLoopCanBeParallelFor() directly in codegen

https://github.com/llvm/llvm-project/pull/72417
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)

2024-01-01 Thread David Pagan via cfe-commits

ddpagan wrote:

Ping.

https://github.com/llvm/llvm-project/pull/72417
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)

2023-12-11 Thread Alexey Bataev via cfe-commits


@@ -7485,6 +7485,99 @@ void 
CodeGenModule::printPostfixForExternalizedDecl(llvm::raw_ostream ,
   }
 }
 
+namespace {
+/// A 'teams loop' with a nested 'loop bind(parallel)' or generic function
+/// call in the associated loop-nest cannot be a 'parllel for'.
+class TeamsLoopChecker final : public ConstStmtVisitor {
+public:
+  TeamsLoopChecker(CodeGenModule )
+  : CGM(CGM), TeamsLoopCanBeParallelFor{true} {}
+  bool teamsLoopCanBeParallelFor() const {
+return TeamsLoopCanBeParallelFor;
+  }
+  // Is there a nested OpenMP loop bind(parallel)
+  void VisitOMPExecutableDirective(const OMPExecutableDirective *D) {
+if (D->getDirectiveKind() == llvm::omp::Directive::OMPD_loop) {
+  if (const auto *C = D->getSingleClause())
+if (C->getBindKind() == OMPC_BIND_parallel) {
+  TeamsLoopCanBeParallelFor = false;
+  // No need to continue visiting any more
+  return;
+}
+}
+for (const Stmt *Child : D->children())
+  if (Child)
+Visit(Child);
+  }
+
+  void VisitCallExpr(const CallExpr *C) {
+// Function calls inhibit parallel loop translation of 'target teams loop'
+// unless the assume-no-nested-parallelism flag has been specified.
+// OpenMP API runtime library calls do not inhibit parallel loop
+// translation, regardless of the assume-no-nested-parallelism.
+if (C) {
+  bool IsOpenMPAPI = false;
+  auto *FD = dyn_cast_or_null(C->getCalleeDecl());
+  if (FD) {
+std::string Name = FD->getNameInfo().getAsString();
+IsOpenMPAPI = Name.find("omp_") == 0;

alexey-bataev wrote:

Ah, ok, then it should be fine

https://github.com/llvm/llvm-project/pull/72417
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)

2023-12-11 Thread via cfe-commits


@@ -7485,6 +7485,99 @@ void 
CodeGenModule::printPostfixForExternalizedDecl(llvm::raw_ostream ,
   }
 }
 
+namespace {
+/// A 'teams loop' with a nested 'loop bind(parallel)' or generic function
+/// call in the associated loop-nest cannot be a 'parllel for'.
+class TeamsLoopChecker final : public ConstStmtVisitor {
+public:
+  TeamsLoopChecker(CodeGenModule )
+  : CGM(CGM), TeamsLoopCanBeParallelFor{true} {}
+  bool teamsLoopCanBeParallelFor() const {
+return TeamsLoopCanBeParallelFor;
+  }
+  // Is there a nested OpenMP loop bind(parallel)
+  void VisitOMPExecutableDirective(const OMPExecutableDirective *D) {
+if (D->getDirectiveKind() == llvm::omp::Directive::OMPD_loop) {
+  if (const auto *C = D->getSingleClause())
+if (C->getBindKind() == OMPC_BIND_parallel) {
+  TeamsLoopCanBeParallelFor = false;
+  // No need to continue visiting any more
+  return;
+}
+}
+for (const Stmt *Child : D->children())
+  if (Child)
+Visit(Child);
+  }
+
+  void VisitCallExpr(const CallExpr *C) {
+// Function calls inhibit parallel loop translation of 'target teams loop'
+// unless the assume-no-nested-parallelism flag has been specified.
+// OpenMP API runtime library calls do not inhibit parallel loop
+// translation, regardless of the assume-no-nested-parallelism.
+if (C) {
+  bool IsOpenMPAPI = false;
+  auto *FD = dyn_cast_or_null(C->getCalleeDecl());
+  if (FD) {
+std::string Name = FD->getNameInfo().getAsString();
+IsOpenMPAPI = Name.find("omp_") == 0;

dhruvachak wrote:

> No, it is not.

I see the following in the spec:
```
 Programs must not declare names that begin with the omp_ or ompx_ prefix, as 
these are reserved for the OpenMP implementation.
```

https://github.com/llvm/llvm-project/pull/72417
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)

2023-12-11 Thread Alexey Bataev via cfe-commits


@@ -7485,6 +7485,99 @@ void 
CodeGenModule::printPostfixForExternalizedDecl(llvm::raw_ostream ,
   }
 }
 
+namespace {
+/// A 'teams loop' with a nested 'loop bind(parallel)' or generic function
+/// call in the associated loop-nest cannot be a 'parllel for'.
+class TeamsLoopChecker final : public ConstStmtVisitor {
+public:
+  TeamsLoopChecker(CodeGenModule )
+  : CGM(CGM), TeamsLoopCanBeParallelFor{true} {}
+  bool teamsLoopCanBeParallelFor() const {
+return TeamsLoopCanBeParallelFor;
+  }
+  // Is there a nested OpenMP loop bind(parallel)
+  void VisitOMPExecutableDirective(const OMPExecutableDirective *D) {
+if (D->getDirectiveKind() == llvm::omp::Directive::OMPD_loop) {
+  if (const auto *C = D->getSingleClause())
+if (C->getBindKind() == OMPC_BIND_parallel) {
+  TeamsLoopCanBeParallelFor = false;
+  // No need to continue visiting any more
+  return;
+}
+}
+for (const Stmt *Child : D->children())
+  if (Child)
+Visit(Child);
+  }
+
+  void VisitCallExpr(const CallExpr *C) {
+// Function calls inhibit parallel loop translation of 'target teams loop'
+// unless the assume-no-nested-parallelism flag has been specified.
+// OpenMP API runtime library calls do not inhibit parallel loop
+// translation, regardless of the assume-no-nested-parallelism.
+if (C) {
+  bool IsOpenMPAPI = false;
+  auto *FD = dyn_cast_or_null(C->getCalleeDecl());
+  if (FD) {
+std::string Name = FD->getNameInfo().getAsString();
+IsOpenMPAPI = Name.find("omp_") == 0;

alexey-bataev wrote:

No, it is not.

https://github.com/llvm/llvm-project/pull/72417
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)

2023-12-11 Thread via cfe-commits


@@ -7485,6 +7485,99 @@ void 
CodeGenModule::printPostfixForExternalizedDecl(llvm::raw_ostream ,
   }
 }
 
+namespace {
+/// A 'teams loop' with a nested 'loop bind(parallel)' or generic function
+/// call in the associated loop-nest cannot be a 'parllel for'.
+class TeamsLoopChecker final : public ConstStmtVisitor {
+public:
+  TeamsLoopChecker(CodeGenModule )
+  : CGM(CGM), TeamsLoopCanBeParallelFor{true} {}
+  bool teamsLoopCanBeParallelFor() const {
+return TeamsLoopCanBeParallelFor;
+  }
+  // Is there a nested OpenMP loop bind(parallel)
+  void VisitOMPExecutableDirective(const OMPExecutableDirective *D) {
+if (D->getDirectiveKind() == llvm::omp::Directive::OMPD_loop) {
+  if (const auto *C = D->getSingleClause())
+if (C->getBindKind() == OMPC_BIND_parallel) {
+  TeamsLoopCanBeParallelFor = false;
+  // No need to continue visiting any more
+  return;
+}
+}
+for (const Stmt *Child : D->children())
+  if (Child)
+Visit(Child);
+  }
+
+  void VisitCallExpr(const CallExpr *C) {
+// Function calls inhibit parallel loop translation of 'target teams loop'
+// unless the assume-no-nested-parallelism flag has been specified.
+// OpenMP API runtime library calls do not inhibit parallel loop
+// translation, regardless of the assume-no-nested-parallelism.
+if (C) {
+  bool IsOpenMPAPI = false;
+  auto *FD = dyn_cast_or_null(C->getCalleeDecl());
+  if (FD) {
+std::string Name = FD->getNameInfo().getAsString();
+IsOpenMPAPI = Name.find("omp_") == 0;

dhruvachak wrote:

> What if this is user function with omp_ prefix?

I think that was not allowed. Isn't omp_ a reserved prefix?

https://github.com/llvm/llvm-project/pull/72417
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)

2023-11-16 Thread David Pagan via cfe-commits


@@ -7485,6 +7485,99 @@ void 
CodeGenModule::printPostfixForExternalizedDecl(llvm::raw_ostream ,
   }
 }
 
+namespace {
+/// A 'teams loop' with a nested 'loop bind(parallel)' or generic function
+/// call in the associated loop-nest cannot be a 'parllel for'.
+class TeamsLoopChecker final : public ConstStmtVisitor {

ddpagan wrote:

Alexey - thanks for the review comments. Good suggestion here. I'll look at 
moving the analysis.

https://github.com/llvm/llvm-project/pull/72417
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)

2023-11-15 Thread Alexey Bataev via cfe-commits


@@ -7485,6 +7485,99 @@ void 
CodeGenModule::printPostfixForExternalizedDecl(llvm::raw_ostream ,
   }
 }
 
+namespace {
+/// A 'teams loop' with a nested 'loop bind(parallel)' or generic function
+/// call in the associated loop-nest cannot be a 'parllel for'.
+class TeamsLoopChecker final : public ConstStmtVisitor {
+public:
+  TeamsLoopChecker(CodeGenModule )
+  : CGM(CGM), TeamsLoopCanBeParallelFor{true} {}
+  bool teamsLoopCanBeParallelFor() const {
+return TeamsLoopCanBeParallelFor;
+  }
+  // Is there a nested OpenMP loop bind(parallel)
+  void VisitOMPExecutableDirective(const OMPExecutableDirective *D) {
+if (D->getDirectiveKind() == llvm::omp::Directive::OMPD_loop) {
+  if (const auto *C = D->getSingleClause())
+if (C->getBindKind() == OMPC_BIND_parallel) {
+  TeamsLoopCanBeParallelFor = false;
+  // No need to continue visiting any more
+  return;
+}
+}
+for (const Stmt *Child : D->children())
+  if (Child)
+Visit(Child);
+  }
+
+  void VisitCallExpr(const CallExpr *C) {
+// Function calls inhibit parallel loop translation of 'target teams loop'
+// unless the assume-no-nested-parallelism flag has been specified.
+// OpenMP API runtime library calls do not inhibit parallel loop
+// translation, regardless of the assume-no-nested-parallelism.
+if (C) {
+  bool IsOpenMPAPI = false;
+  auto *FD = dyn_cast_or_null(C->getCalleeDecl());
+  if (FD) {
+std::string Name = FD->getNameInfo().getAsString();
+IsOpenMPAPI = Name.find("omp_") == 0;

alexey-bataev wrote:

What if this is user function with omp_ prefix?

https://github.com/llvm/llvm-project/pull/72417
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)

2023-11-15 Thread Alexey Bataev via cfe-commits


@@ -7485,6 +7485,99 @@ void 
CodeGenModule::printPostfixForExternalizedDecl(llvm::raw_ostream ,
   }
 }
 
+namespace {
+/// A 'teams loop' with a nested 'loop bind(parallel)' or generic function
+/// call in the associated loop-nest cannot be a 'parllel for'.
+class TeamsLoopChecker final : public ConstStmtVisitor {
+public:
+  TeamsLoopChecker(CodeGenModule )
+  : CGM(CGM), TeamsLoopCanBeParallelFor{true} {}
+  bool teamsLoopCanBeParallelFor() const {
+return TeamsLoopCanBeParallelFor;
+  }
+  // Is there a nested OpenMP loop bind(parallel)
+  void VisitOMPExecutableDirective(const OMPExecutableDirective *D) {
+if (D->getDirectiveKind() == llvm::omp::Directive::OMPD_loop) {
+  if (const auto *C = D->getSingleClause())
+if (C->getBindKind() == OMPC_BIND_parallel) {
+  TeamsLoopCanBeParallelFor = false;
+  // No need to continue visiting any more
+  return;
+}
+}
+for (const Stmt *Child : D->children())
+  if (Child)
+Visit(Child);
+  }
+
+  void VisitCallExpr(const CallExpr *C) {
+// Function calls inhibit parallel loop translation of 'target teams loop'
+// unless the assume-no-nested-parallelism flag has been specified.
+// OpenMP API runtime library calls do not inhibit parallel loop
+// translation, regardless of the assume-no-nested-parallelism.
+if (C) {
+  bool IsOpenMPAPI = false;
+  auto *FD = dyn_cast_or_null(C->getCalleeDecl());
+  if (FD) {
+std::string Name = FD->getNameInfo().getAsString();
+IsOpenMPAPI = Name.find("omp_") == 0;
+  }
+  TeamsLoopCanBeParallelFor =
+  IsOpenMPAPI || CGM.getLangOpts().OpenMPNoNestedParallelism;
+  if (!TeamsLoopCanBeParallelFor)
+return;
+}
+for (const Stmt *Child : C->children())
+  if (Child)
+Visit(Child);
+  }
+
+  void VisitCapturedStmt(const CapturedStmt *S) {
+if (!S)
+  return;
+Visit(S->getCapturedDecl()->getBody());
+  }
+
+  void VisitStmt(const Stmt *S) {
+if (!S)
+  return;
+for (const Stmt *Child : S->children())
+  if (Child)
+Visit(Child);
+  }
+
+private:
+  CodeGenModule 
+  bool TeamsLoopCanBeParallelFor;
+};
+} // namespace
+
+/// Determine if 'teams loop' can be emitted using 'parallel for'.
+bool CodeGenModule::teamsLoopCanBeParallelFor(const OMPExecutableDirective ) 
{
+  if (D.getDirectiveKind() != llvm::omp::Directive::OMPD_target_teams_loop)
+return false;
+  assert(D.hasAssociatedStmt() &&
+  "Loop directive must have associated statement.");
+  TeamsLoopChecker Checker(*this);
+  Checker.Visit(D.getAssociatedStmt());
+  return Checker.teamsLoopCanBeParallelFor();
+}
+
+void CodeGenModule::emitTargetTeamsLoopCodegenStatus(
+std::string StatusMsg, const OMPExecutableDirective , bool IsDevice) {
+  if (IsDevice)
+StatusMsg += ": DEVICE";
+  else
+StatusMsg += ": HOST";
+  SourceLocation L = D.getBeginLoc();
+  SourceManager  = getContext().getSourceManager();
+  PresumedLoc PLoc = SM.getPresumedLoc(L);
+  const char *FileName = PLoc.isValid() ? PLoc.getFilename() : nullptr;
+  unsigned LineNo =
+  PLoc.isValid() ? PLoc.getLine() : SM.getExpansionLineNumber(L);
+  llvm::dbgs() << StatusMsg << ": " << FileName << ": " << LineNo << "\n";
+}

alexey-bataev wrote:

This should be enabled only in debug mode

https://github.com/llvm/llvm-project/pull/72417
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)

2023-11-15 Thread Alexey Bataev via cfe-commits


@@ -7485,6 +7485,99 @@ void 
CodeGenModule::printPostfixForExternalizedDecl(llvm::raw_ostream ,
   }
 }
 
+namespace {
+/// A 'teams loop' with a nested 'loop bind(parallel)' or generic function
+/// call in the associated loop-nest cannot be a 'parllel for'.
+class TeamsLoopChecker final : public ConstStmtVisitor {

alexey-bataev wrote:

Better to do this analysis in Sema and mark the region appropriately

https://github.com/llvm/llvm-project/pull/72417
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)

2023-11-15 Thread via cfe-commits

github-actions[bot] wrote:




:warning: C/C++ code formatter, clang-format found issues in your code. 
:warning:



You can test this locally with the following command:


``bash
git-clang-format --diff ec64af5994213cf7933e961a2b7fe07193a77b61 
8710c48fb90373ebd2afe1afa7399a6643b52c37 -- 
clang/test/OpenMP/target_teams_generic_loop_codegen_as_distribute.cpp 
clang/test/OpenMP/target_teams_generic_loop_codegen_as_parallel_for.cpp 
clang/lib/CodeGen/CGOpenMPRuntime.cpp clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp 
clang/lib/CodeGen/CGStmtOpenMP.cpp clang/lib/CodeGen/CodeGenModule.cpp 
clang/lib/CodeGen/CodeGenModule.h clang/lib/Sema/SemaOpenMP.cpp 
clang/test/OpenMP/nvptx_target_teams_generic_loop_codegen.cpp 
clang/test/OpenMP/nvptx_target_teams_generic_loop_generic_mode_codegen.cpp 
clang/test/OpenMP/target_teams_generic_loop_codegen-1.cpp 
clang/test/OpenMP/target_teams_generic_loop_codegen.cpp 
clang/test/OpenMP/target_teams_generic_loop_collapse_codegen.cpp 
clang/test/OpenMP/target_teams_generic_loop_if_codegen.cpp 
clang/test/OpenMP/target_teams_generic_loop_private_codegen.cpp 
clang/test/OpenMP/target_teams_generic_loop_uses_allocators_codegen.cpp 
clang/test/OpenMP/teams_generic_loop_codegen-1.cpp 
clang/test/OpenMP/teams_generic_loop_codegen.cpp 
clang/test/OpenMP/teams_generic_loop_collapse_codegen.cpp 
clang/test/OpenMP/teams_generic_loop_private_codegen.cpp 
clang/test/OpenMP/teams_generic_loop_reduction_codegen.cpp
``





View the diff from clang-format here.


``diff
diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp 
b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
index 97f963df7a..9545a5e3bd 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
@@ -2645,11 +2645,11 @@ void 
CGOpenMPRuntime::emitForStaticFinish(CodeGenFunction ,
   llvm::Value *Args[] = {
   emitUpdateLocation(CGF, Loc,
  isOpenMPDistributeDirective(DKind) ||
- (DKind == OMPD_target_teams_loop)
+ (DKind == OMPD_target_teams_loop)
  ? OMP_IDENT_WORK_DISTRIBUTE
- : isOpenMPLoopDirective(DKind)
-   ? OMP_IDENT_WORK_LOOP
-   : OMP_IDENT_WORK_SECTIONS),
+ : isOpenMPLoopDirective(DKind)
+ ? OMP_IDENT_WORK_LOOP
+ : OMP_IDENT_WORK_SECTIONS),
   getThreadID(CGF, Loc)};
   auto DL = ApplyDebugLocation::CreateDefaultArtificial(CGF, Loc);
   if (isOpenMPDistributeDirective(DKind) &&
diff --git a/clang/lib/CodeGen/CGStmtOpenMP.cpp 
b/clang/lib/CodeGen/CGStmtOpenMP.cpp
index 16f84cab18..229f015e04 100644
--- a/clang/lib/CodeGen/CGStmtOpenMP.cpp
+++ b/clang/lib/CodeGen/CGStmtOpenMP.cpp
@@ -7924,9 +7924,9 @@ static void emitTargetTeamsGenericLoopRegionAsParallel(
 CGF.EmitOMPReductionClauseFinal(S, /*ReductionKind=*/OMPD_teams);
   };
   DEBUG_WITH_TYPE(TTL_CODEGEN_TYPE,
-  CGF.CGM.emitTargetTeamsLoopCodegenStatus(
-  TTL_CODEGEN_TYPE " as parallel for", S,
-  CGF.CGM.getLangOpts().OpenMPIsTargetDevice));
+  CGF.CGM.emitTargetTeamsLoopCodegenStatus(
+  TTL_CODEGEN_TYPE " as parallel for", S,
+  CGF.CGM.getLangOpts().OpenMPIsTargetDevice));
   emitCommonOMPTeamsDirective(CGF, S, OMPD_distribute_parallel_for,
   CodeGenTeams);
   emitPostUpdateForReductionClause(CGF, S,
@@ -7954,9 +7954,9 @@ static void emitTargetTeamsGenericLoopRegionAsDistribute(
 CGF.EmitOMPReductionClauseFinal(S, /*ReductionKind=*/OMPD_teams);
   };
   DEBUG_WITH_TYPE(TTL_CODEGEN_TYPE,
-  CGF.CGM.emitTargetTeamsLoopCodegenStatus(
-  TTL_CODEGEN_TYPE " as distribute", S,
-  CGF.CGM.getLangOpts().OpenMPIsTargetDevice));
+  CGF.CGM.emitTargetTeamsLoopCodegenStatus(
+  TTL_CODEGEN_TYPE " as distribute", S,
+  CGF.CGM.getLangOpts().OpenMPIsTargetDevice));
   emitCommonOMPTeamsDirective(CGF, S, OMPD_distribute, CodeGen);
   emitPostUpdateForReductionClause(CGF, S,
[](CodeGenFunction &) { return nullptr; });
diff --git a/clang/lib/CodeGen/CodeGenModule.cpp 
b/clang/lib/CodeGen/CodeGenModule.cpp
index 0dd389b3a8..ae287c7c68 100644
--- a/clang/lib/CodeGen/CodeGenModule.cpp
+++ b/clang/lib/CodeGen/CodeGenModule.cpp
@@ -7492,9 +7492,7 @@ class TeamsLoopChecker final : public 
ConstStmtVisitor {
 public:
   TeamsLoopChecker(CodeGenModule )
   : CGM(CGM), TeamsLoopCanBeParallelFor{true} {}
-  bool teamsLoopCanBeParallelFor() const {
-return TeamsLoopCanBeParallelFor;
-  }
+  bool teamsLoopCanBeParallelFor() const { return TeamsLoopCanBeParallelFor; }
   // Is there a nested OpenMP loop bind(parallel)
   void VisitOMPExecutableDirective(const OMPExecutableDirective *D) {
 if (D->getDirectiveKind() == 

[clang] [OpenMP][CodeGen] Improved codegen for combined loop directives (PR #72417)

2023-11-15 Thread via cfe-commits

llvmbot wrote:



@llvm/pr-subscribers-clang-codegen

@llvm/pr-subscribers-clang

Author: David Pagan (ddpagan)


Changes

IR for 'target teams loop' is now dependent on suitability of associated 
loop-nest.

If a loop-nest:
  - does not contain a function call, or
  - the -fopenmp-assume-no-nested-parallelism has been specified,
  - or the call is to an OpenMP API AND
  - does not contain nested loop bind(parallel) directives

then it can be emitted as 'target teams distribute parallel for', which is the 
current default. Otherwise, it is emitted as 'target teams distribute'.

Added debug output indicating how 'target teams loop' was emitted. Flag is 
-mllvm -debug-only=target-teams-loop-codegen

Added LIT tests explicitly verifying 'target teams loop' emitted as a parallel 
loop and a distribute loop.

Updated other 'loop' related tests as needed to reflect change in IR.
- These updates account for most of the changed files and additions/deletions.

---

Patch is 1.06 MiB, truncated to 20.00 KiB below, full version: 
https://github.com/llvm/llvm-project/pull/72417.diff


21 Files Affected:

- (modified) clang/lib/CodeGen/CGOpenMPRuntime.cpp (+6-3) 
- (modified) clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp (+7-3) 
- (modified) clang/lib/CodeGen/CGStmtOpenMP.cpp (+51-14) 
- (modified) clang/lib/CodeGen/CodeGenModule.cpp (+93) 
- (modified) clang/lib/CodeGen/CodeGenModule.h (+8) 
- (modified) clang/lib/Sema/SemaOpenMP.cpp (+8-2) 
- (modified) clang/test/OpenMP/nvptx_target_teams_generic_loop_codegen.cpp 
(+72-42) 
- (modified) 
clang/test/OpenMP/nvptx_target_teams_generic_loop_generic_mode_codegen.cpp 
(+82-315) 
- (modified) clang/test/OpenMP/target_teams_generic_loop_codegen-1.cpp (+25-25) 
- (modified) clang/test/OpenMP/target_teams_generic_loop_codegen.cpp (+17-21) 
- (added) clang/test/OpenMP/target_teams_generic_loop_codegen_as_distribute.cpp 
(+1685) 
- (added) 
clang/test/OpenMP/target_teams_generic_loop_codegen_as_parallel_for.cpp (+3998) 
- (modified) clang/test/OpenMP/target_teams_generic_loop_collapse_codegen.cpp 
(+16-16) 
- (modified) clang/test/OpenMP/target_teams_generic_loop_if_codegen.cpp 
(+118-600) 
- (modified) clang/test/OpenMP/target_teams_generic_loop_private_codegen.cpp 
(+242-1200) 
- (modified) 
clang/test/OpenMP/target_teams_generic_loop_uses_allocators_codegen.cpp (+3-3) 
- (modified) clang/test/OpenMP/teams_generic_loop_codegen-1.cpp (+146-1210) 
- (modified) clang/test/OpenMP/teams_generic_loop_codegen.cpp (+156-510) 
- (modified) clang/test/OpenMP/teams_generic_loop_collapse_codegen.cpp 
(+142-678) 
- (modified) clang/test/OpenMP/teams_generic_loop_private_codegen.cpp 
(+101-580) 
- (modified) clang/test/OpenMP/teams_generic_loop_reduction_codegen.cpp 
(+122-689) 


``diff
diff --git a/clang/lib/CodeGen/CGOpenMPRuntime.cpp 
b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
index d2be8141a3a4b31..97f963df7ad67b4 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntime.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntime.cpp
@@ -2644,7 +2644,8 @@ void CGOpenMPRuntime::emitForStaticFinish(CodeGenFunction 
,
   // Call __kmpc_for_static_fini(ident_t *loc, kmp_int32 tid);
   llvm::Value *Args[] = {
   emitUpdateLocation(CGF, Loc,
- isOpenMPDistributeDirective(DKind)
+ isOpenMPDistributeDirective(DKind) ||
+ (DKind == OMPD_target_teams_loop)
  ? OMP_IDENT_WORK_DISTRIBUTE
  : isOpenMPLoopDirective(DKind)
? OMP_IDENT_WORK_LOOP
@@ -8779,7 +8780,8 @@ getNestedDistributeDirective(ASTContext , const 
OMPExecutableDirective ) {
 OpenMPDirectiveKind DKind = NestedDir->getDirectiveKind();
 switch (D.getDirectiveKind()) {
 case OMPD_target:
-  // For now, just treat 'target teams loop' as if it's distributed.
+  // For now, treat 'target' with nested 'teams loop' as if it's
+  // distributed (target teams distribute).
   if (isOpenMPDistributeDirective(DKind) || DKind == OMPD_teams_loop)
 return NestedDir;
   if (DKind == OMPD_teams) {
@@ -9263,7 +9265,8 @@ llvm::Value *CGOpenMPRuntime::emitTargetNumIterationsCall(
 SizeEmitter) {
   OpenMPDirectiveKind Kind = D.getDirectiveKind();
   const OMPExecutableDirective *TD = 
-  // Get nested teams distribute kind directive, if any.
+  // Get nested teams distribute kind directive, if any. For now, treat
+  // 'target_teams_loop' as if it's really a target_teams_distribute.
   if ((!isOpenMPDistributeDirective(Kind) || !isOpenMPTeamsDirective(Kind)) &&
   Kind != OMPD_target_teams_loop)
 TD = getNestedDistributeDirective(CGM.getContext(), D);
diff --git a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp 
b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
index abecf5250f4cf96..2e1cf1ed3abf40c 100644
--- a/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
+++ b/clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp
@@ -639,14 +639,14 @@ static bool hasNestedSPMDDirective(ASTContext ,