Hello John, Thanks you very much for the review! Here are my answers.
Oh, don't worry. I was just afraid that you're completely done with the clang because of swift. :-)I apologize for the outrageous delays; we’ve been a little busy.
The precondition is quite simple: there must be exactly 0 or 1 clause of the specified kind. This method returns the single clause if it exists. Otherwise, if there is no such clause, it just returns nullptr.The preconditions on this method aren’t clear. Does it *assume* that there’s only one clause or does it *check*? Right now, it seems to be assuming that it doesn’t have more than one, but checking that it doesn’t have zero, which is weird.
This method is required for clauses llike "if", "num_threads" etc. These clauses may be specified only once per directive, but also they can be skipped, if they are not required. So, directives may have 0 or 1 instance of such clauses.Elsewhere in the code, you’re conditionally checking whether the result is null. Is there a language rule that you can only have one if clause?
Just dereference the iterator here. Your assert can then increment the original iterator, and you don’t have to worry about copying the iterator in non-asserts builds.
Done, thanks.
This subclass should go in a separate, OpenMP-specific header.
Ok, moved to CGOpenMPRuntime.h
You seem to be inconsistent about the capitalization here, and on reflection,
honestly, I think this abbreviation is completely illegible. And I’m not sure
that calling it the “global thread ID” is actually meaningful when it’s clearly
not global in any real sense.
Let’s take everywhere you’ve got “Gtid” or “GTid”, including in the code
already committed, and rename this to “ThreadID”. So this would be:
const VarDecl *getThreadIDVariable() const { return ThreadIDVar; }
Ok, done.
I'll modify the comment. The type of this variable is kmp_int32, i.e. just int32.Also, this would be a great place to put a comment explaining what this variable actually is. For example, that it’s a variable of type ident_t*.
If this is an invariant of the CGOpenMPRegionInfo, shouldn’t it be asserted on in that structure, maybe in its constructor?
Moved to constructor.
You have this logic repeated several times. It would be reasonable to have a LValue getThreadIDVariableLValue(CodeGenFunction &CGF); helper on the region info. Also, you seem to vary between using the type of the variable and assuming that ident_t == int32_t. It’s probably better to trust the type of the variable consistently.
Created this function, replaced the code by function call.The type of ThreadID is int32_t, not ident_t. ident_t is a structure that has info about debug location and some internal flags, ThreadID is just a 32-bit integer. The variable is declared only in outlined parallel region. If the construct is not in parallel region (for example, there is a standalone '#omp atomic' or other non-parallel directive), this variable does not exist and we have to create a temp with type int32_t for this var.
Please add a comment describing the intended behavior of this function and how its result will be used. The rule seems to be “if we’re inside a captured statement, use the region info’s thread-ID variable and ignore the thread ID we just computed; otherwise, stash that thread ID in a temporary and return the address of that”. That is a very strange rule. Describing *why* it does that would be invaluable to people trying to maintain this in the future.
Ok, I'll add the description.
I'll look at GCXXABI and will try do my best to improve CodeGen for OpenMP, thanks.I’m not sure what rule you’re following for what goes in CGOpenMP vs. CGOpenMPRuntime. All of the code you’re emitting here is very runtime-specific. I think you could very easily abstract out a virtual interface for doing runtime-specific logic, GCXXABI-style. e.g. “emit a function for this captured statement”, “run this captured statement serially”, “run this captured statement in parallel”, etc. All of that could be in CGOpenMPRuntime.cpp, and you could keep the general logic for interpreting down statements in CGOpenMP.cpp.
It would be better for this to not be templated and take a std::function.
Will do. Best regards, Alexey Bataev ============= Software Engineer Intel Compiler Team 17.09.2014 12:36, John McCall пишет:
On Jul 29, 2014, at 10:29 PM, Alexey Bataev <[email protected]> wrote:Hi doug.gregor, hfinkel, rjmccall, fraggamuffin, ejstotzer, rsmith, Adds codegen for 'if' clause. Currently only for 'if' clause used with the 'parallel' directive. If condition evaluates to true, the code executes parallel version of the code by calling __kmpc_fork_call(loc, 1, microtask, captured_struct/*context*/), where loc - debug location, 1 - number of additional parameters after "microtask" argument, microtask - is outlined finction for the code associated with the 'parallel' directive, captured_struct - list of variables captured in this outlined function. If condition evaluates to false, the code executes serial version of the code by executing the following code: global_thread_id.addr = alloca i32 store i32 global_thread_id, global_thread_id.addr zero.addr = alloca i32 store i32 0, zero.addr __kmpc_serialized_parallel(loc, global_thread_id); microtask(global_thread_id.addr, zero.addr, captured_struct/*context*/); __kmpc_end_serialized_parallel(loc, global_thread_id); Where loc - debug location, global_thread_id - global thread id, returned by __kmpc_global_thread_num() call or passed as a first parameter in microtask() call, global_thread_id.addr - address of the variable, where stored global_thread_id value, zero.addr - implicit bound thread id (should be set to 0 for serial call), microtask() and captured_struct are the same as in parallel call. Also this patch checks if the condition is constant and if it is constant it evaluates its value and then generates either parallel version of the code (if the condition evaluates to true), or the serial version of the code (if the condition evaluates to false).I apologize for the outrageous delays; we’ve been a little busy. + /// \brief Gets single clause of the specified kind \a K associated with the + /// current directive iff there is only one clause of this kind.+const OMPClause *+OMPExecutableDirective::getSingleClause(OpenMPClauseKind K) const { The preconditions on this method aren’t clear. Does it *assume* that there’s only one clause or does it *check*? Right now, it seems to be assuming that it doesn’t have more than one, but checking that it doesn’t have zero, which is weird. Elsewhere in the code, you’re conditionally checking whether the result is null. Is there a language rule that you can only have one if clause? + auto ClauseFilter = + [=](const OMPClause *C) -> bool { return C->getClauseKind() == K; }; + OMPExecutableDirective::filtered_clause_iterator<decltype(ClauseFilter)> I( + clauses(), ClauseFilter); + + if (I) { + auto PrevI = I; Just dereference the iterator here. Your assert can then increment the original iterator, and you don’t have to worry about copying the iterator in non-asserts builds. Index: lib/CodeGen/CodeGenFunction.h =================================================================== --- lib/CodeGen/CodeGenFunction.h +++ lib/CodeGen/CodeGenFunction.h @@ -224,6 +224,8 @@ /// \brief Get the name of the capture helper. virtual StringRef getHelperName() const { return "__captured_stmt"; }+ static bool classof(const CGCapturedStmtInfo *) { return true; }+ private: /// \brief The kind of captured statement being generated. CapturedRegionKind Kind; @@ -238,6 +240,27 @@ /// \brief Captured 'this' type. FieldDecl *CXXThisFieldDecl; }; + + /// \brief API for captured statement code generation in OpenMP constructs. + class CGOpenMPRegionInfo : public CGCapturedStmtInfo { This subclass should go in a separate, OpenMP-specific header. + /// \brief Gets a variable or parameter for storing global thread id + /// inside OpenMP construct. + const VarDecl *getGTidVariable() const { return GTidVar; } You seem to be inconsistent about the capitalization here, and on reflection, honestly, I think this abbreviation is completely illegible. And I’m not sure that calling it the “global thread ID” is actually meaningful when it’s clearly not global in any real sense. Let’s take everywhere you’ve got “Gtid” or “GTid”, including in the code already committed, and rename this to “ThreadID”. So this would be: const VarDecl *getThreadIDVariable() const { return ThreadIDVar; } Also, this would be a great place to put a comment explaining what this variable actually is. For example, that it’s a variable of type ident_t*. + } else if (auto OMPRegionInfo = + dyn_cast_or_null<CodeGenFunction::CGOpenMPRegionInfo>( + CGF.CapturedStmtInfo)) { + assert(OMPRegionInfo->getGTidVariable() != nullptr && + "No GTid variable for OpenMP region.”); If this is an invariant of the CGOpenMPRegionInfo, shouldn’t it be asserted on in that structure, maybe in its constructor? + auto GTidVar = OMPRegionInfo->getGTidVariable(); + auto LVal = CGF.MakeNaturalAlignAddrLValue( + CGF.GetAddrOfLocalVar(GTidVar), + CGF.getContext().getPointerType(GTidVar->getType())); You have this logic repeated several times. It would be reasonable to have a LValue getThreadIDVariableLValue(CodeGenFunction &CGF); helper on the region info. Also, you seem to vary between using the type of the variable and assuming that ident_t == int32_t. It’s probably better to trust the type of the variable consistently. +static llvm::Value *EmitGTidAddress(CodeGenFunction &CGF, llvm::Value *GTid) { Please add a comment describing the intended behavior of this function and how its result will be used. The rule seems to be “if we’re inside a captured statement, use the region info’s thread-ID variable and ignore the thread ID we just computed; otherwise, stash that thread ID in a temporary and return the address of that”. That is a very strange rule. Describing *why* it does that would be invaluable to people trying to maintain this in the future. In general, this patch badly needs descriptive comments. static void EmitOMPSerialCall(CodeGenFunction &CGF, + const OMPParallelDirective &S, + llvm::Value *OutlinedFn, + llvm::Value *CapturedStruct) { I’m not sure what rule you’re following for what goes in CGOpenMP vs. CGOpenMPRuntime. All of the code you’re emitting here is very runtime-specific. I think you could very easily abstract out a virtual interface for doing runtime-specific logic, GCXXABI-style. e.g. “emit a function for this captured statement”, “run this captured statement serially”, “run this captured statement in parallel”, etc. All of that could be in CGOpenMPRuntime.cpp, and you could keep the general logic for interpreting down statements in CGOpenMP.cpp. +template <class CodeGenTy> +static void EmitConditionalCode(CodeGenFunction &CGF, const Expr *Cond, + CodeGenTy CodeGen) { It would be better for this to not be templated and take a std::function. John.
_______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
