Re: [PATCH] [OPENMP] CodeGen for "omp atomic read [seq_cst]" directive.

Alexey Bataev Mon, 15 Dec 2014 01:34:57 -0800

================
Comment at: lib/CodeGen/CGStmtOpenMP.cpp:591
@@ +590,3 @@
+      llvm::Value *ScalarVal;
+      switch (CGF.getEvaluationKind(X->getType())) {
+      case TEK_Scalar:
----------------
rjmccall wrote:
> ABataev wrote:
> > rjmccall wrote:
> > > ABataev wrote:
> > > > rjmccall wrote:
> > > > > I think you'll want this functionality for some of the other atomic 
> > > > > ops anyway, so you might as well make this a separate function that 
> > > > > you can call like this:
> > > > >   llvm::Value *ScalarVal = convertToScalarValue(CGF, Res, 
> > > > > X->getType(), V->getType());
> > > > > 
> > > > > It's kindof a shame that you have to redo all of this instead of just 
> > > > > relying on the implicit casts that Sema already created when it 
> > > > > analyzed the expression, but maybe this is better than messing around 
> > > > > with OpaqueValueExprs, and it works because you don't have a 
> > > > > requirement to handle custom operators.
> > > > 1. Agree, I'll rework this.
> > > > 2. Yes, I tried to implement it using some hooks/hacks in AST, but 
> > > > there was a lot of troubles with atomic ops.
> > > > 3. We will have to support some custom operators in future (in OpenMP 
> > > > 4.0 reductions introduces user-defined reductions, which must be called 
> > > > as atomic ops and I plan to use global locks for them).
> > > You can't mix locking and non-locking atomics on the same object:  
> > > they'll be atomic with respect to themselves, but not with respect to 
> > > each other.  That is, assuming that atomic_start and atomic_end aren't 
> > > implemented by halting all other OpenMP threads.
> > > 
> > > e.g. imagine that you have a *= that's implemented with a 
> > > compare-and-swap loop and a custom reduction that you've implemented with 
> > > the global lock.  The global lock doesn't keep the compare-and-swap from 
> > > firing during the execution of the custom reduction, so (1) different 
> > > parts of the custom reduction might see different values for the original 
> > > l-value and (2) the custom reduction will completely overwrite the *= 
> > > rather than appearing to execute before or after.
> > > 
> > > In fact, you have a similar problem with aggregates, where an atomic 
> > > access to an aggregate (e.g. a std::pair<float,float>) cannot be made 
> > > atomic with respect to an atomic access to a subobject unless the 
> > > aggregate will be accessed locklessly.  (You could do *some* operations 
> > > locklessly if you wrote them very carefully, e.g. reads and writes, but I 
> > > don't know of any way to do an aggregate compare-and-swap that's atomic 
> > > with a subobject compare-and-swap when the first requires a lock and the 
> > > second doesn't.)  That's mostly defined away by the current requirement 
> > > that the l-value have scalar type, except that you do support _Complex 
> > > l-values right now, and 32-bit platforms generally can't make _Complex 
> > > double lockless.  The reverse problem applies to vectors: a 16-byte 
> > > vector will generally be sufficiently aligned that a 64-bit platform can 
> > > access it locklessly, but you're implementing some accesses to subobjects 
> > > with locks: specifically, vector projections that create compound 
> > > l-values.
> > > 
> > > This is really a specification problem: OpenMP wants simple operations to 
> > > be implementable with native atomics, but that really restricts the set 
> > > of operations you can do unless you can assume at a high level that there 
> > > are never partially overlapping atomic operations.   The easiest language 
> > > solution is to say that, for the purposes of OpenMP's atomics, _Complex 
> > > and vector types don't count as scalars.  But I don't know if that would 
> > > fly — it might be a goal to loosen the restrictions about what types you 
> > > can use in atomic operations.
> > I agree with you, currently the code is not quite correct. I think we can 
> > resolve this problem by emitting OpenMP specific locks for target supported 
> > lockfree atomic operations. But in this case we don't need target atomic 
> > ops at all, we can rely on runtime library interface only.
> You mean, always grabbing a lock instead of ever using lock-free operations?  
> I agree that that's the most general solution, but it's really heavy-handed.
> 
> It's possible that you can still do user-defined reductions with a 
> compare-and-swap loop, depending on how they're defined. (Specifically: you 
> have to be allowed to try the reduction multiple times, and the reduction's 
> only allowed to access a single atomic value.  The existing specification 
> about the #pragma tries to enforce those conditions even for the simple cases 
> allowed there, so I'm optimistic that whoever is writing your spec is at 
> least thinking about these problems.)
> 
> What you really need here is a statement about subobjects.  The C11/C++11 
> atomic types are very nice because they tell you exactly at what level 
> something is atomic: you can have an std::atomic<std::pair<int,int>>, but 
> that type doesn't provide accessors for atomically accessing the individual 
> members of the pair; you can't just project out a std::atomic<int>& from the 
> first element.  That's really important, because it tells you locally that 
> all atomic accesses are going to reference the full, 64-byte aggregate, which 
> means you can agree very easily on a protocol for that access.
> 
> OpenMP doesn't tie into the type system that way, but you can still impose 
> that rule at a high level by saying that it's undefined behavior for atomic 
> accesses to an aggregate to race with atomic accesses to a subobject.  For 
> example, you can have an atomic operation on a _Complex float, and you can 
> have a simultaneous atomic operation on a float, but the fact that they're 
> simultaneous means that we can assume they don't alias.  That still lets us 
> decide on the atomic access pattern based purely on the type of the atomic 
> l-value.
John, I decided to use lock free operations on simple lvalues, but for other 
lvalues I'm going to use global lock provided by the runtime. I think in this 
case we can avoid conflicts conflicts. What do you think about this solution?


http://reviews.llvm.org/D6431

EMAIL PREFERENCES
  http://reviews.llvm.org/settings/panel/emailpreferences/



_______________________________________________
cfe-commits mailing list
cfe-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits

Re: [PATCH] [OPENMP] CodeGen for "omp atomic read [seq_cst]" directive.

Reply via email to