On Fri, Oct 7, 2011 at 3:53 PM, Jeffrey Yasskin <[email protected]> wrote: > On Thu, Oct 6, 2011 at 6:36 PM, Eli Friedman <[email protected]> wrote: >> Atomic ops; similar to the patch I submitted earlier, but this version >> adds the restriction that the first operand to the __atomic_* >> operations must be an _Atomic(T)*. I'm sending this to double-check >> that I haven't missed any serious issues. >> >> After this is committed, I'll put together a patch for >> __atomic_is_lock_free, including the necessary machinery to come up >> with the right answers. >> > > Comments/questions inline: > >> Index: include/clang/Basic/Builtins.def >> =================================================================== >> --- include/clang/Basic/Builtins.def (revision 141335) >> +++ include/clang/Basic/Builtins.def (working copy) >> @@ -585,8 +585,19 @@ >> BUILTIN(__sync_swap_8, "LLiLLiD*LLi.", "n") >> BUILTIN(__sync_swap_16, "LLLiLLLiD*LLLi.", "n") >> >> +BUILTIN(__atomic_load, "v.", "t") >> +BUILTIN(__atomic_store, "v.", "t") >> +BUILTIN(__atomic_exchange, "v.", "t") >> +BUILTIN(__atomic_compare_exchange_strong, "v.", "t") >> +BUILTIN(__atomic_compare_exchange_weak, "v.", "t") >> +BUILTIN(__atomic_fetch_add, "v.", "t") >> +BUILTIN(__atomic_fetch_sub, "v.", "t") >> +BUILTIN(__atomic_fetch_and, "v.", "t") >> +BUILTIN(__atomic_fetch_or, "v.", "t") >> +BUILTIN(__atomic_fetch_xor, "v.", "t") >> +BUILTIN(__atomic_thread_fence, "vi", "t") >> +BUILTIN(__atomic_signal_fence, "vi", "t") > > I believe gcc is using __sync_mem_* instead of __atomic_* for their > builtins, although they're using __atomic_* for the library that > handles maybe-not-lock-free calls: > http://gcc.gnu.org/wiki/Atomic/GCCMM/LIbrary. I've cc'ed Andrew > MacLeod who's actually deciding this, in the hope that both compilers > can use the same intrinsics.
Okay, I can wait to hear back. > There's also some chance we'll eventually want to provide builtins to > access the singlethread synchronization scope (or other > synchronization scopes we haven't thought of a need for yet). > (http://llvm.org/docs/LangRef.html#ordering for Andrew.) Do you want > to include that parameter in these builtins, or wait, and maybe need > to add another set? The C++ standard doesn't actually expose anything corresponding to those instructions, for reasons I don't entirely understand. >> - >> // Non-overloaded atomic builtins. >> BUILTIN(__sync_synchronize, "v.", "n") >> // GCC does not support these, they are a Clang extension. >> Index: include/clang/Basic/DiagnosticSemaKinds.td >> =================================================================== >> --- include/clang/Basic/DiagnosticSemaKinds.td (revision 141335) >> +++ include/clang/Basic/DiagnosticSemaKinds.td (working copy) >> @@ -4000,6 +4000,15 @@ >> def err_atomic_builtin_pointer_size : Error< >> "first argument to atomic builtin must be a pointer to 1,2,4,8 or 16 byte >> " >> "type (%0 invalid)">; >> +def err_atomic_builtin_needs_atomic : Error< >> + "first argument to atomic builtin must be a pointer to atomic " >> + " type (%0 invalid)">; > > Maybe s/atomic/_Atomic/. Otherwise C++ programmers who see this are > likely to try passing std::atomic<T> instances. K. >> +def err_atomic_builtin_needs_atomic_int_or_ptr : Error< >> + "first argument to atomic builtin must be a pointer to atomic " >> + " integer or pointer (%0 invalid)">; >> +def err_atomic_builtin_logical_needs_atomic_int : Error< >> + "first argument to logical atomic builtin must be a pointer to atomic " >> + " integer (%0 invalid)">; >> >> def err_deleted_function_use : Error<"attempt to use a deleted function">; >> >> Index: include/clang/AST/Expr.h >> =================================================================== >> --- include/clang/AST/Expr.h (revision 141335) >> +++ include/clang/AST/Expr.h (working copy) >> @@ -4157,6 +4157,128 @@ >> // Iterators >> child_range children() { return child_range(&SrcExpr, &SrcExpr+1); } >> }; >> + >> +/// AtomicExpr - Variadic atomic builtins: __atomic_exchange, >> __atomic_fetch_*, >> +/// __atomic_load, __atomic_store, and __atomic_compare_exchange_*, for the >> +/// similarly-named C++0x instructions. All of these instructions take one >> +/// primary pointer and at least one memory order. >> +class AtomicExpr : public Expr { >> +public: >> + enum AtomicOp { Load, Store, CmpXchgStrong, CmpXchgWeak, Xchg, >> + Add, Sub, And, Or, Xor }; >> +private: >> + enum { PTR, ORDER, VAL1, ORDER_FAIL, VAL2, END_EXPR }; >> + Stmt* SubExprs[END_EXPR]; >> + unsigned NumSubExprs; >> + SourceLocation BuiltinLoc, RParenLoc; >> + AtomicOp Op; >> + >> +public: >> + // Constructor for Load >> + AtomicExpr(SourceLocation BLoc, Expr *ptr, Expr *order, QualType t, >> + AtomicOp op, SourceLocation RP, >> + bool TypeDependent, bool ValueDependent) >> + : Expr(AtomicExprClass, t, VK_RValue, OK_Ordinary, >> + TypeDependent, ValueDependent, >> + ptr->isInstantiationDependent(), >> + ptr->containsUnexpandedParameterPack()), >> + BuiltinLoc(BLoc), RParenLoc(RP), Op(op) { >> + SubExprs[PTR] = ptr; >> + SubExprs[ORDER] = order; >> + NumSubExprs = 2; >> + } > > Hmm, normally I'd ask for a CreateLoad() static method that forwarded > to an appropriate private constructor, since it may be hard to > remember which constructor goes with which operation, but it looks > like there's exactly one call site for each of these constructors, so > it's probably not worth it. Maybe add an assert on 'op' instead? An assert seems reasonable. >> + >> + // Constructor for Store, Xchg, Add, Sub, And, Or, Xor >> + AtomicExpr(SourceLocation BLoc, Expr *ptr, Expr *val, Expr *order, >> + QualType t, AtomicOp op, SourceLocation RP, >> + bool TypeDependent, bool ValueDependent) >> + : Expr(AtomicExprClass, t, VK_RValue, OK_Ordinary, >> + TypeDependent, ValueDependent, >> + (ptr->isInstantiationDependent() || >> + val->isInstantiationDependent()), >> + (ptr->containsUnexpandedParameterPack() || >> + val->containsUnexpandedParameterPack())), >> + BuiltinLoc(BLoc), RParenLoc(RP), Op(op) { >> + SubExprs[PTR] = ptr; >> + SubExprs[ORDER] = order; >> + SubExprs[VAL1] = val; >> + NumSubExprs = 3; >> + } >> + >> + // Constructor for CmpXchgStrong, CmpXchgWeak >> + AtomicExpr(SourceLocation BLoc, Expr *ptr, Expr *val1, Expr *val2, >> + Expr *order, Expr *order_fail, QualType t, AtomicOp op, >> + SourceLocation RP, bool TypeDependent, bool ValueDependent) >> + : Expr(AtomicExprClass, t, VK_RValue, OK_Ordinary, >> + TypeDependent, ValueDependent, >> + (ptr->isInstantiationDependent() || >> + val1->isInstantiationDependent() || >> + val2->isInstantiationDependent()), >> + (ptr->containsUnexpandedParameterPack() || >> + val1->containsUnexpandedParameterPack() || >> + val2->containsUnexpandedParameterPack())), >> + BuiltinLoc(BLoc), RParenLoc(RP), Op(op) { >> + SubExprs[PTR] = ptr; >> + SubExprs[VAL1] = val1; >> + SubExprs[ORDER] = order; >> + SubExprs[VAL2] = val2; >> + SubExprs[ORDER_FAIL] = order_fail; >> + NumSubExprs = 5; >> + } >> + >> + /// \brief Build an empty __builtin_choose_expr. > > Is this a mis-copy? If not, I don't understand the comment. mis-copy. >> + explicit AtomicExpr(EmptyShell Empty) : Expr(AtomicExprClass, Empty) { } >> + >> + Expr *getPtr() const { return cast<Expr>(SubExprs[PTR]); } >> + void setPtr(Expr *E) { SubExprs[PTR] = E; } >> + Expr *getOrder() const { return cast<Expr>(SubExprs[ORDER]); } >> + void setOrder(Expr *E) { SubExprs[ORDER] = E; } >> + Expr *getVal1() const { return cast<Expr>(SubExprs[VAL1]); } >> + void setVal1(Expr *E) { SubExprs[VAL1] = E; } >> + Expr *getOrderFail() const { return cast<Expr>(SubExprs[ORDER_FAIL]); } >> + void setOrderFail(Expr *E) { SubExprs[ORDER_FAIL] = E; } >> + Expr *getVal2() const { return cast<Expr>(SubExprs[VAL2]); } >> + void setVal2(Expr *E) { SubExprs[VAL2] = E; } >> + >> + AtomicOp getOp() const { return Op; } >> + void setOp(AtomicOp op) { Op = op; } >> + unsigned getNumSubExprs() { return NumSubExprs; } >> + void setNumSubExprs(unsigned num) { NumSubExprs = num; } >> + >> + int getOrderVal(ASTContext &Ctx) const { >> + return getOrder()->EvaluateAsInt(Ctx).getZExtValue(); >> + } >> + int getOrderFailVal(ASTContext &Ctx) const { > > Want to assert that this is a cmpxchg call? Sure. >> + return getOrderFail()->EvaluateAsInt(Ctx).getZExtValue(); >> + } >> + bool isVolatile() const { >> + return getPtr()->getType()->getPointeeType().isVolatileQualified(); >> + } >> + >> + bool isCmpXChg() const { >> + return getOp() == AtomicExpr::CmpXchgStrong || >> + getOp() == AtomicExpr::CmpXchgWeak; >> + } >> + >> + SourceLocation getBuiltinLoc() const { return BuiltinLoc; } >> + void setBuiltinLoc(SourceLocation L) { BuiltinLoc = L; } >> + >> + SourceLocation getRParenLoc() const { return RParenLoc; } >> + void setRParenLoc(SourceLocation L) { RParenLoc = L; } >> + >> + SourceRange getSourceRange() const { >> + return SourceRange(BuiltinLoc, RParenLoc); >> + } >> + static bool classof(const Stmt *T) { >> + return T->getStmtClass() == AtomicExprClass; >> + } >> + static bool classof(const AtomicExpr *) { return true; } >> + >> + // Iterators >> + child_range children() { >> + return child_range(SubExprs, SubExprs+NumSubExprs); >> + } >> +}; >> } // end namespace clang >> >> #endif >> Index: lib/Sema/SemaChecking.cpp >> =================================================================== >> --- lib/Sema/SemaChecking.cpp (revision 141335) >> +++ lib/Sema/SemaChecking.cpp (working copy) >> @@ -414,6 +437,141 @@ >> return false; >> } >> >> +ExprResult >> +Sema::SemaAtomicOpsOverloaded(ExprResult TheCallResult, >> AtomicExpr::AtomicOp Op) { >> + CallExpr *TheCall = (CallExpr *)TheCallResult.get(); > > Shouldn't you use cast<CallExpr>? Copy-paste code; I don't recall if ExprResult.get() actually returns an Expr, but I'll change it if I can. >> + DeclRefExpr *DRE >> =cast<DeclRefExpr>(TheCall->getCallee()->IgnoreParenCasts()); >> + Expr *Ptr, *Order, *Val1, *Val2, *OrderFail; >> + >> + unsigned NumVals = 1; >> + unsigned NumOrders = 1; > > These probably deserve a comment about the layout of the builtin call. Like: > > // __atomic_call(_Atomic(T)* [, U]*NumVals [, int]*NumOrders); > // where the 'U's are determined by the call and T, and the ints > // represent memory_order parameters. K. >> + if (Op == AtomicExpr::Load) { >> + NumVals = 0; >> + } else if (Op == AtomicExpr::CmpXchgWeak || Op == >> AtomicExpr::CmpXchgStrong) { >> + NumVals = 2; >> + NumOrders = 2; >> + } >> + >> + if (TheCall->getNumArgs() < NumVals+NumOrders+1) { >> + Diag(TheCall->getLocEnd(), diag::err_typecheck_call_too_few_args) >> + << 0 << NumVals+NumOrders+1 << TheCall->getNumArgs() >> + << TheCall->getCallee()->getSourceRange(); >> + return ExprError(); >> + } else if (TheCall->getNumArgs() > NumVals+NumOrders+1) { >> + Diag(TheCall->getArg(2)->getLocStart(), > > Why "getArg(2)"? Should this be getArg(NumVals+NumOrders+1)? I guess? I'll take another look. >> + diag::err_typecheck_call_too_many_args) >> + << 0 << NumVals+NumOrders+1 << TheCall->getNumArgs() >> + << TheCall->getCallee()->getSourceRange(); >> + return ExprError(); >> + } >> + >> + // Inspect the first argument of the atomic builtin. This should always >> be >> + // a pointer type, whose element is an integral scalar or pointer type. > > The pointer's element should be an atomic type whose element may be > constrained by the operation, right? Outdated comment, yes; will fix. >> + Ptr = TheCall->getArg(0); >> + Ptr = DefaultFunctionArrayLvalueConversion(Ptr).get(); >> + const PointerType *pointerType = Ptr->getType()->getAs<PointerType>(); >> + if (!pointerType) { >> + Diag(DRE->getLocStart(), diag::err_atomic_builtin_must_be_pointer) >> + << Ptr->getType() << Ptr->getSourceRange(); >> + return ExprError(); >> + } >> + >> + QualType AtomTy = pointerType->getPointeeType(); >> + if (!AtomTy->isAtomicType()) { >> + Diag(DRE->getLocStart(), diag::err_atomic_builtin_needs_atomic) >> + << Ptr->getType() << Ptr->getSourceRange(); >> + return ExprError(); >> + } >> + QualType ValType = cast<AtomicType>(AtomTy)->getValueType(); >> + >> + if ((Op == AtomicExpr::Add || Op == AtomicExpr::Sub) && >> + !ValType->isIntegerType() && !ValType->isAnyPointerType() && >> + !ValType->isBlockPointerType()) { > > Your indentation's weird here. K. >> + // The first argument --- the pointer --- has a fixed type; we >> + // deduce the types of the rest of the arguments accordingly. Walk >> + // the remaining arguments, converting them to the deduced value type. >> + for (unsigned i = 0; i != NumVals+NumOrders; ++i) { >> + ExprResult Arg = TheCall->getArg(i+1); >> + QualType Ty; >> + if (i < NumVals) { >> + if (i == 0 && (Op == AtomicExpr::CmpXchgWeak || > > It's a little confusing for 'i' to not represent the i'th parameter. K, I can change that. >> + Op == AtomicExpr::CmpXchgStrong)) >> + Ty = Context.getPointerType(ValType.getUnqualifiedType()); >> + else if (!ValType->isIntegerType() && >> + (Op == AtomicExpr::Add || Op == AtomicExpr::Sub)) >> + Ty = Context.getPointerDiffType(); >> + else >> + Ty = ValType; >> + } else { > > Could you add a comment like "// The remaining parameters are > memory_orders, which are represented by ints."? > > Something should check the constraints on memory orders, like that > cmpxchg's failure order has to be weaker than its success order, or > that you can't have an acquire store. Is that this code, the <atomic> > header, or codegen? You might also comment here that it's codegen > that maps from consume to acquire. We can't check it in general: the inputs can be variables. We can't even check it in the common case, because <atomic> is forced to wrap around these functions. I could add code to try and catch some cases, but I'm not sure it's worth bothering. >> + Ty = Context.IntTy; >> + } >> + InitializedEntity Entity = >> + InitializedEntity::InitializeParameter(Context, Ty, false); >> + Arg = PerformCopyInitialization(Entity, SourceLocation(), Arg); >> + if (Arg.isInvalid()) >> + return true; >> + TheCall->setArg(i+1, Arg.get()); >> + } >> + >> + if (Op == AtomicExpr::Load) { >> + Order = TheCall->getArg(1); >> + return Owned(new (Context) >> AtomicExpr(TheCall->getCallee()->getLocStart(), >> + Ptr, Order, ResultType, Op, >> + TheCall->getRParenLoc(), false, >> + false)); > > If AtomicExprs are never type- or value-dependent, why have those two > parameters? Good point; I'll change that. >> + >> + llvm::AtomicRMWInst::BinOp Op = llvm::AtomicRMWInst::Add; > > This is an odd default. I think I need to set it to something to suppress uninitialized warnings. >> + switch (E->getOp()) { >> + case AtomicExpr::CmpXchgWeak: >> + case AtomicExpr::CmpXchgStrong: >> + case AtomicExpr::Store: >> + case AtomicExpr::Load: assert(0 && "Already handled!"); >> + case AtomicExpr::Add: Op = llvm::AtomicRMWInst::Add; break; >> + case AtomicExpr::Sub: Op = llvm::AtomicRMWInst::Sub; break; >> + case AtomicExpr::And: Op = llvm::AtomicRMWInst::And; break; >> + case AtomicExpr::Or: Op = llvm::AtomicRMWInst::Or; break; >> + case AtomicExpr::Xor: Op = llvm::AtomicRMWInst::Xor; break; >> + case AtomicExpr::Xchg: Op = llvm::AtomicRMWInst::Xchg; break; >> + } >> + llvm::LoadInst *LoadVal1 = CGF.Builder.CreateLoad(Val1); >> + LoadVal1->setAlignment(Align); >> + llvm::AtomicRMWInst *RMWI = >> + CGF.Builder.CreateAtomicRMW(Op, Ptr, LoadVal1, Order); >> + RMWI->setVolatile(E->isVolatile()); >> + llvm::StoreInst *StoreDest = CGF.Builder.CreateStore(RMWI, Dest); >> + StoreDest->setAlignment(Align); >> +} >> + >> +static llvm::Value * >> +EmitValToTemp(CodeGenFunction &CGF, Expr *E) { > > Please comment how this is different from EmitScalarExpr. K. >> + llvm::Value *DeclPtr = CGF.CreateMemTemp(E->getType(), ".atomictmp"); >> + CGF.EmitAnyExprToMem(E, DeclPtr, E->getType().getQualifiers(), >> + /*Init*/ true); >> + return DeclPtr; >> +} >> + >> +static RValue ConvertTempToRValue(CodeGenFunction &CGF, QualType Ty, >> + llvm::Value *Dest) { >> + if (Ty->isAnyComplexType()) >> + return RValue::getComplex(CGF.LoadComplexFromAddr(Dest, false)); >> + if (CGF.hasAggregateLLVMType(Ty)) >> + return RValue::getAggregate(Dest); >> + return RValue::get(CGF.EmitLoadOfScalar(CGF.MakeAddrLValue(Dest, Ty))); >> +} >> + >> +RValue CodeGenFunction::EmitAtomicExpr(AtomicExpr *E, llvm::Value *Dest) { >> + QualType PointeeTy = E->getPtr()->getType()->getPointeeType(); >> + CharUnits sizeChars = getContext().getTypeSizeInChars(PointeeTy); >> + uint64_t Size = sizeChars.getQuantity(); >> + CharUnits alignChars = getContext().getTypeSizeInChars(PointeeTy); > > Why call this twice? The second one is actually supposed to be the alignment... I'll fix that. >> + unsigned Align = alignChars.getQuantity(); >> + // FIXME: Bound on Size should not be hardcoded. >> + bool UseLibcall = (!llvm::isPowerOf2_64(Size) || Size > 8); >> + >> + llvm::Value *Ptr, *Order, *OrderFail = 0, *Val1 = 0, *Val2 = 0; >> + Ptr = EmitScalarExpr(E->getPtr()); >> + Order = EmitScalarExpr(E->getOrder()); >> + if (E->isCmpXChg()) { >> + Val1 = EmitScalarExpr(E->getVal1()); >> + Val2 = EmitValToTemp(*this, E->getVal2()); >> + OrderFail = EmitScalarExpr(E->getOrderFail()); >> + (void)OrderFail; // OrderFail is unused at the moment >> + } else if (E->getOp() != AtomicExpr::Load) { >> + Val1 = EmitValToTemp(*this, E->getVal1()); >> + } >> + >> + llvm::Type *PointeeLLVMTy = >> + cast<llvm::PointerType>(Ptr->getType())->getElementType(); >> + >> + llvm::Type *InstrReturnTy = PointeeLLVMTy; > > s/InstrReturnTy/CallReturnTy/ or something similar, since this is not > the return type of the cmpxchg instructions. Sure. >> + if (E->isCmpXChg()) >> + InstrReturnTy = Builder.getInt1Ty(); >> + else if (E->getOp() == AtomicExpr::Store) >> + InstrReturnTy = 0; >> + >> + if (InstrReturnTy && !Dest) >> + Dest = CreateMemTemp(E->getType(), ".atomicdst"); >> + >> + if (UseLibcall) { >> + // FIXME: Finalize what the libcalls are actually supposed to look like. > > Possibly link to http://gcc.gnu.org/wiki/Atomic/GCCMM/LIbrary here. K. >> + if (isa<llvm::ConstantInt>(Order)) { >> + int ord = cast<llvm::ConstantInt>(Order)->getZExtValue(); >> + switch (ord) { >> + case 0: // memory_order_relaxed >> + EmitAtomicOp(*this, E, Dest, Ptr, Val1, Val2, Size, Align, >> + llvm::Monotonic); >> + break; >> + case 1: // memory_order_consume >> + case 2: // memory_order_acquire >> + EmitAtomicOp(*this, E, Dest, Ptr, Val1, Val2, Size, Align, >> + llvm::Acquire); >> + break; >> + case 3: // memory_order_release >> + EmitAtomicOp(*this, E, Dest, Ptr, Val1, Val2, Size, Align, >> + llvm::Release); >> + break; >> + case 4: // memory_order_acq_rel >> + EmitAtomicOp(*this, E, Dest, Ptr, Val1, Val2, Size, Align, >> + llvm::AcquireRelease); >> + break; >> + case 5: // memory_order_seq_cst >> + EmitAtomicOp(*this, E, Dest, Ptr, Val1, Val2, Size, Align, >> + llvm::SequentiallyConsistent); >> + break; >> + default: // invalid order > > This should probably have triggered an error earlier in the frontend. > If that's impossible, comment why? We don't actually check the value of the order argument in Sema because it can be a variable... so an assert here would be fragile. I can add a comment. >> + // Create all the relevant BB's >> + llvm::BasicBlock *MonotonicBB, *AcquireBB, *ReleaseBB, *AcqRelBB, >> *SeqCstBB; >> + MonotonicBB = createBasicBlock("monotonic", CurFn); >> + if (E->getOp() != AtomicExpr::Store) >> + AcquireBB = createBasicBlock("acquire", CurFn); >> + if (E->getOp() != AtomicExpr::Load) >> + ReleaseBB = createBasicBlock("release", CurFn); >> + if (E->getOp() != AtomicExpr::Load && E->getOp() != AtomicExpr::Store) >> + AcqRelBB = createBasicBlock("acqrel", CurFn); >> + SeqCstBB = createBasicBlock("seqcst", CurFn); >> + llvm::BasicBlock *ContBB = createBasicBlock("atomic.continue", CurFn); >> + >> + // Create the switch and PHI for the split > > I don't see a PHI. Leftover comment; I'll zap it. >> + Order = Builder.CreateIntCast(Order, Builder.getInt32Ty(), false); >> + llvm::SwitchInst *SI = Builder.CreateSwitch(Order, MonotonicBB); > > Is it better to make 0 be the default, or is this just the simplest > thing that works, and it's never likely to make a difference because > nearly all uses will inline to a constant anyway? (Comment so the > next person to come along doesn't have to ask that question.) It shouldn't make a difference normally because it will constant-fold. I'll add a comment. -Eli _______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
