[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
Alexander-Johnston wrote: Agreed, will add https://github.com/llvm/llvm-project/pull/180804 ___ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
Alexander-Johnston wrote: Ah yes, I only gated them in the front end. I'll update this! https://github.com/llvm/llvm-project/pull/180804 ___ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
@@ -0,0 +1,18 @@
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.0-library %s -verify
+
+using handle_long_t = __hlsl_resource_t [[hlsl::resource_class(UAV)]]
[[hlsl::contained_type(long)]];
+
+struct CustomResource {
+ handle_long_t BufferLong;
+};
Alexander-Johnston wrote:
Same reason for the custom version as above. As for the type of buffer, we're
specifically interested in the width of the type of the buffer in this test
https://github.com/llvm/llvm-project/pull/180804
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
@@ -0,0 +1,74 @@
+// RUN: %clang_cc1 -triple dxil-pc-shadermodel6.6-library %s -verify
+
+void no_arg() {
+ __builtin_hlsl_interlocked_or();
+ // expected-error@-1 {{too few arguments to function call, expected 3, have
0}}
+}
+
+void too_many_args() {
+ __builtin_hlsl_interlocked_or(0, 0, 0, 0, 0);
+ // expected-error@-1 {{too many arguments to function call, expected at most
4, have 5}}
+}
+
+void non_resource_arg() {
+ __builtin_hlsl_interlocked_or(0, 0, 0);
+ // expected-error@-1 {{used type 'int' where __hlsl_resource_t is required}}
+}
+
+void ret_no_arg() {
+ __builtin_hlsl_interlocked_or_ret_uint();
+ // expected-error@-1 {{too few arguments to function call, expected 4, have
0}}
+}
+
+void ret_too_many_args() {
+ __builtin_hlsl_interlocked_or_ret_uint(0, 0, 0, 0, 0, 0);
+ // expected-error@-1 {{too many arguments to function call, expected at most
5, have 6}}
+}
+
+void ret_non_resource_arg() {
+ __builtin_hlsl_interlocked_or_ret_uint(0, 0, 0, 0);
+ // expected-error@-1 {{used type 'int' where __hlsl_resource_t is required}}
+}
+
+// ByteAddressBuffer
+using handle_char_t = __hlsl_resource_t [[hlsl::resource_class(SRV)]]
[[hlsl::raw_buffer]] [[hlsl::contained_type(char)]];
+// Buffer
+using handle_int_t = __hlsl_resource_t [[hlsl::resource_class(SRV)]]
[[hlsl::contained_type(float)]];
+// RWBuffer
+using handle_float_t = __hlsl_resource_t [[hlsl::resource_class(UAV)]]
[[hlsl::contained_type(float)]];
Alexander-Johnston wrote:
We have these custom resources because the builtin expects a __hlsl_resource_t
type. Passing in a buffer type like Buffer will give a typing error
(`error: used type 'Buffer' where __hlsl_resource_t is required`)
The tests are giving coverage of the various types of buffer to ensure the
different sema and codegen branches are taken and the coordinates are correctly
matched.
https://github.com/llvm/llvm-project/pull/180804
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
@@ -4332,6 +4039,140 @@ bool SemaHLSL::CheckBuiltinFunctionCall(unsigned
BuiltinID, CallExpr *TheCall) {
getASTContext().UnsignedIntTy);
break;
}
+ case Builtin::BI__builtin_hlsl_interlocked_or: {
+if (SemaRef.checkArgCountRange(TheCall, 3, 4))
+ return true;
+const ASTContext &AST = SemaRef.getASTContext();
+
+auto checkResTy = [&](const HLSLAttributedResourceType *ResTy) -> bool {
+ bool IsValid = false;
+ const bool IsUAV = ResTy->getAttrs().ResourceClass == ResourceClass::UAV;
+ const bool HasElemTy = ResTy->hasContainedType();
+ const bool IsRaw = ResTy->isRaw();
+ const bool IsTexture = ResTy->isTexture();
+ const bool IsIntElem =
+ HasElemTy && (ResTy->getContainedType() == AST.IntTy ||
+ResTy->getContainedType() == AST.UnsignedIntTy);
+ const bool IsLongElem =
+ HasElemTy && (ResTy->getContainedType() == AST.LongTy ||
+ResTy->getContainedType() == AST.UnsignedLongTy);
+
+ // The resource handle must be either
Alexander-Johnston wrote:
Meant to be finished by the comments in the conditions below. I'll modify to a
series of `if`s and see if it reads better.
https://github.com/llvm/llvm-project/pull/180804
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
@@ -3651,6 +3464,22 @@ static bool CheckSamplingBuiltin(Sema &S, CallExpr
*TheCall, SampleKind Kind) {
return false;
}
+static bool CheckShaderModelVersion(Sema *S, CallExpr *TheCall,
+VersionTuple MinimumSMVersion) {
+ bool IsDXIL = S->getASTContext().getTargetInfo().getTriple().getArch() ==
+llvm::Triple::dxil;
+ llvm::VersionTuple SMVersion =
+ S->getASTContext().getTargetInfo().getTriple().getOSVersion();
+ if (SMVersion < MinimumSMVersion && IsDXIL) {
Alexander-Johnston wrote:
Yes, SPIR-V (and any other future targets) will need their own restrictions
implemented when we come to add them. I asked about this briefly on the discord
a while back and consensus seemed to be to not place DXIL version restrictions
on SPIR-V (and vice versa).
https://github.com/llvm/llvm-project/pull/180804
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
@@ -3065,6 +2999,25 @@ static bool CheckArgTypeMatches(Sema *S, Expr *Arg,
QualType ExpectedType) {
return false;
}
+// checks for int or long regardless of sign
+static bool CheckArgTypeMatchesList(Sema *S, Expr *Arg,
+llvm::SmallVector ExpectedTypes)
{
+ QualType ArgType = Arg->getType().getCanonicalType();
+ bool MatchedType = false;
+ for (const auto ExpectedType : ExpectedTypes)
+if (ArgType == ExpectedType) {
+ MatchedType = true;
+ return false;
+}
+ if (!MatchedType) {
+for (const auto ExpectedType : ExpectedTypes)
+ S->Diag(Arg->getBeginLoc(), diag::err_typecheck_convert_incompatible)
+ << ArgType << ExpectedType << 1 << 0 << 0;
Alexander-Johnston wrote:
It is a little noisy, but the list is pretty limited and it felt nicer to give
the list of legal types to the user than not. What would you prefer?
https://github.com/llvm/llvm-project/pull/180804
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
@@ -2017,5 +1840,87 @@
BuiltinTypeDeclBuilder::addGetDimensionsMethodForBuffer() {
.finalize();
}
+BuiltinTypeDeclBuilder &
+BuiltinTypeDeclBuilder::addInterlockedMethodsForBuffer() {
+ using PH = BuiltinTypeMethodBuilder::PlaceHolder;
+ ASTContext &AST = SemaRef.getASTContext();
+ QualType UIntTy = AST.UnsignedIntTy;
+ QualType IntTy = AST.IntTy;
+
+ BuiltinTypeMethodBuilder(*this, "InterlockedOr", AST.VoidTy)
+ .addParam("dest", UIntTy, HLSLParamModifierAttr::Keyword_in)
+ .addParam("value", UIntTy, HLSLParamModifierAttr::Keyword_in)
+ .callBuiltin("__builtin_hlsl_interlocked_or", QualType(), PH::Handle,
+ PH::_0, PH::_1)
+ .finalize();
+
+ BuiltinTypeMethodBuilder(*this, "InterlockedOr", AST.VoidTy)
+ .addParam("dest", UIntTy, HLSLParamModifierAttr::Keyword_in)
+ .addParam("value", IntTy, HLSLParamModifierAttr::Keyword_in)
+ .callBuiltin("__builtin_hlsl_interlocked_or", QualType(), PH::Handle,
+ PH::_0, PH::_1)
+ .finalize();
+
+ BuiltinTypeMethodBuilder(*this, "InterlockedOr", AST.VoidTy)
+ .addParam("dest", UIntTy, HLSLParamModifierAttr::Keyword_in)
+ .addParam("value", UIntTy, HLSLParamModifierAttr::Keyword_in)
+ .addParam("original_value", UIntTy, HLSLParamModifierAttr::Keyword_out)
+ .callBuiltin("__builtin_hlsl_interlocked_or_ret_uint", UIntTy,
PH::Handle,
+ PH::_0, PH::_1, PH::_2)
+ .finalize();
+
+ return BuiltinTypeMethodBuilder(*this, "InterlockedOr", AST.VoidTy)
+ .addParam("dest", UIntTy, HLSLParamModifierAttr::Keyword_in)
+ .addParam("value", IntTy, HLSLParamModifierAttr::Keyword_in)
+ .addParam("original_value", IntTy, HLSLParamModifierAttr::Keyword_out)
+ .callBuiltin("__builtin_hlsl_interlocked_or_ret_int", IntTy, PH::Handle,
+ PH::_0, PH::_1, PH::_2)
+ .finalize();
+}
+
+BuiltinTypeDeclBuilder &
+BuiltinTypeDeclBuilder::addInterlocked64MethodsForBuffer() {
+ ASTContext &AST = SemaRef.getASTContext();
+ VersionTuple TargetVersion = AST.getTargetInfo().getTriple().getOSVersion();
+ bool IsDXIL = AST.getTargetInfo().getTriple().getArch() ==
llvm::Triple::dxil;
+ if (TargetVersion < VersionTuple(6, 6) && IsDXIL)
+return *this;
Alexander-Johnston wrote:
I believe in SPIR-V the atomic functions are always available for 32 and 64bit
types regardless of target version.
When I come to add the SPIR-V I'll confirm, and if not I'll add the appropriate
restriction here too.
https://github.com/llvm/llvm-project/pull/180804
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
@@ -2017,5 +1840,87 @@
BuiltinTypeDeclBuilder::addGetDimensionsMethodForBuffer() {
.finalize();
}
+BuiltinTypeDeclBuilder &
+BuiltinTypeDeclBuilder::addInterlockedMethodsForBuffer() {
+ using PH = BuiltinTypeMethodBuilder::PlaceHolder;
+ ASTContext &AST = SemaRef.getASTContext();
+ QualType UIntTy = AST.UnsignedIntTy;
+ QualType IntTy = AST.IntTy;
+
+ BuiltinTypeMethodBuilder(*this, "InterlockedOr", AST.VoidTy)
+ .addParam("dest", UIntTy, HLSLParamModifierAttr::Keyword_in)
+ .addParam("value", UIntTy, HLSLParamModifierAttr::Keyword_in)
+ .callBuiltin("__builtin_hlsl_interlocked_or", QualType(), PH::Handle,
+ PH::_0, PH::_1)
+ .finalize();
+
+ BuiltinTypeMethodBuilder(*this, "InterlockedOr", AST.VoidTy)
+ .addParam("dest", UIntTy, HLSLParamModifierAttr::Keyword_in)
+ .addParam("value", IntTy, HLSLParamModifierAttr::Keyword_in)
+ .callBuiltin("__builtin_hlsl_interlocked_or", QualType(), PH::Handle,
+ PH::_0, PH::_1)
+ .finalize();
+
+ BuiltinTypeMethodBuilder(*this, "InterlockedOr", AST.VoidTy)
+ .addParam("dest", UIntTy, HLSLParamModifierAttr::Keyword_in)
+ .addParam("value", UIntTy, HLSLParamModifierAttr::Keyword_in)
+ .addParam("original_value", UIntTy, HLSLParamModifierAttr::Keyword_out)
+ .callBuiltin("__builtin_hlsl_interlocked_or_ret_uint", UIntTy,
PH::Handle,
+ PH::_0, PH::_1, PH::_2)
+ .finalize();
+
+ return BuiltinTypeMethodBuilder(*this, "InterlockedOr", AST.VoidTy)
+ .addParam("dest", UIntTy, HLSLParamModifierAttr::Keyword_in)
+ .addParam("value", IntTy, HLSLParamModifierAttr::Keyword_in)
+ .addParam("original_value", IntTy, HLSLParamModifierAttr::Keyword_out)
+ .callBuiltin("__builtin_hlsl_interlocked_or_ret_int", IntTy, PH::Handle,
+ PH::_0, PH::_1, PH::_2)
+ .finalize();
+}
+
+BuiltinTypeDeclBuilder &
+BuiltinTypeDeclBuilder::addInterlocked64MethodsForBuffer() {
Alexander-Johnston wrote:
It was just to make it a bit clearer that the 64bit functions wouldn't be added
if we weren't in SM6.6 and above.
And yes, intention was to have all the interlocked functions here, but given
the number and similarity of them it might be nicest to add a macro in here
when we do the subsequent ones to reduce the code size.
https://github.com/llvm/llvm-project/pull/180804
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
https://github.com/Alexander-Johnston edited https://github.com/llvm/llvm-project/pull/180804 ___ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
@@ -2017,5 +1840,87 @@
BuiltinTypeDeclBuilder::addGetDimensionsMethodForBuffer() {
.finalize();
}
+BuiltinTypeDeclBuilder &
+BuiltinTypeDeclBuilder::addInterlockedMethodsForBuffer() {
+ using PH = BuiltinTypeMethodBuilder::PlaceHolder;
+ ASTContext &AST = SemaRef.getASTContext();
+ QualType UIntTy = AST.UnsignedIntTy;
+ QualType IntTy = AST.IntTy;
+
+ BuiltinTypeMethodBuilder(*this, "InterlockedOr", AST.VoidTy)
+ .addParam("dest", UIntTy, HLSLParamModifierAttr::Keyword_in)
+ .addParam("value", UIntTy, HLSLParamModifierAttr::Keyword_in)
+ .callBuiltin("__builtin_hlsl_interlocked_or", QualType(), PH::Handle,
+ PH::_0, PH::_1)
+ .finalize();
+
+ BuiltinTypeMethodBuilder(*this, "InterlockedOr", AST.VoidTy)
+ .addParam("dest", UIntTy, HLSLParamModifierAttr::Keyword_in)
+ .addParam("value", IntTy, HLSLParamModifierAttr::Keyword_in)
+ .callBuiltin("__builtin_hlsl_interlocked_or", QualType(), PH::Handle,
+ PH::_0, PH::_1)
+ .finalize();
+
+ BuiltinTypeMethodBuilder(*this, "InterlockedOr", AST.VoidTy)
+ .addParam("dest", UIntTy, HLSLParamModifierAttr::Keyword_in)
+ .addParam("value", UIntTy, HLSLParamModifierAttr::Keyword_in)
+ .addParam("original_value", UIntTy, HLSLParamModifierAttr::Keyword_out)
+ .callBuiltin("__builtin_hlsl_interlocked_or_ret_uint", UIntTy,
PH::Handle,
+ PH::_0, PH::_1, PH::_2)
+ .finalize();
+
+ return BuiltinTypeMethodBuilder(*this, "InterlockedOr", AST.VoidTy)
+ .addParam("dest", UIntTy, HLSLParamModifierAttr::Keyword_in)
+ .addParam("value", IntTy, HLSLParamModifierAttr::Keyword_in)
+ .addParam("original_value", IntTy, HLSLParamModifierAttr::Keyword_out)
+ .callBuiltin("__builtin_hlsl_interlocked_or_ret_int", IntTy, PH::Handle,
+ PH::_0, PH::_1, PH::_2)
+ .finalize();
+}
+
+BuiltinTypeDeclBuilder &
+BuiltinTypeDeclBuilder::addInterlocked64MethodsForBuffer() {
+ ASTContext &AST = SemaRef.getASTContext();
+ VersionTuple TargetVersion = AST.getTargetInfo().getTriple().getOSVersion();
+ bool IsDXIL = AST.getTargetInfo().getTriple().getArch() ==
llvm::Triple::dxil;
+ if (TargetVersion < VersionTuple(6, 6) && IsDXIL)
+return *this;
+
+ using PH = BuiltinTypeMethodBuilder::PlaceHolder;
+ QualType UIntTy = AST.UnsignedIntTy;
+ QualType ULongTy = AST.UnsignedLongTy;
Alexander-Johnston wrote:
Yes, I noticed a little bit of inconsistency about this throughout the HLSL
backend so checked the data model. I think I landed on Long because it was more
commonly used but will go back and investigate again. I'd like if it were
consistently LLP64 too.
https://github.com/llvm/llvm-project/pull/180804
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
@@ -301,6 +300,98 @@ static Value *handleElementwiseF32ToF16(CodeGenFunction
&CGF,
llvm_unreachable("Intrinsic F32ToF16 not supported by target architecture");
}
+static Value *handleInterlockedOr(CodeGenFunction &CGF, const CallExpr *E,
+ const bool HasReturn) {
+ const bool Is32Bit = CGF.getContext().getTypeSize(
+ E->getArg(E->getNumArgs() - 1)->getType()) == 32;
+ Value *HandleOp = CGF.EmitScalarExpr(E->getArg(0));
+ Value *IndexOp = CGF.EmitScalarExpr(E->getArg(1));
+ Value *StructuredBufIndexOp;
+ Value *NewValueOp;
+ Value *OldValueOp;
+ unsigned OldValueArgIdx;
+ if (E->getNumArgs() == 3) {
+// (handle, index, newValue)
+NewValueOp = CGF.EmitScalarExpr(E->getArg(2));
+ } else if (E->getNumArgs() == 4) {
+if (HasReturn) {
+ // (handle, index, newValue, oldValue)
+ NewValueOp = CGF.EmitScalarExpr(E->getArg(2));
+ OldValueArgIdx = 3;
+} else {
+ // (handle, index, index, newValue)
+ StructuredBufIndexOp = CGF.EmitScalarExpr(E->getArg(2));
+ NewValueOp = CGF.EmitScalarExpr(E->getArg(3));
+}
+ } else {
+// (handle, index, index, newValue, oldValue)
+StructuredBufIndexOp = CGF.EmitScalarExpr(E->getArg(2));
+NewValueOp = CGF.EmitScalarExpr(E->getArg(3));
+OldValueArgIdx = 4;
+ }
+
+ switch (CGF.CGM.getTarget().getTriple().getArch()) {
+ case llvm::Triple::dxil: {
+QualType HandleTy = E->getArg(0)->getType();
+const HLSLAttributedResourceType *ResourceTy =
+HandleTy->getAs();
+
+// AtomicBinOp has 3 coordinate params which must be handled differently
+// depending on the resource type being accessed.
+// Initially poison all the coordinates then fill as required
+Value *Poison = PoisonValue::get(CGF.Int32Ty);
+Value *C0 = Poison;
+Value *C1 = Poison;
+Value *C2 = Poison;
+if (!ResourceTy->getAttrs().RawBuffer) {
+ assert(
+ (ResourceTy->getContainedType() == CGF.getContext().IntTy ||
+ ResourceTy->getContainedType() == CGF.getContext().UnsignedIntTy ||
+ ResourceTy->getContainedType() == CGF.getContext().LongTy ||
+ ResourceTy->getContainedType() == CGF.getContext().UnsignedLongTy)
&&
+ "AtomicBinOp RWBuffer must contain 32 or 64bit (unsigned) int type");
+ // RWBuffer: c0
+ C0 = IndexOp;
+
+ // RWByteAddressBuffers are output as char8_t, but as that isn't
+ // recognised by HLSL we can't use it as an attribute to define them in
+ // tests, so must also check for char ([[hlsl::contained_type(char)]])
+} else if (ResourceTy->getContainedType() == CGF.getContext().Char8Ty ||
+ ResourceTy->getContainedType() == CGF.getContext().CharTy) {
+ // RWByteAddressBuffer: c0
+ C0 = IndexOp;
+} else {
+ // RWStructuredBuffer: c0 and c1
+ C0 = IndexOp;
+ C1 = StructuredBufIndexOp;
+}
+assert(C0 != Poison && "Failed to identify coordinates for Interlocked");
+// TODO: Add coordinate logic for texture and groupshared (#186154)
+
+// atomicBinOp
+// opcode, handle, binary operation code, coordinates c0, c1, c2, new val
+llvm::Type *ReturnType = Is32Bit ? CGF.Int32Ty : CGF.Int64Ty;
+OldValueOp = CGF.Builder.CreateIntrinsic(
+ReturnType, Intrinsic::dx_interlocked_or,
+ArrayRef{HandleOp, C0, C1, C2, NewValueOp}, nullptr,
+"hlsl.interlocked.or");
+break;
+ }
+ default:
+llvm_unreachable(
+"Interlocked intrinsic not supported by target architecture");
+ }
+
+ // Destination may or may not be provided
+ // If it is provided create a store to it
+ if (HasReturn) {
Alexander-Johnston wrote:
I wanted to do something akin to this, but I found it creates an ambiguity
There's 4 input possibilities
```
1. interlocked(buffer, index, newVal)non-structured, no out
2. interlocked(buffer, index, index, newVal) structured, no out
3. interlocked(buffer, index, newVal, oldVal)non-structured, out
4. interlocked(buffer, index, index, newVal, oldVal) structured, out
```
If we only have one builtin it's ambiguous at the callsite between the second
and third cases. The cleanest solution I saw was to remove any risk of
ambiguity with the ret vs no return versions.
Happy to move the naming from `return` to `hasout`/`out` though
https://github.com/llvm/llvm-project/pull/180804
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
@@ -301,6 +300,98 @@ static Value *handleElementwiseF32ToF16(CodeGenFunction
&CGF,
llvm_unreachable("Intrinsic F32ToF16 not supported by target architecture");
}
+static Value *handleInterlockedOr(CodeGenFunction &CGF, const CallExpr *E,
+ const bool HasReturn) {
+ const bool Is32Bit = CGF.getContext().getTypeSize(
+ E->getArg(E->getNumArgs() - 1)->getType()) == 32;
+ Value *HandleOp = CGF.EmitScalarExpr(E->getArg(0));
+ Value *IndexOp = CGF.EmitScalarExpr(E->getArg(1));
+ Value *StructuredBufIndexOp;
+ Value *NewValueOp;
+ Value *OldValueOp;
+ unsigned OldValueArgIdx;
+ if (E->getNumArgs() == 3) {
+// (handle, index, newValue)
+NewValueOp = CGF.EmitScalarExpr(E->getArg(2));
+ } else if (E->getNumArgs() == 4) {
+if (HasReturn) {
+ // (handle, index, newValue, oldValue)
+ NewValueOp = CGF.EmitScalarExpr(E->getArg(2));
+ OldValueArgIdx = 3;
+} else {
+ // (handle, index, index, newValue)
+ StructuredBufIndexOp = CGF.EmitScalarExpr(E->getArg(2));
+ NewValueOp = CGF.EmitScalarExpr(E->getArg(3));
+}
+ } else {
+// (handle, index, index, newValue, oldValue)
+StructuredBufIndexOp = CGF.EmitScalarExpr(E->getArg(2));
+NewValueOp = CGF.EmitScalarExpr(E->getArg(3));
+OldValueArgIdx = 4;
+ }
+
+ switch (CGF.CGM.getTarget().getTriple().getArch()) {
+ case llvm::Triple::dxil: {
+QualType HandleTy = E->getArg(0)->getType();
+const HLSLAttributedResourceType *ResourceTy =
+HandleTy->getAs();
+
+// AtomicBinOp has 3 coordinate params which must be handled differently
+// depending on the resource type being accessed.
+// Initially poison all the coordinates then fill as required
+Value *Poison = PoisonValue::get(CGF.Int32Ty);
+Value *C0 = Poison;
+Value *C1 = Poison;
+Value *C2 = Poison;
+if (!ResourceTy->getAttrs().RawBuffer) {
+ assert(
+ (ResourceTy->getContainedType() == CGF.getContext().IntTy ||
+ ResourceTy->getContainedType() == CGF.getContext().UnsignedIntTy ||
+ ResourceTy->getContainedType() == CGF.getContext().LongTy ||
+ ResourceTy->getContainedType() == CGF.getContext().UnsignedLongTy)
&&
+ "AtomicBinOp RWBuffer must contain 32 or 64bit (unsigned) int type");
+ // RWBuffer: c0
+ C0 = IndexOp;
+
+ // RWByteAddressBuffers are output as char8_t, but as that isn't
+ // recognised by HLSL we can't use it as an attribute to define them in
+ // tests, so must also check for char ([[hlsl::contained_type(char)]])
+} else if (ResourceTy->getContainedType() == CGF.getContext().Char8Ty ||
+ ResourceTy->getContainedType() == CGF.getContext().CharTy) {
+ // RWByteAddressBuffer: c0
+ C0 = IndexOp;
+} else {
+ // RWStructuredBuffer: c0 and c1
+ C0 = IndexOp;
+ C1 = StructuredBufIndexOp;
+}
Alexander-Johnston wrote:
C2 won't always be poison once we have textures, which is why I've organised it
this way.
Both Texture3D and Texture2DArray will use every coordinate. (They take a
vector 3 of uint where each uint is one of the coordinates,
https://godbolt.org/z/jqTdzcPPf)
I'll add more to this with texture interlocked support in an upcoming patch
once textures are available (no work was done towards them when I started this
and it's already quite a big patch by itself).
https://github.com/llvm/llvm-project/pull/180804
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
@@ -301,6 +300,98 @@ static Value *handleElementwiseF32ToF16(CodeGenFunction
&CGF,
llvm_unreachable("Intrinsic F32ToF16 not supported by target architecture");
}
+static Value *handleInterlockedOr(CodeGenFunction &CGF, const CallExpr *E,
+ const bool HasReturn) {
+ const bool Is32Bit = CGF.getContext().getTypeSize(
+ E->getArg(E->getNumArgs() - 1)->getType()) == 32;
+ Value *HandleOp = CGF.EmitScalarExpr(E->getArg(0));
+ Value *IndexOp = CGF.EmitScalarExpr(E->getArg(1));
+ Value *StructuredBufIndexOp;
+ Value *NewValueOp;
+ Value *OldValueOp;
+ unsigned OldValueArgIdx;
+ if (E->getNumArgs() == 3) {
+// (handle, index, newValue)
+NewValueOp = CGF.EmitScalarExpr(E->getArg(2));
+ } else if (E->getNumArgs() == 4) {
+if (HasReturn) {
+ // (handle, index, newValue, oldValue)
+ NewValueOp = CGF.EmitScalarExpr(E->getArg(2));
+ OldValueArgIdx = 3;
+} else {
+ // (handle, index, index, newValue)
Alexander-Johnston wrote:
I have this case for handling the two indexes generated from doing something
like `InterlockedOr(buf[1].z, 0);`, where the index into the buffer is the
first index and the index into the struct is the second.
We can't currently generate the freestanding `InterlockedX(buffer[index]...)`
yet because of the current buffer implementation resolving too quickly,
stripping away the buffer and index information. But this does the initial work
of allowing us to implement those functions in the style of
```
InterlockedX(RWStructuredBuffer[index].index2, val) {
__builtin_hlsl_interlocked_X(buffer, index, index2, val);
}
```
While I'm sure it'll be clearer in the larger picture once the rest of the
implementation is here, I'll try to make it a bit clearer in isolation too.
https://github.com/llvm/llvm-project/pull/180804
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
https://github.com/bogner unassigned https://github.com/llvm/llvm-project/pull/180804 ___ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
@@ -300,6 +300,102 @@ static Value *handleElementwiseF32ToF16(CodeGenFunction
&CGF,
llvm_unreachable("Intrinsic F32ToF16 not supported by target architecture");
}
+static Value *handleInterlockedOr(CodeGenFunction &CGF, const CallExpr *E,
+ const bool HasReturn, const bool Is32Bit) {
+ Value *HandleOp = CGF.EmitScalarExpr(E->getArg(0));
+ Value *IndexOp = CGF.EmitScalarExpr(E->getArg(1));
+ Value *StructuredBufIndexOp;
+ Value *NewValueOp;
+ Value *OldValueOp;
+ unsigned OldValueArgIdx;
+ if (E->getNumArgs() == 3) {
+// (handle, index, newValue)
+NewValueOp = CGF.EmitScalarExpr(E->getArg(2));
+ } else if (E->getNumArgs() == 4) {
+if (HasReturn) {
+ // (handle, index, newValue, oldValue)
+ NewValueOp = CGF.EmitScalarExpr(E->getArg(2));
+ OldValueArgIdx = 3;
+} else {
+ // (handle, index, index, newValue)
+ StructuredBufIndexOp = CGF.EmitScalarExpr(E->getArg(2));
+ NewValueOp = CGF.EmitScalarExpr(E->getArg(3));
+}
+ } else {
+// (handle, index, index, newValue, oldValue)
+StructuredBufIndexOp = CGF.EmitScalarExpr(E->getArg(2));
+NewValueOp = CGF.EmitScalarExpr(E->getArg(3));
+OldValueArgIdx = 4;
+ }
+
+ switch (CGF.CGM.getTarget().getTriple().getArch()) {
+ case llvm::Triple::dxil: {
+QualType HandleTy = E->getArg(0)->getType();
+const HLSLAttributedResourceType *ResourceTy =
+HandleTy->getAs();
+
+// AtomicBinOp has 3 coordinate params which must be handled differently
+// depending on the resource type being accessed.
+// Initially undef all the coordinates then fill as required
+Value *Poison = PoisonValue::get(CGF.Int32Ty);
+Value *C0 = Poison;
+Value *C1 = Poison;
+Value *C2 = Poison;
+if (!ResourceTy->getAttrs().RawBuffer) {
+ assert(
+ (ResourceTy->getContainedType() == CGF.getContext().IntTy ||
+ ResourceTy->getContainedType() == CGF.getContext().UnsignedIntTy) &&
+ "AtomicBinOp RWBuffer must contain int or uint");
+ // RWBuffer: c0
+ C0 = IndexOp;
+
+ // RWByteAddressBuffers are output as char8_t, but as that isn't
+ // recognised by HLSL we can't use it as an attribute to define them in
+ // tests, so must also check for char ([[hlsl::contained_type(char)]])
+} else if (ResourceTy->getContainedType() == CGF.getContext().Char8Ty ||
+ ResourceTy->getContainedType() == CGF.getContext().CharTy) {
+ // RWByteAddressBuffer: c0
+ C0 = IndexOp;
+} else {
+ // RWStructuredBuffer: c0 and c1
+ C0 = IndexOp;
+ C1 = StructuredBufIndexOp;
+}
+assert(C0 != Poison && "Failed to identify coordinates for Interlocked");
+// TODO: Add coordinate logic for texture and groupshared
+
+// atomicBinOp
+// opcode, handle, binary operation code, coordinates c0, c1, c2, new val
+if (Is32Bit) {
+ Intrinsic::ID ID = Intrinsic::dx_interlocked_or;
+ OldValueOp = CGF.Builder.CreateIntrinsic(
+ /*ReturnType=*/CGF.Int32Ty, ID,
+ ArrayRef{HandleOp, C0, C1, C2, NewValueOp}, nullptr,
+ "hlsl.interlocked.or");
+} else {
+ Intrinsic::ID ID = Intrinsic::dx_interlocked_or;
+ OldValueOp = CGF.Builder.CreateIntrinsic(
+ /*ReturnType=*/CGF.Int64Ty, ID,
+ ArrayRef{HandleOp, C0, C1, C2, NewValueOp}, nullptr,
+ "hlsl.interlocked.or");
+}
Alexander-Johnston wrote:
Condensed.
https://github.com/llvm/llvm-project/pull/180804
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
@@ -300,6 +300,102 @@ static Value *handleElementwiseF32ToF16(CodeGenFunction
&CGF,
llvm_unreachable("Intrinsic F32ToF16 not supported by target architecture");
}
+static Value *handleInterlockedOr(CodeGenFunction &CGF, const CallExpr *E,
+ const bool HasReturn, const bool Is32Bit) {
+ Value *HandleOp = CGF.EmitScalarExpr(E->getArg(0));
+ Value *IndexOp = CGF.EmitScalarExpr(E->getArg(1));
+ Value *StructuredBufIndexOp;
+ Value *NewValueOp;
+ Value *OldValueOp;
+ unsigned OldValueArgIdx;
+ if (E->getNumArgs() == 3) {
+// (handle, index, newValue)
+NewValueOp = CGF.EmitScalarExpr(E->getArg(2));
+ } else if (E->getNumArgs() == 4) {
+if (HasReturn) {
+ // (handle, index, newValue, oldValue)
+ NewValueOp = CGF.EmitScalarExpr(E->getArg(2));
+ OldValueArgIdx = 3;
+} else {
+ // (handle, index, index, newValue)
+ StructuredBufIndexOp = CGF.EmitScalarExpr(E->getArg(2));
+ NewValueOp = CGF.EmitScalarExpr(E->getArg(3));
+}
+ } else {
+// (handle, index, index, newValue, oldValue)
+StructuredBufIndexOp = CGF.EmitScalarExpr(E->getArg(2));
+NewValueOp = CGF.EmitScalarExpr(E->getArg(3));
+OldValueArgIdx = 4;
+ }
+
+ switch (CGF.CGM.getTarget().getTriple().getArch()) {
+ case llvm::Triple::dxil: {
+QualType HandleTy = E->getArg(0)->getType();
+const HLSLAttributedResourceType *ResourceTy =
+HandleTy->getAs();
+
+// AtomicBinOp has 3 coordinate params which must be handled differently
+// depending on the resource type being accessed.
+// Initially undef all the coordinates then fill as required
+Value *Poison = PoisonValue::get(CGF.Int32Ty);
+Value *C0 = Poison;
+Value *C1 = Poison;
+Value *C2 = Poison;
+if (!ResourceTy->getAttrs().RawBuffer) {
+ assert(
+ (ResourceTy->getContainedType() == CGF.getContext().IntTy ||
+ ResourceTy->getContainedType() == CGF.getContext().UnsignedIntTy) &&
+ "AtomicBinOp RWBuffer must contain int or uint");
+ // RWBuffer: c0
+ C0 = IndexOp;
+
+ // RWByteAddressBuffers are output as char8_t, but as that isn't
+ // recognised by HLSL we can't use it as an attribute to define them in
+ // tests, so must also check for char ([[hlsl::contained_type(char)]])
+} else if (ResourceTy->getContainedType() == CGF.getContext().Char8Ty ||
+ ResourceTy->getContainedType() == CGF.getContext().CharTy) {
+ // RWByteAddressBuffer: c0
+ C0 = IndexOp;
+} else {
+ // RWStructuredBuffer: c0 and c1
+ C0 = IndexOp;
+ C1 = StructuredBufIndexOp;
+}
+assert(C0 != Poison && "Failed to identify coordinates for Interlocked");
+// TODO: Add coordinate logic for texture and groupshared
Alexander-Johnston wrote:
Filed and added https://github.com/llvm/llvm-project/issues/186154
https://github.com/llvm/llvm-project/pull/180804
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
@@ -0,0 +1,98 @@
+// RUN: %clang_cc1 -finclude-default-header -x hlsl -triple
dxil-pc-shadermodel6.0-library %s \
+// RUN: -emit-llvm -disable-llvm-passes -o - -DINTERLOCKED32 | \
+// RUN: FileCheck %s --check-prefixes=CHECK-32
+// RUN: %clang_cc1 -finclude-default-header -x hlsl -triple
dxil-pc-shadermodel6.6-library %s \
+// RUN: -emit-llvm -disable-llvm-passes -o - -DINTERLOCKED64 | \
+// RUN: FileCheck %s --check-prefixes=CHECK-64
+
+RWByteAddressBuffer buf: register(u0);
+
+// CHECK: %"class.hlsl::RWByteAddressBuffer" = type { target("dx.RawBuffer",
i8, 1, 0) }
+
+#ifdef INTERLOCKED32
Alexander-Johnston wrote:
I'd left these in by accident from my own bug fixing. Removed.
https://github.com/llvm/llvm-project/pull/180804
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
@@ -4004,6 +4020,170 @@ bool SemaHLSL::CheckBuiltinFunctionCall(unsigned
BuiltinID, CallExpr *TheCall) {
getASTContext().UnsignedIntTy);
break;
}
+ case Builtin::BI__builtin_hlsl_interlocked_or: {
+if (SemaRef.checkArgCountRange(TheCall, 3, 4))
+ return true;
+auto checkResTy = [this](const HLSLAttributedResourceType *ResTy) -> bool {
+ bool IsValid = false;
+ const ASTContext &AST = SemaRef.getASTContext();
+ // The resource handle must be either
+ // RWByteAddressBuffer or RWStructuredBuffer
+ IsValid |= ResTy->getAttrs().ResourceClass == ResourceClass::UAV &&
+ ResTy->isRaw() && ResTy->hasContainedType();
+ // RWBuffer or RWBuffer
+ IsValid |= ResTy->getAttrs().ResourceClass == ResourceClass::UAV &&
+ !ResTy->isRaw() && ResTy->hasContainedType() &&
+ (ResTy->getContainedType() == AST.IntTy ||
+ ResTy->getContainedType() == AST.UnsignedIntTy);
+ // RWTexture or RWTexture (any dimension)
+ IsValid |= ResTy->getAttrs().ResourceClass == ResourceClass::UAV &&
+ !ResTy->isRaw() &&
+ ResTy->getAttrs().ResourceDimension !=
+ llvm::dxil::ResourceDimension::Unknown &&
+ (ResTy->getContainedType() == AST.IntTy ||
+ ResTy->getContainedType() == AST.UnsignedIntTy);
+ return !IsValid;
+};
+if (CheckResourceHandle(&SemaRef, TheCall, 0, checkResTy))
+ return true;
+
+if (CheckArgTypeMatches(&SemaRef, TheCall->getArg(1),
+SemaRef.getASTContext().UnsignedIntTy) ||
+CheckArgTypeMatches(&SemaRef, TheCall->getArg(2),
+SemaRef.getASTContext().UnsignedIntTy))
+ return true;
+// We will have a second index if handling a RWStructuredBuffer
+if (TheCall->getNumArgs() == 4)
+ if (CheckArgTypeMatches(&SemaRef, TheCall->getArg(3),
+ SemaRef.getASTContext().UnsignedIntTy))
+return true;
+
+TheCall->setType(SemaRef.getASTContext().VoidTy);
+break;
+ }
+ case Builtin::BI__builtin_hlsl_interlocked_or_ret: {
+if (SemaRef.checkArgCountRange(TheCall, 4, 5))
+ return true;
+auto checkResTy = [this](const HLSLAttributedResourceType *ResTy) -> bool {
+ bool IsValid = false;
+ const ASTContext &AST = SemaRef.getASTContext();
+ // The resource handle must be either
+ // RWByteAddressBuffer or RWStructuredBuffer
+ IsValid |= ResTy->getAttrs().ResourceClass == ResourceClass::UAV &&
+ ResTy->getAttrs().RawBuffer && ResTy->hasContainedType();
+ // RWBuffer or RWBuffer
+ IsValid |= ResTy->getAttrs().ResourceClass == ResourceClass::UAV &&
+ !ResTy->getAttrs().RawBuffer && ResTy->hasContainedType() &&
+ (ResTy->getContainedType() == AST.IntTy ||
+ ResTy->getContainedType() == AST.UnsignedIntTy);
+ // TODO: Handle Texture types when implemented
+ return !IsValid;
Alexander-Johnston wrote:
Same as above
https://github.com/llvm/llvm-project/pull/180804
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
@@ -4004,6 +4020,170 @@ bool SemaHLSL::CheckBuiltinFunctionCall(unsigned
BuiltinID, CallExpr *TheCall) {
getASTContext().UnsignedIntTy);
break;
}
+ case Builtin::BI__builtin_hlsl_interlocked_or: {
+if (SemaRef.checkArgCountRange(TheCall, 3, 4))
+ return true;
+auto checkResTy = [this](const HLSLAttributedResourceType *ResTy) -> bool {
+ bool IsValid = false;
+ const ASTContext &AST = SemaRef.getASTContext();
+ // The resource handle must be either
+ // RWByteAddressBuffer or RWStructuredBuffer
+ IsValid |= ResTy->getAttrs().ResourceClass == ResourceClass::UAV &&
+ ResTy->isRaw() && ResTy->hasContainedType();
+ // RWBuffer or RWBuffer
+ IsValid |= ResTy->getAttrs().ResourceClass == ResourceClass::UAV &&
+ !ResTy->isRaw() && ResTy->hasContainedType() &&
+ (ResTy->getContainedType() == AST.IntTy ||
+ ResTy->getContainedType() == AST.UnsignedIntTy);
+ // RWTexture or RWTexture (any dimension)
+ IsValid |= ResTy->getAttrs().ResourceClass == ResourceClass::UAV &&
+ !ResTy->isRaw() &&
+ ResTy->getAttrs().ResourceDimension !=
+ llvm::dxil::ResourceDimension::Unknown &&
+ (ResTy->getContainedType() == AST.IntTy ||
+ ResTy->getContainedType() == AST.UnsignedIntTy);
+ return !IsValid;
Alexander-Johnston wrote:
Condensed down into a series of helpers. Adds an isTexture helper to the
HLSLAttributedResourceType, similar to isRaw and isStructured
https://github.com/llvm/llvm-project/pull/180804
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
@@ -1325,6 +1421,18 @@ Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned
BuiltinID,
llvm::Value *Args[] = {SpecId, DefaultVal};
return Builder.CreateCall(SpecConstantFn, Args);
}
+ case Builtin::BI__builtin_hlsl_interlocked_or: {
+return handleInterlockedOr(*this, E, false, true);
+ }
+ case Builtin::BI__builtin_hlsl_interlocked_or64: {
+return handleInterlockedOr(*this, E, false, false);
Alexander-Johnston wrote:
They are now combined into 1 intrinsic for no return and 1 for return
https://github.com/llvm/llvm-project/pull/180804
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
github-actions[bot] wrote: # :penguin: Linux x64 Test Results * 167422 tests passed * 3091 tests skipped All executed tests passed, but another part of the build **failed**. Click on a failure below to see the details. tools/clang/tools/extra/clangd/benchmarks/CMakeFiles/IndexBenchmark.dir/IndexBenchmark.cpp.o (Likely Already Failing) This test is already failing at the base commit. ``` FAILED: tools/clang/tools/extra/clangd/benchmarks/CMakeFiles/IndexBenchmark.dir/IndexBenchmark.cpp.o sccache /opt/llvm/bin/clang++ -DBENCHMARK_STATIC_DEFINE -DLLVM_BUILD_STATIC -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GLIBCXX_USE_CXX11_ABI=1 -D_GNU_SOURCE -D_LIBCPP_HARDENING_MODE=_LIBCPP_HARDENING_MODE_EXTENSIVE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/gha/actions-runner/_work/llvm-project/llvm-project/build/tools/clang/tools/extra/clangd/benchmarks -I/home/gha/actions-runner/_work/llvm-project/llvm-project/clang-tools-extra/clangd/benchmarks -I/home/gha/actions-runner/_work/llvm-project/llvm-project/clang-tools-extra/clangd/../include-cleaner/include -I/home/gha/actions-runner/_work/llvm-project/llvm-project/build/tools/clang/tools/extra/clangd/../clang-tidy -I/home/gha/actions-runner/_work/llvm-project/llvm-project/clang/include -I/home/gha/actions-runner/_work/llvm-project/llvm-project/build/tools/clang/include -I/home/gha/actions-runner/_work/llvm-project/llvm-project/build/include -I/home/gha/actions-runner/_work/llvm-project/llvm-project/llvm/include -I/home/gha/actions-runner/_work/llvm-project/llvm-project/clang-tools-extra/clangd -I/home/gha/actions-runner/_work/llvm-project/llvm-project/build/tools/clang/tools/extra/clangd -I/home/gha/actions-runner/_work/llvm-project/llvm-project/third-party/benchmark/include -gmlt -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wno-pass-failed -Wmisleading-indentation -Wctad-maybe-unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -fno-common -Woverloaded-virtual -Wno-nested-anon-types -O3 -DNDEBUG -std=c++17 -UNDEBUG -fno-exceptions -funwind-tables -fno-rtti -MD -MT tools/clang/tools/extra/clangd/benchmarks/CMakeFiles/IndexBenchmark.dir/IndexBenchmark.cpp.o -MF tools/clang/tools/extra/clangd/benchmarks/CMakeFiles/IndexBenchmark.dir/IndexBenchmark.cpp.o.d -o tools/clang/tools/extra/clangd/benchmarks/CMakeFiles/IndexBenchmark.dir/IndexBenchmark.cpp.o -c /home/gha/actions-runner/_work/llvm-project/llvm-project/clang-tools-extra/clangd/benchmarks/IndexBenchmark.cpp /home/gha/actions-runner/_work/llvm-project/llvm-project/clang-tools-extra/clangd/benchmarks/IndexBenchmark.cpp:83:1: error: '__COUNTER__' is a C2y extension [-Werror,-Wc2y-extensions] 83 | BENCHMARK(memQueries); | ^ /home/gha/actions-runner/_work/llvm-project/llvm-project/third-party/benchmark/include/benchmark/benchmark.h:1503:3: note: expanded from macro 'BENCHMARK' 1503 | BENCHMARK_PRIVATE_DECLARE(_benchmark_) = \ | ^ /home/gha/actions-runner/_work/llvm-project/llvm-project/third-party/benchmark/include/benchmark/benchmark.h:1498:44: note: expanded from macro 'BENCHMARK_PRIVATE_DECLARE' 1498 | static ::benchmark::internal::Benchmark* BENCHMARK_PRIVATE_NAME(n) \ |^ /home/gha/actions-runner/_work/llvm-project/llvm-project/third-party/benchmark/include/benchmark/benchmark.h:1484:45: note: expanded from macro 'BENCHMARK_PRIVATE_NAME' 1484 | BENCHMARK_PRIVATE_CONCAT(benchmark_uniq_, BENCHMARK_PRIVATE_UNIQUE_ID, \ | ^ /home/gha/actions-runner/_work/llvm-project/llvm-project/third-party/benchmark/include/benchmark/benchmark.h:1475:37: note: expanded from macro 'BENCHMARK_PRIVATE_UNIQUE_ID' 1475 | #define BENCHMARK_PRIVATE_UNIQUE_ID __COUNTER__ | ^ /home/gha/actions-runner/_work/llvm-project/llvm-project/clang-tools-extra/clangd/benchmarks/IndexBenchmark.cpp:92:1: error: '__COUNTER__' is a C2y extension [-Werror,-Wc2y-extensions] 92 | BENCHMARK(dexQueries); | ^ /home/gha/actions-runner/_work/llvm-project/llvm-project/third-party/benchmark/include/benchmark/benchmark.h:1503:3: note: expanded from macro 'BENCHMARK' 1503 | BENCHMARK_PRIVATE_DECLARE(_benchmark_) = \ | ^ /home/gha/actions-runner/_work/llvm-project/llvm-project/third-party/benchmark/include/benchmark/benchmark.h:1498:44: note: expanded from macro 'BENCHMARK_PRIVATE_DECLARE' 1498 | static ::benchmark::internal::Benchmark* BENCHMARK_PRIVATE_NAME(n) \ |^ /home/gha/actions-
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
@@ -4004,6 +4020,170 @@ bool SemaHLSL::CheckBuiltinFunctionCall(unsigned
BuiltinID, CallExpr *TheCall) {
getASTContext().UnsignedIntTy);
break;
}
+ case Builtin::BI__builtin_hlsl_interlocked_or: {
+if (SemaRef.checkArgCountRange(TheCall, 3, 4))
+ return true;
+auto checkResTy = [this](const HLSLAttributedResourceType *ResTy) -> bool {
+ bool IsValid = false;
+ const ASTContext &AST = SemaRef.getASTContext();
+ // The resource handle must be either
+ // RWByteAddressBuffer or RWStructuredBuffer
+ IsValid |= ResTy->getAttrs().ResourceClass == ResourceClass::UAV &&
+ ResTy->isRaw() && ResTy->hasContainedType();
+ // RWBuffer or RWBuffer
+ IsValid |= ResTy->getAttrs().ResourceClass == ResourceClass::UAV &&
+ !ResTy->isRaw() && ResTy->hasContainedType() &&
+ (ResTy->getContainedType() == AST.IntTy ||
+ ResTy->getContainedType() == AST.UnsignedIntTy);
+ // RWTexture or RWTexture (any dimension)
+ IsValid |= ResTy->getAttrs().ResourceClass == ResourceClass::UAV &&
+ !ResTy->isRaw() &&
+ ResTy->getAttrs().ResourceDimension !=
+ llvm::dxil::ResourceDimension::Unknown &&
+ (ResTy->getContainedType() == AST.IntTy ||
+ ResTy->getContainedType() == AST.UnsignedIntTy);
+ return !IsValid;
Alexander-Johnston wrote:
I believe there's helpers for this in the backend, but not as many in the
frontend. I believe Steven may have added some as part of implementing Textures
though. I'll have a check, and if there aren't any I'll add them.
https://github.com/llvm/llvm-project/pull/180804
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
@@ -5379,6 +5379,30 @@ def HLSLDdyFine : LangBuiltin<"HLSL_LANG"> {
let Prototype = "void(...)";
}
+def HLSLInterlockedOr : LangBuiltin<"HLSL_LANG"> {
+ let Spellings = ["__builtin_hlsl_interlocked_or"];
+ let Attributes = [NoThrow, CustomTypeChecking];
+ let Prototype = "void(...)";
+}
+
+def HLSLInterlockedOrRet : LangBuiltin<"HLSL_LANG"> {
+ let Spellings = ["__builtin_hlsl_interlocked_or_ret"];
+ let Attributes = [NoThrow, CustomTypeChecking];
+ let Prototype = "void(...)";
+}
+
+def HLSLInterlockedOr64 : LangBuiltin<"HLSL_LANG"> {
+ let Spellings = ["__builtin_hlsl_interlocked_or64"];
+ let Attributes = [NoThrow, CustomTypeChecking];
+ let Prototype = "void(...)";
+}
+
+def HLSLInterlockedOrRet64 : LangBuiltin<"HLSL_LANG"> {
+ let Spellings = ["__builtin_hlsl_interlocked_or_ret64"];
+ let Attributes = [NoThrow, CustomTypeChecking];
+ let Prototype = "void(...)";
+}
Alexander-Johnston wrote:
Ok, I'll try to condense it down into ret vs no ret and handle the different
bitness across both. I appreciate not wanting to set precedent with a 16/32/64
pattern. I think I agree it'd be better to avoid it if possible.
https://github.com/llvm/llvm-project/pull/180804
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
@@ -5379,6 +5379,30 @@ def HLSLDdyFine : LangBuiltin<"HLSL_LANG"> {
let Prototype = "void(...)";
}
+def HLSLInterlockedOr : LangBuiltin<"HLSL_LANG"> {
+ let Spellings = ["__builtin_hlsl_interlocked_or"];
+ let Attributes = [NoThrow, CustomTypeChecking];
+ let Prototype = "void(...)";
+}
+
+def HLSLInterlockedOrRet : LangBuiltin<"HLSL_LANG"> {
+ let Spellings = ["__builtin_hlsl_interlocked_or_ret"];
+ let Attributes = [NoThrow, CustomTypeChecking];
+ let Prototype = "void(...)";
+}
+
+def HLSLInterlockedOr64 : LangBuiltin<"HLSL_LANG"> {
+ let Spellings = ["__builtin_hlsl_interlocked_or64"];
+ let Attributes = [NoThrow, CustomTypeChecking];
+ let Prototype = "void(...)";
+}
+
+def HLSLInterlockedOrRet64 : LangBuiltin<"HLSL_LANG"> {
+ let Spellings = ["__builtin_hlsl_interlocked_or_ret64"];
+ let Attributes = [NoThrow, CustomTypeChecking];
+ let Prototype = "void(...)";
+}
farzonl wrote:
I understand why you did this though. the HLSL alias also has 64 in the name to
distinguish them. My real concern is there is a bunch of duplicated code to
handle these 4 builtins.
https://github.com/llvm/llvm-project/pull/180804
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
@@ -0,0 +1,98 @@
+// RUN: %clang_cc1 -finclude-default-header -x hlsl -triple
dxil-pc-shadermodel6.0-library %s \
+// RUN: -emit-llvm -disable-llvm-passes -o - -DINTERLOCKED32 | \
+// RUN: FileCheck %s --check-prefixes=CHECK-32
+// RUN: %clang_cc1 -finclude-default-header -x hlsl -triple
dxil-pc-shadermodel6.6-library %s \
+// RUN: -emit-llvm -disable-llvm-passes -o - -DINTERLOCKED64 | \
+// RUN: FileCheck %s --check-prefixes=CHECK-64
+
+RWByteAddressBuffer buf: register(u0);
+
+// CHECK: %"class.hlsl::RWByteAddressBuffer" = type { target("dx.RawBuffer",
i8, 1, 0) }
+
+#ifdef INTERLOCKED32
farzonl wrote:
why do you need if defs and seperate 32 and 64 Check lines? Feels like you
should be able to test `InterlockedOr` and `InterlockedOr64` with a single run
line if you just make sure that shadermodel6.6 is set by default.
If you want to test that `InterlockedOr64` doesn't work for 6.5 and lower make
that a sema error test.
https://github.com/llvm/llvm-project/pull/180804
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
@@ -4004,6 +4020,170 @@ bool SemaHLSL::CheckBuiltinFunctionCall(unsigned
BuiltinID, CallExpr *TheCall) {
getASTContext().UnsignedIntTy);
break;
}
+ case Builtin::BI__builtin_hlsl_interlocked_or: {
+if (SemaRef.checkArgCountRange(TheCall, 3, 4))
+ return true;
+auto checkResTy = [this](const HLSLAttributedResourceType *ResTy) -> bool {
+ bool IsValid = false;
+ const ASTContext &AST = SemaRef.getASTContext();
+ // The resource handle must be either
+ // RWByteAddressBuffer or RWStructuredBuffer
+ IsValid |= ResTy->getAttrs().ResourceClass == ResourceClass::UAV &&
+ ResTy->isRaw() && ResTy->hasContainedType();
+ // RWBuffer or RWBuffer
+ IsValid |= ResTy->getAttrs().ResourceClass == ResourceClass::UAV &&
+ !ResTy->isRaw() && ResTy->hasContainedType() &&
+ (ResTy->getContainedType() == AST.IntTy ||
+ ResTy->getContainedType() == AST.UnsignedIntTy);
+ // RWTexture or RWTexture (any dimension)
+ IsValid |= ResTy->getAttrs().ResourceClass == ResourceClass::UAV &&
+ !ResTy->isRaw() &&
+ ResTy->getAttrs().ResourceDimension !=
+ llvm::dxil::ResourceDimension::Unknown &&
+ (ResTy->getContainedType() == AST.IntTy ||
+ ResTy->getContainedType() == AST.UnsignedIntTy);
+ return !IsValid;
+};
+if (CheckResourceHandle(&SemaRef, TheCall, 0, checkResTy))
+ return true;
+
+if (CheckArgTypeMatches(&SemaRef, TheCall->getArg(1),
+SemaRef.getASTContext().UnsignedIntTy) ||
+CheckArgTypeMatches(&SemaRef, TheCall->getArg(2),
+SemaRef.getASTContext().UnsignedIntTy))
+ return true;
+// We will have a second index if handling a RWStructuredBuffer
+if (TheCall->getNumArgs() == 4)
+ if (CheckArgTypeMatches(&SemaRef, TheCall->getArg(3),
+ SemaRef.getASTContext().UnsignedIntTy))
+return true;
+
+TheCall->setType(SemaRef.getASTContext().VoidTy);
+break;
+ }
+ case Builtin::BI__builtin_hlsl_interlocked_or_ret: {
+if (SemaRef.checkArgCountRange(TheCall, 4, 5))
+ return true;
+auto checkResTy = [this](const HLSLAttributedResourceType *ResTy) -> bool {
+ bool IsValid = false;
+ const ASTContext &AST = SemaRef.getASTContext();
+ // The resource handle must be either
+ // RWByteAddressBuffer or RWStructuredBuffer
+ IsValid |= ResTy->getAttrs().ResourceClass == ResourceClass::UAV &&
+ ResTy->getAttrs().RawBuffer && ResTy->hasContainedType();
+ // RWBuffer or RWBuffer
+ IsValid |= ResTy->getAttrs().ResourceClass == ResourceClass::UAV &&
+ !ResTy->getAttrs().RawBuffer && ResTy->hasContainedType() &&
+ (ResTy->getContainedType() == AST.IntTy ||
+ ResTy->getContainedType() == AST.UnsignedIntTy);
+ // TODO: Handle Texture types when implemented
+ return !IsValid;
farzonl wrote:
Same comment here break these into helpers so we arent repeating so much.
https://github.com/llvm/llvm-project/pull/180804
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
@@ -4004,6 +4020,170 @@ bool SemaHLSL::CheckBuiltinFunctionCall(unsigned
BuiltinID, CallExpr *TheCall) {
getASTContext().UnsignedIntTy);
break;
}
+ case Builtin::BI__builtin_hlsl_interlocked_or: {
+if (SemaRef.checkArgCountRange(TheCall, 3, 4))
+ return true;
+auto checkResTy = [this](const HLSLAttributedResourceType *ResTy) -> bool {
+ bool IsValid = false;
+ const ASTContext &AST = SemaRef.getASTContext();
+ // The resource handle must be either
+ // RWByteAddressBuffer or RWStructuredBuffer
+ IsValid |= ResTy->getAttrs().ResourceClass == ResourceClass::UAV &&
+ ResTy->isRaw() && ResTy->hasContainedType();
+ // RWBuffer or RWBuffer
+ IsValid |= ResTy->getAttrs().ResourceClass == ResourceClass::UAV &&
+ !ResTy->isRaw() && ResTy->hasContainedType() &&
+ (ResTy->getContainedType() == AST.IntTy ||
+ ResTy->getContainedType() == AST.UnsignedIntTy);
+ // RWTexture or RWTexture (any dimension)
+ IsValid |= ResTy->getAttrs().ResourceClass == ResourceClass::UAV &&
+ !ResTy->isRaw() &&
+ ResTy->getAttrs().ResourceDimension !=
+ llvm::dxil::ResourceDimension::Unknown &&
+ (ResTy->getContainedType() == AST.IntTy ||
+ ResTy->getContainedType() == AST.UnsignedIntTy);
+ return !IsValid;
farzonl wrote:
Also before you do my suggestion are you sure there are no helpers for these?
https://github.com/llvm/llvm-project/pull/180804
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
@@ -4004,6 +4020,170 @@ bool SemaHLSL::CheckBuiltinFunctionCall(unsigned
BuiltinID, CallExpr *TheCall) {
getASTContext().UnsignedIntTy);
break;
}
+ case Builtin::BI__builtin_hlsl_interlocked_or: {
+if (SemaRef.checkArgCountRange(TheCall, 3, 4))
+ return true;
+auto checkResTy = [this](const HLSLAttributedResourceType *ResTy) -> bool {
+ bool IsValid = false;
+ const ASTContext &AST = SemaRef.getASTContext();
+ // The resource handle must be either
+ // RWByteAddressBuffer or RWStructuredBuffer
+ IsValid |= ResTy->getAttrs().ResourceClass == ResourceClass::UAV &&
+ ResTy->isRaw() && ResTy->hasContainedType();
+ // RWBuffer or RWBuffer
+ IsValid |= ResTy->getAttrs().ResourceClass == ResourceClass::UAV &&
+ !ResTy->isRaw() && ResTy->hasContainedType() &&
+ (ResTy->getContainedType() == AST.IntTy ||
+ ResTy->getContainedType() == AST.UnsignedIntTy);
+ // RWTexture or RWTexture (any dimension)
+ IsValid |= ResTy->getAttrs().ResourceClass == ResourceClass::UAV &&
+ !ResTy->isRaw() &&
+ ResTy->getAttrs().ResourceDimension !=
+ llvm::dxil::ResourceDimension::Unknown &&
+ (ResTy->getContainedType() == AST.IntTy ||
+ ResTy->getContainedType() == AST.UnsignedIntTy);
+ return !IsValid;
farzonl wrote:
```suggestion
const bool IsUAV = ResTy->getAttrs().ResourceClass == ResourceClass::UAV;
const bool HasElemTy = ResTy->hasContainedType();
const bool IsRaw = ResTy->isRaw();
const bool IsIntElem =
HasElemTy && (ResTy->getContainedType() == AST.IntTy ||
ResTy->getContainedType() == AST.UnsignedIntTy);
const bool HasKnownDim =
ResTy->getAttrs().ResourceDimension !=
llvm::dxil::ResourceDimension::Unknown;
const bool IsValid =
IsUAV && (
(IsRaw && HasElemTy) || // RWByteAddressBuffer /
RWStructuredBuffer
(!IsRaw && IsIntElem) ||// RWBuffer
(!IsRaw && HasKnownDim && IsIntElem)// RWTexture*(int/uint)
);
return !IsValid;
```
https://github.com/llvm/llvm-project/pull/180804
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
https://github.com/farzonl edited https://github.com/llvm/llvm-project/pull/180804 ___ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
@@ -4004,6 +4020,170 @@ bool SemaHLSL::CheckBuiltinFunctionCall(unsigned
BuiltinID, CallExpr *TheCall) {
getASTContext().UnsignedIntTy);
break;
}
+ case Builtin::BI__builtin_hlsl_interlocked_or: {
+if (SemaRef.checkArgCountRange(TheCall, 3, 4))
+ return true;
+auto checkResTy = [this](const HLSLAttributedResourceType *ResTy) -> bool {
+ bool IsValid = false;
+ const ASTContext &AST = SemaRef.getASTContext();
+ // The resource handle must be either
+ // RWByteAddressBuffer or RWStructuredBuffer
+ IsValid |= ResTy->getAttrs().ResourceClass == ResourceClass::UAV &&
+ ResTy->isRaw() && ResTy->hasContainedType();
+ // RWBuffer or RWBuffer
+ IsValid |= ResTy->getAttrs().ResourceClass == ResourceClass::UAV &&
+ !ResTy->isRaw() && ResTy->hasContainedType() &&
+ (ResTy->getContainedType() == AST.IntTy ||
+ ResTy->getContainedType() == AST.UnsignedIntTy);
+ // RWTexture or RWTexture (any dimension)
+ IsValid |= ResTy->getAttrs().ResourceClass == ResourceClass::UAV &&
+ !ResTy->isRaw() &&
+ ResTy->getAttrs().ResourceDimension !=
+ llvm::dxil::ResourceDimension::Unknown &&
+ (ResTy->getContainedType() == AST.IntTy ||
+ ResTy->getContainedType() == AST.UnsignedIntTy);
+ return !IsValid;
farzonl wrote:
There is a lot of repetition here you can factor out the common predicates and
avoid so many repeated boolean conditions.
https://github.com/llvm/llvm-project/pull/180804
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
@@ -5379,6 +5379,30 @@ def HLSLDdyFine : LangBuiltin<"HLSL_LANG"> {
let Prototype = "void(...)";
}
+def HLSLInterlockedOr : LangBuiltin<"HLSL_LANG"> {
+ let Spellings = ["__builtin_hlsl_interlocked_or"];
+ let Attributes = [NoThrow, CustomTypeChecking];
+ let Prototype = "void(...)";
+}
+
+def HLSLInterlockedOrRet : LangBuiltin<"HLSL_LANG"> {
+ let Spellings = ["__builtin_hlsl_interlocked_or_ret"];
+ let Attributes = [NoThrow, CustomTypeChecking];
+ let Prototype = "void(...)";
+}
+
+def HLSLInterlockedOr64 : LangBuiltin<"HLSL_LANG"> {
+ let Spellings = ["__builtin_hlsl_interlocked_or64"];
+ let Attributes = [NoThrow, CustomTypeChecking];
+ let Prototype = "void(...)";
+}
+
+def HLSLInterlockedOrRet64 : LangBuiltin<"HLSL_LANG"> {
+ let Spellings = ["__builtin_hlsl_interlocked_or_ret64"];
+ let Attributes = [NoThrow, CustomTypeChecking];
+ let Prototype = "void(...)";
+}
farzonl wrote:
ok yeah I think it should be fine to have one builtin that returns and one that
does not return. I don't see that we do much via signed vs unsigned as we
are just using one dxil op for this in the backend.
Also bitness specific builtins is not something we have done yet and I'm not
sure I want this to be a pattern that gets copied.
https://github.com/llvm/llvm-project/pull/180804
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
@@ -300,6 +300,122 @@ static Value *handleElementwiseF32ToF16(CodeGenFunction
&CGF,
llvm_unreachable("Intrinsic F32ToF16 not supported by target architecture");
}
+// Not sure where would be best for this to live
+// AtomicBinOp uses an i32 to determine the operation mode as follows
+enum AtomicOperationCode : uint {
+ Add = 0,
+ And = 1,
+ Or = 2,
+ Xor = 3,
+ IMin = 4,
+ IMax = 5,
+ UMin = 6,
+ UMax = 7,
+ Exchange = 8
+};
Alexander-Johnston wrote:
Yup, thank you! Moved to
https://github.com/llvm/llvm-project/pull/180804/changes#diff-ea687e2038566cc5ff5ab98f750bf66576c25a5d42e5cebd003863a8d409cd7aR320
https://github.com/llvm/llvm-project/pull/180804
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
@@ -300,6 +300,122 @@ static Value *handleElementwiseF32ToF16(CodeGenFunction
&CGF,
llvm_unreachable("Intrinsic F32ToF16 not supported by target architecture");
}
+// Not sure where would be best for this to live
+// AtomicBinOp uses an i32 to determine the operation mode as follows
+enum AtomicOperationCode : uint {
+ Add = 0,
+ And = 1,
+ Or = 2,
+ Xor = 3,
+ IMin = 4,
+ IMax = 5,
+ UMin = 6,
+ UMax = 7,
+ Exchange = 8
+};
+
+static Value *handleAtomicBinOp(CodeGenFunction &CGF, const CallExpr *E,
+const AtomicOperationCode OpCode,
+const bool HasReturn, const bool Is32Bit) {
+ Value *HandleOp = CGF.EmitScalarExpr(E->getArg(0));
+ Value *IndexOp = CGF.EmitScalarExpr(E->getArg(1));
+ Value *StructuredBufIndexOp;
+ Value *NewValueOp;
+ Value *OldValueOp;
+ unsigned OldValueArgIdx;
+ if (E->getNumArgs() == 3) {
+// (handle, index, newValue)
+NewValueOp = CGF.EmitScalarExpr(E->getArg(2));
+ } else if (E->getNumArgs() == 4) {
+if (HasReturn) {
+ // (handle, index, newValue, oldValue)
+ NewValueOp = CGF.EmitScalarExpr(E->getArg(2));
+ OldValueArgIdx = 3;
+} else {
+ // (handle, index, index, newValue)
+ StructuredBufIndexOp = CGF.EmitScalarExpr(E->getArg(2));
+ NewValueOp = CGF.EmitScalarExpr(E->getArg(3));
+}
+ } else {
+// (handle, index, index, newValue, oldValue)
+StructuredBufIndexOp = CGF.EmitScalarExpr(E->getArg(2));
+NewValueOp = CGF.EmitScalarExpr(E->getArg(3));
+OldValueArgIdx = 4;
+ }
+
+ switch (CGF.CGM.getTarget().getTriple().getArch()) {
+ case llvm::Triple::dxil: {
+QualType HandleTy = E->getArg(0)->getType();
+const HLSLAttributedResourceType *ResourceTy =
+HandleTy->getAs();
+
+// AtomicBinOp uses an i32 to determine the operation mode as follows
+// Add: 0, And: 1, Or: 2, Xor: 3, IMin: 4, IMax: 5, UMin: 6, UMax: 7,
+// Exchange: 8
+Value *ModeConstant = ConstantInt::get(CGF.Int32Ty, OpCode);
+
+// AtomicBinOp has 3 coordinate params which must be handled differently
+// depending on the resource type being accessed.
+// Initially undef all the coordinates then fill as required
+Value *Undef = UndefValue::get(CGF.Int32Ty);
+Value *C0 = Undef;
+Value *C1 = Undef;
+Value *C2 = Undef;
Alexander-Johnston wrote:
I've replaced these with Poison to try and downgrade them in the backend as you
suggest.
Maybe I'm looking in the wrong places, but I don't see a specific downgrader
anywhere in the DirectX backend. I noticed some manual downgrading in
DXILOpLowering on some resource intrinsics, but this path is already covered
for the interlocked/atomicBinOp by the `let intrinsics...` in DXIL.td. What do
you suggest?
https://github.com/llvm/llvm-project/pull/180804
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
https://github.com/Alexander-Johnston updated
https://github.com/llvm/llvm-project/pull/180804
>From 687db9e025d2269743a7a654a65897e0ce2fec72 Mon Sep 17 00:00:00 2001
From: Alexander Johnston
Date: Tue, 10 Feb 2026 18:01:11 +
Subject: [PATCH 1/2] [HLSL][DXIL] InterlockedOr and Interlocked64 builtins
This includes the first phase of implementation of the InterlockedOr intrinsic.
This covers the usage of the intrinsic/builtin on RWByteAddressBuffers, Typed
Buffers, and Structured Buffers. Not covered are textures, groupshared memory,
and the standalone InterlockedOr(buf[index], val, ret) intrinsics.
SPIRV implementation is not covered in this commit.
---
clang/include/clang/Basic/Builtins.td | 24 +++
.../clang/Basic/DiagnosticSemaKinds.td| 3 +
clang/lib/CodeGen/CGHLSLBuiltins.cpp | 128 +
clang/lib/Sema/HLSLBuiltinTypeDeclBuilder.cpp | 51 +
clang/lib/Sema/HLSLBuiltinTypeDeclBuilder.h | 2 +
clang/lib/Sema/HLSLExternalSemaSource.cpp | 2 +
clang/lib/Sema/SemaHLSL.cpp | 179 ++
.../builtins/Interlocked-or-builtin.hlsl | 76
.../CodeGenHLSL/builtins/Interlocked-or.hlsl | 98 ++
.../BuiltIns/interlocked-or-errors.hlsl | 84
.../BuiltIns/interlocked-or64-errors.hlsl | 74
llvm/include/llvm/IR/IntrinsicsDirectX.td | 3 +
llvm/lib/Target/DirectX/DXIL.td | 8 +
llvm/lib/Target/DirectX/DXILOpLowering.cpp| 60 ++
llvm/test/CodeGen/DirectX/interlocked-or.ll | 117
llvm/test/CodeGen/DirectX/interlocked-or64.ll | 117
16 files changed, 1026 insertions(+)
create mode 100644 clang/test/CodeGenHLSL/builtins/Interlocked-or-builtin.hlsl
create mode 100644 clang/test/CodeGenHLSL/builtins/Interlocked-or.hlsl
create mode 100644 clang/test/SemaHLSL/BuiltIns/interlocked-or-errors.hlsl
create mode 100644 clang/test/SemaHLSL/BuiltIns/interlocked-or64-errors.hlsl
create mode 100644 llvm/test/CodeGen/DirectX/interlocked-or.ll
create mode 100644 llvm/test/CodeGen/DirectX/interlocked-or64.ll
diff --git a/clang/include/clang/Basic/Builtins.td
b/clang/include/clang/Basic/Builtins.td
index 05e3af4a0e96f..374ff6470d91e 100644
--- a/clang/include/clang/Basic/Builtins.td
+++ b/clang/include/clang/Basic/Builtins.td
@@ -5379,6 +5379,30 @@ def HLSLDdyFine : LangBuiltin<"HLSL_LANG"> {
let Prototype = "void(...)";
}
+def HLSLInterlockedOr : LangBuiltin<"HLSL_LANG"> {
+ let Spellings = ["__builtin_hlsl_interlocked_or"];
+ let Attributes = [NoThrow, CustomTypeChecking];
+ let Prototype = "void(...)";
+}
+
+def HLSLInterlockedOrRet : LangBuiltin<"HLSL_LANG"> {
+ let Spellings = ["__builtin_hlsl_interlocked_or_ret"];
+ let Attributes = [NoThrow, CustomTypeChecking];
+ let Prototype = "void(...)";
+}
+
+def HLSLInterlockedOr64 : LangBuiltin<"HLSL_LANG"> {
+ let Spellings = ["__builtin_hlsl_interlocked_or64"];
+ let Attributes = [NoThrow, CustomTypeChecking];
+ let Prototype = "void(...)";
+}
+
+def HLSLInterlockedOrRet64 : LangBuiltin<"HLSL_LANG"> {
+ let Spellings = ["__builtin_hlsl_interlocked_or_ret64"];
+ let Attributes = [NoThrow, CustomTypeChecking];
+ let Prototype = "void(...)";
+}
+
// Builtins for XRay.
def XRayCustomEvent : Builtin {
let Spellings = ["__xray_customevent"];
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td
b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index f999c362307af..384611a97dee3 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -13492,6 +13492,9 @@ def err_hlsl_assign_to_global_resource: Error<
def err_hlsl_push_constant_unique
: Error<"cannot have more than one push constant block">;
+def err_hlsl_intrinsic_in_wrong_shader_model: Error<
+ "intrinsic %0 requires shader model %1 or greater">;
+
// Layout randomization diagnostics.
def err_non_designated_init_used : Error<
"a randomized struct can only be initialized with a designated initializer">;
diff --git a/clang/lib/CodeGen/CGHLSLBuiltins.cpp
b/clang/lib/CodeGen/CGHLSLBuiltins.cpp
index c72eef1982e9e..39d716bea91bf 100644
--- a/clang/lib/CodeGen/CGHLSLBuiltins.cpp
+++ b/clang/lib/CodeGen/CGHLSLBuiltins.cpp
@@ -300,6 +300,122 @@ static Value *handleElementwiseF32ToF16(CodeGenFunction
&CGF,
llvm_unreachable("Intrinsic F32ToF16 not supported by target architecture");
}
+// Not sure where would be best for this to live
+// AtomicBinOp uses an i32 to determine the operation mode as follows
+enum AtomicOperationCode : uint {
+ Add = 0,
+ And = 1,
+ Or = 2,
+ Xor = 3,
+ IMin = 4,
+ IMax = 5,
+ UMin = 6,
+ UMax = 7,
+ Exchange = 8
+};
+
+static Value *handleAtomicBinOp(CodeGenFunction &CGF, const CallExpr *E,
+const AtomicOperationCode OpCode,
+const bool HasReturn, const bool Is32Bit) {
+ Value *HandleOp = CGF.EmitScalarExpr(E->getArg(0));
+ Value
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
@@ -910,6 +910,14 @@ def GetDimensions : DXILOp<72, getDimensions> {
let stages = [Stages];
}
+def AtomicBinOp : DXILOp<78, atomicBinOp> {
+ let Doc = "performs an atomic operation on a value in memory";
farzonl wrote:
doesn't this need a ` let intrinsics = [`?
https://github.com/llvm/llvm-project/pull/180804
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
farzonl wrote: I think a good example for how we should be considering Interlock intrinsics is `WaveActiveOp` in DXIL.td I think we should more heavily rely on the nice abstractions that llvm intrinsics give us on the clang codgen side and move more of the target specific details into the backend. https://github.com/llvm/llvm-project/blob/977d910d005c47f884ecf838e504da301b1124b9/llvm/lib/Target/DirectX/DXIL.td#L1083-L1125 https://github.com/llvm/llvm-project/pull/180804 ___ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
@@ -300,6 +300,122 @@ static Value *handleElementwiseF32ToF16(CodeGenFunction
&CGF,
llvm_unreachable("Intrinsic F32ToF16 not supported by target architecture");
}
+// Not sure where would be best for this to live
+// AtomicBinOp uses an i32 to determine the operation mode as follows
+enum AtomicOperationCode : uint {
+ Add = 0,
+ And = 1,
+ Or = 2,
+ Xor = 3,
+ IMin = 4,
+ IMax = 5,
+ UMin = 6,
+ UMax = 7,
+ Exchange = 8
+};
+
+static Value *handleAtomicBinOp(CodeGenFunction &CGF, const CallExpr *E,
+const AtomicOperationCode OpCode,
+const bool HasReturn, const bool Is32Bit) {
+ Value *HandleOp = CGF.EmitScalarExpr(E->getArg(0));
+ Value *IndexOp = CGF.EmitScalarExpr(E->getArg(1));
+ Value *StructuredBufIndexOp;
+ Value *NewValueOp;
+ Value *OldValueOp;
+ unsigned OldValueArgIdx;
+ if (E->getNumArgs() == 3) {
+// (handle, index, newValue)
+NewValueOp = CGF.EmitScalarExpr(E->getArg(2));
+ } else if (E->getNumArgs() == 4) {
+if (HasReturn) {
+ // (handle, index, newValue, oldValue)
+ NewValueOp = CGF.EmitScalarExpr(E->getArg(2));
+ OldValueArgIdx = 3;
+} else {
+ // (handle, index, index, newValue)
+ StructuredBufIndexOp = CGF.EmitScalarExpr(E->getArg(2));
+ NewValueOp = CGF.EmitScalarExpr(E->getArg(3));
+}
+ } else {
+// (handle, index, index, newValue, oldValue)
+StructuredBufIndexOp = CGF.EmitScalarExpr(E->getArg(2));
+NewValueOp = CGF.EmitScalarExpr(E->getArg(3));
+OldValueArgIdx = 4;
+ }
+
+ switch (CGF.CGM.getTarget().getTriple().getArch()) {
+ case llvm::Triple::dxil: {
+QualType HandleTy = E->getArg(0)->getType();
+const HLSLAttributedResourceType *ResourceTy =
+HandleTy->getAs();
+
+// AtomicBinOp uses an i32 to determine the operation mode as follows
+// Add: 0, And: 1, Or: 2, Xor: 3, IMin: 4, IMax: 5, UMin: 6, UMax: 7,
+// Exchange: 8
+Value *ModeConstant = ConstantInt::get(CGF.Int32Ty, OpCode);
+
+// AtomicBinOp has 3 coordinate params which must be handled differently
+// depending on the resource type being accessed.
+// Initially undef all the coordinates then fill as required
+Value *Undef = UndefValue::get(CGF.Int32Ty);
+Value *C0 = Undef;
+Value *C1 = Undef;
+Value *C2 = Undef;
+if (!ResourceTy->getAttrs().RawBuffer) {
+ assert(
+ (ResourceTy->getContainedType() == CGF.getContext().IntTy ||
+ ResourceTy->getContainedType() == CGF.getContext().UnsignedIntTy) &&
+ "AtomicBinOp RWBuffer must contain int or uint");
+ // RWBuffer: c0
+ C0 = IndexOp;
+
+ // RWByteAddressBuffers are output as char8_t, but as that isn't
+ // recognised by HLSL we can't use it as an attribute to define them in
+ // tests, so must also check for char ([[hlsl::contained_type(char)]])
+} else if (ResourceTy->getContainedType() == CGF.getContext().Char8Ty ||
+ ResourceTy->getContainedType() == CGF.getContext().CharTy) {
+ // RWByteAddressBuffer: c0
+ C0 = IndexOp;
+} else {
+ // RWStructuredBuffer: c0 and c1
+ C0 = IndexOp;
+ C1 = StructuredBufIndexOp;
+}
+assert(C0 != Undef && "Failed to identify coordinates for Interlocked");
+// TODO: Add coordinate logic for texture and groupshared
+
+// atomicBinOp
+// opcode, handle, binary operation code, coordinates c0, c1, c2, new val
+if (Is32Bit) {
+ Intrinsic::ID ID = Intrinsic::dx_resource_atomicbinop;
+ OldValueOp = CGF.Builder.CreateIntrinsic(
+ /*ReturnType=*/CGF.Int32Ty, ID,
+ ArrayRef{HandleOp, ModeConstant, C0, C1, C2, NewValueOp},
+ nullptr, "hlsl.interlocked.or");
+} else {
+ Intrinsic::ID ID = Intrinsic::dx_resource_atomicbinop64;
+ OldValueOp = CGF.Builder.CreateIntrinsic(
+ /*ReturnType=*/CGF.Int64Ty, ID,
+ ArrayRef{HandleOp, ModeConstant, C0, C1, C2, NewValueOp},
+ nullptr, "hlsl.interlocked.or");
+}
farzonl wrote:
Why can't we just define `dx_resource_atomicbinop` to take either `CGF.Int32Ty`
or `GF.Int64Ty` and just have one intrinsic
https://github.com/llvm/llvm-project/pull/180804
___
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
Alexander-Johnston wrote: > ⚠️ undef deprecator found issues in your code. ⚠️ Currently the atomicBinOp DXIL operation expects `undef` in any coordinate not being used. Here's an example of all the different coordinate layouts https://godbolt.org/z/saqYb5Mzc Given this it seems a usage of undef is required somewhere, and it seems easiest to produce the undef in Clang and to keep the DXILOpLowering changes simple, rather than to pass through poison values then replace them later. I'm happy to alter to poison values and replace during DXILOpLowering if this would be better. https://github.com/llvm/llvm-project/pull/180804 ___ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
[clang] [llvm] [HLSL][DXIL] InterlockedOr and InterlockedOr64 builtins (PR #180804)
https://github.com/zwuis edited https://github.com/llvm/llvm-project/pull/180804 ___ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
