jwanggit86 wrote:
@arsenm @kzhuravl If there are no more comments, I'm going to merge this PR.
https://github.com/llvm/llvm-project/pull/118907
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-
https://github.com/jwanggit86 edited
https://github.com/llvm/llvm-project/pull/118907
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jwanggit86 edited
https://github.com/llvm/llvm-project/pull/118907
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jwanggit86 edited
https://github.com/llvm/llvm-project/pull/118907
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
jwanggit86 wrote:
> Should have the git generated "Reapply..." message with reference to the
> original
I didn't notice that. Do you want me to put "Reapply " in the msg?
https://github.com/llvm/llvm-project/pull/118907
___
cfe-commits mailing list
c
jwanggit86 wrote:
[PR 94647](https://github.com/llvm/llvm-project/pull/94647) was found to cause
build problems after merge with upstream. It has since been reverted by commit
1ef9410a96c1d9669a6feaf03fcab8d0a4a13bd5. This PR fixes the tests that caused
the build problems.
https://github.com/
https://github.com/jwanggit86 closed
https://github.com/llvm/llvm-project/pull/94647
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
jwanggit86 wrote:
I plan to submit this PR next week by Nov 26. Pls let me know if you have
additional comments.
https://github.com/llvm/llvm-project/pull/94647
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mai
@@ -683,6 +706,59 @@ struct AAAMDAttributesFunction : public AAAMDAttributes {
return !A.checkForAllCallLikeInstructions(DoesNotRetrieve, *this,
UsedAssumedInformation);
}
+
+ // Returns true if FlatScratchInit is needed, i.e
@@ -683,6 +706,59 @@ struct AAAMDAttributesFunction : public AAAMDAttributes {
return !A.checkForAllCallLikeInstructions(DoesNotRetrieve, *this,
UsedAssumedInformation);
}
+
+ // Returns true if FlatScratchInit is needed, i.e
@@ -683,6 +706,59 @@ struct AAAMDAttributesFunction : public AAAMDAttributes {
return !A.checkForAllCallLikeInstructions(DoesNotRetrieve, *this,
UsedAssumedInformation);
}
+
+ // Returns true if FlatScratchInit is needed, i.e
@@ -683,6 +706,59 @@ struct AAAMDAttributesFunction : public AAAMDAttributes {
return !A.checkForAllCallLikeInstructions(DoesNotRetrieve, *this,
UsedAssumedInformation);
}
+
+ // Returns true if FlatScratchInit is needed, i.e
@@ -434,6 +434,15 @@ struct AAAMDAttributesFunction : public AAAMDAttributes {
indicatePessimisticFixpoint();
return;
}
+
+for (Instruction &I : instructions(F)) {
+ if (isa(I) &&
jwanggit86 wrote:
This test has been added.
https://gi
@@ -434,6 +434,15 @@ struct AAAMDAttributesFunction : public AAAMDAttributes {
indicatePessimisticFixpoint();
return;
}
+
+for (Instruction &I : instructions(F)) {
+ if (isa(I) &&
jwanggit86 wrote:
@arsenm Following constants to see if
@@ -678,6 +690,37 @@ struct AAAMDAttributesFunction : public AAAMDAttributes {
return !A.checkForAllCallLikeInstructions(DoesNotRetrieve, *this,
UsedAssumedInformation);
}
+
+ // Returns true if FlatScratchInit is needed, i.e
jwanggit86 wrote:
Based off wrong branch.
https://github.com/llvm/llvm-project/pull/109223
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jwanggit86 closed
https://github.com/llvm/llvm-project/pull/109223
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -434,6 +434,15 @@ struct AAAMDAttributesFunction : public AAAMDAttributes {
indicatePessimisticFixpoint();
return;
}
+
+for (Instruction &I : instructions(F)) {
+ if (isa(I) &&
jwanggit86 wrote:
@arsenm Could you please give me an e
@@ -433,6 +433,13 @@ struct AAAMDAttributesFunction : public AAAMDAttributes {
indicatePessimisticFixpoint();
return;
}
+
+for (Instruction &I : instructions(F)) {
+ if (isa(I) || isa(I)) {
jwanggit86 wrote:
Thanks. This has been updat
@@ -677,6 +687,36 @@ struct AAAMDAttributesFunction : public AAAMDAttributes {
return !A.checkForAllCallLikeInstructions(DoesNotRetrieve, *this,
UsedAssumedInformation);
}
+
+ // Returns true if FlatScratchInit is needed, i.e
@@ -1054,12 +1054,6 @@ GCNUserSGPRUsageInfo::GCNUserSGPRUsageInfo(const
Function &F,
const CallingConv::ID CC = F.getCallingConv();
const bool IsKernel =
CC == CallingConv::AMDGPU_KERNEL || CC == CallingConv::SPIR_KERNEL;
- // FIXME: Should have analysis or somethin
https://github.com/jwanggit86 closed
https://github.com/llvm/llvm-project/pull/79236
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
jwanggit86 wrote:
@jayfoad Do you have any more comments?
https://github.com/llvm/llvm-project/pull/79236
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
jwanggit86 wrote:
Added a testcase that has flat_atomic_swap, which is an atomic without return.
https://github.com/llvm/llvm-project/pull/79236
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe
jwanggit86 wrote:
@jayfoad @arsenm Any other comments?
https://github.com/llvm/llvm-project/pull/79236
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -0,0 +1,1413 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
UTC_ARGS: --version 4
+; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -mattr=+precise-memory < %s |
FileCheck %s -check-prefixes=GFX9
+; RUN: llc -mtriple=amdgcn -mcpu=gfx90a -mattr=+preci
@@ -0,0 +1,1413 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
UTC_ARGS: --version 4
+; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -mattr=+precise-memory < %s |
FileCheck %s -check-prefixes=GFX9
+; RUN: llc -mtriple=amdgcn -mcpu=gfx90a -mattr=+preci
@@ -2326,6 +2326,20 @@ bool
SIInsertWaitcnts::insertWaitcntInBlock(MachineFunction &MF,
}
#endif
+if (ST->isPreciseMemoryEnabled() && Inst.mayLoadOrStore()) {
+ AMDGPU::Waitcnt Wait;
+ if (ST->hasExtendedWaitCounts())
+Wait = AMDGPU::Waitcnt(0, 0, 0,
@@ -0,0 +1,618 @@
+; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -mattr=+precise-memory < %s |
FileCheck %s -check-prefixes=GFX9
+; RUN: llc -mtriple=amdgcn -mcpu=gfx90a -mattr=+precise-memory < %s |
FileCheck %s -check-prefixes=GFX90A
+; RUN: llc -mtriple=amdgcn -mcpu=gfx1010 -mattr=
https://github.com/jwanggit86 edited
https://github.com/llvm/llvm-project/pull/79236
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -2326,6 +2326,20 @@ bool
SIInsertWaitcnts::insertWaitcntInBlock(MachineFunction &MF,
}
#endif
+if (ST->isPreciseMemoryEnabled() && Inst.mayLoadOrStore()) {
+ AMDGPU::Waitcnt Wait;
+ if (ST->hasExtendedWaitCounts())
+Wait = AMDGPU::Waitcnt(0, 0, 0,
@@ -2326,6 +2326,20 @@ bool
SIInsertWaitcnts::insertWaitcntInBlock(MachineFunction &MF,
}
#endif
+if (ST->isPreciseMemoryEnabled()) {
+ AMDGPU::Waitcnt Wait;
+ if (WCG == &WCGPreGFX12)
+Wait = AMDGPU::Waitcnt(0, 0, 0, 0);
jwanggit86
@@ -0,0 +1,577 @@
+; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -mattr=+precise-memory < %s |
FileCheck %s -check-prefixes=GFX9
+; RUN: llc -mtriple=amdgcn -mcpu=gfx90a -mattr=+precise-memory < %s |
FileCheck %s -check-prefixes=GFX90A
+; RUN: llc -mtriple=amdgcn -mcpu=gfx1010 -mattr=
@@ -0,0 +1,577 @@
+; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -mattr=+precise-memory < %s |
FileCheck %s -check-prefixes=GFX9
+; RUN: llc -mtriple=amdgcn -mcpu=gfx90a -mattr=+precise-memory < %s |
FileCheck %s -check-prefixes=GFX90A
+; RUN: llc -mtriple=amdgcn -mcpu=gfx1010 -mattr=
@@ -0,0 +1,577 @@
+; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -mattr=+precise-memory < %s |
FileCheck %s -check-prefixes=GFX9
+; RUN: llc -mtriple=amdgcn -mcpu=gfx90a -mattr=+precise-memory < %s |
FileCheck %s -check-prefixes=GFX90A
+; RUN: llc -mtriple=amdgcn -mcpu=gfx1010 -mattr=
https://github.com/jwanggit86 closed
https://github.com/llvm/llvm-project/pull/79035
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/jwanggit86 edited
https://github.com/llvm/llvm-project/pull/79035
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -137,6 +137,12 @@ Removed Compiler Flags
Attribute Changes in Clang
--
+- Introduced a new function attribute
``__attribute__((amdgpu_max_num_work_groups(x, y, z)))`` or
jwanggit86 wrote:
In the case of flat workgroup size, the LLV
@@ -137,6 +137,12 @@ Removed Compiler Flags
Attribute Changes in Clang
--
+- Introduced a new function attribute
``__attribute__((amdgpu_max_num_work_groups(x, y, z)))`` or
jwanggit86 wrote:
There are existing attributes that have wor
@@ -0,0 +1,77 @@
+; RUN: not llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 < %s 2>&1 | FileCheck
--check-prefix=ERROR %s
+
+; ERROR: error: can't parse integer attribute -1 in amdgpu-max-num-work-groups
+define amdgpu_kernel void @empty_max_num_work_groups_neg_num1() #21 {
+entry:
@@ -139,6 +139,36 @@ kernel void
reqd_work_group_size_32_2_1_flat_work_group_size_16_128() {
// CHECK: define{{.*}} amdgpu_kernel void
@reqd_work_group_size_32_2_1_flat_work_group_size_16_128()
[[FLAT_WORK_GROUP_SIZE_16_128:#[0-9]+]]
}
+__attribute__((amdgpu_max_num_work_gr
@@ -194,3 +204,105 @@ __global__ void non_cexpr_waves_per_eu_2() {}
// expected-error@+1{{'amdgpu_waves_per_eu' attribute requires parameter 1 to
be an integer constant}}
__attribute__((amdgpu_waves_per_eu(2, ipow2(2
__global__ void non_cexpr_waves_per_eu_2_4() {}
+
+// ex
@@ -0,0 +1,84 @@
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 < %s | FileCheck %s
+
+; Attribute not specified.
+; CHECK-LABEL: {{^}}empty_no_attribute:
+define amdgpu_kernel void @empty_no_attribute() {
+entry:
+ ret void
+}
+
+; Ignore if number of work groups for x dime
@@ -0,0 +1,84 @@
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 < %s | FileCheck %s
+
jwanggit86 wrote:
Created a new test file to test various errors.
https://github.com/llvm/llvm-project/pull/79035
___
cfe-commi
@@ -814,6 +814,15 @@ bool shouldEmitConstantsToTextSection(const Triple &TT);
/// to integer.
int getIntegerAttribute(const Function &F, StringRef Name, int Default);
+/// \returns Unsigned Integer value requested using \p F's \p Name attribute.
+///
+/// \returns \p Default i
@@ -194,3 +204,105 @@ __global__ void non_cexpr_waves_per_eu_2() {}
// expected-error@+1{{'amdgpu_waves_per_eu' attribute requires parameter 1 to
be an integer constant}}
__attribute__((amdgpu_waves_per_eu(2, ipow2(2
__global__ void non_cexpr_waves_per_eu_2_4() {}
+
+// ex
@@ -139,6 +139,36 @@ kernel void
reqd_work_group_size_32_2_1_flat_work_group_size_16_128() {
// CHECK: define{{.*}} amdgpu_kernel void
@reqd_work_group_size_32_2_1_flat_work_group_size_16_128()
[[FLAT_WORK_GROUP_SIZE_16_128:#[0-9]+]]
}
+__attribute__((amdgpu_max_num_work_gr
@@ -356,6 +356,24 @@ void AMDGPUTargetCodeGenInfo::setFunctionDeclAttributes(
if (NumVGPR != 0)
F->addFnAttr("amdgpu-num-vgpr", llvm::utostr(NumVGPR));
}
+
+ if (const auto *Attr = FD->getAttr()) {
+uint32_t X = Attr->getMaxNumWorkGroupsX()
+
@@ -137,6 +137,11 @@ Removed Compiler Flags
Attribute Changes in Clang
--
+- Introduced a new function attribute
``__attribute__((amdgpu_max_num_work_groups(x, y, z)))`` or
+ ``[[clang::amdgpu_max_num_work_groups(x, y, z)]]`` for the AMDGPU target.
T
jwanggit86 wrote:
@jayfoad After trying the patch you provided above, it appears that this
feature can indeed be done in SIInsertWaitcnt instead of SIMemoryLegalizer.
Code has been updated accordingly. Pls take a look.
https://github.com/llvm/llvm-project/pull/79236
___
jwanggit86 wrote:
@Pierre-vh Could you pls help review the backend part of this patch? Thanks!
https://github.com/llvm/llvm-project/pull/79035
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-c
jwanggit86 wrote:
@Pierre-vh Any further comments?
https://github.com/llvm/llvm-project/pull/79236
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
jwanggit86 wrote:
@arsenm Any comments on the LLVM side?
https://github.com/llvm/llvm-project/pull/79035
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -607,6 +607,29 @@ static void instantiateDependentAMDGPUWavesPerEUAttr(
S.addAMDGPUWavesPerEUAttr(New, Attr, MinExpr, MaxExpr);
}
+static void instantiateDependentAMDGPUMaxNumWorkGroupsAttr(
+Sema &S, const MultiLevelTemplateArgumentList &TemplateArgs,
+const AMDG
@@ -8069,6 +8069,67 @@ static void handleAMDGPUNumVGPRAttr(Sema &S, Decl *D,
const ParsedAttr &AL) {
D->addAttr(::new (S.Context) AMDGPUNumVGPRAttr(S.Context, AL, NumVGPR));
}
+// Returns true if error
+static bool
+checkAMDGPUMaxNumWorkGroupsArguments(Sema &S, Expr *XExpr,
@@ -8069,6 +8069,67 @@ static void handleAMDGPUNumVGPRAttr(Sema &S, Decl *D,
const ParsedAttr &AL) {
D->addAttr(::new (S.Context) AMDGPUNumVGPRAttr(S.Context, AL, NumVGPR));
}
+// Returns true if error
jwanggit86 wrote:
(1) Updated `ReleaseNotes.rst` (2) A
@@ -194,3 +204,87 @@ __global__ void non_cexpr_waves_per_eu_2() {}
// expected-error@+1{{'amdgpu_waves_per_eu' attribute requires parameter 1 to
be an integer constant}}
__attribute__((amdgpu_waves_per_eu(2, ipow2(2
__global__ void non_cexpr_waves_per_eu_2_4() {}
+
+// exp
jwanggit86 wrote:
Thanks for explaining the "dependent exp". At present, template arguments are
not required for the attribute. In other words, only int constants are
supported. Diagnostics and test cases have been updated.
https://github.com/llvm/llvm-project/pull/79035
__
jwanggit86 wrote:
> Also note, this is missing Clang lit tests, and doesn't seem to be correctly
> handling dependent expressions for x,y, and z.
What does it mean to "handle dependent expressions for x,y, and z"? Thanks!
https://github.com/llvm/llvm-project/pull/79035
@@ -167,6 +167,10 @@ def FeatureCuMode : SubtargetFeature<"cumode",
"Enable CU wavefront execution mode"
>;
+def FeaturePreciseMemory
jwanggit86 wrote:
Thanks! This works. Code has been updated. Pls take a look.
https://github.com/llvm/llvm-project/pull/79
https://github.com/jwanggit86 updated
https://github.com/llvm/llvm-project/pull/79236
>From 9c40b1151b0673430ff53eb121784724a5b090e5 Mon Sep 17 00:00:00 2001
From: Jun Wang
Date: Tue, 23 Jan 2024 19:19:00 -0600
Subject: [PATCH 1/7] [AMDGPU] Emit a waitcnt instruction after each memory
instruct
https://github.com/jwanggit86 edited
https://github.com/llvm/llvm-project/pull/79236
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -167,6 +167,10 @@ def FeatureCuMode : SubtargetFeature<"cumode",
"Enable CU wavefront execution mode"
>;
+def FeaturePreciseMemory
jwanggit86 wrote:
A possible fix is to check `hasFlag` after calling `handleTargetFeaturesGroup`,
and if the check is true
https://github.com/jwanggit86 edited
https://github.com/llvm/llvm-project/pull/79236
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -167,6 +167,10 @@ def FeatureCuMode : SubtargetFeature<"cumode",
"Enable CU wavefront execution mode"
>;
+def FeaturePreciseMemory
jwanggit86 wrote:
@Pierre-vh With the suggested change, the func `getAMDGPUTargetFeatures` looks
something like the follow
@@ -8069,6 +8069,26 @@ static void handleAMDGPUNumVGPRAttr(Sema &S, Decl *D,
const ParsedAttr &AL) {
D->addAttr(::new (S.Context) AMDGPUNumVGPRAttr(S.Context, AL, NumVGPR));
}
+static void handleAMDGPUMaxNumWorkGroupsAttr(Sema &S, Decl *D,
+
https://github.com/jwanggit86 edited
https://github.com/llvm/llvm-project/pull/79236
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -167,6 +167,10 @@ def FeatureCuMode : SubtargetFeature<"cumode",
"Enable CU wavefront execution mode"
>;
+def FeaturePreciseMemory
jwanggit86 wrote:
The function `hasFlag()` actually appears to be pretty expensive, and it's
going to be called every time
@@ -355,6 +356,18 @@ class SICacheControl {
MachineBasicBlock::iterator &MI) const {
return false;
}
+
+public:
+ // The following is for supporting precise memory mode. When the option
+ // amdgpu-precise-memory is enabled, an s_waitc
jwanggit86 wrote:
> Did you try to move this to SIInsertWaitCnt, as suggested?
> Did you try to move this to SIInsertWaitCnt, as suggested?
Pls see my reply on Feb 15, which is copy-pasted below.
Regarding the question about SIInsertWaitcnt, initially the code was indeed put
there (see PR h
https://github.com/jwanggit86 updated
https://github.com/llvm/llvm-project/pull/79035
>From 5c088a59bd36df40bae9a3a712f3994feded359d Mon Sep 17 00:00:00 2001
From: Jun Wang
Date: Mon, 22 Jan 2024 12:43:27 -0600
Subject: [PATCH 1/7] [AMDGPU] Adding the amdgpu-num-work-groups function
attribute
@@ -2031,6 +2031,13 @@ def AMDGPUNumVGPR : InheritableAttr {
let Subjects = SubjectList<[Function], ErrorDiag, "kernel functions">;
}
+def AMDGPUNumWorkGroups : InheritableAttr {
+ let Spellings = [Clang<"amdgpu_num_work_groups", 0>];
jwanggit86 wrote:
Cha
@@ -356,6 +356,19 @@ void AMDGPUTargetCodeGenInfo::setFunctionDeclAttributes(
if (NumVGPR != 0)
F->addFnAttr("amdgpu-num-vgpr", llvm::utostr(NumVGPR));
}
+
+ if (const auto *Attr = FD->getAttr()) {
+uint32_t X = Attr->getNumWorkGroupsX();
+uint32_t Y = Attr
@@ -0,0 +1,84 @@
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 -verify-machineinstrs < %s
| FileCheck %s
jwanggit86 wrote:
Removed.
https://github.com/llvm/llvm-project/pull/79035
___
cfe-commits mailing list
cf
@@ -2705,6 +2705,30 @@ An error will be given if:
}];
}
+def AMDGPUNumWorkGroupsDocs : Documentation {
+ let Category = DocCatAMDGPUAttributes;
+ let Content = [{
+The number of work groups specifies the number of work groups when the kernel
jwanggit86 wro
@@ -2705,6 +2705,30 @@ An error will be given if:
}];
}
+def AMDGPUNumWorkGroupsDocs : Documentation {
+ let Category = DocCatAMDGPUAttributes;
+ let Content = [{
+The number of work groups specifies the number of work groups when the kernel
+is dispatched.
+
+Clang suppor
@@ -2705,6 +2705,30 @@ An error will be given if:
}];
}
+def AMDGPUNumWorkGroupsDocs : Documentation {
+ let Category = DocCatAMDGPUAttributes;
+ let Content = [{
+The number of work groups specifies the number of work groups when the kernel
+is dispatched.
+
+Clang suppor
@@ -2705,6 +2705,30 @@ An error will be given if:
}];
}
+def AMDGPUNumWorkGroupsDocs : Documentation {
+ let Category = DocCatAMDGPUAttributes;
+ let Content = [{
+The number of work groups specifies the number of work groups when the kernel
+is dispatched.
+
+Clang suppor
@@ -2031,6 +2031,13 @@ def AMDGPUNumVGPR : InheritableAttr {
let Subjects = SubjectList<[Function], ErrorDiag, "kernel functions">;
}
+def AMDGPUNumWorkGroups : InheritableAttr {
+ let Spellings = [Clang<"amdgpu_num_work_groups", 0>];
+ let Args = [UnsignedArgument<"NumWor
@@ -8069,6 +8069,25 @@ static void handleAMDGPUNumVGPRAttr(Sema &S, Decl *D,
const ParsedAttr &AL) {
D->addAttr(::new (S.Context) AMDGPUNumVGPRAttr(S.Context, AL, NumVGPR));
}
+static void handleAMDGPUNumWorkGroupsAttr(Sema &S, Decl *D,
+
@@ -1108,3 +1108,8 @@ void
GCNUserSGPRUsageInfo::allocKernargPreloadSGPRs(unsigned NumSGPRs) {
unsigned GCNUserSGPRUsageInfo::getNumFreeUserSGPRs() {
return AMDGPU::getMaxNumUserSGPRs(ST) - NumUsedUserSGPRs;
}
+
+SmallVector
jwanggit86 wrote:
This function
https://github.com/jwanggit86 updated
https://github.com/llvm/llvm-project/pull/79035
>From 5c088a59bd36df40bae9a3a712f3994feded359d Mon Sep 17 00:00:00 2001
From: Jun Wang
Date: Mon, 22 Jan 2024 12:43:27 -0600
Subject: [PATCH 1/6] [AMDGPU] Adding the amdgpu-num-work-groups function
attribute
https://github.com/jwanggit86 updated
https://github.com/llvm/llvm-project/pull/79035
>From 5c088a59bd36df40bae9a3a712f3994feded359d Mon Sep 17 00:00:00 2001
From: Jun Wang
Date: Mon, 22 Jan 2024 12:43:27 -0600
Subject: [PATCH 1/5] [AMDGPU] Adding the amdgpu-num-work-groups function
attribute
@@ -356,6 +356,19 @@ void AMDGPUTargetCodeGenInfo::setFunctionDeclAttributes(
if (NumVGPR != 0)
F->addFnAttr("amdgpu-num-vgpr", llvm::utostr(NumVGPR));
}
+
+ if (const auto *Attr = FD->getAttr()) {
+uint32_t X = Attr->getNumWorkGroupsX();
+uint32_t Y = Attr
https://github.com/jwanggit86 edited
https://github.com/llvm/llvm-project/pull/79236
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -2378,6 +2456,221 @@ bool
SIGfx12CacheControl::enableVolatileAndOrNonTemporal(
return Changed;
}
+bool SIGfx6CacheControl ::handleNonAtomicForPreciseMemory(
+MachineBasicBlock::iterator &MI) {
+ assert(MI->mayLoadOrStore());
+
+ MachineInstr &Inst = *MI;
+ AMDGPU:
@@ -2378,6 +2456,221 @@ bool
SIGfx12CacheControl::enableVolatileAndOrNonTemporal(
return Changed;
}
+bool SIGfx6CacheControl ::handleNonAtomicForPreciseMemory(
+MachineBasicBlock::iterator &MI) {
+ assert(MI->mayLoadOrStore());
+
+ MachineInstr &Inst = *MI;
+ AMDGPU:
@@ -603,14 +626,69 @@ class SIGfx12CacheControl : public SIGfx11CacheControl {
SIAtomicAddrSpace AddrSpace, SIMemOp Op,
bool IsVolatile,
bool IsNonTemporal) const o
https://github.com/jwanggit86 updated
https://github.com/llvm/llvm-project/pull/79236
>From 9c40b1151b0673430ff53eb121784724a5b090e5 Mon Sep 17 00:00:00 2001
From: Jun Wang
Date: Tue, 23 Jan 2024 19:19:00 -0600
Subject: [PATCH 1/6] [AMDGPU] Emit a waitcnt instruction after each memory
instruct
https://github.com/jwanggit86 updated
https://github.com/llvm/llvm-project/pull/79236
>From 9c40b1151b0673430ff53eb121784724a5b090e5 Mon Sep 17 00:00:00 2001
From: Jun Wang
Date: Tue, 23 Jan 2024 19:19:00 -0600
Subject: [PATCH 1/5] [AMDGPU] Emit a waitcnt instruction after each memory
instruct
@@ -603,14 +626,69 @@ class SIGfx12CacheControl : public SIGfx11CacheControl {
SIAtomicAddrSpace AddrSpace, SIMemOp Op,
bool IsVolatile,
bool IsNonTemporal) const o
@@ -603,14 +626,69 @@ class SIGfx12CacheControl : public SIGfx11CacheControl {
SIAtomicAddrSpace AddrSpace, SIMemOp Op,
bool IsVolatile,
bool IsNonTemporal) const o
@@ -603,14 +626,69 @@ class SIGfx12CacheControl : public SIGfx11CacheControl {
SIAtomicAddrSpace AddrSpace, SIMemOp Op,
bool IsVolatile,
bool IsNonTemporal) const o
@@ -167,6 +167,10 @@ def FeatureCuMode : SubtargetFeature<"cumode",
"Enable CU wavefront execution mode"
>;
+def FeaturePreciseMemory
jwanggit86 wrote:
Thanks for the suggestion. It looks it would work. However, I'm not sure the
extra overhead is worth it
https://github.com/jwanggit86 updated
https://github.com/llvm/llvm-project/pull/79236
>From 9c40b1151b0673430ff53eb121784724a5b090e5 Mon Sep 17 00:00:00 2001
From: Jun Wang
Date: Tue, 23 Jan 2024 19:19:00 -0600
Subject: [PATCH 1/4] [AMDGPU] Emit a waitcnt instruction after each memory
instruct
@@ -2705,6 +2705,30 @@ An error will be given if:
}];
}
+def AMDGPUNumWorkGroupsDocs : Documentation {
+ let Category = DocCatAMDGPUAttributes;
+ let Content = [{
+The number of work groups specifies the number of work groups when the kernel
+is dispatched.
+
+Clang suppor
@@ -605,12 +606,197 @@ class SIGfx12CacheControl : public SIGfx11CacheControl {
bool IsNonTemporal) const override;
};
+class SIPreciseMemorySupport {
+protected:
+ const GCNSubtarget &ST;
+ const SIInstrInfo *TII = nullptr;
+
+ IsaVers
@@ -0,0 +1,362 @@
+; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -mattr=+amdgpu-precise-memory-op < %s
| FileCheck %s -check-prefixes=GFX9
+; RUN: llc -mtriple=amdgcn -mcpu=gfx90a -mattr=+amdgpu-precise-memory-op < %s
| FileCheck %s -check-prefixes=GFX90A
+; RUN: llc -mtriple=amdgcn -
@@ -167,6 +167,10 @@ def FeatureCuMode : SubtargetFeature<"cumode",
"Enable CU wavefront execution mode"
>;
+def FeaturePreciseMemory
jwanggit86 wrote:
As it is, we have a clang command-line option "-mamdgpu-precise-memory-op".
When specified, "+amdgpu-pre
@@ -605,12 +606,197 @@ class SIGfx12CacheControl : public SIGfx11CacheControl {
bool IsNonTemporal) const override;
};
+class SIPreciseMemorySupport {
jwanggit86 wrote:
Merged with SICacheControl.
https://github.com/llvm
1 - 100 of 161 matches
Mail list logo