[clang] [llvm] Reapply "[AMDGPU] Infer amdgpu-no-flat-scratch-init attribute in AMDGPUAttributor (#94647)" (PR #118907)

2024-12-09 Thread Jun Wang via cfe-commits
jwanggit86 wrote: @arsenm @kzhuravl If there are no more comments, I'm going to merge this PR. https://github.com/llvm/llvm-project/pull/118907 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-

[clang] [llvm] Reapply "[AMDGPU] Infer amdgpu-no-flat-scratch-init attribute in AMDGPUAttributor (#94647)" (PR #118907)

2024-12-06 Thread Jun Wang via cfe-commits
https://github.com/jwanggit86 edited https://github.com/llvm/llvm-project/pull/118907 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] Reapply "[AMDGPU] Infer amdgpu-no-flat-scratch-init attribute in AMDGPUAttributor (#94647)" (PR #118907)

2024-12-06 Thread Jun Wang via cfe-commits
https://github.com/jwanggit86 edited https://github.com/llvm/llvm-project/pull/118907 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] Reapply "[AMDGPU] Infer amdgpu-no-flat-scratch-init attribute in AMDGPUAttributor (#94647)" (PR #118907)

2024-12-06 Thread Jun Wang via cfe-commits
https://github.com/jwanggit86 edited https://github.com/llvm/llvm-project/pull/118907 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Infer amdgpu-no-flat-scratch-init attribute in AMDGPUAttributor (PR #118907)

2024-12-05 Thread Jun Wang via cfe-commits
jwanggit86 wrote: > Should have the git generated "Reapply..." message with reference to the > original I didn't notice that. Do you want me to put "Reapply " in the msg? https://github.com/llvm/llvm-project/pull/118907 ___ cfe-commits mailing list c

[clang] [llvm] [AMDGPU] Infer amdgpu-no-flat-scratch-init attribute in AMDGPUAttributor (PR #118907)

2024-12-05 Thread Jun Wang via cfe-commits
jwanggit86 wrote: [PR 94647](https://github.com/llvm/llvm-project/pull/94647) was found to cause build problems after merge with upstream. It has since been reverted by commit 1ef9410a96c1d9669a6feaf03fcab8d0a4a13bd5. This PR fixes the tests that caused the build problems. https://github.com/

[clang] [llvm] [AMDGPU] Infer amdgpu-no-flat-scratch-init attribute in AMDGPUAttributor (PR #94647)

2024-12-04 Thread Jun Wang via cfe-commits
https://github.com/jwanggit86 closed https://github.com/llvm/llvm-project/pull/94647 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Infer amdgpu-no-flat-scratch-init attribute in AMDGPUAttributor (PR #94647)

2024-11-21 Thread Jun Wang via cfe-commits
jwanggit86 wrote: I plan to submit this PR next week by Nov 26. Pls let me know if you have additional comments. https://github.com/llvm/llvm-project/pull/94647 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mai

[clang] [llvm] [AMDGPU] Infer amdgpu-no-flat-scratch-init attribute in AMDGPUAttributor (PR #94647)

2024-11-01 Thread Jun Wang via cfe-commits
@@ -683,6 +706,59 @@ struct AAAMDAttributesFunction : public AAAMDAttributes { return !A.checkForAllCallLikeInstructions(DoesNotRetrieve, *this, UsedAssumedInformation); } + + // Returns true if FlatScratchInit is needed, i.e

[clang] [llvm] [AMDGPU] Infer amdgpu-no-flat-scratch-init attribute in AMDGPUAttributor (PR #94647)

2024-10-29 Thread Jun Wang via cfe-commits
@@ -683,6 +706,59 @@ struct AAAMDAttributesFunction : public AAAMDAttributes { return !A.checkForAllCallLikeInstructions(DoesNotRetrieve, *this, UsedAssumedInformation); } + + // Returns true if FlatScratchInit is needed, i.e

[clang] [llvm] [AMDGPU] Infer amdgpu-no-flat-scratch-init attribute in AMDGPUAttributor (PR #94647)

2024-10-24 Thread Jun Wang via cfe-commits
@@ -683,6 +706,59 @@ struct AAAMDAttributesFunction : public AAAMDAttributes { return !A.checkForAllCallLikeInstructions(DoesNotRetrieve, *this, UsedAssumedInformation); } + + // Returns true if FlatScratchInit is needed, i.e

[clang] [llvm] [AMDGPU] Infer amdgpu-no-flat-scratch-init attribute in AMDGPUAttributor (PR #94647)

2024-10-22 Thread Jun Wang via cfe-commits
@@ -683,6 +706,59 @@ struct AAAMDAttributesFunction : public AAAMDAttributes { return !A.checkForAllCallLikeInstructions(DoesNotRetrieve, *this, UsedAssumedInformation); } + + // Returns true if FlatScratchInit is needed, i.e

[clang] [llvm] [AMDGPU] Infer amdgpu-no-flat-scratch-init attribute in AMDGPUAttributor (PR #94647)

2024-10-07 Thread Jun Wang via cfe-commits
@@ -434,6 +434,15 @@ struct AAAMDAttributesFunction : public AAAMDAttributes { indicatePessimisticFixpoint(); return; } + +for (Instruction &I : instructions(F)) { + if (isa(I) && jwanggit86 wrote: This test has been added. https://gi

[clang] [llvm] [AMDGPU] Infer amdgpu-no-flat-scratch-init attribute in AMDGPUAttributor (PR #94647)

2024-09-18 Thread Jun Wang via cfe-commits
@@ -434,6 +434,15 @@ struct AAAMDAttributesFunction : public AAAMDAttributes { indicatePessimisticFixpoint(); return; } + +for (Instruction &I : instructions(F)) { + if (isa(I) && jwanggit86 wrote: @arsenm Following constants to see if

[clang] [llvm] [AMDGPU] Infer amdgpu-no-flat-scratch-init attribute in AMDGPUAttributor (PR #94647)

2024-09-18 Thread Jun Wang via cfe-commits
@@ -678,6 +690,37 @@ struct AAAMDAttributesFunction : public AAAMDAttributes { return !A.checkForAllCallLikeInstructions(DoesNotRetrieve, *this, UsedAssumedInformation); } + + // Returns true if FlatScratchInit is needed, i.e

[clang] [llvm] Buffer_store_lds_dword, buffer_wbinvl1, and buffer_wbinvl1_vol should not be supported in GFX940 (PR #109223)

2024-09-18 Thread Jun Wang via cfe-commits
jwanggit86 wrote: Based off wrong branch. https://github.com/llvm/llvm-project/pull/109223 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] Buffer_store_lds_dword, buffer_wbinvl1, and buffer_wbinvl1_vol should not be supported in GFX940 (PR #109223)

2024-09-18 Thread Jun Wang via cfe-commits
https://github.com/jwanggit86 closed https://github.com/llvm/llvm-project/pull/109223 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Infer amdgpu-no-flat-scratch-init attribute in AMDGPUAttributor (PR #94647)

2024-09-17 Thread Jun Wang via cfe-commits
@@ -434,6 +434,15 @@ struct AAAMDAttributesFunction : public AAAMDAttributes { indicatePessimisticFixpoint(); return; } + +for (Instruction &I : instructions(F)) { + if (isa(I) && jwanggit86 wrote: @arsenm Could you please give me an e

[clang] [llvm] [AMDGPU] Infer amdgpu-no-flat-scratch-init attribute in AMDGPUAttributor (PR #94647)

2024-09-04 Thread Jun Wang via cfe-commits
@@ -433,6 +433,13 @@ struct AAAMDAttributesFunction : public AAAMDAttributes { indicatePessimisticFixpoint(); return; } + +for (Instruction &I : instructions(F)) { + if (isa(I) || isa(I)) { jwanggit86 wrote: Thanks. This has been updat

[clang] [llvm] [AMDGPU] Infer amdgpu-no-flat-scratch-init attribute in AMDGPUAttributor (PR #94647)

2024-09-04 Thread Jun Wang via cfe-commits
@@ -677,6 +687,36 @@ struct AAAMDAttributesFunction : public AAAMDAttributes { return !A.checkForAllCallLikeInstructions(DoesNotRetrieve, *this, UsedAssumedInformation); } + + // Returns true if FlatScratchInit is needed, i.e

[clang] [llvm] [AMDGPU] Infer amdgpu-no-flat-scratch-init attribute in AMDGPUAttributor (PR #94647)

2024-09-04 Thread Jun Wang via cfe-commits
@@ -1054,12 +1054,6 @@ GCNUserSGPRUsageInfo::GCNUserSGPRUsageInfo(const Function &F, const CallingConv::ID CC = F.getCallingConv(); const bool IsKernel = CC == CallingConv::AMDGPU_KERNEL || CC == CallingConv::SPIR_KERNEL; - // FIXME: Should have analysis or somethin

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-04-10 Thread Jun Wang via cfe-commits
https://github.com/jwanggit86 closed https://github.com/llvm/llvm-project/pull/79236 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-04-09 Thread Jun Wang via cfe-commits
jwanggit86 wrote: @jayfoad Do you have any more comments? https://github.com/llvm/llvm-project/pull/79236 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-04-08 Thread Jun Wang via cfe-commits
jwanggit86 wrote: Added a testcase that has flat_atomic_swap, which is an atomic without return. https://github.com/llvm/llvm-project/pull/79236 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-04-01 Thread Jun Wang via cfe-commits
jwanggit86 wrote: @jayfoad @arsenm Any other comments? https://github.com/llvm/llvm-project/pull/79236 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-03-31 Thread Jun Wang via cfe-commits
@@ -0,0 +1,1413 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 4 +; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -mattr=+precise-memory < %s | FileCheck %s -check-prefixes=GFX9 +; RUN: llc -mtriple=amdgcn -mcpu=gfx90a -mattr=+preci

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-03-31 Thread Jun Wang via cfe-commits
@@ -0,0 +1,1413 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --version 4 +; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -mattr=+precise-memory < %s | FileCheck %s -check-prefixes=GFX9 +; RUN: llc -mtriple=amdgcn -mcpu=gfx90a -mattr=+preci

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-03-29 Thread Jun Wang via cfe-commits
@@ -2326,6 +2326,20 @@ bool SIInsertWaitcnts::insertWaitcntInBlock(MachineFunction &MF, } #endif +if (ST->isPreciseMemoryEnabled() && Inst.mayLoadOrStore()) { + AMDGPU::Waitcnt Wait; + if (ST->hasExtendedWaitCounts()) +Wait = AMDGPU::Waitcnt(0, 0, 0,

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-03-29 Thread Jun Wang via cfe-commits
@@ -0,0 +1,618 @@ +; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -mattr=+precise-memory < %s | FileCheck %s -check-prefixes=GFX9 +; RUN: llc -mtriple=amdgcn -mcpu=gfx90a -mattr=+precise-memory < %s | FileCheck %s -check-prefixes=GFX90A +; RUN: llc -mtriple=amdgcn -mcpu=gfx1010 -mattr=

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-03-26 Thread Jun Wang via cfe-commits
https://github.com/jwanggit86 edited https://github.com/llvm/llvm-project/pull/79236 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-03-26 Thread Jun Wang via cfe-commits
@@ -2326,6 +2326,20 @@ bool SIInsertWaitcnts::insertWaitcntInBlock(MachineFunction &MF, } #endif +if (ST->isPreciseMemoryEnabled() && Inst.mayLoadOrStore()) { + AMDGPU::Waitcnt Wait; + if (ST->hasExtendedWaitCounts()) +Wait = AMDGPU::Waitcnt(0, 0, 0,

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-03-26 Thread Jun Wang via cfe-commits
@@ -2326,6 +2326,20 @@ bool SIInsertWaitcnts::insertWaitcntInBlock(MachineFunction &MF, } #endif +if (ST->isPreciseMemoryEnabled()) { + AMDGPU::Waitcnt Wait; + if (WCG == &WCGPreGFX12) +Wait = AMDGPU::Waitcnt(0, 0, 0, 0); jwanggit86

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-03-26 Thread Jun Wang via cfe-commits
@@ -0,0 +1,577 @@ +; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -mattr=+precise-memory < %s | FileCheck %s -check-prefixes=GFX9 +; RUN: llc -mtriple=amdgcn -mcpu=gfx90a -mattr=+precise-memory < %s | FileCheck %s -check-prefixes=GFX90A +; RUN: llc -mtriple=amdgcn -mcpu=gfx1010 -mattr=

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-03-26 Thread Jun Wang via cfe-commits
@@ -0,0 +1,577 @@ +; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -mattr=+precise-memory < %s | FileCheck %s -check-prefixes=GFX9 +; RUN: llc -mtriple=amdgcn -mcpu=gfx90a -mattr=+precise-memory < %s | FileCheck %s -check-prefixes=GFX90A +; RUN: llc -mtriple=amdgcn -mcpu=gfx1010 -mattr=

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-03-26 Thread Jun Wang via cfe-commits
@@ -0,0 +1,577 @@ +; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -mattr=+precise-memory < %s | FileCheck %s -check-prefixes=GFX9 +; RUN: llc -mtriple=amdgcn -mcpu=gfx90a -mattr=+precise-memory < %s | FileCheck %s -check-prefixes=GFX90A +; RUN: llc -mtriple=amdgcn -mcpu=gfx1010 -mattr=

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-03-12 Thread Jun Wang via cfe-commits
https://github.com/jwanggit86 closed https://github.com/llvm/llvm-project/pull/79035 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-03-06 Thread Jun Wang via cfe-commits
https://github.com/jwanggit86 edited https://github.com/llvm/llvm-project/pull/79035 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-03-06 Thread Jun Wang via cfe-commits
@@ -137,6 +137,12 @@ Removed Compiler Flags Attribute Changes in Clang -- +- Introduced a new function attribute ``__attribute__((amdgpu_max_num_work_groups(x, y, z)))`` or jwanggit86 wrote: In the case of flat workgroup size, the LLV

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-03-06 Thread Jun Wang via cfe-commits
@@ -137,6 +137,12 @@ Removed Compiler Flags Attribute Changes in Clang -- +- Introduced a new function attribute ``__attribute__((amdgpu_max_num_work_groups(x, y, z)))`` or jwanggit86 wrote: There are existing attributes that have wor

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-03-05 Thread Jun Wang via cfe-commits
@@ -0,0 +1,77 @@ +; RUN: not llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 < %s 2>&1 | FileCheck --check-prefix=ERROR %s + +; ERROR: error: can't parse integer attribute -1 in amdgpu-max-num-work-groups +define amdgpu_kernel void @empty_max_num_work_groups_neg_num1() #21 { +entry:

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-03-05 Thread Jun Wang via cfe-commits
@@ -139,6 +139,36 @@ kernel void reqd_work_group_size_32_2_1_flat_work_group_size_16_128() { // CHECK: define{{.*}} amdgpu_kernel void @reqd_work_group_size_32_2_1_flat_work_group_size_16_128() [[FLAT_WORK_GROUP_SIZE_16_128:#[0-9]+]] } +__attribute__((amdgpu_max_num_work_gr

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-03-05 Thread Jun Wang via cfe-commits
@@ -194,3 +204,105 @@ __global__ void non_cexpr_waves_per_eu_2() {} // expected-error@+1{{'amdgpu_waves_per_eu' attribute requires parameter 1 to be an integer constant}} __attribute__((amdgpu_waves_per_eu(2, ipow2(2 __global__ void non_cexpr_waves_per_eu_2_4() {} + +// ex

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-03-01 Thread Jun Wang via cfe-commits
@@ -0,0 +1,84 @@ +; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 < %s | FileCheck %s + +; Attribute not specified. +; CHECK-LABEL: {{^}}empty_no_attribute: +define amdgpu_kernel void @empty_no_attribute() { +entry: + ret void +} + +; Ignore if number of work groups for x dime

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-03-01 Thread Jun Wang via cfe-commits
@@ -0,0 +1,84 @@ +; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 < %s | FileCheck %s + jwanggit86 wrote: Created a new test file to test various errors. https://github.com/llvm/llvm-project/pull/79035 ___ cfe-commi

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-03-01 Thread Jun Wang via cfe-commits
@@ -814,6 +814,15 @@ bool shouldEmitConstantsToTextSection(const Triple &TT); /// to integer. int getIntegerAttribute(const Function &F, StringRef Name, int Default); +/// \returns Unsigned Integer value requested using \p F's \p Name attribute. +/// +/// \returns \p Default i

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-03-01 Thread Jun Wang via cfe-commits
@@ -194,3 +204,105 @@ __global__ void non_cexpr_waves_per_eu_2() {} // expected-error@+1{{'amdgpu_waves_per_eu' attribute requires parameter 1 to be an integer constant}} __attribute__((amdgpu_waves_per_eu(2, ipow2(2 __global__ void non_cexpr_waves_per_eu_2_4() {} + +// ex

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-03-01 Thread Jun Wang via cfe-commits
@@ -139,6 +139,36 @@ kernel void reqd_work_group_size_32_2_1_flat_work_group_size_16_128() { // CHECK: define{{.*}} amdgpu_kernel void @reqd_work_group_size_32_2_1_flat_work_group_size_16_128() [[FLAT_WORK_GROUP_SIZE_16_128:#[0-9]+]] } +__attribute__((amdgpu_max_num_work_gr

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-03-01 Thread Jun Wang via cfe-commits
@@ -356,6 +356,24 @@ void AMDGPUTargetCodeGenInfo::setFunctionDeclAttributes( if (NumVGPR != 0) F->addFnAttr("amdgpu-num-vgpr", llvm::utostr(NumVGPR)); } + + if (const auto *Attr = FD->getAttr()) { +uint32_t X = Attr->getMaxNumWorkGroupsX() +

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-03-01 Thread Jun Wang via cfe-commits
@@ -137,6 +137,11 @@ Removed Compiler Flags Attribute Changes in Clang -- +- Introduced a new function attribute ``__attribute__((amdgpu_max_num_work_groups(x, y, z)))`` or + ``[[clang::amdgpu_max_num_work_groups(x, y, z)]]`` for the AMDGPU target. T

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-29 Thread Jun Wang via cfe-commits
jwanggit86 wrote: @jayfoad After trying the patch you provided above, it appears that this feature can indeed be done in SIInsertWaitcnt instead of SIMemoryLegalizer. Code has been updated accordingly. Pls take a look. https://github.com/llvm/llvm-project/pull/79236 ___

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-02-27 Thread Jun Wang via cfe-commits
jwanggit86 wrote: @Pierre-vh Could you pls help review the backend part of this patch? Thanks! https://github.com/llvm/llvm-project/pull/79035 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-c

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-26 Thread Jun Wang via cfe-commits
jwanggit86 wrote: @Pierre-vh Any further comments? https://github.com/llvm/llvm-project/pull/79236 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-02-26 Thread Jun Wang via cfe-commits
jwanggit86 wrote: @arsenm Any comments on the LLVM side? https://github.com/llvm/llvm-project/pull/79035 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-02-26 Thread Jun Wang via cfe-commits
@@ -607,6 +607,29 @@ static void instantiateDependentAMDGPUWavesPerEUAttr( S.addAMDGPUWavesPerEUAttr(New, Attr, MinExpr, MaxExpr); } +static void instantiateDependentAMDGPUMaxNumWorkGroupsAttr( +Sema &S, const MultiLevelTemplateArgumentList &TemplateArgs, +const AMDG

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-02-26 Thread Jun Wang via cfe-commits
@@ -8069,6 +8069,67 @@ static void handleAMDGPUNumVGPRAttr(Sema &S, Decl *D, const ParsedAttr &AL) { D->addAttr(::new (S.Context) AMDGPUNumVGPRAttr(S.Context, AL, NumVGPR)); } +// Returns true if error +static bool +checkAMDGPUMaxNumWorkGroupsArguments(Sema &S, Expr *XExpr,

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-02-26 Thread Jun Wang via cfe-commits
@@ -8069,6 +8069,67 @@ static void handleAMDGPUNumVGPRAttr(Sema &S, Decl *D, const ParsedAttr &AL) { D->addAttr(::new (S.Context) AMDGPUNumVGPRAttr(S.Context, AL, NumVGPR)); } +// Returns true if error jwanggit86 wrote: (1) Updated `ReleaseNotes.rst` (2) A

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-02-23 Thread Jun Wang via cfe-commits
@@ -194,3 +204,87 @@ __global__ void non_cexpr_waves_per_eu_2() {} // expected-error@+1{{'amdgpu_waves_per_eu' attribute requires parameter 1 to be an integer constant}} __attribute__((amdgpu_waves_per_eu(2, ipow2(2 __global__ void non_cexpr_waves_per_eu_2_4() {} + +// exp

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-02-22 Thread Jun Wang via cfe-commits
jwanggit86 wrote: Thanks for explaining the "dependent exp". At present, template arguments are not required for the attribute. In other words, only int constants are supported. Diagnostics and test cases have been updated. https://github.com/llvm/llvm-project/pull/79035 __

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-02-21 Thread Jun Wang via cfe-commits
jwanggit86 wrote: > Also note, this is missing Clang lit tests, and doesn't seem to be correctly > handling dependent expressions for x,y, and z. What does it mean to "handle dependent expressions for x,y, and z"? Thanks! https://github.com/llvm/llvm-project/pull/79035

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-21 Thread Jun Wang via cfe-commits
@@ -167,6 +167,10 @@ def FeatureCuMode : SubtargetFeature<"cumode", "Enable CU wavefront execution mode" >; +def FeaturePreciseMemory jwanggit86 wrote: Thanks! This works. Code has been updated. Pls take a look. https://github.com/llvm/llvm-project/pull/79

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-21 Thread Jun Wang via cfe-commits
https://github.com/jwanggit86 updated https://github.com/llvm/llvm-project/pull/79236 >From 9c40b1151b0673430ff53eb121784724a5b090e5 Mon Sep 17 00:00:00 2001 From: Jun Wang Date: Tue, 23 Jan 2024 19:19:00 -0600 Subject: [PATCH 1/7] [AMDGPU] Emit a waitcnt instruction after each memory instruct

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-20 Thread Jun Wang via cfe-commits
https://github.com/jwanggit86 edited https://github.com/llvm/llvm-project/pull/79236 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-20 Thread Jun Wang via cfe-commits
@@ -167,6 +167,10 @@ def FeatureCuMode : SubtargetFeature<"cumode", "Enable CU wavefront execution mode" >; +def FeaturePreciseMemory jwanggit86 wrote: A possible fix is to check `hasFlag` after calling `handleTargetFeaturesGroup`, and if the check is true

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-20 Thread Jun Wang via cfe-commits
https://github.com/jwanggit86 edited https://github.com/llvm/llvm-project/pull/79236 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-20 Thread Jun Wang via cfe-commits
@@ -167,6 +167,10 @@ def FeatureCuMode : SubtargetFeature<"cumode", "Enable CU wavefront execution mode" >; +def FeaturePreciseMemory jwanggit86 wrote: @Pierre-vh With the suggested change, the func `getAMDGPUTargetFeatures` looks something like the follow

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-02-20 Thread Jun Wang via cfe-commits
@@ -8069,6 +8069,26 @@ static void handleAMDGPUNumVGPRAttr(Sema &S, Decl *D, const ParsedAttr &AL) { D->addAttr(::new (S.Context) AMDGPUNumVGPRAttr(S.Context, AL, NumVGPR)); } +static void handleAMDGPUMaxNumWorkGroupsAttr(Sema &S, Decl *D, +

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-19 Thread Jun Wang via cfe-commits
https://github.com/jwanggit86 edited https://github.com/llvm/llvm-project/pull/79236 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-19 Thread Jun Wang via cfe-commits
@@ -167,6 +167,10 @@ def FeatureCuMode : SubtargetFeature<"cumode", "Enable CU wavefront execution mode" >; +def FeaturePreciseMemory jwanggit86 wrote: The function `hasFlag()` actually appears to be pretty expensive, and it's going to be called every time

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-19 Thread Jun Wang via cfe-commits
@@ -355,6 +356,18 @@ class SICacheControl { MachineBasicBlock::iterator &MI) const { return false; } + +public: + // The following is for supporting precise memory mode. When the option + // amdgpu-precise-memory is enabled, an s_waitc

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-19 Thread Jun Wang via cfe-commits
jwanggit86 wrote: > Did you try to move this to SIInsertWaitCnt, as suggested? > Did you try to move this to SIInsertWaitCnt, as suggested? Pls see my reply on Feb 15, which is copy-pasted below. Regarding the question about SIInsertWaitcnt, initially the code was indeed put there (see PR h

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-02-19 Thread Jun Wang via cfe-commits
https://github.com/jwanggit86 updated https://github.com/llvm/llvm-project/pull/79035 >From 5c088a59bd36df40bae9a3a712f3994feded359d Mon Sep 17 00:00:00 2001 From: Jun Wang Date: Mon, 22 Jan 2024 12:43:27 -0600 Subject: [PATCH 1/7] [AMDGPU] Adding the amdgpu-num-work-groups function attribute

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-02-16 Thread Jun Wang via cfe-commits
@@ -2031,6 +2031,13 @@ def AMDGPUNumVGPR : InheritableAttr { let Subjects = SubjectList<[Function], ErrorDiag, "kernel functions">; } +def AMDGPUNumWorkGroups : InheritableAttr { + let Spellings = [Clang<"amdgpu_num_work_groups", 0>]; jwanggit86 wrote: Cha

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-02-16 Thread Jun Wang via cfe-commits
@@ -356,6 +356,19 @@ void AMDGPUTargetCodeGenInfo::setFunctionDeclAttributes( if (NumVGPR != 0) F->addFnAttr("amdgpu-num-vgpr", llvm::utostr(NumVGPR)); } + + if (const auto *Attr = FD->getAttr()) { +uint32_t X = Attr->getNumWorkGroupsX(); +uint32_t Y = Attr

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-02-16 Thread Jun Wang via cfe-commits
@@ -0,0 +1,84 @@ +; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 -verify-machineinstrs < %s | FileCheck %s jwanggit86 wrote: Removed. https://github.com/llvm/llvm-project/pull/79035 ___ cfe-commits mailing list cf

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-02-16 Thread Jun Wang via cfe-commits
@@ -2705,6 +2705,30 @@ An error will be given if: }]; } +def AMDGPUNumWorkGroupsDocs : Documentation { + let Category = DocCatAMDGPUAttributes; + let Content = [{ +The number of work groups specifies the number of work groups when the kernel jwanggit86 wro

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-02-16 Thread Jun Wang via cfe-commits
@@ -2705,6 +2705,30 @@ An error will be given if: }]; } +def AMDGPUNumWorkGroupsDocs : Documentation { + let Category = DocCatAMDGPUAttributes; + let Content = [{ +The number of work groups specifies the number of work groups when the kernel +is dispatched. + +Clang suppor

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-02-16 Thread Jun Wang via cfe-commits
@@ -2705,6 +2705,30 @@ An error will be given if: }]; } +def AMDGPUNumWorkGroupsDocs : Documentation { + let Category = DocCatAMDGPUAttributes; + let Content = [{ +The number of work groups specifies the number of work groups when the kernel +is dispatched. + +Clang suppor

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-02-16 Thread Jun Wang via cfe-commits
@@ -2705,6 +2705,30 @@ An error will be given if: }]; } +def AMDGPUNumWorkGroupsDocs : Documentation { + let Category = DocCatAMDGPUAttributes; + let Content = [{ +The number of work groups specifies the number of work groups when the kernel +is dispatched. + +Clang suppor

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-02-16 Thread Jun Wang via cfe-commits
@@ -2031,6 +2031,13 @@ def AMDGPUNumVGPR : InheritableAttr { let Subjects = SubjectList<[Function], ErrorDiag, "kernel functions">; } +def AMDGPUNumWorkGroups : InheritableAttr { + let Spellings = [Clang<"amdgpu_num_work_groups", 0>]; + let Args = [UnsignedArgument<"NumWor

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-02-16 Thread Jun Wang via cfe-commits
@@ -8069,6 +8069,25 @@ static void handleAMDGPUNumVGPRAttr(Sema &S, Decl *D, const ParsedAttr &AL) { D->addAttr(::new (S.Context) AMDGPUNumVGPRAttr(S.Context, AL, NumVGPR)); } +static void handleAMDGPUNumWorkGroupsAttr(Sema &S, Decl *D, +

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-02-16 Thread Jun Wang via cfe-commits
@@ -1108,3 +1108,8 @@ void GCNUserSGPRUsageInfo::allocKernargPreloadSGPRs(unsigned NumSGPRs) { unsigned GCNUserSGPRUsageInfo::getNumFreeUserSGPRs() { return AMDGPU::getMaxNumUserSGPRs(ST) - NumUsedUserSGPRs; } + +SmallVector jwanggit86 wrote: This function

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-02-16 Thread Jun Wang via cfe-commits
https://github.com/jwanggit86 updated https://github.com/llvm/llvm-project/pull/79035 >From 5c088a59bd36df40bae9a3a712f3994feded359d Mon Sep 17 00:00:00 2001 From: Jun Wang Date: Mon, 22 Jan 2024 12:43:27 -0600 Subject: [PATCH 1/6] [AMDGPU] Adding the amdgpu-num-work-groups function attribute

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-02-16 Thread Jun Wang via cfe-commits
https://github.com/jwanggit86 updated https://github.com/llvm/llvm-project/pull/79035 >From 5c088a59bd36df40bae9a3a712f3994feded359d Mon Sep 17 00:00:00 2001 From: Jun Wang Date: Mon, 22 Jan 2024 12:43:27 -0600 Subject: [PATCH 1/5] [AMDGPU] Adding the amdgpu-num-work-groups function attribute

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-02-16 Thread Jun Wang via cfe-commits
@@ -356,6 +356,19 @@ void AMDGPUTargetCodeGenInfo::setFunctionDeclAttributes( if (NumVGPR != 0) F->addFnAttr("amdgpu-num-vgpr", llvm::utostr(NumVGPR)); } + + if (const auto *Attr = FD->getAttr()) { +uint32_t X = Attr->getNumWorkGroupsX(); +uint32_t Y = Attr

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-15 Thread Jun Wang via cfe-commits
https://github.com/jwanggit86 edited https://github.com/llvm/llvm-project/pull/79236 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-15 Thread Jun Wang via cfe-commits
@@ -2378,6 +2456,221 @@ bool SIGfx12CacheControl::enableVolatileAndOrNonTemporal( return Changed; } +bool SIGfx6CacheControl ::handleNonAtomicForPreciseMemory( +MachineBasicBlock::iterator &MI) { + assert(MI->mayLoadOrStore()); + + MachineInstr &Inst = *MI; + AMDGPU:

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-15 Thread Jun Wang via cfe-commits
@@ -2378,6 +2456,221 @@ bool SIGfx12CacheControl::enableVolatileAndOrNonTemporal( return Changed; } +bool SIGfx6CacheControl ::handleNonAtomicForPreciseMemory( +MachineBasicBlock::iterator &MI) { + assert(MI->mayLoadOrStore()); + + MachineInstr &Inst = *MI; + AMDGPU:

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-15 Thread Jun Wang via cfe-commits
@@ -603,14 +626,69 @@ class SIGfx12CacheControl : public SIGfx11CacheControl { SIAtomicAddrSpace AddrSpace, SIMemOp Op, bool IsVolatile, bool IsNonTemporal) const o

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-15 Thread Jun Wang via cfe-commits
https://github.com/jwanggit86 updated https://github.com/llvm/llvm-project/pull/79236 >From 9c40b1151b0673430ff53eb121784724a5b090e5 Mon Sep 17 00:00:00 2001 From: Jun Wang Date: Tue, 23 Jan 2024 19:19:00 -0600 Subject: [PATCH 1/6] [AMDGPU] Emit a waitcnt instruction after each memory instruct

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-15 Thread Jun Wang via cfe-commits
https://github.com/jwanggit86 updated https://github.com/llvm/llvm-project/pull/79236 >From 9c40b1151b0673430ff53eb121784724a5b090e5 Mon Sep 17 00:00:00 2001 From: Jun Wang Date: Tue, 23 Jan 2024 19:19:00 -0600 Subject: [PATCH 1/5] [AMDGPU] Emit a waitcnt instruction after each memory instruct

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-15 Thread Jun Wang via cfe-commits
@@ -603,14 +626,69 @@ class SIGfx12CacheControl : public SIGfx11CacheControl { SIAtomicAddrSpace AddrSpace, SIMemOp Op, bool IsVolatile, bool IsNonTemporal) const o

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-15 Thread Jun Wang via cfe-commits
@@ -603,14 +626,69 @@ class SIGfx12CacheControl : public SIGfx11CacheControl { SIAtomicAddrSpace AddrSpace, SIMemOp Op, bool IsVolatile, bool IsNonTemporal) const o

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-15 Thread Jun Wang via cfe-commits
@@ -603,14 +626,69 @@ class SIGfx12CacheControl : public SIGfx11CacheControl { SIAtomicAddrSpace AddrSpace, SIMemOp Op, bool IsVolatile, bool IsNonTemporal) const o

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-15 Thread Jun Wang via cfe-commits
@@ -167,6 +167,10 @@ def FeatureCuMode : SubtargetFeature<"cumode", "Enable CU wavefront execution mode" >; +def FeaturePreciseMemory jwanggit86 wrote: Thanks for the suggestion. It looks it would work. However, I'm not sure the extra overhead is worth it

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-14 Thread Jun Wang via cfe-commits
https://github.com/jwanggit86 updated https://github.com/llvm/llvm-project/pull/79236 >From 9c40b1151b0673430ff53eb121784724a5b090e5 Mon Sep 17 00:00:00 2001 From: Jun Wang Date: Tue, 23 Jan 2024 19:19:00 -0600 Subject: [PATCH 1/4] [AMDGPU] Emit a waitcnt instruction after each memory instruct

[clang] [llvm] [AMDGPU] Adding the amdgpu-num-work-groups function attribute (PR #79035)

2024-02-13 Thread Jun Wang via cfe-commits
@@ -2705,6 +2705,30 @@ An error will be given if: }]; } +def AMDGPUNumWorkGroupsDocs : Documentation { + let Category = DocCatAMDGPUAttributes; + let Content = [{ +The number of work groups specifies the number of work groups when the kernel +is dispatched. + +Clang suppor

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-12 Thread Jun Wang via cfe-commits
@@ -605,12 +606,197 @@ class SIGfx12CacheControl : public SIGfx11CacheControl { bool IsNonTemporal) const override; }; +class SIPreciseMemorySupport { +protected: + const GCNSubtarget &ST; + const SIInstrInfo *TII = nullptr; + + IsaVers

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-12 Thread Jun Wang via cfe-commits
@@ -0,0 +1,362 @@ +; RUN: llc -mtriple=amdgcn -mcpu=gfx900 -mattr=+amdgpu-precise-memory-op < %s | FileCheck %s -check-prefixes=GFX9 +; RUN: llc -mtriple=amdgcn -mcpu=gfx90a -mattr=+amdgpu-precise-memory-op < %s | FileCheck %s -check-prefixes=GFX90A +; RUN: llc -mtriple=amdgcn -

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-12 Thread Jun Wang via cfe-commits
@@ -167,6 +167,10 @@ def FeatureCuMode : SubtargetFeature<"cumode", "Enable CU wavefront execution mode" >; +def FeaturePreciseMemory jwanggit86 wrote: As it is, we have a clang command-line option "-mamdgpu-precise-memory-op". When specified, "+amdgpu-pre

[clang] [llvm] [AMDGPU] Emit a waitcnt instruction after each memory instruction (PR #79236)

2024-02-12 Thread Jun Wang via cfe-commits
@@ -605,12 +606,197 @@ class SIGfx12CacheControl : public SIGfx11CacheControl { bool IsNonTemporal) const override; }; +class SIPreciseMemorySupport { jwanggit86 wrote: Merged with SICacheControl. https://github.com/llvm

  1   2   >