[PATCH] D134355: [AMDGPU] Emit module flag for all code object versions

2022-09-22 Thread Yaxun Liu via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG5e25284dbc94: [AMDGPU] Emit module flag for all code object 
versions (authored by yaxunl).
Herald added a project: clang.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D134355/new/

https://reviews.llvm.org/D134355

Files:
  clang/lib/CodeGen/CodeGenModule.cpp
  clang/test/CodeGenCUDA/amdgpu-code-object-version.cu
  clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu


Index: clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu
===
--- clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu
+++ clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu
@@ -18,7 +18,7 @@
 // COMMON-LABEL: define{{.*}} amdgpu_kernel void @_Z7kernel1Pi(i32 
addrspace(1)*{{.*}} %x.coerce)
 // CHECK: ={{.*}} addrspacecast [[TYPE:.*]] addrspace(1)* %{{.*}} to 
[[TYPE]]*
 // CHECK-NOT: ={{.*}} addrspacecast [[TYPE:.*]] addrspace(1)* %{{.*}} to 
[[TYPE]]*
-// OPT: [[VAL:%.*]] = load i32, i32 addrspace(1)* %x.coerce, align 4, 
!amdgpu.noclobber !2
+// OPT: [[VAL:%.*]] = load i32, i32 addrspace(1)* %x.coerce, align 4, 
!amdgpu.noclobber ![[MD:[0-9]+]]
 // OPT: [[INC:%.*]] = add nsw i32 [[VAL]], 1
 // OPT: store i32 [[INC]], i32 addrspace(1)* %x.coerce, align 4
 // OPT: ret void
@@ -30,7 +30,7 @@
 // COMMON-LABEL: define{{.*}} amdgpu_kernel void @_Z7kernel2Ri(i32 
addrspace(1)*{{.*}} nonnull align 4 dereferenceable(4) %x.coerce)
 // CHECK: ={{.*}} addrspacecast [[TYPE:.*]] addrspace(1)* %{{.*}} to 
[[TYPE]]*
 // CHECK-NOT: ={{.*}} addrspacecast [[TYPE:.*]] addrspace(1)* %{{.*}} to 
[[TYPE]]*
-// OPT: [[VAL:%.*]] = load i32, i32 addrspace(1)* %x.coerce, align 4, 
!amdgpu.noclobber !2
+// OPT: [[VAL:%.*]] = load i32, i32 addrspace(1)* %x.coerce, align 4, 
!amdgpu.noclobber ![[MD]]
 // OPT: [[INC:%.*]] = add nsw i32 [[VAL]], 1
 // OPT: store i32 [[INC]], i32 addrspace(1)* %x.coerce, align 4
 // OPT: ret void
@@ -68,7 +68,7 @@
 // OPT: [[R1:%.*]] = getelementptr inbounds %struct.S, %struct.S addrspace(4)* 
%0, i64 0, i32 1
 // OPT: [[P1:%.*]] = load float*, float* addrspace(4)* [[R1]], align 8
 // OPT: [[G1:%.*]] ={{.*}} addrspacecast float* [[P1]] to float addrspace(1)*
-// OPT: [[V0:%.*]] = load i32, i32 addrspace(1)* [[G0]], align 4, 
!amdgpu.noclobber !2
+// OPT: [[V0:%.*]] = load i32, i32 addrspace(1)* [[G0]], align 4, 
!amdgpu.noclobber ![[MD]]
 // OPT: [[INC:%.*]] = add nsw i32 [[V0]], 1
 // OPT: store i32 [[INC]], i32 addrspace(1)* [[G0]], align 4
 // OPT: [[V1:%.*]] = load float, float addrspace(1)* [[G1]], align 4
@@ -103,7 +103,7 @@
 // OPT: [[R1:%.*]] = getelementptr inbounds %struct.T, %struct.T addrspace(4)* 
%0, i64 0, i32 0, i64 1
 // OPT: [[P1:%.*]] = load float*, float* addrspace(4)* [[R1]], align 8
 // OPT: [[G1:%.*]] ={{.*}} addrspacecast float* [[P1]] to float addrspace(1)*
-// OPT: [[V0:%.*]] = load float, float addrspace(1)* [[G0]], align 4, 
!amdgpu.noclobber !2
+// OPT: [[V0:%.*]] = load float, float addrspace(1)* [[G0]], align 4, 
!amdgpu.noclobber ![[MD]]
 // OPT: [[ADD0:%.*]] = fadd contract float [[V0]], 1.00e+00
 // OPT: store float [[ADD0]], float addrspace(1)* [[G0]], align 4
 // OPT: [[V1:%.*]] = load float, float addrspace(1)* [[G1]], align 4
@@ -130,7 +130,7 @@
 // COMMON-LABEL: define{{.*}} amdgpu_kernel void @_Z7kernel82SS(float 
addrspace(1)*{{.*}} %a.coerce)
 // CHECK: ={{.*}} addrspacecast [[TYPE:.*]] addrspace(1)* %{{.*}} to 
[[TYPE]]*
 // CHECK-NOT: ={{.*}} addrspacecast [[TYPE:.*]] addrspace(1)* %{{.*}} to 
[[TYPE]]*
-// OPT: [[VAL:%.*]] = load float, float addrspace(1)* %a.coerce, align 4, 
!amdgpu.noclobber !2
+// OPT: [[VAL:%.*]] = load float, float addrspace(1)* %a.coerce, align 4, 
!amdgpu.noclobber ![[MD]]
 // OPT: [[INC:%.*]] = fadd contract float [[VAL]], 3.00e+00
 // OPT: store float [[INC]], float addrspace(1)* %a.coerce, align 4
 // OPT: ret void
Index: clang/test/CodeGenCUDA/amdgpu-code-object-version.cu
===
--- clang/test/CodeGenCUDA/amdgpu-code-object-version.cu
+++ clang/test/CodeGenCUDA/amdgpu-code-object-version.cu
@@ -1,16 +1,16 @@
 // Create module flag for code object version.
 
 // RUN: %clang_cc1 -fcuda-is-device -triple amdgcn-amd-amdhsa -emit-llvm \
-// RUN:   -o - %s | FileCheck %s -check-prefix=NONE
+// RUN:   -o - %s | FileCheck %s -check-prefix=V4
 
 // RUN: %clang_cc1 -fcuda-is-device -triple amdgcn-amd-amdhsa -emit-llvm \
-// RUN:   -mcode-object-version=2 -o - %s | FileCheck -check-prefix=NONE %s
+// RUN:   -mcode-object-version=2 -o - %s | FileCheck -check-prefix=V2 %s
 
 // RUN: %clang_cc1 -fcuda-is-device -triple amdgcn-amd-amdhsa -emit-llvm \
-// RUN:   -mcode-object-version=3 -o - %s | FileCheck -check-prefix=NONE %s
+// RUN:   -mcode-object-version=3 -o - %s | FileCheck -check-prefix=V3 %s
 
 // RUN: %clang_cc1 

[PATCH] D134355: [AMDGPU] Emit module flag for all code object versions

2022-09-22 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment.

In D134355#3809471 , @b-sumner wrote:

> In D134355#3809294 , @yaxunl wrote:
>
>> In D134355#3807435 , @cfang wrote:
>>
>>> LGTM
>>>
>>> Should the module flag name be amdgpu_code_object_version or 
>>> amdhsa_code_object_version?
>>
>> Good question.
>>
>> @b-sumner Does code object version affects PAL? Thanks.
>
> It should, we use the same code objects on PAL and ROCm.

Thanks. Then amdgpu_code_object_version sounds like the right name.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D134355/new/

https://reviews.llvm.org/D134355

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D134355: [AMDGPU] Emit module flag for all code object versions

2022-09-22 Thread Brian Sumner via Phabricator via cfe-commits
b-sumner added a comment.

In D134355#3809294 , @yaxunl wrote:

> In D134355#3807435 , @cfang wrote:
>
>> LGTM
>>
>> Should the module flag name be amdgpu_code_object_version or 
>> amdhsa_code_object_version?
>
> Good question.
>
> @b-sumner Does code object version affects PAL? Thanks.

It should, we use the same code objects on PAL and ROCm.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D134355/new/

https://reviews.llvm.org/D134355

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D134355: [AMDGPU] Emit module flag for all code object versions

2022-09-22 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done.
yaxunl added a comment.

In D134355#3807435 , @cfang wrote:

> LGTM
>
> Should the module flag name be amdgpu_code_object_version or 
> amdhsa_code_object_version?

Good question.

@b-sumner Does code object version affects PAL? Thanks.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D134355/new/

https://reviews.llvm.org/D134355

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D134355: [AMDGPU] Emit module flag for all code object versions

2022-09-21 Thread Changpeng Fang via Phabricator via cfe-commits
cfang accepted this revision.
cfang added a comment.
This revision is now accepted and ready to land.

LGTM

Should the module flag name be amdgpu_code_object_version or 
amdhsa_code_object_version?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D134355/new/

https://reviews.llvm.org/D134355

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D134355: [AMDGPU] Emit module flag for all code object versions

2022-09-21 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl marked an inline comment as done.
yaxunl added inline comments.



Comment at: clang/lib/CodeGen/CodeGenModule.cpp:590-591
 // times 100.
-// ToDo: Enable module flag for all code object version when ROCm device
-// library is ready.
-if (getTarget().getTargetOpts().CodeObjectVersion == TargetOptions::COV_5) 
{
+if (getTarget().getTargetOpts().CodeObjectVersion !=
+TargetOptions::COV_None) {
   getModule().addModuleFlag(llvm::Module::Error,

arsenm wrote:
> What is the flag default value? What does it mean to target none?
> What is the flag default value? What does it mean to target none?

Default value is COV_V4. COV_None is for device library, which needs to link 
with bitcode containing any code object version. COV_None is only specified 
with a clang -cc1 option which is used when compiling device library.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D134355/new/

https://reviews.llvm.org/D134355

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D134355: [AMDGPU] Emit module flag for all code object versions

2022-09-21 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments.



Comment at: clang/lib/CodeGen/CodeGenModule.cpp:590-591
 // times 100.
-// ToDo: Enable module flag for all code object version when ROCm device
-// library is ready.
-if (getTarget().getTargetOpts().CodeObjectVersion == TargetOptions::COV_5) 
{
+if (getTarget().getTargetOpts().CodeObjectVersion !=
+TargetOptions::COV_None) {
   getModule().addModuleFlag(llvm::Module::Error,

What is the flag default value? What does it mean to target none?


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D134355/new/

https://reviews.llvm.org/D134355

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D134355: [AMDGPU] Emit module flag for all code object versions

2022-09-21 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl created this revision.
yaxunl added reviewers: b-sumner, cfang.
Herald added subscribers: kosarev, kerbowa, t-tye, tpr, dstuttard, jvesely, 
kzhuravl.
Herald added a project: All.
yaxunl requested review of this revision.
Herald added a subscriber: wdng.

https://reviews.llvm.org/D134355

Files:
  clang/lib/CodeGen/CodeGenModule.cpp
  clang/test/CodeGenCUDA/amdgpu-code-object-version.cu
  clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu


Index: clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu
===
--- clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu
+++ clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu
@@ -18,7 +18,7 @@
 // COMMON-LABEL: define{{.*}} amdgpu_kernel void @_Z7kernel1Pi(i32 
addrspace(1)*{{.*}} %x.coerce)
 // CHECK: ={{.*}} addrspacecast [[TYPE:.*]] addrspace(1)* %{{.*}} to 
[[TYPE]]*
 // CHECK-NOT: ={{.*}} addrspacecast [[TYPE:.*]] addrspace(1)* %{{.*}} to 
[[TYPE]]*
-// OPT: [[VAL:%.*]] = load i32, i32 addrspace(1)* %x.coerce, align 4, 
!amdgpu.noclobber !2
+// OPT: [[VAL:%.*]] = load i32, i32 addrspace(1)* %x.coerce, align 4, 
!amdgpu.noclobber ![[MD:[0-9]+]]
 // OPT: [[INC:%.*]] = add nsw i32 [[VAL]], 1
 // OPT: store i32 [[INC]], i32 addrspace(1)* %x.coerce, align 4
 // OPT: ret void
@@ -30,7 +30,7 @@
 // COMMON-LABEL: define{{.*}} amdgpu_kernel void @_Z7kernel2Ri(i32 
addrspace(1)*{{.*}} nonnull align 4 dereferenceable(4) %x.coerce)
 // CHECK: ={{.*}} addrspacecast [[TYPE:.*]] addrspace(1)* %{{.*}} to 
[[TYPE]]*
 // CHECK-NOT: ={{.*}} addrspacecast [[TYPE:.*]] addrspace(1)* %{{.*}} to 
[[TYPE]]*
-// OPT: [[VAL:%.*]] = load i32, i32 addrspace(1)* %x.coerce, align 4, 
!amdgpu.noclobber !2
+// OPT: [[VAL:%.*]] = load i32, i32 addrspace(1)* %x.coerce, align 4, 
!amdgpu.noclobber ![[MD]]
 // OPT: [[INC:%.*]] = add nsw i32 [[VAL]], 1
 // OPT: store i32 [[INC]], i32 addrspace(1)* %x.coerce, align 4
 // OPT: ret void
@@ -68,7 +68,7 @@
 // OPT: [[R1:%.*]] = getelementptr inbounds %struct.S, %struct.S addrspace(4)* 
%0, i64 0, i32 1
 // OPT: [[P1:%.*]] = load float*, float* addrspace(4)* [[R1]], align 8
 // OPT: [[G1:%.*]] ={{.*}} addrspacecast float* [[P1]] to float addrspace(1)*
-// OPT: [[V0:%.*]] = load i32, i32 addrspace(1)* [[G0]], align 4, 
!amdgpu.noclobber !2
+// OPT: [[V0:%.*]] = load i32, i32 addrspace(1)* [[G0]], align 4, 
!amdgpu.noclobber ![[MD]]
 // OPT: [[INC:%.*]] = add nsw i32 [[V0]], 1
 // OPT: store i32 [[INC]], i32 addrspace(1)* [[G0]], align 4
 // OPT: [[V1:%.*]] = load float, float addrspace(1)* [[G1]], align 4
@@ -103,7 +103,7 @@
 // OPT: [[R1:%.*]] = getelementptr inbounds %struct.T, %struct.T addrspace(4)* 
%0, i64 0, i32 0, i64 1
 // OPT: [[P1:%.*]] = load float*, float* addrspace(4)* [[R1]], align 8
 // OPT: [[G1:%.*]] ={{.*}} addrspacecast float* [[P1]] to float addrspace(1)*
-// OPT: [[V0:%.*]] = load float, float addrspace(1)* [[G0]], align 4, 
!amdgpu.noclobber !2
+// OPT: [[V0:%.*]] = load float, float addrspace(1)* [[G0]], align 4, 
!amdgpu.noclobber ![[MD]]
 // OPT: [[ADD0:%.*]] = fadd contract float [[V0]], 1.00e+00
 // OPT: store float [[ADD0]], float addrspace(1)* [[G0]], align 4
 // OPT: [[V1:%.*]] = load float, float addrspace(1)* [[G1]], align 4
@@ -130,7 +130,7 @@
 // COMMON-LABEL: define{{.*}} amdgpu_kernel void @_Z7kernel82SS(float 
addrspace(1)*{{.*}} %a.coerce)
 // CHECK: ={{.*}} addrspacecast [[TYPE:.*]] addrspace(1)* %{{.*}} to 
[[TYPE]]*
 // CHECK-NOT: ={{.*}} addrspacecast [[TYPE:.*]] addrspace(1)* %{{.*}} to 
[[TYPE]]*
-// OPT: [[VAL:%.*]] = load float, float addrspace(1)* %a.coerce, align 4, 
!amdgpu.noclobber !2
+// OPT: [[VAL:%.*]] = load float, float addrspace(1)* %a.coerce, align 4, 
!amdgpu.noclobber ![[MD]]
 // OPT: [[INC:%.*]] = fadd contract float [[VAL]], 3.00e+00
 // OPT: store float [[INC]], float addrspace(1)* %a.coerce, align 4
 // OPT: ret void
Index: clang/test/CodeGenCUDA/amdgpu-code-object-version.cu
===
--- clang/test/CodeGenCUDA/amdgpu-code-object-version.cu
+++ clang/test/CodeGenCUDA/amdgpu-code-object-version.cu
@@ -1,16 +1,16 @@
 // Create module flag for code object version.
 
 // RUN: %clang_cc1 -fcuda-is-device -triple amdgcn-amd-amdhsa -emit-llvm \
-// RUN:   -o - %s | FileCheck %s -check-prefix=NONE
+// RUN:   -o - %s | FileCheck %s -check-prefix=V4
 
 // RUN: %clang_cc1 -fcuda-is-device -triple amdgcn-amd-amdhsa -emit-llvm \
-// RUN:   -mcode-object-version=2 -o - %s | FileCheck -check-prefix=NONE %s
+// RUN:   -mcode-object-version=2 -o - %s | FileCheck -check-prefix=V2 %s
 
 // RUN: %clang_cc1 -fcuda-is-device -triple amdgcn-amd-amdhsa -emit-llvm \
-// RUN:   -mcode-object-version=3 -o - %s | FileCheck -check-prefix=NONE %s
+// RUN:   -mcode-object-version=3 -o - %s | FileCheck -check-prefix=V3 %s
 
 // RUN: %clang_cc1 -fcuda-is-device -triple amdgcn-amd-amdhsa -emit-llvm \
-// RUN:   -mcode-object-version=4 -o - %s | FileCheck