[PATCH] D147732: [AMDGPU] Add type mangling for {read, write, readfirst, perm}lane intrinsics

2023-07-24 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments.



Comment at: llvm/lib/Target/AMDGPU/AMDGPULateCodeGenPrepare.cpp:187
 
+Value *AMDGPULateCodeGenPrepare::buildLegalLaneIntrinsic(
+IRBuilder<> , Intrinsic::ID IID, Value *Data0, Value *Data1, Value 
*Lane0,

jrbyrnes wrote:
> arsenm wrote:
> > You're not relying on this for correctness are you? This is an optimization 
> > pass, you can't lower here. You also shouldn't need to handle this in the 
> > IR, it should codegen normally 
> This is the legalization for non 32bit types -- I don't exactly know why it 
> wasn't handled via the normal codegen / selection process. @nhaehnle , I 
> believe you tried this in https://reviews.llvm.org/D86154 -- do you happen to 
> remember why we do legalization this way? If not, I'll rework the approach.
CodeGenPrepare/LateCodeGenPrepare can't be used for lowering, they're 
optimization passes. Legalization needs to be handled in the codegen


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147732/new/

https://reviews.llvm.org/D147732

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D147732: [AMDGPU] Add type mangling for {read, write, readfirst, perm}lane intrinsics

2023-07-06 Thread Jeffrey Byrnes via Phabricator via cfe-commits
jrbyrnes added inline comments.



Comment at: llvm/lib/Target/AMDGPU/AMDGPULateCodeGenPrepare.cpp:187
 
+Value *AMDGPULateCodeGenPrepare::buildLegalLaneIntrinsic(
+IRBuilder<> , Intrinsic::ID IID, Value *Data0, Value *Data1, Value 
*Lane0,

arsenm wrote:
> You're not relying on this for correctness are you? This is an optimization 
> pass, you can't lower here. You also shouldn't need to handle this in the IR, 
> it should codegen normally 
This is the legalization for non 32bit types -- I don't exactly know why it 
wasn't handled via the normal codegen / selection process. @nhaehnle , I 
believe you tried this in https://reviews.llvm.org/D86154 -- do you happen to 
remember why we do legalization this way? If not, I'll rework the approach.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147732/new/

https://reviews.llvm.org/D147732

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D147732: [AMDGPU] Add type mangling for {read, write, readfirst, perm}lane intrinsics

2023-07-06 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm requested changes to this revision.
arsenm added inline comments.
This revision now requires changes to proceed.
Herald added a subscriber: wangpc.



Comment at: llvm/lib/Target/AMDGPU/AMDGPULateCodeGenPrepare.cpp:187
 
+Value *AMDGPULateCodeGenPrepare::buildLegalLaneIntrinsic(
+IRBuilder<> , Intrinsic::ID IID, Value *Data0, Value *Data1, Value 
*Lane0,

You're not relying on this for correctness are you? This is an optimization 
pass, you can't lower here. You also shouldn't need to handle this in the IR, 
it should codegen normally 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147732/new/

https://reviews.llvm.org/D147732

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D147732: [AMDGPU] Add type mangling for {read, write, readfirst, perm}lane intrinsics

2023-06-23 Thread Jeffrey Byrnes via Phabricator via cfe-commits
jrbyrnes added a comment.

In D147732#4434557 , @arsenm wrote:

> I think this may not hard break mesa. I believe mesa bypasses the intrinsic 
> creation API, and just declares the string name of the intrinsic. The type 
> name mangling suffix is technically irrelevant, and as long as you use a 
> consistent type with a consistent suffix things should work out (and the null 
> suffix also works). After committing mesa should still move to adding the 
> type suffix

I can echo this sentiment.

The main issues arises when there are untyped calls to CreateIntrinsic, as the 
intrinsics are no longer defined with a type.

For {read, readfirst, write, perm}lanes, Mesa uses LLVMAddFunction and 
LLVMBuildCall2 APIs under its own ac_build_intrinsic -- these calls are all 
typed in the current implementation. Also, (as expected) the implementation 
inserts bitcasts to cast to Int32Ty before inserting these calls since only 
that version of the intrinsic currently exists. This also implies they wont 
have an issue with intrinsic / type declarations.

Unless I have missed something, I don't see why switching to type-mangling 
would cause an issue with Mesa's current implementation.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147732/new/

https://reviews.llvm.org/D147732

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D147732: [AMDGPU] Add type mangling for {read, write, readfirst, perm}lane intrinsics

2023-06-20 Thread Jeffrey Byrnes via Phabricator via cfe-commits
jrbyrnes updated this revision to Diff 533080.
jrbyrnes marked 5 inline comments as done.
jrbyrnes added a comment.

Address comments + enable selection of ptr types


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147732/new/

https://reviews.llvm.org/D147732

Files:
  clang/include/clang/Basic/BuiltinsAMDGPU.def
  clang/lib/CodeGen/CGBuiltin.cpp
  clang/test/CodeGenOpenCL/builtins-amdgcn-gfx10.cl
  clang/test/CodeGenOpenCL/builtins-amdgcn.cl
  clang/test/SemaOpenCL/builtins-amdgcn-error-gfx10-param.cl
  llvm/include/llvm/IR/IntrinsicsAMDGPU.td
  llvm/lib/Target/AMDGPU/AMDGPUAtomicOptimizer.cpp
  llvm/lib/Target/AMDGPU/AMDGPUCodeGenPrepare.cpp
  llvm/lib/Target/AMDGPU/AMDGPUInstCombineIntrinsic.cpp
  llvm/lib/Target/AMDGPU/AMDGPULateCodeGenPrepare.cpp
  llvm/lib/Target/AMDGPU/AMDGPUTargetMachine.cpp
  llvm/lib/Target/AMDGPU/SIInstructions.td
  llvm/lib/Target/AMDGPU/VOP3Instructions.td
  llvm/test/Analysis/UniformityAnalysis/AMDGPU/intrinsics.ll
  llvm/test/Assembler/autoupgrade-amdgpu-intrinsics.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/atomic_optimizations_mul_one.ll
  llvm/test/CodeGen/AMDGPU/GlobalISel/inst-select-amdgcn.readfirstlane.mir
  llvm/test/CodeGen/AMDGPU/atomic_optimizations_local_pointer.ll
  llvm/test/CodeGen/AMDGPU/global-atomic-scan.ll
  llvm/test/CodeGen/AMDGPU/llvm.amdgcn.permlane.ll
  llvm/test/CodeGen/AMDGPU/llvm.amdgcn.readfirstlane.ll
  llvm/test/CodeGen/AMDGPU/llvm.amdgcn.readlane.ll
  llvm/test/CodeGen/AMDGPU/llvm.amdgcn.writelane.ll
  llvm/test/CodeGen/AMDGPU/permlane-ptr.ll
  llvm/test/Transforms/InstCombine/AMDGPU/amdgcn-intrinsics.ll
  llvm/test/Verifier/AMDGPU/intrinsic-immarg.ll

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D147732: [AMDGPU] Add type mangling for {read, write, readfirst, perm}lane intrinsics

2023-06-20 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments.



Comment at: llvm/test/CodeGen/AMDGPU/llvm.amdgcn.readlane.ll:5
+declare i16 @llvm.amdgcn.readlane.i16(i16, i32) #0
+declare half @llvm.amdgcn.readlane.f16(half, i32) #0
+declare float @llvm.amdgcn.readlane.f32(float, i32) #0

arsenm wrote:
> Add bfloat and <2 x i16>, <2 x half>, <2 x bfloat> tests
Also p2, p3, p5, p6


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147732/new/

https://reviews.llvm.org/D147732

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D147732: [AMDGPU] Add type mangling for {read, write, readfirst, perm}lane intrinsics

2023-06-20 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments.



Comment at: llvm/lib/Target/AMDGPU/AMDGPULateCodeGenPrepare.cpp:209
+bool is16Bit =
+(EltType->isIntegerTy() && EltType->getIntegerBitWidth() == 16) ||
+(EltType->isHalfTy());

isIntegerTy(16). Also, just check the bitsize is 16. Might as well also handle 
bfloat



Comment at: llvm/lib/Target/AMDGPU/AMDGPULateCodeGenPrepare.cpp:301-310
+  if (Ty->isPointerTy()) {
+unsigned BitWidth = DL->getTypeSizeInBits(Ty);
+auto ResTy = Result->getType();
+if (ResTy->isVectorTy()) {
+  auto Scalarized = B.CreateBitCast(
+  Result, IntegerType::get(Mod->getContext(), BitWidth));
+  return B.CreateIntToPtr(Scalarized, Ty);

Just let pointer types pass through to codegen, we try really hard to never 
introduce ptrtoint/inttoptr



Comment at: llvm/test/CodeGen/AMDGPU/llvm.amdgcn.readlane.ll:5
+declare i16 @llvm.amdgcn.readlane.i16(i16, i32) #0
+declare half @llvm.amdgcn.readlane.f16(half, i32) #0
+declare float @llvm.amdgcn.readlane.f32(float, i32) #0

Add bfloat and <2 x i16>, <2 x half>, <2 x bfloat> tests


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147732/new/

https://reviews.llvm.org/D147732

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D147732: [AMDGPU] Add type mangling for {read, write, readfirst, perm}lane intrinsics

2023-06-20 Thread Nuno Lopes via Phabricator via cfe-commits
nlopes added inline comments.



Comment at: llvm/lib/Target/AMDGPU/AMDGPULateCodeGenPrepare.cpp:213
+
+Value *Result = UndefValue::get(Ty);
+for (int i = 0; i < EC; i += 1 + is16Bit) {

Please use poison wherever possible. In this case it seems it's just a 
placeholder, so it can be poison.
We're trying to get rid of poison. Thanks!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147732/new/

https://reviews.llvm.org/D147732

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D147732: [AMDGPU] Add type mangling for {read, write, readfirst, perm}lane intrinsics

2023-06-20 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment.

I think this may not hard break mesa. I believe mesa bypasses the intrinsic 
creation API, and just declares the string name of the intrinsic. The type name 
mangling suffix is technically irrelevant, and as long as you use a consistent 
type with a consistent suffix things should work out (and the null suffix also 
works). After committing mesa should still move to adding the type suffix




Comment at: llvm/test/CodeGen/AMDGPU/llvm.amdgcn.readlane.ll:13
 define amdgpu_kernel void @test_readlane_sreg_sreg(i32 %src0, i32 %src1) #1 {
   %readlane = call i32 @llvm.amdgcn.readlane(i32 %src0, i32 %src1)
   call void asm sideeffect "; use $0", "s"(i32 %readlane)

This is a hint mesa won't break


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147732/new/

https://reviews.llvm.org/D147732

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D147732: [AMDGPU] Add type mangling for {read, write, readfirst, perm}lane intrinsics

2023-06-09 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment.

D84639  is an old version of this. It has some 
additional tests not covered here, can you copy them?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147732/new/

https://reviews.llvm.org/D147732

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D147732: [AMDGPU] Add type mangling for {read, write, readfirst, perm}lane intrinsics

2023-05-15 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment.

In D147732#4343146 , @jrbyrnes wrote:

> Hi @yaxunl thanks for the comment.
>
> I looked into AutoUpgrade, and it catches upgrades of this type (unmangled -> 
> mangled) by default (via remangleIntrinsicFunction). I've included tests to 
> demonstrate. Is this the desired effect, or are you asking to blacklist these 
> intrinstics from being upgraded?

That is the desired effect. Thanks.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147732/new/

https://reviews.llvm.org/D147732

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D147732: [AMDGPU] Add type mangling for {read, write, readfirst, perm}lane intrinsics

2023-05-15 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment.

Please make sure we can consume old IR using unmangled llvm.amdgcn.permlanex16 
by fixing auto upgrade. Otherwise, it may cause regressions for device libs or 
existing apps. Thanks.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D147732/new/

https://reviews.llvm.org/D147732

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits