hliao added inline comments.
================ Comment at: clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu:19 +// COMMON-LABEL: define amdgpu_kernel void @_Z7kernel1Pi(i32*{{.*}} %x) +// OPT: [[VAL:%.*]] = load i32, i32* %x, align 4 // OPT: [[INC:%.*]] = add nsw i32 [[VAL]], 1 ---------------- arsenm wrote: > hliao wrote: > > arsenm wrote: > > > This is still a regression. Fixing up AA does not solve the problem this > > > promotions this is intended to solve. Generic accesses are worse > > > independently of the aliasing properties > > Do you mean FLAT load/store has worse addressing mode than GLOBAL ones? > Yes. The flat offsets have a smaller range, and do not have the saddr mode. > Flat accesses also won't avoid the extra lgmkcnt wait I plan to add support to select GLOBAL ones once we could confirm that pointer could only point to GLOBAL/CONSTANT address spaces. Do you think that's a reasonable solution? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D89980/new/ https://reviews.llvm.org/D89980 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits