[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-29 Thread Saiyedul Islam via Phabricator via cfe-commits
saiislam added inline comments. Comment at: clang/test/CodeGenCUDA/amdgpu-code-object-version-linking.cu:12 +// RUN: llvm-link %t_0 %t_5 -o -| llvm-dis -o - | FileCheck -check-prefix=LINKED5 %s + +#include "Inputs/cuda.h" yaxunl wrote: > saiislam wrote: > >

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-29 Thread Saiyedul Islam via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rGf616c3eeb43f: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5 (authored by saiislam). Changed prior to commit:

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-28 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl accepted this revision. yaxunl added a comment. LGTM. Thanks Comment at: clang/test/CodeGenCUDA/amdgpu-code-object-version-linking.cu:12 +// RUN: llvm-link %t_0 %t_5 -o -| llvm-dis -o - | FileCheck -check-prefix=LINKED5 %s + +#include "Inputs/cuda.h"

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-28 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 accepted this revision. jhuber6 added a comment. This revision is now accepted and ready to land. I think it's fine now given that it's passing tests. Others feel free to comment. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D139730/new/

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-28 Thread Saiyedul Islam via Phabricator via cfe-commits
saiislam updated this revision to Diff 553991. saiislam marked 3 inline comments as done. saiislam added a comment. Minor fixes addressing reviewer's comment. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D139730/new/

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-28 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. Just a few more nits. I think it's looking fine but I haven't tested it. Anyone else? Comment at: clang/tools/clang-linker-wrapper/ClangLinkerWrapper.cpp:406 + // pass on -mllvm options to the clang + for (const opt::Arg *Arg :

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-25 Thread Saiyedul Islam via Phabricator via cfe-commits
saiislam updated this revision to Diff 553413. saiislam marked 2 inline comments as done. saiislam added a comment. Changed getImplicitArgsSize to use sizeof. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D139730/new/

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-24 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added inline comments. Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/utils/UtilitiesRTL.h:49 -static_assert(sizeof(AMDGPUImplicitArgsTy) == 56, - "Unexpected size of implicit arguments"); +enum IMPLICITARGS : uint32_t { + COV4_SIZE = 56,

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-24 Thread Saiyedul Islam via Phabricator via cfe-commits
saiislam marked an inline comment as done. saiislam added inline comments. Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/utils/UtilitiesRTL.h:49 -static_assert(sizeof(AMDGPUImplicitArgsTy) == 56, - "Unexpected size of implicit arguments"); +enum

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-24 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added inline comments. Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/utils/UtilitiesRTL.h:49 -static_assert(sizeof(AMDGPUImplicitArgsTy) == 56, - "Unexpected size of implicit arguments"); +enum IMPLICITARGS : uint32_t { + COV4_SIZE = 56,

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-24 Thread Saiyedul Islam via Phabricator via cfe-commits
saiislam added inline comments. Comment at: clang/lib/Driver/ToolChains/Clang.cpp:8649-8650 + // code-object-version=X needs to be passed to clang-linker-wrapper to ensure + // that it is used by lld. + if (const Arg *A =

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-24 Thread Saiyedul Islam via Phabricator via cfe-commits
saiislam updated this revision to Diff 553179. saiislam marked 7 inline comments as done. saiislam added a comment. Updated test case to check internalization of newly inserted global variable. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-23 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/test/CodeGenCUDA/amdgpu-code-object-version-linking.cu:12 +// RUN: llvm-link %t_0 %t_5 -o -| llvm-dis -o - | FileCheck -check-prefix=LINKED5 %s + +#include "Inputs/cuda.h" need to test using clang -cc1 with -O3

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-23 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. Codegen parts LGTM, questions with the driver parts Comment at: clang/lib/Driver/ToolChain.cpp:1368 if (A->getOption().matches(options::OPT_m_Group)) { - if (SameTripleAsHost) + // Pass code objection version to device toolchain + //

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-22 Thread Saiyedul Islam via Phabricator via cfe-commits
saiislam updated this revision to Diff 552344. saiislam marked 5 inline comments as done. saiislam added a comment. Used CreateConstInBoundsGEP1_32 for emitting GEP statements. Changed lambda function to simple fucntion body for defining the global variable. Repository: rG LLVM Github

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-21 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/CGBuiltin.cpp:17067 + +Value *Iscov5 = CGF.Builder.CreateICmpSGE( +ABIVersion, Capitalization is weird, IsCOV5? Comment at: clang/lib/CodeGen/CGBuiltin.cpp:17082-17083 +

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-21 Thread Saiyedul Islam via Phabricator via cfe-commits
saiislam updated this revision to Diff 552085. saiislam marked an inline comment as done. saiislam added a comment. Adressed reviewer's comments. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D139730/new/ https://reviews.llvm.org/D139730 Files:

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-21 Thread Saiyedul Islam via Phabricator via cfe-commits
saiislam marked 4 inline comments as done. saiislam added inline comments. Comment at: clang/lib/CodeGen/Targets/AMDGPU.cpp:383 +CGM.getTarget().getTargetOpts().CodeObjectVersion, /*Size=*/32, +llvm::GlobalValue::WeakODRLinkage); +}

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-18 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/test/CodeGenCUDA/amdgpu-code-object-version-linking.cu:40-43 +__device__ void bar(int *out) +{ + *out = __builtin_amdgcn_workgroup_size_x(); +} test all the builtins? Repository: rG LLVM Github Monorepo

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-18 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/CGBuiltin.cpp:17057 + Constant *Offset, *OffsetOld; + Value *DP, *DP1; + Spell out to DispatchPtr? Comment at: clang/lib/CodeGen/CodeGenModule.cpp:1206-1208 +

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-18 Thread Saiyedul Islam via Phabricator via cfe-commits
saiislam added a comment. In D139730#4597504 , @jhuber6 wrote: > Some nits. I'm assuming we're getting the code object in the backend now? > We'll need to make sure that `-Wl,--amdhsa-code-object-version` is passed to > the clang invocation inside of

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-18 Thread Saiyedul Islam via Phabricator via cfe-commits
saiislam updated this revision to Diff 551597. saiislam marked 4 inline comments as done. saiislam added a comment. Changed ImplitArgs implementation using struct. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D139730/new/

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-17 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. Some nits. I'm assuming we're getting the code object in the backend now? We'll need to make sure that `-Wl,--amdhsa-code-object-version` is passed to the clang invocation inside of the `clang-linker-wrapper` to handle `-save-temps` mode. Comment

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-17 Thread Saiyedul Islam via Phabricator via cfe-commits
saiislam added inline comments. Comment at: clang/lib/CodeGen/CGBuiltin.cpp:17143-17145 + llvm::LoadInst *LD; + Constant *Offset, *Offset1; + Value *DP, *DP1; arsenm wrote: > Move down to define and initialize There are multiple uses of the same identifier.

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-17 Thread Saiyedul Islam via Phabricator via cfe-commits
saiislam updated this revision to Diff 551266. saiislam marked 6 inline comments as done. saiislam added a comment. Updated the patch as per reviewers comments. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D139730/new/

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-07 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/CGBuiltin.cpp:17143-17145 + llvm::LoadInst *LD; + Constant *Offset, *Offset1; + Value *DP, *DP1; Move down to define and initialize Comment at:

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-07 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. I would suggest separating the clang/llvm part into a separate review. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D139730/new/ https://reviews.llvm.org/D139730 ___ cfe-commits

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-07 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added a comment. need a lit test for the codegen of the clang builtin for cov 4/5/none and a lit test to show the branching code generated with cov none can be optimized away when linked with cov4 or cov5. Comment at: clang/lib/CodeGen/Targets/AMDGPU.cpp:383 +

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-07 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added inline comments. Comment at: clang/lib/CodeGen/CGBuiltin.cpp:17138 +///and use its value for COV_4 or COV_5 approach. It is used for +///compiling device libraries in ABI-agnostic way. +/// Comment at:

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-07 Thread Saiyedul Islam via Phabricator via cfe-commits
saiislam added a comment. In D139730#4561622 , @arsenm wrote: > In D139730#4561619 , @arsenm wrote: > >> In D139730#4561575 , @jhuber6 >> wrote: >> >>> In

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-07 Thread Saiyedul Islam via Phabricator via cfe-commits
saiislam updated this revision to Diff 547751. saiislam marked 5 inline comments as done. saiislam added a comment. Removed unused cov5 implicitargs fields. Added comments about EmitAMDGPUWorkGroupSize and ABI-agnostica code emission. Adressed reviewers' comments. Repository: rG LLVM Github

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-04 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D139730#4561619 , @arsenm wrote: > In D139730#4561575 , @jhuber6 wrote: > >> In D139730#4561573 , @arsenm wrote: >> >>> In D139730#4561540

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-04 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D139730#4561575 , @jhuber6 wrote: > In D139730#4561573 , @arsenm wrote: > >> In D139730#4561540 , @jhuber6 >> wrote: >> >>> Could you explain

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-04 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. In D139730#4561573 , @arsenm wrote: > In D139730#4561540 , @jhuber6 wrote: > >> Could you explain briefly what the approach here is? I'm confused as to >> what's actually changed and how

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-04 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D139730#4561540 , @jhuber6 wrote: > Could you explain briefly what the approach here is? I'm confused as to > what's actually changed and how we're handling this difference. I thought if > this was just the definition of some

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-04 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added inline comments. Comment at: openmp/libomptarget/plugins-nextgen/amdgpu/utils/UtilitiesRTL.h:36 -// The implicit arguments of AMDGPU kernels. -struct AMDGPUImplicitArgsTy { - uint64_t OffsetX; - uint64_t OffsetY; - uint64_t OffsetZ; - uint64_t HostcallPtr; -

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-04 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. Could you explain briefly what the approach here is? I'm confused as to what's actually changed and how we're handling this difference. I thought if this was just the definition of some builtin function we could just rely on the backend to figure it out. Why do we need

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-04 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/CGBuiltin.cpp:17146 +Value *ABIVersion; +if (ABIVersionC) { + ABIVersion = CGF.Builder.CreateAlignedLoad(CGF.Int32Ty, ABIVersionC, this must always pass Comment at:

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-08-04 Thread Saiyedul Islam via Phabricator via cfe-commits
saiislam updated this revision to Diff 547297. saiislam added a comment. Another attempt at cov5 support by using CodeGen for buitlin_amdgpu_workgroup_size. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D139730/new/

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-06-13 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. In D139730#4418630 , @arsenm wrote: > In D139730#3991628 , > @JonChesterfield wrote: > >> We currently require that library for libm, which I'm also not thrilled >> about, but at least

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-06-13 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. Herald added subscribers: jplehr, sunshaoce. In D139730#3991628 , @JonChesterfield wrote: > We currently require that library for libm, which I'm also not thrilled > about, but at least you can currently build and run openmp

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-01-18 Thread Saiyedul Islam via Phabricator via cfe-commits
saiislam marked an inline comment as done. saiislam added inline comments. Comment at: clang/lib/Driver/ToolChains/Clang.cpp:7085 if (Triple.isAMDGPU()) { -handleAMDGPUCodeObjectVersionOptions(D, Args, CmdArgs); +handleAMDGPUCodeObjectVersionOptions(D, C.getArgs(),

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-01-04 Thread Saiyedul Islam via Phabricator via cfe-commits
saiislam added inline comments. Comment at: openmp/libomptarget/DeviceRTL/src/Mapping.cpp:50 -uint32_t getNumHardwareThreadsInBlock() { - return __builtin_amdgcn_workgroup_size_x(); -} +uint32_t getNumHardwareThreadsInBlock() { return external_get_local_size(0); }

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2023-01-04 Thread Saiyedul Islam via Phabricator via cfe-commits
saiislam added a comment. Thanks everyone for your review and comments! I am going to address all of them in a series of smaller patches starting with D140784 . Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2022-12-13 Thread Shilei Tian via Phabricator via cfe-commits
tianshilei1992 added inline comments. Comment at: openmp/libomptarget/DeviceRTL/src/Mapping.cpp:19 #pragma omp begin declare target device_type(nohost) - +extern const uint16_t __oclc_ABI_version; #include "llvm/Frontend/OpenMP/OMPGridValues.h" jhuber6 wrote:

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2022-12-13 Thread Jon Chesterfield via Phabricator via cfe-commits
JonChesterfield added a comment. I am reluctant to add the dependency edge to rocm device libs to openmp's GPU runtime. We currently require that library for libm, which I'm also not thrilled about, but at least you can currently build and run openmp programs (that don't use libm, like much

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2022-12-09 Thread Johannes Doerfert via Phabricator via cfe-commits
jdoerfert added a comment. Could we elaborate on the benefits, please. Now we support two versions? Why is this helpful: > DeviceRTL uses rocm-device-libs instead of directly calling amdgcn builtins > for the functions which are affected by cov5. Comment at:

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2022-12-09 Thread Yaxun Liu via Phabricator via cfe-commits
yaxunl added inline comments. Comment at: clang/lib/Driver/ToolChains/Clang.cpp:7085 if (Triple.isAMDGPU()) { -handleAMDGPUCodeObjectVersionOptions(D, Args, CmdArgs); +handleAMDGPUCodeObjectVersionOptions(D, C.getArgs(), CmdArgs, +

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2022-12-09 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. I'm not fully up-to-date, what's the main difference and advantage of the new code object version? What do all the new implicit arguments do. Comment at: clang/lib/Driver/ToolChains/AMDGPU.cpp:953 getAMDGPUCodeObjectVersion(getDriver(),

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2022-12-09 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. Maybe we should wait until D138389 lands and we can update both, otherwise we'd need a second patch. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D139730/new/

[PATCH] D139730: [OpenMP][DeviceRTL][AMDGPU] Support code object version 5

2022-12-09 Thread Saiyedul Islam via Phabricator via cfe-commits
saiislam created this revision. saiislam added reviewers: jdoerfert, JonChesterfield, jhuber6, yaxunl. Herald added subscribers: kosarev, kerbowa, guansong, tpr, dstuttard, jvesely, kzhuravl. Herald added a project: All. saiislam requested review of this revision. Herald added subscribers: