[PATCH] D153883: [Clang][OpenMP] Enable use of __kmpc_alloc_shared for VLAs defined in AMD GPU offloaded regions

2023-06-28 Thread Alexey Bataev via Phabricator via cfe-commits
ABataev added a comment. In D153883#4456342 , @tianshilei1992 wrote: > I think it's better to just limit it to AMDGPU for now. I rather doubt this is a good decision. Better to support for all targets. NVPTX supports(ed) (IIRC) static allocation and

[PATCH] D153883: [Clang][OpenMP] Enable use of __kmpc_alloc_shared for VLAs defined in AMD GPU offloaded regions

2023-06-28 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added a comment. In D153883#4456342 , @tianshilei1992 wrote: > I think it's better to just limit it to AMDGPU for now. > BTW, it might be worth to check if heap-to-stack will push it back to stack. If you're really going to go for backend

[PATCH] D153883: [Clang][OpenMP] Enable use of __kmpc_alloc_shared for VLAs defined in AMD GPU offloaded regions

2023-06-28 Thread Shilei Tian via Phabricator via cfe-commits
tianshilei1992 added a comment. I think it's better to just limit it to AMDGPU for now. BTW, it might be worth to check if heap-to-stack will push it back to stack. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D153883/new/

[PATCH] D153883: [Clang][OpenMP] Enable use of __kmpc_alloc_shared for VLAs defined in AMD GPU offloaded regions

2023-06-28 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/CGDecl.cpp:1603 +// deallocation call of __kmpc_free_shared() is emitted later. +if (getLangOpts().OpenMP && getTarget().getTriple().isAMDGCN()) { + // Emit call to __kmpc_alloc_shared() instead of the

[PATCH] D153883: [Clang][OpenMP] Enable use of __kmpc_alloc_shared for VLAs defined in AMD GPU offloaded regions

2023-06-28 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
doru1004 added inline comments. Comment at: clang/lib/CodeGen/CGDecl.cpp:1603 +// deallocation call of __kmpc_free_shared() is emitted later. +if (getLangOpts().OpenMP && getTarget().getTriple().isAMDGCN()) { + // Emit call to __kmpc_alloc_shared() instead of the

[PATCH] D153883: [Clang][OpenMP] Enable use of __kmpc_alloc_shared for VLAs defined in AMD GPU offloaded regions

2023-06-28 Thread Matt Arsenault via Phabricator via cfe-commits
arsenm added inline comments. Comment at: clang/lib/CodeGen/CGDecl.cpp:1603 +// deallocation call of __kmpc_free_shared() is emitted later. +if (getLangOpts().OpenMP && getTarget().getTriple().isAMDGCN()) { + // Emit call to __kmpc_alloc_shared() instead of the

[PATCH] D153883: [Clang][OpenMP] Enable use of __kmpc_alloc_shared for VLAs defined in AMD GPU offloaded regions

2023-06-27 Thread Alexey Bataev via Phabricator via cfe-commits
ABataev added inline comments. Comment at: clang/lib/CodeGen/CGDecl.cpp:1603 +// deallocation call of __kmpc_free_shared() is emitted later. +if (getLangOpts().OpenMP && getTarget().getTriple().isAMDGCN()) { + // Emit call to __kmpc_alloc_shared() instead of the

[PATCH] D153883: [Clang][OpenMP] Enable use of __kmpc_alloc_shared for VLAs defined in AMD GPU offloaded regions

2023-06-27 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
doru1004 added inline comments. Comment at: clang/lib/CodeGen/CGDecl.cpp:1603 +// deallocation call of __kmpc_free_shared() is emitted later. +if (getLangOpts().OpenMP && getTarget().getTriple().isAMDGCN()) { + // Emit call to __kmpc_alloc_shared() instead of the

[PATCH] D153883: [Clang][OpenMP] Enable use of __kmpc_alloc_shared for VLAs defined in AMD GPU offloaded regions

2023-06-27 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
doru1004 updated this revision to Diff 535186. doru1004 marked 3 inline comments as done. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D153883/new/ https://reviews.llvm.org/D153883 Files: clang/lib/CodeGen/CGDecl.cpp

[PATCH] D153883: [Clang][OpenMP] Enable use of __kmpc_alloc_shared for VLAs defined in AMD GPU offloaded regions

2023-06-27 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
doru1004 added inline comments. Comment at: clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp:1085 } - for (const auto *VD : I->getSecond().EscapedVariableLengthDecls) { -// Use actual memory size of the VLA object including the padding doru1004 wrote: > ABataev

[PATCH] D153883: [Clang][OpenMP] Enable use of __kmpc_alloc_shared for VLAs defined in AMD GPU offloaded regions

2023-06-27 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
doru1004 added inline comments. Comment at: clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp:1085 } - for (const auto *VD : I->getSecond().EscapedVariableLengthDecls) { -// Use actual memory size of the VLA object including the padding ABataev wrote: > jhuber6

[PATCH] D153883: [Clang][OpenMP] Enable use of __kmpc_alloc_shared for VLAs defined in AMD GPU offloaded regions

2023-06-27 Thread Alexey Bataev via Phabricator via cfe-commits
ABataev added inline comments. Comment at: clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp:1085 } - for (const auto *VD : I->getSecond().EscapedVariableLengthDecls) { -// Use actual memory size of the VLA object including the padding jhuber6 wrote: > doru1004

[PATCH] D153883: [Clang][OpenMP] Enable use of __kmpc_alloc_shared for VLAs defined in AMD GPU offloaded regions

2023-06-27 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added inline comments. Comment at: clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp:1085 } - for (const auto *VD : I->getSecond().EscapedVariableLengthDecls) { -// Use actual memory size of the VLA object including the padding doru1004 wrote: > ABataev

[PATCH] D153883: [Clang][OpenMP] Enable use of __kmpc_alloc_shared for VLAs defined in AMD GPU offloaded regions

2023-06-27 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
doru1004 added inline comments. Comment at: clang/lib/CodeGen/CGOpenMPRuntimeGPU.cpp:1085 } - for (const auto *VD : I->getSecond().EscapedVariableLengthDecls) { -// Use actual memory size of the VLA object including the padding ABataev wrote: > Why this

[PATCH] D153883: [Clang][OpenMP] Enable use of __kmpc_alloc_shared for VLAs defined in AMD GPU offloaded regions

2023-06-27 Thread Alexey Bataev via Phabricator via cfe-commits
ABataev added a comment. Add the runtime test? Comment at: clang/lib/CodeGen/CGDecl.cpp:587 +std::pair AddrSizePair; +KmpcAllocFree(std::pair AddrSizePair) +: AddrSizePair(AddrSizePair) {} Better to pass it as const reference

[PATCH] D153883: [Clang][OpenMP] Enable use of __kmpc_alloc_shared for VLAs defined in AMD GPU offloaded regions

2023-06-27 Thread Joseph Huber via Phabricator via cfe-commits
jhuber6 added a comment. So this is implementing the `stacksave` using `__kmpc_alloc_shared` instead? It makes sense since the OpenMP standard expects sharing for the stack. I wonder how this interfaces with `-fopenmp-cuda-mode`. Comment at: clang/lib/CodeGen/CGDecl.cpp:1603

[PATCH] D153883: [Clang][OpenMP] Enable use of __kmpc_alloc_shared for VLAs defined in AMD GPU offloaded regions

2023-06-27 Thread Gheorghe-Teodor Bercea via Phabricator via cfe-commits
doru1004 created this revision. doru1004 added reviewers: ronlieb, gregrodgers, carlo.bertolli, arsenm, jdoerfert, dhruvachak, ABataev. doru1004 added a project: OpenMP. Herald added subscribers: sunshaoce, guansong, yaxunl, jvesely. Herald added a project: All. doru1004 requested review of this