[PATCH] D129536: [CUDA][FIX] Make shfl[_sync] for unsigned long long non-recursive

2022-07-21 Thread Johannes Doerfert via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG48d6f5240187: [CUDA][FIX] Make shfl[_sync] for unsigned long long non-recursive (authored by jdoerfert). Repository: rG LLVM Github Monorepo

[PATCH] D129536: [CUDA][FIX] Make shfl[_sync] for unsigned long long non-recursive

2022-07-20 Thread Johannes Doerfert via Phabricator via cfe-commits
jdoerfert updated this revision to Diff 446286. jdoerfert added a comment. Use <...> Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D129536/new/ https://reviews.llvm.org/D129536 Files: clang/lib/Headers/__clang_cuda_intrinsics.h

[PATCH] D129536: [CUDA][FIX] Make shfl[_sync] for unsigned long long non-recursive

2022-07-20 Thread Johannes Doerfert via Phabricator via cfe-commits
jdoerfert added a comment. In D129536#3666884 , @tra wrote: > In D129536#3666860 , @jdoerfert > wrote: > >> The assertion is arguably not great but doesn't really matter, does it? How >> would I detect if they

[PATCH] D129536: [CUDA][FIX] Make shfl[_sync] for unsigned long long non-recursive

2022-07-20 Thread Johannes Doerfert via Phabricator via cfe-commits
jdoerfert updated this revision to Diff 446285. jdoerfert added a comment. Address comments Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D129536/new/ https://reviews.llvm.org/D129536 Files: clang/lib/Headers/__clang_cuda_intrinsics.h

[PATCH] D129536: [CUDA][FIX] Make shfl[_sync] for unsigned long long non-recursive

2022-07-20 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D129536#3666860 , @jdoerfert wrote: > The assertion is arguably not great but doesn't really matter, does it? How > would I detect if they are supported? The latest revision of the patch is fine in this regard. My comment

[PATCH] D129536: [CUDA][FIX] Make shfl[_sync] for unsigned long long non-recursive

2022-07-20 Thread Johannes Doerfert via Phabricator via cfe-commits
jdoerfert added a comment. In D129536#3666257 , @tra wrote: > In D129536#3663957 , @jdoerfert > wrote: > >> @tra, unsure about the crash. For me this passes fine (no gpu), is anything >> missing? > > The tests

[PATCH] D129536: [CUDA][FIX] Make shfl[_sync] for unsigned long long non-recursive

2022-07-20 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D129536#3663957 , @jdoerfert wrote: > @tra, unsure about the crash. For me this passes fine (no gpu), is anything > missing? The tests in the patch are running with `-emit-llvm`, so they are not actually lowering to NVPTX and

[PATCH] D129536: [CUDA][FIX] Make shfl[_sync] for unsigned long long non-recursive

2022-07-19 Thread Johannes Doerfert via Phabricator via cfe-commits
jdoerfert added a comment. @tra, unsure about the crash. For me this passes fine (no gpu), is anything missing? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D129536/new/ https://reviews.llvm.org/D129536

[PATCH] D129536: [CUDA][FIX] Make shfl[_sync] for unsigned long long non-recursive

2022-07-19 Thread Johannes Doerfert via Phabricator via cfe-commits
jdoerfert updated this revision to Diff 445864. jdoerfert added a comment. Address comments Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D129536/new/ https://reviews.llvm.org/D129536 Files: clang/lib/Headers/__clang_cuda_intrinsics.h

[PATCH] D129536: [CUDA][FIX] Make shfl[_sync] for unsigned long long non-recursive

2022-07-12 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Oops. Thank you for fixing this. Comment at: clang/test/CodeGenCUDA/shuffle_long_long.cu:52 + long long ll = 17; + ull = __shfl(ull, 7, 32); + ll = __shfl(ll, 7, 32); This crashes LLVM when we taget sm_70 where these instructions no

[PATCH] D129536: [CUDA][FIX] Make shfl[_sync] for unsigned long long non-recursive

2022-07-11 Thread Johannes Doerfert via Phabricator via cfe-commits
jdoerfert created this revision. jdoerfert added a reviewer: tra. Herald added subscribers: mattd, bollu, yaxunl. Herald added a project: All. jdoerfert requested review of this revision. Herald added a project: clang. Herald added a subscriber: cfe-commits. A copy-paste error caused UB in the