[PATCH] D100394: [Clang][NVPTX] Add NVPTX intrinsics and builtins for CUDA PTX cp.async instructions

2023-05-17 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Hi. It looks like CUDA-11+ headers need a variant of cm.async intrinsics which provides the optional src_size argument. I'm planning to add it to the existing intrinsics in NVPTX. It's just a heads-up in case you may have existing uses of them that may need to be updated.

[PATCH] D100394: [Clang][NVPTX] Add NVPTX intrinsics and builtins for CUDA PTX cp.async instructions

2022-04-22 Thread Steffen Larsen via Phabricator via cfe-commits
steffenlarsen added a comment. In D100394#3466316 , @nirvedhmeshram wrote: > Hello, I was interested in using `llvm.nvvm.cp.async.cg.shared.global.8` and > `llvm.nvvm.cp.async.cg.shared.global.4` and was wondering if there is some > fundamental reason

[PATCH] D100394: [Clang][NVPTX] Add NVPTX intrinsics and builtins for CUDA PTX cp.async instructions

2022-04-21 Thread Nirvedh Meshram via Phabricator via cfe-commits
nirvedhmeshram added a comment. Herald added subscribers: mattd, gchakrabarti, asavonic. Herald added a project: All. Hello, I was interested in using `llvm.nvvm.cp.async.cg.shared.global.8` and `llvm.nvvm.cp.async.cg.shared.global.4` and was wondering if there is some fundamental reason they

[PATCH] D100394: [Clang][NVPTX] Add NVPTX intrinsics and builtins for CUDA PTX cp.async instructions

2021-05-17 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG02c2468864bb: [Clang][NVPTX] Add NVPTX intrinsics and builtins for CUDA PTX cp.async… (authored by nyalloc, committed by tra). Herald added a subscriber: cfe-commits. Repository: rG LLVM Github