[PATCH] D141375: [SYCL][OpenMP] Fix compilation errors for unsupported __bf16 intrinsics

2023-09-07 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/Sema/Sema.cpp:1978-1979 !Context.getTargetInfo().hasInt128Type()) || +(Ty->isBFloat16Type() && !Context.getTargetInfo().hasBFloat16Type() && + !LangOpts.CUDAIsDevice) || LongDoubleMismatched) {

[PATCH] D141375: [SYCL][OpenMP] Fix compilation errors for unsupported __bf16 intrinsics

2023-09-07 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Herald added a subscriber: jplehr. Comment at: clang/lib/Sema/Sema.cpp:1978-1979 !Context.getTargetInfo().hasInt128Type()) || +(Ty->isBFloat16Type() && !Context.getTargetInfo().hasBFloat16Type() && + !LangOpts.CUDAIsDevice) ||

[PATCH] D158778: [CUDA] Propagate __float128 support from the host.

2023-08-29 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D158778#4626181 , @jhuber6 wrote: > Just doing a simple example here https://godbolt.org/z/Y3E58PKMz shows that > for NVPTX we error out (as I would expect) but for AMDGPU we emit an x86 > 80-bit double. With this patch NVPTX

[PATCH] D158778: [CUDA] Propagate __float128 support from the host.

2023-08-29 Thread Artem Belevich via Phabricator via cfe-commits
tra added a subscriber: jhuber6. tra added a comment. In D158778#4624408 , @ABataev wrote: > Just checks removal should be fine Looks like OpenMP handles long double and __float128 differently -- it always insists on using the host's FP format for

[PATCH] D158778: [CUDA] Propagate __float128 support from the host.

2023-08-28 Thread Artem Belevich via Phabricator via cfe-commits
tra added a subscriber: ABataev. tra added a comment. @ABataev This patch breaks breaks two tests: - github.com/llvm/llvm-project/blob/main/clang/test/OpenMP/nvptx_unsupported_type_codegen.cpp - github.com/llvm/llvm-project/blob/main/clang/test/OpenMP/nvptx_unsupported_type_messages.cpp It's

[PATCH] D158778: [CUDA] Propagate __float128 support from the host.

2023-08-24 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Also, https://github.com/llvm/llvm-project/issues/46903 Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D158778/new/ https://reviews.llvm.org/D158778 ___ cfe-commits mailing list

[PATCH] D158778: [CUDA] Propagate __float128 support from the host.

2023-08-24 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. Herald added subscribers: mattd, gchakrabarti, asavonic, kerbowa, bixia, tpr, yaxunl, jvesely. Herald added a project: All. tra edited the summary of this revision. tra published this revision for review. tra added reviewers: jlebar, yaxunl. tra added a comment. Herald

[PATCH] D157750: Properly handle -fsplit-machine-functions for fatbinary compilation

2023-08-21 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/test/Driver/fsplit-machine-functions-with-cuda-nvptx.c:9 + +// Check that -fsplit-machine-functions is passed to both x86 and cuda compilation and does not cause driver error. +// MFS2: -fsplit-machine-functions

[PATCH] D158238: Implement __builtin_fmaximum/fminimum*

2023-08-18 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 551646. tra added a comment. Fixed test RUN lines Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D158238/new/ https://reviews.llvm.org/D158238 Files: clang/include/clang/Basic/Builtins.def

[PATCH] D158238: Implement __builtin_fmaximum/fminimum*

2023-08-18 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. @fhahn who else should take a look at the patch? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D158238/new/ https://reviews.llvm.org/D158238 ___ cfe-commits mailing list

[PATCH] D158238: Implement __builtin_fmaximum/fminimum*

2023-08-18 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. Herald added a subscriber: bixia. Herald added a project: All. tra updated this revision to Diff 551336. tra added a comment. tra updated this revision to Diff 551338. tra published this revision for review. tra added a reviewer: fhahn. Herald added subscribers:

[PATCH] D158226: [CUDA/NVPTX] Improve handling of memcpy for -Os compilations.

2023-08-18 Thread Artem Belevich via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG72757343fa86: [CUDA/NVPTX] Improve handling of memcpy for -Os compilations. (authored by tra). Repository: rG LLVM Github Monorepo CHANGES SINCE

[PATCH] D158247: [CUDA][HIP] Fix overloading resolution in global variable initializer

2023-08-18 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Same reproducer but for CUDA: https://godbolt.org/z/WhjTMffnx Comment at: clang/include/clang/Sema/Sema.h:4753 + /// Otherwise, use \p D to determiine the host/device target. bool CheckCallingConvAttr(const ParsedAttr , CallingConv , +

[PATCH] D158226: [CUDA/NVPTX] Improve handling of memcpy for -Os compilations.

2023-08-17 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. Herald added subscribers: mattd, gchakrabarti, asavonic, bixia, hiraditya, yaxunl. Herald added a project: All. tra published this revision for review. tra added a reviewer: alexfh. Herald added subscribers: llvm-commits, cfe-commits, wangpc, jholewinski. Herald added

[PATCH] D157750: Properly handle -fsplit-machine-functions for fatbinary compilation

2023-08-17 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/test/Driver/fsplit-machine-functions-with-cuda-nvptx.c:16 +// causes a warning. +// RUN: %clang --target=x86_64-unknown-linux-gnu -nogpulib -nogpuinc \ +// RUN: --cuda-gpu-arch=sm_70 -x cuda -fsplit-machine-functions -S %s 2>&1

[PATCH] D157750: Properly handle -fsplit-machine-functions for fatbinary compilation

2023-08-14 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/test/Driver/fsplit-machine-functions-with-cuda-nvptx.c:9 + +// Check that -fsplit-machine-functions is passed to both x86 and cuda compilation and does not cause driver error. +// MFS2: -fsplit-machine-functions

[PATCH] D157750: Properly handle -fsplit-machine-functions for fatbinary compilation

2023-08-11 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/test/Driver/fsplit-machine-functions-with-cuda-nvptx.c:9 + +// Check that -fsplit-machine-functions is passed to both x86 and cuda compilation and does not cause driver error. +// MFS2: -fsplit-machine-functions

[PATCH] D157750: Properly handle -fsplit-machine-functions for fatbinary compilation

2023-08-11 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Comment at: clang/test/Driver/fsplit-machine-functions-with-cuda-nvptx.c:9 + +// Check that -fsplit-machine-functions is passed to both x86 and cuda compilation and does not cause driver error. +// MFS2: -fsplit-machine-functions We will

[PATCH] D156014: [Clang][NVPTX] Permit use of the alias attribute for NVPTX targets

2023-08-07 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/Sema/SemaDeclAttr.cpp:1995 } - if (S.Context.getTargetInfo().getTriple().isNVPTX()) { -S.Diag(AL.getLoc(), diag::err_alias_not_supported_on_nvptx); jhuber6 wrote: > jhuber6 wrote: > > tra wrote: > > > tra

[PATCH] D156014: [Clang][NVPTX] Permit use of the alias attribute for NVPTX targets

2023-07-21 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/Sema/SemaDeclAttr.cpp:1995 } - if (S.Context.getTargetInfo().getTriple().isNVPTX()) { -S.Diag(AL.getLoc(), diag::err_alias_not_supported_on_nvptx); tra wrote: > Allowing or not `noreturn` depends on the

[PATCH] D156014: [Clang][NVPTX] Permit use of the alias attribute for NVPTX targets

2023-07-21 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/Sema/SemaDeclAttr.cpp:1995 } - if (S.Context.getTargetInfo().getTriple().isNVPTX()) { -S.Diag(AL.getLoc(), diag::err_alias_not_supported_on_nvptx); Allowing or not `noreturn` depends on the CUDA version

[PATCH] D154559: [clang] Fix constant evaluation about static member function

2023-07-18 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. @rsmith Richard, PTAL. This needs your language lawyering expertise. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D154559/new/ https://reviews.llvm.org/D154559 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[PATCH] D155539: [CUDA][HIP] Use the same default language std as C++

2023-07-18 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. We should probably update documentation that C++ standard version for CUDA/HIP compilation now matches C++ default instead of previously used c++14. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D155539/new/

[PATCH] D154822: [clang] Support '-fgpu-default-stream=per-thread' for NVIDIA CUDA

2023-07-13 Thread Artem Belevich via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rGf05b58a9468c: [clang] Support -fgpu-default-stream=per-thread for NVIDIA CUDA (authored by boxu-zhang, committed by tra). Repository: rG LLVM

[PATCH] D154822: [clang] Support '-fgpu-default-stream=per-thread' for NVIDIA CUDA

2023-07-12 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. > Can anyone push this? I can help with this. How do you want your commit to be attributed? The patch currently has `boxu.zhang `. Do you want it to be changed to something else? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION

[PATCH] D154300: [CUDA][HIP] Fix template argument deduction

2023-07-11 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/Sema/SemaOverload.cpp:12758-12764 +std::optional MorePreferableByCUDA = +CheckCUDAPreference(FD, Result); +// If FD has different CUDA preference than Result. +if (MorePreferableByCUDA) { +

[PATCH] D154797: [CUDA][HIP] Rename and fix `-fcuda-approx-transcendentals`

2023-07-10 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added inline comments. This revision is now accepted and ready to land. Comment at: clang/lib/Frontend/InitPreprocessor.cpp:1294 +if (!LangOpts.HIP) + Builder.defineMacro("__CLANG_CUDA_APPROX_TRANSCENDENTALS__"); +

[PATCH] D154797: [CUDA][HIP] Rename and fix `-fcuda-approx-transcendentals`

2023-07-10 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Looks good in general. Comment at: clang/lib/Driver/ToolChains/Clang.cpp:7221-7223 +bool UseApproxTranscendentals = false; +if (Args.hasFlag(options::OPT_ffast_math, options::OPT_fno_fast_math, + false)) ```

[PATCH] D154822: Support '-fgpu-default-stream=per-thread' for NVIDIA CUDA

2023-07-10 Thread Artem Belevich via Phabricator via cfe-commits
tra added a reviewer: tra. tra added a comment. Looking at CUDA headers, it appears that changing only compiler-generated-glue may be insufficient. A lot of other CUDA API calls need to be changed to `_ptsz` variant and for that we need to have `CUDA_API_PER_THREAD_DEFAULT_STREAM` defined.

[PATCH] D154077: [HIP] Fix version detection for old HIP-PATH

2023-06-29 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM in general with a minor suggestion. Comment at: clang/lib/Driver/ToolChains/AMDGPU.cpp:471 {std::string(SharePath) + "/hip/version", +

[PATCH] D144911: adding bf16 support to NVPTX

2023-06-27 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. @bkramer Ben, PTAL when you get a chance. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D144911/new/ https://reviews.llvm.org/D144911 ___ cfe-commits mailing list

[PATCH] D144911: adding bf16 support to NVPTX

2023-06-23 Thread Artem Belevich via Phabricator via cfe-commits
tra planned changes to this revision. tra added a comment. We're still missing clang-side tests for the new builtins. Now that the intrinsics use `bfloat` we also need to change builtin signatures. Or change codegen to bitcast to/from bfloat to match the types. To be continued next week.

[PATCH] D144911: adding bf16 support to NVPTX

2023-06-23 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 534130. tra added a comment. Fixed few missed places in bf16 lowering. Changed intrinsic types to use bfloat type. Auto-upgrade the old intrinsic variants. Updated broken tests. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION

[PATCH] D144911: adding bf16 support to NVPTX

2023-06-23 Thread Artem Belevich via Phabricator via cfe-commits
tra commandeered this revision. tra edited reviewers, added: kushanam; removed: tra. tra added a comment. This revision now requires review to proceed. Herald added a subscriber: bixia. I've got a few more fixes for the patch. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION

[PATCH] D144911: adding bf16 support to NVPTX

2023-06-23 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. The latest patch revision still fails on a few LLVM tests: Failed Tests (3): LLVM :: CodeGen/NVPTX/bf16-instructions.ll LLVM :: CodeGen/NVPTX/f16x2-instructions.ll LLVM :: CodeGen/NVPTX/math-intrins-sm80-ptx70.ll Comment at:

[PATCH] D144911: adding bf16 support to NVPTX

2023-06-20 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: llvm/lib/Target/NVPTX/NVPTXInstrInfo.td:559-568 -multiclass CVT_FROM_FLOAT_SM80 { -def _f32 : - NVPTXInst<(outs RC:$dst), -(ins Float32Regs:$src, CvtMode:$mode), -

[PATCH] D151361: [CUDA] bump supported CUDA version to 12.1/11.8

2023-06-15 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/docs/ReleaseNotes.rst:590 +- Clang now supports CUDA SDK up to 12.1 bader wrote: > @tra, could you update llvm/docs/CompileCudaWithLLVM.rst as well, please? Done in

[PATCH] D144911: adding bf16 support to NVPTX

2023-06-15 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: llvm/lib/Target/NVPTX/NVPTXIntrinsics.td:1271-1287 -def : Pat<(int_nvvm_ff2f16x2_rn Float32Regs:$a, Float32Regs:$b), - (CVT_f16x2_f32 Float32Regs:$a, Float32Regs:$b, CvtRN)>; -def : Pat<(int_nvvm_ff2f16x2_rn_relu Float32Regs:$a,

[PATCH] D144911: adding bf16 support to NVPTX

2023-06-13 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM with few nits. Thank you for your patience with revising the patch. Comment at: llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp:629-631 + const bool IsBFP16FP16x2NegAvailable =

[PATCH] D144911: adding bf16 support to NVPTX

2023-06-12 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Almost there. Just few cosmetic nits remaining. Comment at: llvm/lib/Target/NVPTX/MCTargetDesc/NVPTXInstPrinter.cpp:64-69 + case 9: OS << "%h"; break; case 8: + case 10: OS << "%hh"; tra wrote: > Looks like I've forgot

[PATCH] D16559: [CUDA] Add -fcuda-allow-variadic-functions.

2023-06-09 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D16559#4410067 , @garymm wrote: > Could you please add this to the documentation? > Could this be made the default? It seems like nvcc does this by default. Clang already does that, though we only allow variadic functions that

[PATCH] D144911: adding bf16 support to NVPTX

2023-06-09 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: llvm/lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp:615 // need to deal with. if (Vector.getSimpleValueType() != MVT::v2f16) return false; This needs to be updated to include v2bf16 Repository: rG LLVM Github

[PATCH] D152403: [Clang][CUDA] Disable diagnostics for neon attrs for GPU-side CUDA compilation

2023-06-08 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM with a nit. Comment at: clang/lib/Sema/SemaType.cpp:8168 +IsTargetCUDAAndHostARM = +!AuxTI || AuxTI->getTriple().isAArch64() || AuxTI->getTriple().isARM(); + }

[PATCH] D144911: adding bf16 support to NVPTX

2023-06-08 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Overall looks good with few minor nits and a couple of questions. Comment at: llvm/include/llvm/IR/IntrinsicsNVVM.td:604 def int_nvvm_f # operation # variant : ClangBuiltin, DefaultAttrsIntrinsic<[llvm_i16_ty], [llvm_i16_ty,

[PATCH] D152403: [Clang][CUDA] Disable diagnostics for neon attrs for GPU-side CUDA compilation

2023-06-07 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/Sema/SemaType.cpp:8168 +IsTargetCUDAAndHostARM = +!AuxTI || AuxTI->getTriple().isAArch64() || AuxTI->getTriple().isARM(); + } Should it be `AuxTI && (AuxTI->getTriple().isAArch64() ||

[PATCH] D152391: [Clang] Allow bitcode linking when the input is LLVM-IR

2023-06-07 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. > clang in.bc -Xclang -mlink-builtin-bitcode -Xclang libdevice.10.bc If that's something we intend to expose to the user, should we consider promoting it to a top-level driver option? Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION

[PATCH] D144911: adding bf16 support to NVPTX

2023-06-06 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp:615 setFP16OperationAction(Op, MVT::v2f16, Legal, Expand); - } - - for (const auto : {ISD::FADD, ISD::FMUL, ISD::FSUB, ISD::FMA}) { setBF16OperationAction(Op, MVT::bf16, Legal,

[PATCH] D99201: [HIP] Diagnose unaligned atomic for amdgpu

2023-06-06 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/Driver/ToolChains/Clang.cpp:7215 +// warnings as errors. +CmdArgs.push_back("-Werror=atomic-alignment"); } Should it be done from `HIPAMDToolChain::addClangWarningOptions` ? That's where Darwin does

[PATCH] D144911: adding bf16 support to NVPTX

2023-06-05 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: llvm/lib/Target/NVPTX/NVPTXISelLowering.cpp:615 setFP16OperationAction(Op, MVT::v2f16, Legal, Expand); - } - - for (const auto : {ISD::FADD, ISD::FMUL, ISD::FSUB, ISD::FMA}) { setBF16OperationAction(Op, MVT::bf16, Legal,

[PATCH] D144911: adding bf16 support to NVPTX

2023-06-05 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. FYI https://reviews.llvm.org/D151601 has landed in https://github.com/llvm/llvm-project/commit/dc90f42ea7b4f6d9e643f5ad2ba663eba2f9e421. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D144911/new/

[PATCH] D152164: [CUDA][HIP] Externalize device var in anonymous namespace

2023-06-05 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added inline comments. This revision is now accepted and ready to land. Comment at: clang/test/CodeGenCUDA/anon-ns.cu:46 + +// COMMON-DAG: @[[STR1:.*]] = {{.*}} c"[[KERN1]]\00" +// COMMON-DAG: @[[STR2:.*]] = {{.*}} c"[[KERN2]]\00"

[PATCH] D152027: [CUDA] Update Kepler(sm_3*) support info.

2023-06-02 Thread Artem Belevich via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG0f49116e261c: [CUDA] Update Kepler(sm_3*) support info. (authored by tra). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION

[PATCH] D152027: [CUDA] Update Kepler(sm_3*) support info.

2023-06-02 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. Herald added subscribers: mattd, carlosgalvezp, bixia, yaxunl. Herald added a project: All. tra published this revision for review. tra added a reviewer: jlebar. tra added a comment. Herald added a project: clang. Herald added a subscriber: cfe-commits. Kepler is gone!

[PATCH] D151601: [NVPTX] Coalesce register classes for {i16,f16,bf16}, {i32,v2f16,v2bf16}

2023-06-02 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. I've tested the change on a bunch of tensorflow tests and the patch didn't cause any apparent issues. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D151601/new/ https://reviews.llvm.org/D151601

[PATCH] D144911: adding bf16 support to NVPTX

2023-06-02 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In D144911#4389187 , @manishucsd wrote: > I fail to compile this patch. Please find the compilation error below: > > [build] ./llvm-project/llvm/lib/Target/NVPTX/NVPTXInstrInfo.td:1117:40: > error: Variable not defined:

[PATCH] D151876: [NVPTX] Signed char and (unsigned)long overloads of ldg and ldu

2023-06-01 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. I'd change the patch title: - `[NVPTX]` -> `[cuda, NVPTX]` as these are clang changes, not NVPTX back-end. - `overloads ` -> `builtins` Comment at:

[PATCH] D151904: [clang-repl][CUDA] Add an unit test for interactive CUDA

2023-06-01 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/unittests/Interpreter/InteractiveCudaTest.cpp:92 + std::unique_ptr Interp = createInterpreter(); + auto Err = Interp->LoadDynamicLibrary("libcudart.so"); + if (Err) { // CUDA runtime is not installed/usable, cannot continue testing

[PATCH] D151904: [clang-repl][CUDA] Add an unit test for interactive CUDA

2023-06-01 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/unittests/Interpreter/InteractiveCudaTest.cpp:92 + std::unique_ptr Interp = createInterpreter(); + auto Err = Interp->LoadDynamicLibrary("libcudart.so"); + if (Err) { // CUDA runtime is not installed/usable, cannot continue testing

[PATCH] D151839: [LinkerWrapper] Fix static library symbol resolution

2023-05-31 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/test/Driver/linker-wrapper-libs.c:27 // // Check that we extract a static library defining an undefined symbol. // jhuber6 wrote: > tra wrote: > > How does this test test the functionality of the undefined symbol?

[PATCH] D151839: [LinkerWrapper] Fix static library symbol resolution

2023-05-31 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. LGTM in general. Comment at: clang/test/Driver/linker-wrapper-libs.c:27 // // Check that we extract a static library defining an undefined symbol. // How does this test test the functionality of the undefined symbol? E.g. how does it

[PATCH] D150985: [clang] Allow fp in atomic fetch max/min builtins

2023-05-31 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM with few more test nits. Comment at: clang/test/Sema/atomic-ops.c:134 int *I, const int *CI, int **P, float *D, struct S *s1, struct S *s2) {

[PATCH] D150985: [clang] Allow fp in atomic fetch max/min builtins

2023-05-31 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/Sema/SemaChecking.cpp:6576-6578 if (!ValType->isFloatingType()) return false; + if (!(AllowedType & AOAVT_FP)) Collapse into a single if statement: `if (!(ValType->isFloatingType() &&

[PATCH] D151503: [CUDA] correctly install cuda_wrappers/bits/shared_ptr_base.h

2023-05-30 Thread Artem Belevich via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG6cdc07a701ee: [CUDA] correctly install cuda_wrappers/bits/shared_ptr_base.h (authored by tra). Repository: rG LLVM Github Monorepo CHANGES SINCE

[PATCH] D151503: [CUDA] correctly install cuda_wrappers/bits/shared_ptr_base.h

2023-05-30 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. @qiongsiwu1 : I've updated the patch. PTAL. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D151503/new/ https://reviews.llvm.org/D151503 ___ cfe-commits mailing list

[PATCH] D151503: [CUDA] correctly install cuda_wrappers/bits/shared_ptr_base.h

2023-05-30 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 526697. tra added a comment. Updated according to comments. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D151503/new/ https://reviews.llvm.org/D151503 Files: clang/lib/Headers/CMakeLists.txt Index:

[PATCH] D151606: [NFC][CLANG] Fix Static Code Analyzer Concerns with bad bit right shift operation in getNVPTXLaneID()

2023-05-30 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. In practice we're guaranteed by GPU architecture that the warp size will always be small enough to fit in 32 bits. Also `log2_32` will never return a value larger than 32. Does this assert help with anything else other than potential undefined behavior? CHANGES SINCE

[PATCH] D151349: [HIP] emit macro `__HIP_NO_IMAGE_SUPPORT`

2023-05-30 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/Basic/Targets/AMDGPU.cpp:248 + auto ISAVer = llvm::AMDGPU::getIsaVersion(Opts.CPU); + HasImage = ISAVer.Major != 9 || ISAVer.Minor != 4; } My usual nit about negations: `!(ISAVer.Major == 9 && ISAVer.Minor ==

[PATCH] D151503: [CUDA] correctly install cuda_wrappers/bits/shared_ptr_base.h

2023-05-26 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/Headers/CMakeLists.txt:516 COMPONENT cuda-resource-headers) install( qiongsiwu1 wrote: > qiongsiwu1 wrote: > > tra wrote: > > > qiongsiwu1 wrote: > > > > Do we need an install target for

[PATCH] D151503: [CUDA] correctly install cuda_wrappers/bits/shared_ptr_base.h

2023-05-26 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 526227. tra added a comment. Verified that install works correctly with individual component installations: cmake -DCOMPONENT=cuda-resource-headers -P ./cmake_install.cmake cmake -DCOMPONENT=clang-resource-headers -P ./cmake_install.cmake Repository: rG

[PATCH] D144911: adding bf16 support to NVPTX

2023-05-26 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Here's a rough proof-of-concept patch coalescing i16/f16/bf16 to use the same Int16Regs register class: https://reviews.llvm.org/D151601 The changes are largely mechanical, replacing `%h` -> `%rs` in the tests and eliminating special cases we previously had for

[PATCH] D144911: adding bf16 support to NVPTX

2023-05-26 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: llvm/include/llvm/IR/IntrinsicsNVVM.td:604 def int_nvvm_f # operation # variant : ClangBuiltin, DefaultAttrsIntrinsic<[llvm_i16_ty], [llvm_i16_ty, llvm_i16_ty], tra wrote: > Availability of these

[PATCH] D151503: [CUDA] correctly install cuda_wrappers/bits/shared_ptr_base.h

2023-05-26 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/lib/Headers/CMakeLists.txt:516 COMPONENT cuda-resource-headers) install( qiongsiwu1 wrote: > Do we need an install target for `${cuda_wrapper_bits_files}` for the > `cuda-resource-headers` component as well? It

[PATCH] D151503: [CUDA] correctly install cuda_wrappers/bits/shared_ptr_base.h

2023-05-25 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. Herald added subscribers: mattd, carlosgalvezp, bixia, yaxunl. Herald added a project: All. tra edited the summary of this revision. tra edited the summary of this revision. tra published this revision for review. tra added reviewers: qiongsiwu1, jlebar. Herald added a

[PATCH] D151362: [CUDA] Add CUDA wrappers over clang builtins for sm_90.

2023-05-25 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG5c082e7e15e3: [CUDA] Add CUDA wrappers over clang builtins for sm_90. (authored by tra). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D151362/new/

[PATCH] D151363: [NVPTX, CUDA] barrier intrinsics and builtins for sm_90

2023-05-25 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG25708b3df6e3: [NVPTX, CUDA] barrier intrinsics and builtins for sm_90 (authored by tra). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D151363/new/

[PATCH] D151168: [CUDA] plumb through new sm_90-specific builtins.

2023-05-25 Thread Artem Belevich via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG0a0bae1e9f94: [CUDA] plumb through new sm_90-specific builtins. (authored by tra). Changed prior to commit:

[PATCH] D151361: [CUDA] bump supported CUDA version to 12.1/11.8

2023-05-25 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rGffb635cb2d4e: [CUDA] bump supported CUDA version to 12.1/11.8 (authored by tra). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D151361/new/

[PATCH] D151359: [CUDA] Relax restrictions on variadics in host-side compilation.

2023-05-25 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG0ad5d40fa19f: [CUDA] Relax restrictions on variadics in host-side compilation. (authored by tra). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D151359/new/

[PATCH] D151362: [CUDA] Add CUDA wrappers over clang builtins for sm_90.

2023-05-24 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. Herald added subscribers: mattd, bixia, yaxunl. Herald added a project: All. tra updated this revision to Diff 525338. tra added a comment. tra updated this revision to Diff 525340. tra published this revision for review. tra added a reviewer: jlebar. Herald added a

[PATCH] D151363: [NVPTX, CUDA] barrier intrinsics and builtins for sm_90

2023-05-24 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 525309. tra added a comment. whitespace fix. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D151363/new/ https://reviews.llvm.org/D151363 Files: clang/include/clang/Basic/BuiltinsNVPTX.def

[PATCH] D151363: [NVPTX, CUDA] barrier intrinsics and builtins for sm_90

2023-05-24 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. Herald added subscribers: mattd, gchakrabarti, asavonic, bixia, hiraditya, yaxunl. Herald added a project: All. tra updated this revision to Diff 525307. tra added a comment. tra published this revision for review. tra added a reviewer: jlebar. Herald added subscribers:

[PATCH] D151361: [CUDA] bump supported CUDA version to 12.1/11.8

2023-05-24 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. Herald added subscribers: mattd, gchakrabarti, asavonic, bixia, hiraditya, yaxunl. Herald added a project: All. tra published this revision for review. tra added a reviewer: jlebar. Herald added subscribers: llvm-commits, cfe-commits, MaskRay, jholewinski. Herald added

[PATCH] D151359: [CUDA] Relax restrictions on variadics in host-side compilation.

2023-05-24 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. Herald added subscribers: mattd, bixia, yaxunl. Herald added a project: All. tra published this revision for review. tra added a reviewer: jlebar. Herald added subscribers: cfe-commits, MaskRay. Herald added a project: clang. D150718

[PATCH] D151243: [CUDA] Fix wrappers for sm_80 functions

2023-05-24 Thread Artem Belevich via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes. Closed by commit rG29cb080c363d: [CUDA] Fix wrappers for sm_80 functions (authored by tra). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D151243/new/

[PATCH] D150985: [clang] Allow fp in atomic fetch max/min builtins

2023-05-24 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. As I said, I'm OK with the patch in principle, I just don't know what other factors I may be missing. Tests seem to be missing for c11 variants of the builtins. Comment at: clang/test/Sema/atomic-ops.c:209 + __atomic_fetch_min(D, 3,

[PATCH] D151243: [CUDA] Fix wrappers for sm_80 functions

2023-05-23 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. Herald added subscribers: mattd, carlosgalvezp, bixia, yaxunl. Herald added a project: All. tra published this revision for review. tra added a reviewer: jlebar. Herald added a project: clang. Herald added a subscriber: cfe-commits. Previous implementation provided

[PATCH] D151168: [CUDA] plumb through new sm_90-specific builtins.

2023-05-22 Thread Artem Belevich via Phabricator via cfe-commits
tra created this revision. Herald added subscribers: mattd, gchakrabarti, asavonic, bixia, yaxunl. Herald added a project: All. tra added a reviewer: jlebar. tra published this revision for review. Herald added subscribers: cfe-commits, jholewinski. Herald added a project: clang. Repository: rG

[PATCH] D144911: adding bf16 support to NVPTX

2023-05-22 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: llvm/lib/Target/NVPTX/NVPTXAsmPrinter.cpp:315-318 -} else if (RC == ::BFloat16RegsRegClass) { - Ret = (9 << 28); -} else if (RC == ::BFloat16x2RegsRegClass) { - Ret = (10 << 28); There's still something

[PATCH] D150894: [CUDA] provide wrapper functions for new NVCC builtins.

2023-05-19 Thread Artem Belevich via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG4450285bd740: [CUDA] provide wrapper functions for new NVCC builtins. (authored by tra). Repository: rG LLVM Github Monorepo CHANGES SINCE LAST

[PATCH] D150894: [CUDA] provide wrapper functions for new NVCC builtins.

2023-05-19 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 523881. tra added a comment. typo fix. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D150894/new/ https://reviews.llvm.org/D150894 Files: clang/lib/Headers/__clang_cuda_intrinsics.h Index:

[PATCH] D150894: [CUDA] provide wrapper functions for new NVCC builtins.

2023-05-19 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 523879. tra added a comment. Added __nv_associate_access_property_impl() stub. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D150894/new/ https://reviews.llvm.org/D150894 Files:

[PATCH] D150965: [HIP] Allow std::malloc in device function

2023-05-19 Thread Artem Belevich via Phabricator via cfe-commits
tra added inline comments. Comment at: clang/test/Headers/Inputs/include/math.h:108-109 long lroundf(float __a); -int max(int __a, int __b); -int min(int __a, int __b); double modf(double __a, double *__b); yaxunl wrote: > tra wrote: > > Why were these

[PATCH] D150985: [clang] Allow fp in atomic fetch max/min builtins

2023-05-19 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. The code changes look OK to me. Whether allowing FP for clang builtins is OK -- I have no idea, especially for the c11 ones. CHANGES SINCE LAST ACTION https://reviews.llvm.org/D150985/new/ https://reviews.llvm.org/D150985

[PATCH] D150820: [NVPTX, CUDA] added optional src_size argument to __nvvm_cp_async*

2023-05-19 Thread Artem Belevich via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds. This revision was automatically updated to reflect the committed changes. Closed by commit rG6963c61f0f6e: [NVPTX/CUDA] added an optional src_size argument to __nvvm_cp_async* (authored by tra). Repository: rG LLVM Github Monorepo

[PATCH] D150965: [HIP] Allow std::malloc in device function

2023-05-19 Thread Artem Belevich via Phabricator via cfe-commits
tra accepted this revision. tra added a comment. This revision is now accepted and ready to land. LGTM. Comment at: clang/test/Headers/Inputs/include/math.h:108-109 long lroundf(float __a); -int max(int __a, int __b); -int min(int __a, int __b); double modf(double __a,

[PATCH] D150820: [NVPTX, CUDA] added optional src_size argument to __nvvm_cp_async*

2023-05-18 Thread Artem Belevich via Phabricator via cfe-commits
tra requested review of this revision. tra added a comment. PTAL. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D150820/new/ https://reviews.llvm.org/D150820 ___ cfe-commits mailing list

[PATCH] D150820: [NVPTX, CUDA] added optional src_size argument to __nvvm_cp_async*

2023-05-18 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 523566. tra added a comment. Instead of changing existing intrinsic, introduce a new set which takes an additional src_size argument. This should keep existing users working. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION

[PATCH] D150820: [NVPTX, CUDA] added optional src_size argument to __nvvm_cp_async*

2023-05-18 Thread Artem Belevich via Phabricator via cfe-commits
tra added a comment. Looks like the extra intrinsic argument broke MLIR. I'll need to figure out how to deal with that. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D150820/new/ https://reviews.llvm.org/D150820

[PATCH] D150894: [CUDA] provide wrapper functions for new NVCC builtins.

2023-05-18 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 523472. tra added a comment. Put the wrappers behind __CUDA_ARCH__ >= 800, as these clang builtins are not available on older GPUs. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D150894/new/

[PATCH] D150894: [CUDA] provide wrapper functions for new NVCC builtins.

2023-05-18 Thread Artem Belevich via Phabricator via cfe-commits
tra updated this revision to Diff 523466. tra added a comment. Prefix function args with `__`. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D150894/new/ https://reviews.llvm.org/D150894 Files: clang/lib/Headers/__clang_cuda_intrinsics.h

  1   2   3   4   5   6   7   8   9   10   >