[clang] [lld] [llvm] [mlir] [IR] Introduce `T` to `DataLayout` to represent flat address space if a target supports it (PR #108786)

2024-09-16 Thread Artem Belevich via cfe-commits
Artem-B wrote: > I'm still concerned about the (no-)aliasing guarantees. It's useful to have > two non-flat address spaces that can alias, Another example for NVIDIA GPUs would be `.param` space. According to the [PTX spec](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#gen

[clang] [lld] [llvm] [mlir] [IR] Introduce `T` to `DataLayout` to represent flat address space if a target supports it (PR #108786)

2024-09-16 Thread Artem Belevich via cfe-commits
Artem-B wrote: The description of the flat address space in the `TargetTransformInfo.h` is somewhat vague and both, soo specific and not precise enough, IMO: ``` The flat address space is a /// generic address space that can be used access multiple segments of memory /// with different addre

[clang] [llvm] [NVPTX] Remove nvvm.bitcast.* intrinsics (PR #107936)

2024-09-12 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/107936 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Remove nvvm.bitcast.* intrinsics (PR #107936)

2024-09-09 Thread Artem Belevich via cfe-commits
Artem-B wrote: It may be worth adding a note about this in the release notes. https://github.com/llvm/llvm-project/pull/107936 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [HIP] Use original file path for CUID (PR #107734)

2024-09-09 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/107734 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA/HIP] propagate -cuid to a host-only compilation. (PR #107483)

2024-09-09 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B closed https://github.com/llvm/llvm-project/pull/107483 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA/HIP] propagate -cuid to a host-only compilation. (PR #107483)

2024-09-05 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B created https://github.com/llvm/llvm-project/pull/107483 Right now we're bailing out too early, and `-cuid` does not get set for the host-only compilations. >From 52a27293d1c93a7ed4dcef845f705808afa3c273 Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Thu, 5 Se

[clang] [llvm] [NVPTX] Support __usAtomicCAS builtin (PR #99646)

2024-08-28 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B closed https://github.com/llvm/llvm-project/pull/99646 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Fix amdgpu-arch for dll name on Windows (PR #101350)

2024-08-23 Thread Artem Belevich via cfe-commits
Artem-B wrote: This is very Windows-specific. @rnk -- would you have time to take a look? https://github.com/llvm/llvm-project/pull/101350 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-comm

[clang] [llvm] [NVPTX] Support __usAtomicCAS builtin (PR #99646)

2024-08-23 Thread Artem Belevich via cfe-commits
Artem-B wrote: I can land the patch. The buildkite failures appear to be unrelated (something in lldb tests). Let's wait till clang format checks are done. https://github.com/llvm/llvm-project/pull/99646 ___ cfe-commits mailing list cfe-commits@lists

[clang] [llvm] [NVPTX] Support __usAtomicCAS builtin (PR #99646)

2024-08-22 Thread Artem Belevich via cfe-commits
Artem-B wrote: Buildkite failures are caused by lldb and are unrelated. We're good to go. https://github.com/llvm/llvm-project/pull/99646 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commit

[clang] [llvm] [NVPTX] Support __usAtomicCAS builtin (PR #99646)

2024-08-20 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,216 @@ +; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py UTC_ARGS: --default-march nvptx64 --version 5 +; RUN: llc < %s -march=nvptx64 -mcpu=sm_32 | FileCheck %s --check-prefixes=SM30,CHECK +; RUN: %if ptxas %{ llc < %s -march=nvptx64 -mcp

[clang] [llvm] [NVPTX] Support __usAtomicCAS builtin (PR #99646)

2024-08-19 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/99646 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][HIP] Target-dependent overload resolution in declarators and specifiers (PR #103031)

2024-08-19 Thread Artem Belevich via cfe-commits
@@ -9017,6 +9017,10 @@ def err_global_call_not_config : Error< def err_ref_bad_target : Error< "reference to %select{__device__|__global__|__host__|__host__ __device__}0 " "%select{function|variable}1 %2 in %select{__device__|__global__|__host__|__host__ __device__}3 funct

[clang] [Clang][HIP] Target-dependent overload resolution in declarators and specifiers (PR #103031)

2024-08-19 Thread Artem Belevich via cfe-commits
@@ -115,20 +143,65 @@ static bool hasAttr(const Decl *D, bool IgnoreImplicitAttr) { }); } +SemaCUDA::CUDATargetContext::CUDATargetContext(SemaCUDA *S, Artem-B wrote: This could probably be moved into the header. https://github.com/llvm/llvm-projec

[clang] [Clang][HIP] Target-dependent overload resolution in declarators and specifiers (PR #103031)

2024-08-19 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/103031 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][HIP] Target-dependent overload resolution in declarators and specifiers (PR #103031)

2024-08-19 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B commented: Considering that we're adding another interesting quirk to how we interpret target attributes & function calls, it would be useful to run this by a language lawyer to make sure we're not missing something. @zygoloid - would you have time to take a look or

[clang] [llvm] [NVPTX] Add conversion intrinsics from/to fp8 types (e4m3, e5m2) (PR #102969)

2024-08-19 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B closed https://github.com/llvm/llvm-project/pull/102969 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Fix sema checks thinking kernels aren't kernels (PR #104460)

2024-08-16 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/104460 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [HIP] search fatbin symbols for libs passed by -l (PR #104638)

2024-08-16 Thread Artem Belevich via cfe-commits
@@ -76,8 +79,75 @@ class HIPUndefinedFatBinSymbols { return GPUBinHandleSymbols; } + // Collect symbols from static libraries specified by -l options. + void processStaticLibraries() { +llvm::SmallVector LibNames; +llvm::SmallVector LibPaths; +llvm::SmallVe

[clang] [HIP] search fatbin symbols for libs passed by -l (PR #104638)

2024-08-16 Thread Artem Belevich via cfe-commits
@@ -76,8 +79,75 @@ class HIPUndefinedFatBinSymbols { return GPUBinHandleSymbols; } + // Collect symbols from static libraries specified by -l options. + void processStaticLibraries() { +llvm::SmallVector LibNames; +llvm::SmallVector LibPaths; +llvm::SmallVe

[clang] [HIP] search fatbin symbols for libs passed by -l (PR #104638)

2024-08-16 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. LGTM in general with a couple of nits. https://github.com/llvm/llvm-project/pull/104638 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-c

[clang] [HIP] search fatbin symbols for libs passed by -l (PR #104638)

2024-08-16 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/104638 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [CUDA] Add a pseudo GPU sm_next which allows overriding for SM/PTX version. (PR #100247)

2024-08-16 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B updated https://github.com/llvm/llvm-project/pull/100247 >From 2a948e8803cd881937e9a121ca9fe9c4816e857e Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Thu, 18 Jul 2024 15:05:01 -0700 Subject: [PATCH 1/2] [CUDA] Add a pseudo GPU sm_next which allows overrides for

[clang] [llvm] [CUDA] Add a pseudo GPU sm_next which allows overriding for SM/PTX version. (PR #100247)

2024-08-16 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B updated https://github.com/llvm/llvm-project/pull/100247 >From da1ac9d36bd284dc607b7366ff83ba556fb64fb5 Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Thu, 18 Jul 2024 15:05:01 -0700 Subject: [PATCH 1/2] [CUDA] Add a pseudo GPU sm_next which allows overrides for

[clang] [llvm] [NVPTX] Add conversion intrinsics from/to fp8 types (e4m3, e5m2) (PR #102969)

2024-08-16 Thread Artem Belevich via cfe-commits
Artem-B wrote: Looks like that it was clang-format check github was waiting on an approaval for. I've just clicked that button, let's see what it brings. The patch is good to go otherwise, IMO. https://github.com/llvm/llvm-project/pull/102969 ___ cfe-

[clang] [Clang] Fix sema checks thinking kernels aren't kernels (PR #104460)

2024-08-15 Thread Artem Belevich via cfe-commits
@@ -7163,24 +7163,27 @@ void Sema::ProcessDeclAttributeList( } else if (const auto *A = D->getAttr()) { Diag(D->getLocation(), diag::err_opencl_kernel_attr) << A; D->setInvalidDecl(); -} else if (!D->hasAttr()) { - if (const auto *A = D->getAttr()) { -

[clang] [Clang] Fix sema checks thinking kernels aren't kernels (PR #104460)

2024-08-15 Thread Artem Belevich via cfe-commits
@@ -7163,24 +7163,27 @@ void Sema::ProcessDeclAttributeList( } else if (const auto *A = D->getAttr()) { Diag(D->getLocation(), diag::err_opencl_kernel_attr) << A; D->setInvalidDecl(); -} else if (!D->hasAttr()) { - if (const auto *A = D->getAttr()) { -

[clang] [Clang] Fix sema checks thinking kernels aren't kernels (PR #104460)

2024-08-15 Thread Artem Belevich via cfe-commits
@@ -7163,24 +7163,27 @@ void Sema::ProcessDeclAttributeList( } else if (const auto *A = D->getAttr()) { Diag(D->getLocation(), diag::err_opencl_kernel_attr) << A; D->setInvalidDecl(); -} else if (!D->hasAttr()) { - if (const auto *A = D->getAttr()) { -

[clang] [llvm] [NVPTX] Add conversion intrinsics from/to fp8 types (e4m3, e5m2) (PR #102969)

2024-08-14 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/102969 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Add conversion intrinsics from/to fp8 types (e4m3, e5m2) (PR #102969)

2024-08-14 Thread Artem Belevich via cfe-commits
@@ -722,6 +722,37 @@ let hasSideEffects = false in { defm CVT_f16x2 : CVT_FROM_FLOAT_V2_SM80<"f16x2", Int32Regs>; defm CVT_bf16x2 : CVT_FROM_FLOAT_V2_SM80<"bf16x2", Int32Regs>; + + // FP8 conversions. + multiclass CVT_TO_F8X2 { +def _f32 : + NVPTXInst<(outs Int1

[clang] [llvm] [NVPTX] Add conversion intrinsics from/to fp8 types (e4m3, e5m2) (PR #102969)

2024-08-13 Thread Artem Belevich via cfe-commits
@@ -722,6 +722,37 @@ let hasSideEffects = false in { defm CVT_f16x2 : CVT_FROM_FLOAT_V2_SM80<"f16x2", Int32Regs>; defm CVT_bf16x2 : CVT_FROM_FLOAT_V2_SM80<"bf16x2", Int32Regs>; + + // FP8 conversions. + multiclass CVT_TO_F8X2 { +def _f32 : + NVPTXInst<(outs Int1

[clang] [llvm] [NVPTX] Add conversion intrinsics from/to fp8 types (e4m3, e5m2) (PR #102969)

2024-08-13 Thread Artem Belevich via cfe-commits
@@ -968,6 +971,39 @@ __device__ void nvvm_cvt_sm80() { // CHECK: ret void } +// CHECK-LABEL: nvvm_cvt_sm89 +__device__ void nvvm_cvt_sm89() { +#if __CUDA_ARCH__ >= 890 + // CHECK_PTX81_SM89: call i16 @llvm.nvvm.ff.to.e4m3x2.rn(float 1.00e+00, float 1.00e+00) + __n

[clang] [llvm] [NVPTX] Add conversion intrinsics from/to fp8 types (e4m3, e5m2) (PR #102969)

2024-08-13 Thread Artem Belevich via cfe-commits
@@ -722,6 +722,37 @@ let hasSideEffects = false in { defm CVT_f16x2 : CVT_FROM_FLOAT_V2_SM80<"f16x2", Int32Regs>; defm CVT_bf16x2 : CVT_FROM_FLOAT_V2_SM80<"bf16x2", Int32Regs>; + + // FP8 conversions. + multiclass CVT_TO_F8X2 { +def _f32 : + NVPTXInst<(outs Int1

[clang] [llvm] [NVPTX] Add conversion intrinsics from/to fp8 types (e4m3, e5m2) (PR #102969)

2024-08-13 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. LGTM with a couple of nits. https://github.com/llvm/llvm-project/pull/102969 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Add conversion intrinsics from/to fp8 types (e4m3, e5m2) (PR #102969)

2024-08-13 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/102969 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [CUDA] Add a pseudo GPU sm_next which allows overriding for SM/PTX version. (PR #100247)

2024-08-12 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B updated https://github.com/llvm/llvm-project/pull/100247 >From da1ac9d36bd284dc607b7366ff83ba556fb64fb5 Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Thu, 18 Jul 2024 15:05:01 -0700 Subject: [PATCH 1/2] [CUDA] Add a pseudo GPU sm_next which allows overrides for

[clang] [llvm] [CUDA] Add a pseudo GPU sm_next which allows overriding for SM/PTX version. (PR #100247)

2024-08-12 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B updated https://github.com/llvm/llvm-project/pull/100247 >From da1ac9d36bd284dc607b7366ff83ba556fb64fb5 Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Thu, 18 Jul 2024 15:05:01 -0700 Subject: [PATCH] [CUDA] Add a pseudo GPU sm_next which allows overrides for SM/

[clang] [llvm] [CUDA] Add a pseudo GPU sm_next which allows overriding for SM/PTX version. (PR #100247)

2024-08-12 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B updated https://github.com/llvm/llvm-project/pull/100247 >From 44a1045eee71777fa916e2a8043b2f99afc96a96 Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Thu, 18 Jul 2024 15:05:01 -0700 Subject: [PATCH 1/4] [CUDA] Add a pseudo GPU sm_next which allows overrides for

[clang] [llvm] [CUDA] Add a pseudo GPU sm_next which allows overriding for SM/PTX version. (PR #100247)

2024-08-12 Thread Artem Belevich via cfe-commits
Artem-B wrote: OK, I've reworked the patch, and it appears to correctly propagate arbitrary SM/PTX versions from clang, down to the LLVM and generated PTX, and to ptxas and fatbinary command line options. PTAL. https://github.com/llvm/llvm-project/pull/100247

[clang] [CUDA] Emit used function list in deterministic order. (PR #102661)

2024-08-12 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B closed https://github.com/llvm/llvm-project/pull/102661 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA] Emit used function list in deterministic order. (PR #102661)

2024-08-09 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B updated https://github.com/llvm/llvm-project/pull/102661 >From 0f3944e1c12baa958f52c3c015a0cf5f9aeff1ed Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Fri, 9 Aug 2024 11:51:23 -0700 Subject: [PATCH 1/2] [CUDA] Emit used function list in deterministic order. Fixe

[clang] [CUDA] Emit used function list in deterministic order. (PR #102661)

2024-08-09 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/102661 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA] Emit used function list in deterministic order. (PR #102661)

2024-08-09 Thread Artem Belevich via cfe-commits
@@ -950,6 +950,9 @@ void CodeGenModule::Release() { UsedArray.push_back(llvm::ConstantExpr::getPointerBitCastOrAddrSpaceCast( GetAddrOfGlobal(GD), Int8PtrTy)); } +// Sort decls by name to always emit them in deterministic order. Artem-B

[clang] [CUDA] Emit used function list in deterministic order. (PR #102661)

2024-08-09 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B updated https://github.com/llvm/llvm-project/pull/102661 >From 0f3944e1c12baa958f52c3c015a0cf5f9aeff1ed Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Fri, 9 Aug 2024 11:51:23 -0700 Subject: [PATCH] [CUDA] Emit used function list in deterministic order. Fixes ht

[clang] [CUDA] Emit used function list in deterministic order. (PR #102661)

2024-08-09 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B created https://github.com/llvm/llvm-project/pull/102661 Fixes https://github.com/llvm/llvm-project/issues/101560 >From 6ee0add21bd2a9b25d28640c91de2fc6dab7fa72 Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Fri, 9 Aug 2024 11:51:23 -0700 Subject: [PATCH] [CUDA]

[clang] [llvm] [CUDA] Add a pseudo GPU sm_next which allows overriding for SM/PTX version. (PR #100247)

2024-08-08 Thread Artem Belevich via cfe-commits
@@ -52,6 +53,42 @@ const char *CudaVersionToString(CudaVersion V); // Input is "Major.Minor" CudaVersion CudaStringToVersion(const llvm::Twine &S); +enum class PTXVersion { + PTX_UNKNOWN = 0, + PTX_32 = 32, + PTX_40 = 40, + PTX_41, + PTX_42, + PTX_43, + PTX_50 = 50, +

[clang] [llvm] [CUDA] Add a pseudo GPU sm_next which allows overriding for SM/PTX version. (PR #100247)

2024-08-08 Thread Artem Belevich via cfe-commits
@@ -26,24 +27,38 @@ static cl::opt NoF16Math("nvptx-no-f16-math", cl::Hidden, cl::desc("NVPTX Specific: Disable generation of f16 math ops."), cl::init(false)); +static cl::opt +NextSM("nvptx-next-sm", cl::Hidden, + cl::desc("NVPTX

[clang] [llvm] [CUDA] Add a pseudo GPU sm_next which allows overriding for SM/PTX version. (PR #100247)

2024-08-08 Thread Artem Belevich via cfe-commits
@@ -26,24 +27,38 @@ static cl::opt NoF16Math("nvptx-no-f16-math", cl::Hidden, cl::desc("NVPTX Specific: Disable generation of f16 math ops."), cl::init(false)); +static cl::opt +NextSM("nvptx-next-sm", cl::Hidden, + cl::desc("NVPTX

[clang] [llvm] [CUDA] Add a pseudo GPU sm_next which allows overriding for SM/PTX version. (PR #100247)

2024-08-08 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B updated https://github.com/llvm/llvm-project/pull/100247 >From 44a1045eee71777fa916e2a8043b2f99afc96a96 Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Thu, 18 Jul 2024 15:05:01 -0700 Subject: [PATCH 1/3] [CUDA] Add a pseudo GPU sm_next which allows overrides for

[clang] [llvm] [NVPTX] Support __usAtomicCAS builtin (PR #99646)

2024-08-07 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,37 @@ +; RUN: llc < %s -march=nvptx64 -mcpu=sm_32 | FileCheck %s --check-prefixes=SM30,CHECK Artem-B wrote: https://llvm.org/docs/TestingGuide.html#generating-assertions-in-regression-tests https://github.com/llvm/llvm-project/pull/99646 __

[clang] [llvm] [NVPTX] Support __usAtomicCAS builtin (PR #99646)

2024-08-07 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,37 @@ +; RUN: llc < %s -march=nvptx64 -mcpu=sm_32 | FileCheck %s --check-prefixes=SM30,CHECK Artem-B wrote: This test should be suitable for automatic check generation -- we probably do want to see the details of what we're doing when we do 32-bit CA

[clang] [llvm] [NVPTX] Support __usAtomicCAS builtin (PR #99646)

2024-08-07 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. LGTM with a test nit https://github.com/llvm/llvm-project/pull/99646 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Support __usAtomicCAS builtin (PR #99646)

2024-08-07 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/99646 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Do not emit intrinsic math functions on GPU targets (PR #98209)

2024-08-05 Thread Artem Belevich via cfe-commits
Artem-B wrote: Given that the prevalent compilation for CUDA has no standard library whatsoever, preserving libcalls may break some existing users that may be relying on library call lowering to an intrinsic that *is* implemented by the back-end. Perhaps this "no library call to intrinsic con

[clang] [NVPTX] Correctly forward the PTX feature to the nvlink wrapper (PR #100607)

2024-07-25 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/100607 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [NVPTX] Correctly forward the PTX feature to the nvlink wrapper (PR #100607)

2024-07-25 Thread Artem Belevich via cfe-commits
Artem-B wrote: The patch seems to change only the test file. Should there be more changes in the patch? https://github.com/llvm/llvm-project/pull/100607 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/lis

[clang] [NVPTX] Restore old va_list builtin type (PR #100438)

2024-07-24 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/100438 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [CUDA] Add a pseudo GPU sm_next which allows overriding for SM/PTX version. (PR #100247)

2024-07-24 Thread Artem Belevich via cfe-commits
@@ -26,24 +27,38 @@ static cl::opt NoF16Math("nvptx-no-f16-math", cl::Hidden, cl::desc("NVPTX Specific: Disable generation of f16 math ops."), cl::init(false)); +static cl::opt +NextSM("nvptx-next-sm", cl::Hidden, + cl::desc("NVPTX

[clang] [llvm] [CUDA] Add a pseudo GPU sm_next which allows overriding for SM/PTX version. (PR #100247)

2024-07-23 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/100247 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [CUDA] Add a pseudo GPU sm_next which allows overriding for SM/PTX version. (PR #100247)

2024-07-23 Thread Artem Belevich via cfe-commits
@@ -648,6 +658,13 @@ void NVPTX::getNVPTXTargetFeatures(const Driver &D, const llvm::Triple &Triple, Features.push_back(Args.MakeArgString(PtxFeature)); return; } + // Add --cuda-next-ptx to the list of features, but carry on to add the + // default PTX feature for

[clang] [llvm] [CUDA] Add a pseudo GPU sm_next which allows overriding for SM/PTX version. (PR #100247)

2024-07-23 Thread Artem Belevich via cfe-commits
@@ -96,6 +96,7 @@ static const OffloadArchToStringMap arch_names[] = { SM(89), // Ada Lovelace SM(90), // Hopper SM(90a), // Hopper +SM(next),// Placeholder for a n

[clang] [llvm] [CUDA] Add a pseudo GPU sm_next which allows overriding for SM/PTX version. (PR #100247)

2024-07-23 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B updated https://github.com/llvm/llvm-project/pull/100247 >From 44a1045eee71777fa916e2a8043b2f99afc96a96 Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Thu, 18 Jul 2024 15:05:01 -0700 Subject: [PATCH 1/2] [CUDA] Add a pseudo GPU sm_next which allows overrides for

[clang] [llvm] [CUDA] Add support for CUDA-12.6 and sm_100 (PR #97402)

2024-07-23 Thread Artem Belevich via cfe-commits
Artem-B wrote: @sergey-kozub FYI, https://github.com/llvm/llvm-project/pull/100247 should allow forward-testing CUDA w/o relying on specific GPU/PTX variant being hardcoded in clang. https://github.com/llvm/llvm-project/pull/97402 ___ cfe-commits ma

[clang] [llvm] [CUDA] Add a pseudo GPU sm_next which allows overriding for SM/PTX version. (PR #100247)

2024-07-23 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B created https://github.com/llvm/llvm-project/pull/100247 Sometimes users may need to use older clang with newer SM/PTX versions which clang does not know anything about, yet. --offload-arch=sm_next, combined with --cuda-next-sm=X and --cuda-next-ptx=Y allows passing

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-23 Thread Artem Belevich via cfe-commits
Artem-B wrote: @nico: > Why do we need a new binary for this, instead of having something like `clang > -cc1_nvlink` that calls a custom mode within clang? Do we have existing precedents for such built-in tools, other than cc1 itself? If the linker wrapper can be part of clang itself, it woul

[clang] [Clang] Correctly forward `--cuda-path` to the nvlink wrapper (PR #100170)

2024-07-23 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/100170 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [LinkerWrapper] Pass all files to the device linker (PR #97573)

2024-07-22 Thread Artem Belevich via cfe-commits
@@ -504,18 +511,23 @@ Expected clang(ArrayRef InputFiles, const ArgList &Args) { llvm::copy(LinkerArgs, std::back_inserter(CmdArgs)); } - // Pass on -mllvm options to the clang invocation. - for (const opt::Arg *Arg : Args.filtered(OPT_mllvm)) { -CmdArgs.push_back

[clang] [LinkerWrapper] Pass all files to the device linker (PR #97573)

2024-07-22 Thread Artem Belevich via cfe-commits
@@ -504,18 +511,23 @@ Expected clang(ArrayRef InputFiles, const ArgList &Args) { llvm::copy(LinkerArgs, std::back_inserter(CmdArgs)); } - // Pass on -mllvm options to the clang invocation. - for (const opt::Arg *Arg : Args.filtered(OPT_mllvm)) { -CmdArgs.push_back

[clang] [LinkerWrapper] Pass all files to the device linker (PR #97573)

2024-07-22 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,64 @@ + +Clang nvlink Wrapper + + +.. contents:: + :local: + +.. _clang-nvlink-wrapper: + +Introduction + + +This tools works as a wrapper around the NVIDIA ``nvlink`` linker. The purpose +of this wrapper is to prov

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-22 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,778 @@ +//===-- clang-nvlink-wrapper/ClangNVLinkWrapper.cpp - NVIDIA linker util --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-22 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,778 @@ +//===-- clang-nvlink-wrapper/ClangNVLinkWrapper.cpp - NVIDIA linker util --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-22 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,778 @@ +//===-- clang-nvlink-wrapper/ClangNVLinkWrapper.cpp - NVIDIA linker util --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-22 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,778 @@ +//===-- clang-nvlink-wrapper/ClangNVLinkWrapper.cpp - NVIDIA linker util --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-22 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. LGTM for the patch in general, though I can't vouch for the details of the linking process. I'll defer to @MaskRay on that. https://github.com/llvm/llvm-project/pull/96561 ___ cfe-commits mailing

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-22 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,778 @@ +//===-- clang-nvlink-wrapper/ClangNVLinkWrapper.cpp - NVIDIA linker util --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-22 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,778 @@ +//===-- clang-nvlink-wrapper/ClangNVLinkWrapper.cpp - NVIDIA linker util --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-22 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/96561 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-15 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,64 @@ + +Clang nvlink Wrapper + + +.. contents:: + :local: + +.. _clang-nvlink-wrapper: + +Introduction + + +This tools works as a wrapper around the NVIDIA ``nvlink`` linker. The purpose +of this wrapper is to prov

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-15 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,776 @@ +//===-- clang-nvlink-wrapper/ClangNVLinkWrapper.cpp - NVIDIA linker util --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-15 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B commented: First batch of comments on the patch -- I only got till about the middle of ClangNVLinkWrapper.cpp. Will continue reviewing tomorrow. https://github.com/llvm/llvm-project/pull/96561 ___ cfe-commits mailing list cf

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-15 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,776 @@ +//===-- clang-nvlink-wrapper/ClangNVLinkWrapper.cpp - NVIDIA linker util --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-15 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,776 @@ +//===-- clang-nvlink-wrapper/ClangNVLinkWrapper.cpp - NVIDIA linker util --===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-15 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,64 @@ + +Clang nvlink Wrapper + + +.. contents:: + :local: + +.. _clang-nvlink-wrapper: + +Introduction + + +This tools works as a wrapper around the NVIDIA ``nvlink`` linker. The purpose +of this wrapper is to prov

[clang] [Clang] Introduce 'clang-nvlink-wrapper' to work around 'nvlink' (PR #96561)

2024-07-15 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/96561 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [libc] [llvm] [NVPTX] Implement variadic functions using IR lowering (PR #96015)

2024-07-12 Thread Artem Belevich via cfe-commits
@@ -203,8 +203,12 @@ ABIArgInfo NVPTXABIInfo::classifyArgumentType(QualType Ty) const { void NVPTXABIInfo::computeInfo(CGFunctionInfo &FI) const { if (!getCXXABI().classifyReturnType(FI)) FI.getReturnInfo() = classifyReturnType(FI.getReturnType()); + + unsigned Argument

[clang] [libc] [llvm] [NVPTX] Implement variadic functions using IR lowering (PR #96015)

2024-07-12 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. LGTM with a minor nit. https://github.com/llvm/llvm-project/pull/96015 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [libc] [llvm] [NVPTX] Implement variadic functions using IR lowering (PR #96015)

2024-07-12 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/96015 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [CUDA] Add support for CUDA-12.6 and sm_100 (PR #97402)

2024-07-08 Thread Artem Belevich via cfe-commits
Artem-B wrote: > This PR is redundant, closing. I think the patch was perfectly fine. Considering that other NVIDIA open-source projects already mention sm_100 (E.g. https://github.com/NVIDIA/cccl/blob/5efe53dbd71ea3e4bc4fdbb73edc001e0bf81547/libcudacxx/include/nv/detail/__target_macros#L241),

[clang] [llvm] [Offload][CUDA] Allow CUDA kernels to use LLVM/Offload (PR #94549)

2024-06-07 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,31 @@ +/*===-- LLVM/Offload helpers for kernel languages (CUDA/HIP) -*- c++ -*-=== + * + * Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. + * See https://llvm.org/LICENSE.txt for license information. + * SPDX-License-Identifier: Apach

[clang] [llvm] [Offload][CUDA] Allow CUDA kernels to use LLVM/Offload (PR #94549)

2024-06-07 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. LGTM in principle. Will kernels in TUs compiled with `-foffload-via-llvm` be interoperable with code that wants to launch them from another TU compiled w/o `-foffload-via-llvm` ? E.g.: - a.cu: `__global__ void kernel() { ... }` - b.cu: `e

[clang] [llvm] [Offload][CUDA] Allow CUDA kernels to use LLVM/Offload (PR #94549)

2024-06-07 Thread Artem Belevich via cfe-commits
@@ -1125,6 +1125,22 @@ void Clang::AddPreprocessingOptions(Compilation &C, const JobAction &JA, CmdArgs.push_back("__clang_openmp_device_functions.h"); } + if (Args.hasArg(options::OPT_foffload_via_llvm)) { +// Add llvm_wrappers/* to our system include path. This

[clang] [llvm] [Offload][CUDA] Allow CUDA kernels to use LLVM/Offload (PR #94549)

2024-06-07 Thread Artem Belevich via cfe-commits
@@ -,17 +6684,26 @@ void Clang::ConstructJob(Compilation &C, const JobAction &JA, break; } } else { +if (Args.hasFlag(options::OPT_foffload_via_llvm, + options::OPT_fno_offload_via_llvm, false)) + Args.AddLastArg(CmdArgs, options::O

[clang] [llvm] [Offload][CUDA] Allow CUDA kernels to use LLVM/Offload (PR #94549)

2024-06-07 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/94549 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Add timeout for GPU detection utilities (PR #94751)

2024-06-07 Thread Artem Belevich via cfe-commits
Artem-B wrote: > What's the config to set this by default without any graphics? https://docs.nvidia.com/deploy/driver-persistence/index.html I usually use "nvidia-smi -i -pm ENABLED" to force the driver to be loaded permanently. As for `__nvcc_device_query`, my guess is that it just uses a

[clang] [Clang] Add timeout for GPU detection utilities (PR #94751)

2024-06-07 Thread Artem Belevich via cfe-commits
Artem-B wrote: Ooh... I think I know exactly what may be causing this. On machines where NVIDIA GPUs are used for compute only (e.g. a headless server machine), NVIDIA drivers are not always loaded by default and may not have driver persistence enabled. The drivers get loaded when GPU is acces

[clang] [CUDA][HIP] warn incompatible redeclare (PR #77359)

2024-06-06 Thread Artem Belevich via cfe-commits
@@ -9013,6 +9013,12 @@ def err_cuda_ovl_target : Error< "cannot overload %select{__device__|__global__|__host__|__host__ __device__}2 function %3">; def note_cuda_ovl_candidate_target_mismatch : Note< "candidate template ignored: target attributes do not match">; +def wa

[clang] [CUDA][HIP] warn incompatible redeclare (PR #77359)

2024-06-06 Thread Artem Belevich via cfe-commits
@@ -9013,6 +9013,12 @@ def err_cuda_ovl_target : Error< "cannot overload %select{__device__|__global__|__host__|__host__ __device__}2 function %3">; def note_cuda_ovl_candidate_target_mismatch : Note< "candidate template ignored: target attributes do not match">; +def wa

[clang] [CUDA][HIP] warn incompatible redeclare (PR #77359)

2024-06-06 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. LGTM with some wording/namiung nits. https://github.com/llvm/llvm-project/pull/77359 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-comm

[clang] [CUDA][HIP] warn incompatible redeclare (PR #77359)

2024-06-06 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/77359 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

  1   2   3   4   5   6   7   8   9   10   >