Artem-B wrote:
> I'm still concerned about the (no-)aliasing guarantees. It's useful to have
> two non-flat address spaces that can alias,
Another example for NVIDIA GPUs would be `.param` space. According to the [PTX
spec](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#gen
Artem-B wrote:
The description of the flat address space in the `TargetTransformInfo.h` is
somewhat vague and both, soo specific and not precise enough, IMO:
```
The flat address space is a
/// generic address space that can be used access multiple segments of memory
/// with different addre
https://github.com/Artem-B approved this pull request.
https://github.com/llvm/llvm-project/pull/107936
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Artem-B wrote:
It may be worth adding a note about this in the release notes.
https://github.com/llvm/llvm-project/pull/107936
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/Artem-B approved this pull request.
https://github.com/llvm/llvm-project/pull/107734
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/Artem-B closed
https://github.com/llvm/llvm-project/pull/107483
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/Artem-B created
https://github.com/llvm/llvm-project/pull/107483
Right now we're bailing out too early, and `-cuid` does not get set for the
host-only compilations.
>From 52a27293d1c93a7ed4dcef845f705808afa3c273 Mon Sep 17 00:00:00 2001
From: Artem Belevich
Date: Thu, 5 Se
https://github.com/Artem-B closed
https://github.com/llvm/llvm-project/pull/99646
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Artem-B wrote:
This is very Windows-specific.
@rnk -- would you have time to take a look?
https://github.com/llvm/llvm-project/pull/101350
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-comm
Artem-B wrote:
I can land the patch.
The buildkite failures appear to be unrelated (something in lldb tests).
Let's wait till clang format checks are done.
https://github.com/llvm/llvm-project/pull/99646
___
cfe-commits mailing list
cfe-commits@lists
Artem-B wrote:
Buildkite failures are caused by lldb and are unrelated. We're good to go.
https://github.com/llvm/llvm-project/pull/99646
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commit
@@ -0,0 +1,216 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
UTC_ARGS: --default-march nvptx64 --version 5
+; RUN: llc < %s -march=nvptx64 -mcpu=sm_32 | FileCheck %s
--check-prefixes=SM30,CHECK
+; RUN: %if ptxas %{ llc < %s -march=nvptx64 -mcp
https://github.com/Artem-B approved this pull request.
https://github.com/llvm/llvm-project/pull/99646
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -9017,6 +9017,10 @@ def err_global_call_not_config : Error<
def err_ref_bad_target : Error<
"reference to %select{__device__|__global__|__host__|__host__ __device__}0 "
"%select{function|variable}1 %2 in
%select{__device__|__global__|__host__|__host__ __device__}3 funct
@@ -115,20 +143,65 @@ static bool hasAttr(const Decl *D, bool
IgnoreImplicitAttr) {
});
}
+SemaCUDA::CUDATargetContext::CUDATargetContext(SemaCUDA *S,
Artem-B wrote:
This could probably be moved into the header.
https://github.com/llvm/llvm-projec
https://github.com/Artem-B edited
https://github.com/llvm/llvm-project/pull/103031
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/Artem-B commented:
Considering that we're adding another interesting quirk to how we interpret
target attributes & function calls, it would be useful to run this by a
language lawyer to make sure we're not missing something.
@zygoloid - would you have time to take a look or
https://github.com/Artem-B closed
https://github.com/llvm/llvm-project/pull/102969
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/Artem-B approved this pull request.
https://github.com/llvm/llvm-project/pull/104460
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -76,8 +79,75 @@ class HIPUndefinedFatBinSymbols {
return GPUBinHandleSymbols;
}
+ // Collect symbols from static libraries specified by -l options.
+ void processStaticLibraries() {
+llvm::SmallVector LibNames;
+llvm::SmallVector LibPaths;
+llvm::SmallVe
@@ -76,8 +79,75 @@ class HIPUndefinedFatBinSymbols {
return GPUBinHandleSymbols;
}
+ // Collect symbols from static libraries specified by -l options.
+ void processStaticLibraries() {
+llvm::SmallVector LibNames;
+llvm::SmallVector LibPaths;
+llvm::SmallVe
https://github.com/Artem-B approved this pull request.
LGTM in general with a couple of nits.
https://github.com/llvm/llvm-project/pull/104638
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-c
https://github.com/Artem-B edited
https://github.com/llvm/llvm-project/pull/104638
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/Artem-B updated
https://github.com/llvm/llvm-project/pull/100247
>From 2a948e8803cd881937e9a121ca9fe9c4816e857e Mon Sep 17 00:00:00 2001
From: Artem Belevich
Date: Thu, 18 Jul 2024 15:05:01 -0700
Subject: [PATCH 1/2] [CUDA] Add a pseudo GPU sm_next which allows overrides
for
https://github.com/Artem-B updated
https://github.com/llvm/llvm-project/pull/100247
>From da1ac9d36bd284dc607b7366ff83ba556fb64fb5 Mon Sep 17 00:00:00 2001
From: Artem Belevich
Date: Thu, 18 Jul 2024 15:05:01 -0700
Subject: [PATCH 1/2] [CUDA] Add a pseudo GPU sm_next which allows overrides
for
Artem-B wrote:
Looks like that it was clang-format check github was waiting on an approaval
for. I've just clicked that button, let's see what it brings.
The patch is good to go otherwise, IMO.
https://github.com/llvm/llvm-project/pull/102969
___
cfe-
@@ -7163,24 +7163,27 @@ void Sema::ProcessDeclAttributeList(
} else if (const auto *A = D->getAttr()) {
Diag(D->getLocation(), diag::err_opencl_kernel_attr) << A;
D->setInvalidDecl();
-} else if (!D->hasAttr()) {
- if (const auto *A = D->getAttr()) {
-
@@ -7163,24 +7163,27 @@ void Sema::ProcessDeclAttributeList(
} else if (const auto *A = D->getAttr()) {
Diag(D->getLocation(), diag::err_opencl_kernel_attr) << A;
D->setInvalidDecl();
-} else if (!D->hasAttr()) {
- if (const auto *A = D->getAttr()) {
-
@@ -7163,24 +7163,27 @@ void Sema::ProcessDeclAttributeList(
} else if (const auto *A = D->getAttr()) {
Diag(D->getLocation(), diag::err_opencl_kernel_attr) << A;
D->setInvalidDecl();
-} else if (!D->hasAttr()) {
- if (const auto *A = D->getAttr()) {
-
https://github.com/Artem-B approved this pull request.
https://github.com/llvm/llvm-project/pull/102969
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -722,6 +722,37 @@ let hasSideEffects = false in {
defm CVT_f16x2 : CVT_FROM_FLOAT_V2_SM80<"f16x2", Int32Regs>;
defm CVT_bf16x2 : CVT_FROM_FLOAT_V2_SM80<"bf16x2", Int32Regs>;
+
+ // FP8 conversions.
+ multiclass CVT_TO_F8X2 {
+def _f32 :
+ NVPTXInst<(outs Int1
@@ -722,6 +722,37 @@ let hasSideEffects = false in {
defm CVT_f16x2 : CVT_FROM_FLOAT_V2_SM80<"f16x2", Int32Regs>;
defm CVT_bf16x2 : CVT_FROM_FLOAT_V2_SM80<"bf16x2", Int32Regs>;
+
+ // FP8 conversions.
+ multiclass CVT_TO_F8X2 {
+def _f32 :
+ NVPTXInst<(outs Int1
@@ -968,6 +971,39 @@ __device__ void nvvm_cvt_sm80() {
// CHECK: ret void
}
+// CHECK-LABEL: nvvm_cvt_sm89
+__device__ void nvvm_cvt_sm89() {
+#if __CUDA_ARCH__ >= 890
+ // CHECK_PTX81_SM89: call i16 @llvm.nvvm.ff.to.e4m3x2.rn(float 1.00e+00,
float 1.00e+00)
+ __n
@@ -722,6 +722,37 @@ let hasSideEffects = false in {
defm CVT_f16x2 : CVT_FROM_FLOAT_V2_SM80<"f16x2", Int32Regs>;
defm CVT_bf16x2 : CVT_FROM_FLOAT_V2_SM80<"bf16x2", Int32Regs>;
+
+ // FP8 conversions.
+ multiclass CVT_TO_F8X2 {
+def _f32 :
+ NVPTXInst<(outs Int1
https://github.com/Artem-B approved this pull request.
LGTM with a couple of nits.
https://github.com/llvm/llvm-project/pull/102969
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/Artem-B edited
https://github.com/llvm/llvm-project/pull/102969
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/Artem-B updated
https://github.com/llvm/llvm-project/pull/100247
>From da1ac9d36bd284dc607b7366ff83ba556fb64fb5 Mon Sep 17 00:00:00 2001
From: Artem Belevich
Date: Thu, 18 Jul 2024 15:05:01 -0700
Subject: [PATCH 1/2] [CUDA] Add a pseudo GPU sm_next which allows overrides
for
https://github.com/Artem-B updated
https://github.com/llvm/llvm-project/pull/100247
>From da1ac9d36bd284dc607b7366ff83ba556fb64fb5 Mon Sep 17 00:00:00 2001
From: Artem Belevich
Date: Thu, 18 Jul 2024 15:05:01 -0700
Subject: [PATCH] [CUDA] Add a pseudo GPU sm_next which allows overrides for
SM/
https://github.com/Artem-B updated
https://github.com/llvm/llvm-project/pull/100247
>From 44a1045eee71777fa916e2a8043b2f99afc96a96 Mon Sep 17 00:00:00 2001
From: Artem Belevich
Date: Thu, 18 Jul 2024 15:05:01 -0700
Subject: [PATCH 1/4] [CUDA] Add a pseudo GPU sm_next which allows overrides
for
Artem-B wrote:
OK, I've reworked the patch, and it appears to correctly propagate arbitrary
SM/PTX versions from clang, down to the LLVM and generated PTX, and to ptxas
and fatbinary command line options.
PTAL.
https://github.com/llvm/llvm-project/pull/100247
https://github.com/Artem-B closed
https://github.com/llvm/llvm-project/pull/102661
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/Artem-B updated
https://github.com/llvm/llvm-project/pull/102661
>From 0f3944e1c12baa958f52c3c015a0cf5f9aeff1ed Mon Sep 17 00:00:00 2001
From: Artem Belevich
Date: Fri, 9 Aug 2024 11:51:23 -0700
Subject: [PATCH 1/2] [CUDA] Emit used function list in deterministic order.
Fixe
https://github.com/Artem-B edited
https://github.com/llvm/llvm-project/pull/102661
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -950,6 +950,9 @@ void CodeGenModule::Release() {
UsedArray.push_back(llvm::ConstantExpr::getPointerBitCastOrAddrSpaceCast(
GetAddrOfGlobal(GD), Int8PtrTy));
}
+// Sort decls by name to always emit them in deterministic order.
Artem-B
https://github.com/Artem-B updated
https://github.com/llvm/llvm-project/pull/102661
>From 0f3944e1c12baa958f52c3c015a0cf5f9aeff1ed Mon Sep 17 00:00:00 2001
From: Artem Belevich
Date: Fri, 9 Aug 2024 11:51:23 -0700
Subject: [PATCH] [CUDA] Emit used function list in deterministic order.
Fixes ht
https://github.com/Artem-B created
https://github.com/llvm/llvm-project/pull/102661
Fixes https://github.com/llvm/llvm-project/issues/101560
>From 6ee0add21bd2a9b25d28640c91de2fc6dab7fa72 Mon Sep 17 00:00:00 2001
From: Artem Belevich
Date: Fri, 9 Aug 2024 11:51:23 -0700
Subject: [PATCH] [CUDA]
@@ -52,6 +53,42 @@ const char *CudaVersionToString(CudaVersion V);
// Input is "Major.Minor"
CudaVersion CudaStringToVersion(const llvm::Twine &S);
+enum class PTXVersion {
+ PTX_UNKNOWN = 0,
+ PTX_32 = 32,
+ PTX_40 = 40,
+ PTX_41,
+ PTX_42,
+ PTX_43,
+ PTX_50 = 50,
+
@@ -26,24 +27,38 @@ static cl::opt
NoF16Math("nvptx-no-f16-math", cl::Hidden,
cl::desc("NVPTX Specific: Disable generation of f16 math ops."),
cl::init(false));
+static cl::opt
+NextSM("nvptx-next-sm", cl::Hidden,
+ cl::desc("NVPTX
@@ -26,24 +27,38 @@ static cl::opt
NoF16Math("nvptx-no-f16-math", cl::Hidden,
cl::desc("NVPTX Specific: Disable generation of f16 math ops."),
cl::init(false));
+static cl::opt
+NextSM("nvptx-next-sm", cl::Hidden,
+ cl::desc("NVPTX
https://github.com/Artem-B updated
https://github.com/llvm/llvm-project/pull/100247
>From 44a1045eee71777fa916e2a8043b2f99afc96a96 Mon Sep 17 00:00:00 2001
From: Artem Belevich
Date: Thu, 18 Jul 2024 15:05:01 -0700
Subject: [PATCH 1/3] [CUDA] Add a pseudo GPU sm_next which allows overrides
for
@@ -0,0 +1,37 @@
+; RUN: llc < %s -march=nvptx64 -mcpu=sm_32 | FileCheck %s
--check-prefixes=SM30,CHECK
Artem-B wrote:
https://llvm.org/docs/TestingGuide.html#generating-assertions-in-regression-tests
https://github.com/llvm/llvm-project/pull/99646
__
@@ -0,0 +1,37 @@
+; RUN: llc < %s -march=nvptx64 -mcpu=sm_32 | FileCheck %s
--check-prefixes=SM30,CHECK
Artem-B wrote:
This test should be suitable for automatic check generation -- we probably do
want to see the details of what we're doing when we do 32-bit CA
https://github.com/Artem-B approved this pull request.
LGTM with a test nit
https://github.com/llvm/llvm-project/pull/99646
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/Artem-B edited
https://github.com/llvm/llvm-project/pull/99646
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Artem-B wrote:
Given that the prevalent compilation for CUDA has no standard library
whatsoever, preserving libcalls may break some existing users that may be
relying on library call lowering to an intrinsic that *is* implemented by the
back-end.
Perhaps this "no library call to intrinsic con
https://github.com/Artem-B approved this pull request.
https://github.com/llvm/llvm-project/pull/100607
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Artem-B wrote:
The patch seems to change only the test file. Should there be more changes in
the patch?
https://github.com/llvm/llvm-project/pull/100607
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/lis
https://github.com/Artem-B approved this pull request.
https://github.com/llvm/llvm-project/pull/100438
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -26,24 +27,38 @@ static cl::opt
NoF16Math("nvptx-no-f16-math", cl::Hidden,
cl::desc("NVPTX Specific: Disable generation of f16 math ops."),
cl::init(false));
+static cl::opt
+NextSM("nvptx-next-sm", cl::Hidden,
+ cl::desc("NVPTX
https://github.com/Artem-B edited
https://github.com/llvm/llvm-project/pull/100247
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -648,6 +658,13 @@ void NVPTX::getNVPTXTargetFeatures(const Driver &D, const
llvm::Triple &Triple,
Features.push_back(Args.MakeArgString(PtxFeature));
return;
}
+ // Add --cuda-next-ptx to the list of features, but carry on to add the
+ // default PTX feature for
@@ -96,6 +96,7 @@ static const OffloadArchToStringMap arch_names[] = {
SM(89), // Ada Lovelace
SM(90), // Hopper
SM(90a), // Hopper
+SM(next),// Placeholder for a n
https://github.com/Artem-B updated
https://github.com/llvm/llvm-project/pull/100247
>From 44a1045eee71777fa916e2a8043b2f99afc96a96 Mon Sep 17 00:00:00 2001
From: Artem Belevich
Date: Thu, 18 Jul 2024 15:05:01 -0700
Subject: [PATCH 1/2] [CUDA] Add a pseudo GPU sm_next which allows overrides
for
Artem-B wrote:
@sergey-kozub FYI, https://github.com/llvm/llvm-project/pull/100247 should
allow forward-testing CUDA w/o relying on specific GPU/PTX variant being
hardcoded in clang.
https://github.com/llvm/llvm-project/pull/97402
___
cfe-commits ma
https://github.com/Artem-B created
https://github.com/llvm/llvm-project/pull/100247
Sometimes users may need to use older clang with newer SM/PTX versions which
clang does not know anything about, yet.
--offload-arch=sm_next, combined with --cuda-next-sm=X and --cuda-next-ptx=Y
allows passing
Artem-B wrote:
@nico:
> Why do we need a new binary for this, instead of having something like `clang
> -cc1_nvlink` that calls a custom mode within clang?
Do we have existing precedents for such built-in tools, other than cc1 itself?
If the linker wrapper can be part of clang itself, it woul
https://github.com/Artem-B approved this pull request.
https://github.com/llvm/llvm-project/pull/100170
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -504,18 +511,23 @@ Expected clang(ArrayRef InputFiles,
const ArgList &Args) {
llvm::copy(LinkerArgs, std::back_inserter(CmdArgs));
}
- // Pass on -mllvm options to the clang invocation.
- for (const opt::Arg *Arg : Args.filtered(OPT_mllvm)) {
-CmdArgs.push_back
@@ -504,18 +511,23 @@ Expected clang(ArrayRef InputFiles,
const ArgList &Args) {
llvm::copy(LinkerArgs, std::back_inserter(CmdArgs));
}
- // Pass on -mllvm options to the clang invocation.
- for (const opt::Arg *Arg : Args.filtered(OPT_mllvm)) {
-CmdArgs.push_back
@@ -0,0 +1,64 @@
+
+Clang nvlink Wrapper
+
+
+.. contents::
+ :local:
+
+.. _clang-nvlink-wrapper:
+
+Introduction
+
+
+This tools works as a wrapper around the NVIDIA ``nvlink`` linker. The purpose
+of this wrapper is to prov
@@ -0,0 +1,778 @@
+//===-- clang-nvlink-wrapper/ClangNVLinkWrapper.cpp - NVIDIA linker util
--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Ap
@@ -0,0 +1,778 @@
+//===-- clang-nvlink-wrapper/ClangNVLinkWrapper.cpp - NVIDIA linker util
--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Ap
@@ -0,0 +1,778 @@
+//===-- clang-nvlink-wrapper/ClangNVLinkWrapper.cpp - NVIDIA linker util
--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Ap
@@ -0,0 +1,778 @@
+//===-- clang-nvlink-wrapper/ClangNVLinkWrapper.cpp - NVIDIA linker util
--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Ap
https://github.com/Artem-B approved this pull request.
LGTM for the patch in general, though I can't vouch for the details of the
linking process. I'll defer to @MaskRay on that.
https://github.com/llvm/llvm-project/pull/96561
___
cfe-commits mailing
@@ -0,0 +1,778 @@
+//===-- clang-nvlink-wrapper/ClangNVLinkWrapper.cpp - NVIDIA linker util
--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Ap
@@ -0,0 +1,778 @@
+//===-- clang-nvlink-wrapper/ClangNVLinkWrapper.cpp - NVIDIA linker util
--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Ap
https://github.com/Artem-B edited
https://github.com/llvm/llvm-project/pull/96561
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -0,0 +1,64 @@
+
+Clang nvlink Wrapper
+
+
+.. contents::
+ :local:
+
+.. _clang-nvlink-wrapper:
+
+Introduction
+
+
+This tools works as a wrapper around the NVIDIA ``nvlink`` linker. The purpose
+of this wrapper is to prov
@@ -0,0 +1,776 @@
+//===-- clang-nvlink-wrapper/ClangNVLinkWrapper.cpp - NVIDIA linker util
--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Ap
https://github.com/Artem-B commented:
First batch of comments on the patch -- I only got till about the middle of
ClangNVLinkWrapper.cpp. Will continue reviewing tomorrow.
https://github.com/llvm/llvm-project/pull/96561
___
cfe-commits mailing list
cf
@@ -0,0 +1,776 @@
+//===-- clang-nvlink-wrapper/ClangNVLinkWrapper.cpp - NVIDIA linker util
--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Ap
@@ -0,0 +1,776 @@
+//===-- clang-nvlink-wrapper/ClangNVLinkWrapper.cpp - NVIDIA linker util
--===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Ap
@@ -0,0 +1,64 @@
+
+Clang nvlink Wrapper
+
+
+.. contents::
+ :local:
+
+.. _clang-nvlink-wrapper:
+
+Introduction
+
+
+This tools works as a wrapper around the NVIDIA ``nvlink`` linker. The purpose
+of this wrapper is to prov
https://github.com/Artem-B edited
https://github.com/llvm/llvm-project/pull/96561
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
@@ -203,8 +203,12 @@ ABIArgInfo NVPTXABIInfo::classifyArgumentType(QualType Ty)
const {
void NVPTXABIInfo::computeInfo(CGFunctionInfo &FI) const {
if (!getCXXABI().classifyReturnType(FI))
FI.getReturnInfo() = classifyReturnType(FI.getReturnType());
+
+ unsigned Argument
https://github.com/Artem-B approved this pull request.
LGTM with a minor nit.
https://github.com/llvm/llvm-project/pull/96015
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
https://github.com/Artem-B edited
https://github.com/llvm/llvm-project/pull/96015
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Artem-B wrote:
> This PR is redundant, closing.
I think the patch was perfectly fine. Considering that other NVIDIA open-source
projects already mention sm_100 (E.g.
https://github.com/NVIDIA/cccl/blob/5efe53dbd71ea3e4bc4fdbb73edc001e0bf81547/libcudacxx/include/nv/detail/__target_macros#L241),
@@ -0,0 +1,31 @@
+/*===-- LLVM/Offload helpers for kernel languages (CUDA/HIP) -*- c++ -*-===
+ *
+ * Part of the LLVM Project, under the Apache License v2.0 with LLVM
Exceptions.
+ * See https://llvm.org/LICENSE.txt for license information.
+ * SPDX-License-Identifier: Apach
https://github.com/Artem-B approved this pull request.
LGTM in principle.
Will kernels in TUs compiled with `-foffload-via-llvm` be interoperable with
code that wants to launch them from another TU compiled w/o
`-foffload-via-llvm` ?
E.g.:
- a.cu: `__global__ void kernel() { ... }`
- b.cu: `e
@@ -1125,6 +1125,22 @@ void Clang::AddPreprocessingOptions(Compilation &C,
const JobAction &JA,
CmdArgs.push_back("__clang_openmp_device_functions.h");
}
+ if (Args.hasArg(options::OPT_foffload_via_llvm)) {
+// Add llvm_wrappers/* to our system include path. This
@@ -,17 +6684,26 @@ void Clang::ConstructJob(Compilation &C, const
JobAction &JA,
break;
}
} else {
+if (Args.hasFlag(options::OPT_foffload_via_llvm,
+ options::OPT_fno_offload_via_llvm, false))
+ Args.AddLastArg(CmdArgs, options::O
https://github.com/Artem-B edited
https://github.com/llvm/llvm-project/pull/94549
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
Artem-B wrote:
> What's the config to set this by default without any graphics?
https://docs.nvidia.com/deploy/driver-persistence/index.html
I usually use "nvidia-smi -i -pm ENABLED" to force the driver to
be loaded permanently.
As for `__nvcc_device_query`, my guess is that it just uses a
Artem-B wrote:
Ooh... I think I know exactly what may be causing this.
On machines where NVIDIA GPUs are used for compute only (e.g. a headless server
machine), NVIDIA drivers are not always loaded by default and may not have
driver persistence enabled. The drivers get loaded when GPU is acces
@@ -9013,6 +9013,12 @@ def err_cuda_ovl_target : Error<
"cannot overload %select{__device__|__global__|__host__|__host__
__device__}2 function %3">;
def note_cuda_ovl_candidate_target_mismatch : Note<
"candidate template ignored: target attributes do not match">;
+def wa
@@ -9013,6 +9013,12 @@ def err_cuda_ovl_target : Error<
"cannot overload %select{__device__|__global__|__host__|__host__
__device__}2 function %3">;
def note_cuda_ovl_candidate_target_mismatch : Note<
"candidate template ignored: target attributes do not match">;
+def wa
https://github.com/Artem-B approved this pull request.
LGTM with some wording/namiung nits.
https://github.com/llvm/llvm-project/pull/77359
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-comm
https://github.com/Artem-B edited
https://github.com/llvm/llvm-project/pull/77359
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
1 - 100 of 955 matches
Mail list logo