[clang] [Clang] Only enable builtins on aux triple if supported by language (PR #154217)

2025-08-28 Thread Artem Belevich via cfe-commits
Artem-B wrote: > So the language options are not guaranteed to be the same between the host > and device compilations? I think it's more of the case that builtins are rarely tagged as language-specific, so, as implemented, this filtering is a very blunt hammer, even if we do consistently fil

[clang] [llvm] [NVPTX] Change the alloca address space in NVPTXLowerAlloca (PR #154814)

2025-08-28 Thread Artem Belevich via cfe-commits
@@ -502,3 +497,27 @@ void NVPTXPassConfig::addMachineSSAOptimization() { addPass(&PeepholeOptimizerLegacyID); printAndVerify("After codegen peephole optimization pass"); } + +bool NVPTXTargetMachine::isCompatibleDataLayout( +const DataLayout &Candidate) const { + // XX

[clang] [llvm] [NVPTX] Change the alloca address space in NVPTXLowerAlloca (PR #154814)

2025-08-28 Thread Artem Belevich via cfe-commits
@@ -1444,15 +1444,16 @@ void clang::emitBackendOutput(CompilerInstance &CI, CodeGenOptions &CGOpts, // Verify clang's TargetInfo DataLayout against the LLVM TargetMachine's // DataLayout. - if (AsmHelper.TM) { -std::string DLDesc = M->getDataLayout().getStringReprese

[clang] [llvm] [NVPTX] Change the alloca address space in NVPTXLowerAlloca (PR #154814)

2025-08-28 Thread Artem Belevich via cfe-commits
@@ -368,8 +354,17 @@ void NVPTXPassConfig::addIRPasses() { // NVPTXLowerArgs is required for correctness and should be run right // before the address space inference passes. addPass(createNVPTXLowerArgsPass()); + + if (getOptLevel() != CodeGenOptLevel::None) +// NVP

[clang] [llvm] [NVPTX] Change the alloca address space in NVPTXLowerAlloca (PR #154814)

2025-08-28 Thread Artem Belevich via cfe-commits
@@ -66,63 +69,97 @@ char NVPTXPeephole::ID = 0; INITIALIZE_PASS(NVPTXPeephole, "nvptx-peephole", "NVPTX Peephole", false, false) -static bool isCVTAToLocalCombinationCandidate(MachineInstr &Root) { +static bool isCVTALocalCombinationCandidate(MachineInstr &Root) { auto &M

[clang] [llvm] [NVPTX] Change the alloca address space in NVPTXLowerAlloca (PR #154814)

2025-08-28 Thread Artem Belevich via cfe-commits
@@ -2253,22 +2253,31 @@ def trapexitinst : NVPTXInst<(outs), (ins), "trap; exit;", [(trap)]>, Requires<[ // brkpt instruction def debugtrapinst : BasicNVPTXInst<(outs), (ins), "brkpt", [(debugtrap)]>; -def SDTDynAllocaOp : - SDTypeProfile<1, 2, [SDTCisSameAs<0, 1>, SDTCisInt

[clang] [llvm] [NVPTX] Change the alloca address space in NVPTXLowerAlloca (PR #154814)

2025-08-28 Thread Artem Belevich via cfe-commits
@@ -2253,22 +2253,31 @@ def trapexitinst : NVPTXInst<(outs), (ins), "trap; exit;", [(trap)]>, Requires<[ // brkpt instruction def debugtrapinst : BasicNVPTXInst<(outs), (ins), "brkpt", [(debugtrap)]>; -def SDTDynAllocaOp : - SDTypeProfile<1, 2, [SDTCisSameAs<0, 1>, SDTCisInt

[clang] [llvm] [mlir] [NVPTX] Auto-upgrade nvvm.grid_constant to param attribute (PR #155489)

2025-08-27 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. Nice. LGTM. https://github.com/llvm/llvm-project/pull/155489 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [mlir] [NVPTX] Auto-upgrade nvvm.grid_constant to param attribute (PR #155489)

2025-08-27 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/155489 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Support i256 load/store with 256-bit vector load (PR #155198)

2025-08-27 Thread Artem Belevich via cfe-commits
@@ -198,6 +198,12 @@ static bool IsPTXVectorType(MVT VT) { static std::optional> getVectorLoweringShape(EVT VectorEVT, const NVPTXSubtarget &STI, unsigned AddressSpace) { + const bool CanLowerTo256Bit = STI.has256BitVectorLoadStore(AddressSpace); + + if

[clang] [llvm] [NVPTX] Support i256 load/store with 256-bit vector load (PR #155198)

2025-08-27 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. LGTM overall, with few nits. https://github.com/llvm/llvm-project/pull/155198 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Support i256 load/store with 256-bit vector load (PR #155198)

2025-08-27 Thread Artem Belevich via cfe-commits
@@ -1506,3 +1506,98 @@ define void @local_volatile_4xdouble(ptr addrspace(5) %a, ptr addrspace(5) %b) { store volatile <4 x double> %a.load, ptr addrspace(5) %b ret void } + +define void @test_i256_global(ptr addrspace(1) %a, ptr addrspace(1) %b) { Artem-B

[clang] [llvm] [NVPTX] Support i256 load/store with 256-bit vector load (PR #155198)

2025-08-27 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/155198 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Only enable builtins on aux triple if supported by language (PR #154217)

2025-08-21 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B commented: > This patch makes it so that builtins for the aux triple only get enabled if > they are marked as supported by the current language options. I'm not convinced that it's the right fix for the problem. * `__cpuidex`, uses `Languages = "ALL_MS_LANGUAGES"` w

[clang] [Clang] Only enable builtins on aux triple if supported by language (PR #154217)

2025-08-21 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/154217 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [Clang][CUDA] Add support for __managed__ variables in non-RDC and default RDC mode (PR #149716)

2025-08-11 Thread Artem Belevich via cfe-commits
Artem-B wrote: Is this pull request ready for review? https://github.com/llvm/llvm-project/pull/149716 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA] Do not include obsolete texture_fetch_functions.h (PR #151766)

2025-08-01 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B updated https://github.com/llvm/llvm-project/pull/151766 >From 7ad34288652d0f236e87f0c348275bcff5469432 Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Fri, 1 Aug 2025 13:09:11 -0700 Subject: [PATCH 1/2] [CUDA] Do not include obsolete texture_fetch_functions.h --

[clang] [CUDA] Do not include obsolete texture_fetch_functions.h (PR #151766)

2025-08-01 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B updated https://github.com/llvm/llvm-project/pull/151766 >From 7ad34288652d0f236e87f0c348275bcff5469432 Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Fri, 1 Aug 2025 13:09:11 -0700 Subject: [PATCH 1/2] [CUDA] Do not include obsolete texture_fetch_functions.h --

[clang] [CUDA] Do not include obsolete texture_fetch_functions.h (PR #151766)

2025-08-01 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B created https://github.com/llvm/llvm-project/pull/151766 None >From 7ad34288652d0f236e87f0c348275bcff5469432 Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Fri, 1 Aug 2025 13:09:11 -0700 Subject: [PATCH] [CUDA] Do not include obsolete texture_fetch_functions.h

[clang] [CUDA] Use --image3 to construct fat binary (PR #151760)

2025-08-01 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B closed https://github.com/llvm/llvm-project/pull/151760 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA] Use --image3 to construct fat binary (PR #151760)

2025-08-01 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B updated https://github.com/llvm/llvm-project/pull/151760 >From a3359f2959d2f161be88c49e5fece6d6fc73bf6b Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Fri, 1 Aug 2025 12:23:06 -0700 Subject: [PATCH] [CUDA] Use --image3 to construct fat binary CUDA-12.9 has remov

[clang] [CUDA] Use --image3 to construct fat binary (PR #151760)

2025-08-01 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B created https://github.com/llvm/llvm-project/pull/151760 CUDA-12.9 has removed fatbinary tool's `--image` argument we've been using till now. --image3 has been supported since cuda-9, so we do not need CUDA SDK version checks. >From 098ea946feb9d2f95d1a0913bd7d24cb

[clang] [llvm] [LLVM] Add GNU make jobserver support (PR #145131)

2025-07-18 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. LGTM syntax/style-wise. Looks reasonable on the functionality side, but we could use a second opinion on that. https://github.com/llvm/llvm-project/pull/145131 ___ cfe-commits mailing list cfe-com

[clang] [llvm] [LLVM] Add GNU make jobserver support (PR #145131)

2025-07-18 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/145131 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Rework creating offloading toolchains (PR #125556)

2025-07-17 Thread Artem Belevich via cfe-commits
@@ -14,14 +14,14 @@ // RUN: | FileCheck %s --check-prefix=NO-OUTPUT-ERROR // RUN: not %clang -### --target=x86_64-unknown-linux-gnu -nogpulib --offload-new-driver --offload-arch=native --amdgpu-arch-tool=%t/amdgpu_arch_fail -x hip %s 2>&1 \ // RUN: | FileCheck %s --chec

[clang] [Clang] Rework creating offloading toolchains (PR #125556)

2025-07-17 Thread Artem Belevich via cfe-commits
@@ -951,221 +931,262 @@ static bool addSYCLDefaultTriple(Compilation &C, return true; } -void Driver::CreateOffloadingDeviceToolChains(Compilation &C, - InputList &Inputs) { - - // - // CUDA/HIP - // - // We need to generate a

[clang] [Clang] Rework creating offloading toolchains (PR #125556)

2025-07-17 Thread Artem Belevich via cfe-commits
@@ -3441,91 +3455,25 @@ class OffloadingActionBuilder final { return true; } - ToolChains.push_back( - AssociatedOffloadKind == Action::OFK_Cuda - ? C.getSingleOffloadToolChain() - : C.getSingleOffloadToolChain()); - -

[clang] [Clang] Rework creating offloading toolchains (PR #125556)

2025-07-17 Thread Artem Belevich via cfe-commits
@@ -3441,91 +3455,25 @@ class OffloadingActionBuilder final { return true; } - ToolChains.push_back( - AssociatedOffloadKind == Action::OFK_Cuda - ? C.getSingleOffloadToolChain() - : C.getSingleOffloadToolChain()); - -

[clang] [Clang] Rework creating offloading toolchains (PR #125556)

2025-07-17 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B commented: Drive-by style/syntax mostly review. LGTM overall, with a few nits. https://github.com/llvm/llvm-project/pull/125556 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/li

[clang] [Clang] Rework creating offloading toolchains (PR #125556)

2025-07-17 Thread Artem Belevich via cfe-commits
@@ -4,7 +4,7 @@ // RUN: --rocm-path=%S/Inputs/rocm \ // RUN: %s 2>&1 | FileCheck -check-prefix=NOPLUS %s -// NOPLUS: error: invalid target ID 'gfx908xnack' +// NOPLUS: error: unsupported HIP gpu architecture: gfx908xnack Artem-B wrote: "HIP compilation c

[clang] [Clang] Rework creating offloading toolchains (PR #125556)

2025-07-17 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/125556 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA] add wrapper header for libc++'s __utlility/declval.h (PR #148918)

2025-07-15 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B updated https://github.com/llvm/llvm-project/pull/148918 >From ea1949d13608ac948ab34d1eeb073decdd11e2a3 Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Tue, 15 Jul 2025 11:10:40 -0700 Subject: [PATCH 1/2] [CUDA] add wrapper header for libc++'s __utlility/declval.

[clang] [CUDA] add wrapper header for libc++'s __utlility/declval.h (PR #148918)

2025-07-15 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B created https://github.com/llvm/llvm-project/pull/148918 Since #116709 more libc++ code relies on std::declval() and it broke some CUDA compilations. The new wrapper adds GPU-side overloads for the declval() helper functions which allows it to continue working when

[clang] [HIP] Add warning for -mwavefrontsize64 on gfx10+ architectures (PR #140185)

2025-07-14 Thread Artem Belevich via cfe-commits
@@ -67,6 +67,12 @@ // DUP-NOT: "-target-feature" "{{.*}}wavefrontsize64" // DUP: {{.*}}lld{{.*}} "-plugin-opt=-mattr=+cumode" +// RUN: %clang -### --target=x86_64-linux-gnu -fgpu-rdc -nogpulib \ +// RUN: -nogpuinc --offload-arch=gfx1010 --no-offload-new-driver %s \ +// RUN:

[clang] [llvm] [LLVM] Add GNU make jobserver support (PR #145131)

2025-07-14 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,257 @@ +//===- llvm/Support/Jobserver.cpp - Jobserver Client Implementation ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [llvm] [LLVM] Add GNU make jobserver support (PR #145131)

2025-07-14 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,257 @@ +//===- llvm/Support/Jobserver.cpp - Jobserver Client Implementation ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [llvm] [LLVM] Add GNU make jobserver support (PR #145131)

2025-07-14 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,257 @@ +//===- llvm/Support/Jobserver.cpp - Jobserver Client Implementation ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [llvm] [LLVM] Add GNU make jobserver support (PR #145131)

2025-07-14 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,257 @@ +//===- llvm/Support/Jobserver.cpp - Jobserver Client Implementation ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [llvm] [LLVM] Add GNU make jobserver support (PR #145131)

2025-07-14 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,141 @@ +//===- llvm/Support/Jobserver.h - Jobserver Client --*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [llvm] [LLVM] Add GNU make jobserver support (PR #145131)

2025-07-14 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,257 @@ +//===- llvm/Support/Jobserver.cpp - Jobserver Client Implementation ---===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [llvm] [LLVM] Add GNU make jobserver support (PR #145131)

2025-07-14 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,141 @@ +//===- llvm/Support/Jobserver.h - Jobserver Client --*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [llvm] [LLVM] Add GNU make jobserver support (PR #145131)

2025-07-14 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,141 @@ +//===- llvm/Support/Jobserver.h - Jobserver Client --*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier: Ap

[clang] [llvm] [LLVM] Add GNU make jobserver support (PR #145131)

2025-07-14 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B commented: Few comments on syntax/style. I didn't look at the job management logic itself. https://github.com/llvm/llvm-project/pull/145131 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bi

[clang] [llvm] [LLVM] Add GNU make jobserver support (PR #145131)

2025-07-14 Thread Artem Belevich via cfe-commits
@@ -1420,12 +1420,18 @@ int main(int Argc, char **Argv) { parallel::strategy = hardware_concurrency(1); if (auto *Arg = Args.getLastArg(OPT_wrapper_jobs)) { -unsigned Threads = 0; -if (!llvm::to_integer(Arg->getValue(), Threads) || Threads == 0) - reportError(

[clang] [llvm] [LLVM] Add GNU make jobserver support (PR #145131)

2025-07-14 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/145131 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Extract offloading code from static libs with 'offload-arch=' (PR #147823)

2025-07-09 Thread Artem Belevich via cfe-commits
Artem-B wrote: > but we now assume that if the user specified --offload-arch= on the link job, > they definitely want that architecture to be used if it exists. That would be my assumption, too. Do we currently just ignore `--offload-arch=` for the linking phase? With the patch, what's expe

[clang] [llvm] [NVPTX] Consolidate and cleanup various NVPTXISD nodes (NFC) (PR #145581)

2025-06-25 Thread Artem Belevich via cfe-commits
@@ -457,3 +457,25 @@ void NVPTXInstPrinter::printCTAGroup(const MCInst *MI, int OpNum, } llvm_unreachable("Invalid cta_group in printCTAGroup"); } + +void NVPTXInstPrinter::printCallOperand(const MCInst *MI, int OpNum, +raw_ostream &

[clang] [llvm] [NVPTX] Consolidate and cleanup various NVPTXISD nodes (NFC) (PR #145581)

2025-06-25 Thread Artem Belevich via cfe-commits
@@ -457,3 +457,25 @@ void NVPTXInstPrinter::printCTAGroup(const MCInst *MI, int OpNum, } llvm_unreachable("Invalid cta_group in printCTAGroup"); } + +void NVPTXInstPrinter::printCallOperand(const MCInst *MI, int OpNum, +raw_ostream &

[clang] Add missing intrinsics to cuda headers (PR #143664)

2025-06-18 Thread Artem Belevich via cfe-commits
Artem-B wrote: It's a C++-11 feature. Tests still include c++98. We do not intend to keep everything working with c++98 (we already use c++11 in other headers), but we should not break it either. In this case, you can just enable the new stuff for c++11 or newer standards. https://github.com/

[clang] [CUDA][HIP] Add a __device__ version of std::__glibcxx_assert_fail() (PR #136133)

2025-06-18 Thread Artem Belevich via cfe-commits
Artem-B wrote: @jmmartinez It appears that CUDA tests are broken by this change: https://lab.llvm.org/buildbot/#/builders/69/builds/22562/steps/8/logs/stdio ``` FAILED: External/CUDA/CMakeFiles/algorithm-cuda-11.8-c++98-libstdc++-10.dir/algorithm.cu.o /buildbot/cuda-t4-0/work/clang-cuda-t4/c

[clang] Revert "Add missing intrinsics to cuda headers" (PR #144755)

2025-06-18 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B closed https://github.com/llvm/llvm-project/pull/144755 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Revert "Add missing intrinsics to cuda headers" (PR #144755)

2025-06-18 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B created https://github.com/llvm/llvm-project/pull/144755 Reverts llvm/llvm-project#143664 as it breaks CUDA compilation. >From 2ed0932a540bb1a692fe442ab590d51674645f6c Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Wed, 18 Jun 2025 10:06:56 -0700 Subject: [PATCH

[clang] Add missing intrinsics to cuda headers (PR #143664)

2025-06-18 Thread Artem Belevich via cfe-commits
Artem-B wrote: It appears to be breaking CUDA tests: https://lab.llvm.org/buildbot/#/builders/69/builds/22559 I'll revert it for now and we'll try again later. ``` [29/988] Building CXX object External/CUDA/CMakeFiles/math_h-cuda-11.8-c++98-libstdc++-10.dir/math_h.cu.o FAILED: External/CUDA/

[clang] [CUDA][HIP] add options `--[no-]offload-inc` (PR #140106)

2025-06-17 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/140106 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA][HIP] add options `--[no-]offload-inc` (PR #140106)

2025-06-17 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/140106 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Add missing intrinsics to cuda headers (PR #143664)

2025-06-17 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B closed https://github.com/llvm/llvm-project/pull/143664 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Add missing intrinsics to cuda headers (PR #143664)

2025-06-13 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. LGTM with one last nit. https://github.com/llvm/llvm-project/pull/143664 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Add missing intrinsics to cuda headers (PR #143664)

2025-06-13 Thread Artem Belevich via cfe-commits
@@ -479,7 +479,291 @@ inline __device__ unsigned __funnelshift_rc(unsigned low32, unsigned high32, return ret; } -#endif // !defined(__CUDA_ARCH__) || __CUDA_ARCH__ >= 320 +#pragma push_macro("__INTRINSIC_LOAD") +#define __INTRINSIC_LOAD(__FnName, __AsmOp, __DeclType, __Tmp

[clang] Add missing intrinsics to cuda headers (PR #143664)

2025-06-12 Thread Artem Belevich via cfe-commits
@@ -479,6 +479,275 @@ inline __device__ unsigned __funnelshift_rc(unsigned low32, unsigned high32, return ret; } +#define INTRINSIC_LOAD(func_name, asm_op, decl_type, internal_type, asm_type) \ Artem-B wrote: We have to be careful with the names used in th

[clang] Add missing intrinsics to cuda headers (PR #143664)

2025-06-12 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B requested changes to this pull request. Nice. I like this approach better. There are few more things to polish up, but it looks good overall. https://github.com/llvm/llvm-project/pull/143664 ___ cfe-commits mailing list cfe-

[clang] Add missing intrinsics to cuda headers (PR #143664)

2025-06-12 Thread Artem Belevich via cfe-commits
@@ -479,6 +479,275 @@ inline __device__ unsigned __funnelshift_rc(unsigned low32, unsigned high32, return ret; } +#define INTRINSIC_LOAD(func_name, asm_op, decl_type, internal_type, asm_type) \ Artem-B wrote: Can we merge `INTRINSIC*` and `MINTRINSIC*` mac

[clang] Add missing intrinsics to cuda headers (PR #143664)

2025-06-12 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/143664 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] Add missing intrinsics to cuda headers (PR #143664)

2025-06-11 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/143664 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [NVPTX] Enable OpenCL 3d_image_writes support (PR #143331)

2025-06-09 Thread Artem Belevich via cfe-commits
Artem-B wrote: @svenvh appears to be the current maintainer of OpenCL in LLVM. https://github.com/llvm/llvm-project/pull/143331 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA] Disallow use of address_space(N) on CUDA device variables. (PR #142857)

2025-06-09 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B closed https://github.com/llvm/llvm-project/pull/142857 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA] Disallow use of address_space(N) on CUDA device variables. (PR #142857)

2025-06-06 Thread Artem Belevich via cfe-commits
Artem-B wrote: @yxsamliu Sam, do you have any thoughts on this? https://github.com/llvm/llvm-project/pull/142857 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA] Disallow use of address_space(N) on CUDA device variables. (PR #142857)

2025-06-04 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B created https://github.com/llvm/llvm-project/pull/142857 The variables have implicit host-side shadow instances and explicit address space attribute breaks them on the host. >From e2e8da0271ae11711dbd54f6e8d9ff498f3226d4 Mon Sep 17 00:00:00 2001 From: Artem Belevich

[clang] [clang] Move opt level in clang toolchain to clang::ConstructJob start (PR #141036)

2025-05-27 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/141036 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Add pm_event intrinsics (PR #141278)

2025-05-27 Thread Artem Belevich via cfe-commits
@@ -177,6 +177,7 @@ let Attributes = [NoReturn] in { } let Attributes = [NoThrow] in { def __nvvm_nanosleep : NVPTXBuiltinSMAndPTX<"void(unsigned int)", SM_70, PTX63>; + def __nvvm_pm_event_mask : NVPTXBuiltin<"void(unsigned short)">; Artem-B wrote: The ar

[clang] [llvm] [NVPTX] Add pm_event intrinsics (PR #141278)

2025-05-27 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. Builtin signature needs a fix, but LGTM otherwise. https://github.com/llvm/llvm-project/pull/141278 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/li

[clang] [llvm] [NVPTX] Add pm_event intrinsics (PR #141278)

2025-05-27 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/141278 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [mlir] Reland "[NVPTX] Unify and extend barrier{.cta} intrinsic support" (PR #141143)

2025-05-22 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/141143 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [mlir] [NVPTX] Unify and extend barrier{.cta} intrinsic support (PR #140615)

2025-05-19 Thread Artem Belevich via cfe-commits
@@ -1349,6 +1349,10 @@ static bool upgradeIntrinsicFunction1(Function *F, Function *&NewFn, else if (Name == "clz.ll" || Name == "popc.ll" || Name == "h2f" || Name == "swap.lo.hi.b64") Expand = true; + else if (Name == "barrier0" || Name == "b

[clang] [NVPTX] Support the OpenCL generic addrspace feature by default (PR #137940)

2025-05-19 Thread Artem Belevich via cfe-commits
@@ -170,6 +170,8 @@ class LLVM_LIBRARY_VISIBILITY NVPTXTargetInfo : public TargetInfo { Opts["cl_khr_global_int32_extended_atomics"] = true; Opts["cl_khr_local_int32_base_atomics"] = true; Opts["cl_khr_local_int32_extended_atomics"] = true; + +Opts["__opencl_c_

[clang] [llvm] [NVPTX] Add errors for incorrect CUDA addrpaces (PR #138706)

2025-05-19 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/138706 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [NVPTX] Add errors for incorrect CUDA addrpaces (PR #138706)

2025-05-19 Thread Artem Belevich via cfe-commits
@@ -2927,6 +2928,20 @@ void Verifier::visitFunction(const Function &F) { "Calling convention does not support varargs or " "perfect forwarding!", &F); +if (F.getCallingConv() == CallingConv::PTX_Kernel && +TT.getOS() == Triple::CUDA) {

[clang] [CUDA][HIP] add option -gpuinc (PR #140106)

2025-05-15 Thread Artem Belevich via cfe-commits
@@ -5734,6 +5734,9 @@ def nobuiltininc : Flag<["-"], "nobuiltininc">, def nogpuinc : Flag<["-"], "nogpuinc">, Group, HelpText<"Do not add include paths for CUDA/HIP and" " do not include the default CUDA/HIP wrapper headers">; +def gpuinc : Flag<["-"], "gpuinc">, Group, +

[clang] [CUDA][HIP] add option -gpuinc (PR #140106)

2025-05-15 Thread Artem Belevich via cfe-commits
@@ -5734,6 +5734,9 @@ def nobuiltininc : Flag<["-"], "nobuiltininc">, def nogpuinc : Flag<["-"], "nogpuinc">, Group, HelpText<"Do not add include paths for CUDA/HIP and" " do not include the default CUDA/HIP wrapper headers">; +def gpuinc : Flag<["-"], "gpuinc">, Group, +

[clang] [CUDA][HIP] add option -gpuinc (PR #140106)

2025-05-15 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/140106 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA][HIP] add option -gpuinc (PR #140106)

2025-05-15 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B commented: Being able to override a flag is a good thing to have, IMO. There are builds where the owner of the leaf targets do not have much control over which options are set by the "default" compilation, so they need to rely on being able to override preceding opti

[clang] [llvm] [NVPTX] Add errors for incorrect CUDA addrpaces (PR #138706)

2025-05-13 Thread Artem Belevich via cfe-commits
@@ -1399,19 +1399,27 @@ void NVPTXAsmPrinter::emitFunctionParamList(const Function *F, raw_ostream &O) { if (PTy) { O << "\t.param .u" << PTySizeInBits << " .ptr"; +bool IsCUDA = static_cast(TM).getDrvInterface() == + NVPTX::CUDA;

[clang] [CUDA] Remove obsolete GPU-side __constexpr_* wrappers. (PR #139164)

2025-05-12 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B closed https://github.com/llvm/llvm-project/pull/139164 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA] Remove obsolete GPU-side __constexpr_* wrappers. (PR #139164)

2025-05-12 Thread Artem Belevich via cfe-commits
Artem-B wrote: No wrappers -- no problems. :-) https://github.com/llvm/llvm-project/pull/139164 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA] Remove obsolete GPU-side __constexpr_* wrappers. (PR #139164)

2025-05-12 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B edited https://github.com/llvm/llvm-project/pull/139164 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CUDA] fix wrapper cmath header to match #136101 (PR #139164)

2025-05-12 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B updated https://github.com/llvm/llvm-project/pull/139164 >From a1d60feed11174b9d2106b57ee15ff6d9bc56fa4 Mon Sep 17 00:00:00 2001 From: Artem Belevich Date: Thu, 8 May 2025 14:43:47 -0700 Subject: [PATCH] [CUDA] remove obsolete GPU-side __constexpr* wrappers libc++ no

[clang] [CUDA] fix wrapper cmath header to match #136101 (PR #139164)

2025-05-12 Thread Artem Belevich via cfe-commits
Artem-B wrote: > Right now this checks for `libc++` less than 14. Is that still relevant > following that change? That's a very good point. Looks like those `__constexpr_fmin/fmax` are gone now and we do not heed them any more. https://github.com/llvm/llvm-project/pull/139164

[clang] [CUDA] fix wrapper cmath header to match #136101 (PR #139164)

2025-05-12 Thread Artem Belevich via cfe-commits
Artem-B wrote: @jhuber6 @ldionne One concern I have for this change is that it will break folks who will use older libc++ with the new Clang + wrapper headers. Is older libc++ expected to work with non-matching clang version? If the expectation is that libc++ and clang are from the same versio

[clang] [llvm] [NVPTX] Add intrinsics and clang builtins for conversions of f4x2 type (PR #139244)

2025-05-09 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/139244 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [HIP] change default offload archs (PR #139281)

2025-05-09 Thread Artem Belevich via cfe-commits
Artem-B wrote: @cgmb > I would suggest that we should either (a) change the default GPU target to > native and make the failure to detect the user’s GPU into a hard compiler > error, or (b) change the default GPU target to SPIR-V so that it works on > every machine. The thing is that the se

[clang] [HIP] change default offload archs (PR #139281)

2025-05-09 Thread Artem Belevich via cfe-commits
Artem-B wrote: @jhuber6 do you think can we use `native` instead? I think it would be a somewhat better option here. If we have to choose a GPU variant by default, we may as well choose the actual GPU, rather than a conditional choice between generic SPIR-V or an old GPU, which has the disadva

[clang] [CUDA][HIP] Fix host/device attribute of builtin (PR #138162)

2025-05-07 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. https://github.com/llvm/llvm-project/pull/138162 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] clang/OpenCL: Add baseline test showing broken codegen (PR #138862)

2025-05-07 Thread Artem Belevich via cfe-commits
@@ -109,3 +109,48 @@ void func2(void) { void func3(void) { float a[16][1] = {{0.}}; } + +// CL12-LABEL: define dso_local void @wrong_store_type_private_pointer_alloca( +// CL12-SAME: ) #[[ATTR0]] { +// CL12-NEXT: [[ENTRY:.*:]] +// CL12-NEXT:[[PLONG:%.*]] = alloca i64, al

[clang] clang/OpenCL: Add baseline test showing broken codegen (PR #138862)

2025-05-07 Thread Artem Belevich via cfe-commits
@@ -109,3 +109,48 @@ void func2(void) { void func3(void) { float a[16][1] = {{0.}}; } + +// CL12-LABEL: define dso_local void @wrong_store_type_private_pointer_alloca( +// CL12-SAME: ) #[[ATTR0]] { +// CL12-NEXT: [[ENTRY:.*:]] +// CL12-NEXT:[[PLONG:%.*]] = alloca i64, al

[clang] [clang][Sema] Don't warn for implicit uses of builtins in system headers (PR #138205)

2025-05-02 Thread Artem Belevich via cfe-commits
@@ -2376,9 +2376,14 @@ NamedDecl *Sema::LazilyCreateBuiltin(IdentifierInfo *II, unsigned ID, return nullptr; } + // Warn for implicit uses of header dependent libraries, + // except in system headers. if (!ForRedeclaration && (Context.BuiltinInfo.isPredefine

[clang] [clang][Sema] Don't warn for implicit uses of builtins in system headers (PR #138205)

2025-05-02 Thread Artem Belevich via cfe-commits
Artem-B wrote: OK. This makes sense. > sorry this change is so drawn out :) What matters is that you're making progress, and I appreciate your work on getting this issue sorted out the right way. https://github.com/llvm/llvm-project/pull/138205 _

[clang] [clang][Sema] Don't warn for implicit uses of builtins in system headers (PR #138205)

2025-05-02 Thread Artem Belevich via cfe-commits
Artem-B wrote: Something does not add up here. AFAICT, using builtins w/o explicitly declaring them is something that's done all the time. https://godbolt.org/z/ha47W53dh In that sense, we should not be needing to filter out the diagnostics coming from the system headers only. There should not

[clang] [CUDA][HIP] Fix implicit attribute of builtin (PR #138162)

2025-05-01 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,23 @@ +// expected-no-diagnostics + +// RUN: %clang_cc1 -triple x86_64-unknown-linux-gnu -aux-triple amdgcn-amd-amdhsa -fsyntax-only -verify -xhip %s +// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -fsyntax-only -fcuda-is-device -verify -xhip %s + +#include "Inputs/cuda

[clang] [CUDA][HIP] Add a __device__ version of std::__glibcxx_assert_fail() (PR #136133)

2025-04-30 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B commented: LGTM in principle. Now the question is -- how do we test it? There are multiple libstdc++ library versions in the wild and we must not break any of them. We do have some testing on CUDA test bots (which I've just discovered to be silently broken for a whil

[clang] [CUDA][HIP] Add a __device__ version of std::__glibcxx_assert_fail() (PR #136133)

2025-04-30 Thread Artem Belevich via cfe-commits
@@ -0,0 +1,35 @@ +// libstdc++ uses the non-constexpr function std::__glibcxx_assert_fail() +// to trigger compilation errors when the __glibcxx_assert(cond) macro +// is used in a constexpr context. +// Compilation fails when using code from the libstdc++ (such as std::array) on

[clang] [CUDA][HIP] capture possible ODR-used var (PR #136645)

2025-04-22 Thread Artem Belevich via cfe-commits
https://github.com/Artem-B approved this pull request. LGTM w/ a nit. https://github.com/llvm/llvm-project/pull/136645 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

  1   2   3   4   5   6   7   8   9   10   >