[clang] [llvm] [Offload][CUDA] Add initial cuda_runtime.h overlay (PR #94821)

2024-06-07 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,30 @@ +// RUN: %clang++ -foffload-via-llvm --offload-arch=native %s -o %t +// RUN: %t | %fcheck-generic + +// UNSUPPORTED: aarch64-unknown-linux-gnu +// UNSUPPORTED: aarch64-unknown-linux-gnu-LTO +// UNSUPPORTED: x86_64-pc-linux-gnu +// UNSUPPORTED:

[clang] [Clang][AMDGPU] Add a new builtin type for buffer rsrc (PR #94830)

2024-06-07 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,9 @@ +// REQUIRES: amdgpu-registered-target +// RUN: %clang_cc1 -fclang-abi-compat=latest -triple amdgcn %s -emit-llvm -o - | FileCheck %s arsenm wrote: Why do you need -fclang-abi-compat=latest https://github.com/llvm/llvm-project/pull/94830

[clang] [Clang][AMDGPU] Add a new builtin type for buffer rsrc (PR #94830)

2024-06-07 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,21 @@ +//===-- AMDGPUTypes.def - Metadata about AMDGPU types ---*- C++ -*-===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[clang] [Clang][AMDGPU] Add a new builtin type for buffer rsrc (PR #94830)

2024-06-07 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/94830 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][AMDGPU] Add a new builtin type for buffer rsrc (PR #94830)

2024-06-07 Thread Matt Arsenault via cfe-commits
@@ -2200,6 +2206,9 @@ TypeInfo ASTContext::getTypeInfoImpl(const Type *T) const { Align = 8; \ break; #include "clang/Basic/WebAssemblyReferenceTypes.def" +case BuiltinType::AMDGPUBufferRsrc: +

[clang] [Clang][AMDGPU] Add a new builtin type for buffer rsrc (PR #94830)

2024-06-07 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm commented: Need stacked PR that adds the make_buffer_rsrc builtin that shows its use https://github.com/llvm/llvm-project/pull/94830 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[clang] [Clang][AMDGPU] Add a new builtin type for buffer rsrc (PR #94830)

2024-06-07 Thread Matt Arsenault via cfe-commits
@@ -1091,6 +1091,9 @@ enum PredefinedTypeIDs { // \brief WebAssembly reference types with auto numeration #define WASM_TYPE(Name, Id, SingletonId) PREDEF_TYPE_##Id##_ID, #include "clang/Basic/WebAssemblyReferenceTypes.def" +// \breif AMDGPU types with auto numeration

[clang] [llvm] Intrinsic: introduce minimumnum and maximumnum (PR #93841)

2024-06-07 Thread Matt Arsenault via cfe-commits
@@ -16055,6 +16145,90 @@ of the two arguments. -0.0 is considered to be less than +0.0 for this intrinsic. Note that these are the semantics specified in the draft of IEEE 754-2019. +.. _i_minimumnum: + +'``llvm.minimumnum.*``' Intrinsic +^ +

[clang] [llvm] Intrinsic: introduce minimumnum and maximumnum (PR #93841)

2024-06-07 Thread Matt Arsenault via cfe-commits
@@ -16055,6 +16145,90 @@ of the two arguments. -0.0 is considered to be less than +0.0 for this intrinsic. Note that these are the semantics specified in the draft of IEEE 754-2019. +.. _i_minimumnum: + +'``llvm.minimumnum.*``' Intrinsic +^ +

[clang] [llvm] Intrinsic: introduce minimumnum and maximumnum (PR #93841)

2024-06-07 Thread Matt Arsenault via cfe-commits
@@ -15874,6 +15874,96 @@ The returned value is completely identical to the input except for the sign bit; in particular, if the input is a NaN, then the quiet/signaling bit and payload are perfectly preserved. +.. _i_fminmax_family: + +'``llvm.min.*``' Intrinsics Comparation

[clang] [llvm] Intrinsic: introduce minimumnum and maximumnum (PR #93841)

2024-06-07 Thread Matt Arsenault via cfe-commits
@@ -15874,6 +15874,96 @@ The returned value is completely identical to the input except for the sign bit; in particular, if the input is a NaN, then the quiet/signaling bit and payload are perfectly preserved. +.. _i_fminmax_family: + +'``llvm.min.*``' Intrinsics Comparation

[clang] [llvm] Intrinsic: introduce minimumnum and maximumnum (PR #93841)

2024-06-07 Thread Matt Arsenault via cfe-commits
@@ -16055,6 +16145,90 @@ of the two arguments. -0.0 is considered to be less than +0.0 for this intrinsic. Note that these are the semantics specified in the draft of IEEE 754-2019. +.. _i_minimumnum: + +'``llvm.minimumnum.*``' Intrinsic +^ +

[clang] [Clang][AMDGPU] Add builtins for instrinsic `llvm.amdgcn.raw.buffer.store` (PR #94576)

2024-06-07 Thread Matt Arsenault via cfe-commits
arsenm wrote: > "aggregates" here might even be unusual cases like `<4 x i8>` Vectors aren't aggregates and are more reasonable https://github.com/llvm/llvm-project/pull/94576 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[clang] [Clang][AMDGPU] Add builtins for instrinsic `llvm.amdgcn.raw.buffer.store` (PR #94576)

2024-06-07 Thread Matt Arsenault via cfe-commits
arsenm wrote: > `voffset` and `soffset` are "offset that goes in VGPRs" and "offset that goes > in SGPRs", with the latter having some different bounds-checking semantics on > ... at least some of the gfx9's, IIRC. > Right, that's the problem. We need to know the parameters of the SRD in

[clang] [Clang][AMDGPU] Add builtins for instrinsic `llvm.amdgcn.raw.buffer.store` (PR #94576)

2024-06-07 Thread Matt Arsenault via cfe-commits
arsenm wrote: > 2. What I mean is that "types that work" isn't the right framing: any type > can be legalized to one or more types that work. That is, down in the isel > legalizer, if I call for, for example >```llvm >%0 = call {i64, i64, i8} @llvm.amdgcn.raw.buffer.ptr.load(ptr

[clang] [Clang][AMDGPU] Add builtins for instrinsic `llvm.amdgcn.raw.buffer.store` (PR #94576)

2024-06-07 Thread Matt Arsenault via cfe-commits
arsenm wrote: > 1. For the swizzled case, that's `struct.ptr.buffer.*`, and yeah, those will > always need builtins because LLVM can't deal in 2D addressing schemes But the raw buffer intrinsics have both the soffset and voffset parameters though? Not just the struct

[clang] [Clang][AMDGPU] Add builtins for instrinsic `llvm.amdgcn.raw.buffer.store` (PR #94576)

2024-06-07 Thread Matt Arsenault via cfe-commits
arsenm wrote: > Actually, even ignoring address space 7, it feels like these builtins if you > could `raw.ptr.buffer.store` any type you liked, and then they could be > type-varying in Clang? We could either have a builtin for all the types that would work, or if we want to treat them more

[clang] [llvm] [APFloat] Add APFloat support for FP6 data types (PR #94735)

2024-06-07 Thread Matt Arsenault via cfe-commits
@@ -68,6 +68,10 @@ enum class fltNonfiniteBehavior { // `fltNanEncoding` enum. We treat all NaNs as quiet, as the available // encodings do not distinguish between signalling and quiet NaN. NanOnly, + + // This behavior is present in Float6E3M2FN and Float6E2M3FN types.

[clang] [llvm] [APFloat] Add APFloat support for FP6 data types (PR #94735)

2024-06-07 Thread Matt Arsenault via cfe-commits
@@ -878,6 +896,10 @@ void IEEEFloat::copySignificand(const IEEEFloat ) { for the significand. If double or longer, this is a signalling NaN, which may not be ideal. If float, this is QNaN(0). */ void IEEEFloat::makeNaN(bool SNaN, bool Negative, const APInt *fill) { +

[clang] [Clang] Add timeout for GPU detection utilities (PR #94751)

2024-06-07 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/94751 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Add timeout for GPU detection utilities (PR #94751)

2024-06-07 Thread Matt Arsenault via cfe-commits
@@ -205,7 +205,7 @@ class ToolChain { /// Executes the given \p Executable and returns the stdout. llvm::Expected> - executeToolChainProgram(StringRef Executable) const; + executeToolChainProgram(StringRef Executable, unsigned Timeout = 0) const; arsenm

[clang] [llvm] [APFloat] Add APFloat support for FP6 data types (PR #94735)

2024-06-07 Thread Matt Arsenault via cfe-commits
@@ -1881,6 +1890,20 @@ TEST(APFloatTest, getSmallest) { EXPECT_TRUE(test.isFiniteNonZero()); EXPECT_TRUE(test.isDenormal()); EXPECT_TRUE(test.bitwiseIsEqual(expected)); + + test = APFloat::getSmallest(APFloat::Float6E3M2FN(), false); + expected =

[clang] [llvm] [APFloat] Add APFloat support for FP6 data types (PR #94735)

2024-06-07 Thread Matt Arsenault via cfe-commits
@@ -47,6 +47,10 @@ static std::string convertToString(double d, unsigned Prec, unsigned Pad, return std::string(Buffer.data(), Buffer.size()); } +static bool hasNanOrInf(APFloat::Semantics S) { + return (S != APFloat::S_Float6E3M2FN) && (S != APFloat::S_Float6E2M3FN); +}

[clang] [llvm] [clang][CodeGen] `used` globals are fake (PR #93601)

2024-06-07 Thread Matt Arsenault via cfe-commits
@@ -8642,8 +8642,11 @@ The '``llvm.used``' Global Variable The ``@llvm.used`` global is an array which has :ref:`appending linkage `. This array contains a list of pointers to named global variables, functions and aliases which may optionally -have a pointer cast formed of

[clang] [llvm] [clang][CodeGen] Global constructors/destructors are globals (PR #93914)

2024-06-06 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm commented: Is this redundant with #93601? https://github.com/llvm/llvm-project/pull/93914 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [clang][CodeGen] `used` globals && the payloads for global ctors & dtors are globals (PR #93601)

2024-06-06 Thread Matt Arsenault via cfe-commits
arsenm wrote: Commit message also needs to be updated https://github.com/llvm/llvm-project/pull/93601 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [clang][CodeGen] `used` globals && the payloads for global ctors & dtors are globals (PR #93601)

2024-06-06 Thread Matt Arsenault via cfe-commits
@@ -2922,18 +2922,19 @@ static void emitUsed(CodeGenModule , StringRef Name, if (List.empty()) return; + llvm::Type *UsedPtrTy = llvm::PointerType::getUnqual(CGM.getLLVMContext()); arsenm wrote: Best to just use get(Ctx, 0)

[clang] [llvm] [clang][CodeGen] `used` globals && the payloads for global ctors & dtors are globals (PR #93601)

2024-06-06 Thread Matt Arsenault via cfe-commits
@@ -2922,18 +2922,19 @@ static void emitUsed(CodeGenModule , StringRef Name, if (List.empty()) return; + llvm::Type *UsedPtrTy = llvm::PointerType::getUnqual(CGM.getLLVMContext()); + // Convert List to what ConstantArray needs. SmallVector UsedArray;

[clang] [llvm] [clang][CodeGen] `used` globals && the payloads for global ctors & dtors are globals (PR #93601)

2024-06-06 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. lgtm with nit https://github.com/llvm/llvm-project/pull/93601 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [clang][CodeGen] `used` globals && the payloads for global ctors & dtors are globals (PR #93601)

2024-06-06 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/93601 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][AMDGPU] Add builtins for instrinsic `llvm.amdgcn.raw.buffer.store` (PR #94576)

2024-06-06 Thread Matt Arsenault via cfe-commits
arsenm wrote: > If we do want addrspace(7), we'll need to expose `make.buffer.rsrc` and give > it a `p7` variant probably. Yes. We probably should expose some kind of custom type instead of directly using a C address_space(7) attribute https://github.com/llvm/llvm-project/pull/94576

[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-06-06 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,19 @@ +; RUN: not --crash llc -stop-after=amdgpu-isel -mtriple=amdgcn-- -mcpu=gfx900 -verify-machineinstrs -o - %s 2>&1 | FileCheck %s arsenm wrote: This should also be repeated for all 3 intrinsics https://github.com/llvm/llvm-project/pull/89217

[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-06-06 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm requested changes to this pull request. @jayfoad's testcase fails and the same test should be repeated for all 3 intrinsics https://github.com/llvm/llvm-project/pull/89217 ___ cfe-commits mailing list

[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-06-06 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/89217 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-06-06 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,19 @@ +; RUN: not --crash llc -stop-after=amdgpu-isel -mtriple=amdgcn-- -mcpu=gfx900 -verify-machineinstrs -o - %s 2>&1 | FileCheck %s arsenm wrote: This is not an IR verifier test, it is a codegen test that fails the machine verifier. A machine

[clang] [Clang][AMDGPU] Use `I` to decorate imm argument for `__builtin_amdgcn_global_load_lds` (PR #94376)

2024-06-06 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/94376 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][AMDGPU] Add builtins for instrinsic `llvm.amdgcn.raw.buffer.store` (PR #94576)

2024-06-06 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,264 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py +// REQUIRES: amdgpu-registered-target +// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu verde -emit-llvm -o - %s | FileCheck %s --check-prefixes=VERDE +// RUN:

[clang] [amdgpu] Pass variadic arguments without splitting (PR #94083)

2024-06-06 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,293 @@ +// REQUIRES: amdgpu-registered-target +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --function-signature +// RUN: %clang_cc1 -cc1 -std=c23 -triple amdgcn-amd-amdhsa -emit-llvm -O1 %s -o - | FileCheck %s + +void

[clang] [amdgpu] Pass variadic arguments without splitting (PR #94083)

2024-06-06 Thread Matt Arsenault via cfe-commits
arsenm wrote: > @arsenm You're right about passing larger things indirectly. I'm intending to > land this as-is, with the types inlined, as that unblocks #93362. I'm nervous > that the extra pointer indirection will hit the same memory error that > tweaking codegen in that patch hits (it's a

[clang] [libc] [llvm] [AMDGPU] Implement variadic functions by IR lowering (PR #93362)

2024-06-06 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,1037 @@ +//===-- ExpandVariadicsPass.cpp *- C++ -*-=// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[clang] [libc] [llvm] [AMDGPU] Implement variadic functions by IR lowering (PR #93362)

2024-06-06 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,1037 @@ +//===-- ExpandVariadicsPass.cpp *- C++ -*-=// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[clang] [Clang][AMDGPU] Add builtins for instrinsic `llvm.amdgcn.raw.buffer.store` (PR #94576)

2024-06-06 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,264 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py +// REQUIRES: amdgpu-registered-target +// RUN: %clang_cc1 -triple amdgcn-unknown-unknown -target-cpu verde -emit-llvm -o - %s | FileCheck %s --check-prefixes=VERDE +// RUN:

[clang] [Clang][AMDGPU] Add builtins for instrinsic `llvm.amdgcn.raw.buffer.store` (PR #94576)

2024-06-06 Thread Matt Arsenault via cfe-commits
arsenm wrote: > Is there really a good use case for this? Can you use regular stores to > addrspace(7) instead? @krzysz00 I see these regularly used via inline asm in various ML code. We need to expose these in some way to stop people from doing that > > Also, do you really need a separate

[clang] [Clang][AMDGPU] Use `I` to decorate imm argument for `__builtin_amdgcn_global_load_lds` (PR #94376)

2024-06-04 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm commented: Missing non-constant tests for each parameter? https://github.com/llvm/llvm-project/pull/94376 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [amdgpu] Pass variadic arguments without splitting (PR #94083)

2024-06-01 Thread Matt Arsenault via cfe-commits
@@ -197,12 +202,20 @@ ABIArgInfo AMDGPUABIInfo::classifyKernelArgumentType(QualType Ty) const { return ABIArgInfo::getDirect(LTy, 0, nullptr, false); } -ABIArgInfo AMDGPUABIInfo::classifyArgumentType(QualType Ty, +ABIArgInfo AMDGPUABIInfo::classifyArgumentType(QualType Ty,

[clang] [amdgpu] Pass variadic arguments without splitting (PR #94083)

2024-06-01 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,293 @@ +// REQUIRES: amdgpu-registered-target +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --function-signature +// RUN: %clang_cc1 -cc1 -std=c23 -triple amdgcn-amd-amdhsa -emit-llvm -O1 %s -o - | FileCheck %s + +void

[clang] [amdgpu] Pass variadic arguments without splitting (PR #94083)

2024-06-01 Thread Matt Arsenault via cfe-commits
@@ -197,12 +202,20 @@ ABIArgInfo AMDGPUABIInfo::classifyKernelArgumentType(QualType Ty) const { return ABIArgInfo::getDirect(LTy, 0, nullptr, false); } -ABIArgInfo AMDGPUABIInfo::classifyArgumentType(QualType Ty, +ABIArgInfo AMDGPUABIInfo::classifyArgumentType(QualType Ty,

[clang] [llvm] Intrinsic: introduce minimumnum and maximumnum (PR #93841)

2024-05-31 Thread Matt Arsenault via cfe-commits
@@ -32,27 +32,29 @@ class StoreInst; /// These are the kinds of recurrences that we support. enum class RecurKind { - None, ///< Not a recurrence. - Add, ///< Sum of integers. - Mul, ///< Product of integers. - Or, ///< Bitwise or logical OR of

[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-31 Thread Matt Arsenault via cfe-commits
arsenm wrote: You should add the mentioned convergence-tokens.ll test function https://github.com/llvm/llvm-project/pull/89217 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [clang][CodeGen] Global constructors/destructors are globals (PR #93914)

2024-05-31 Thread Matt Arsenault via cfe-commits
arsenm wrote: > Perhaps an alternative is to tweak LangRef wording to say that that these are > always emitted as unqualified ptrs, and that their ephemeral nature implies > that their AS is meaningless? I think this is the correct way to handle it. Also we'll need a few stripPointerCasts

[clang] [llvm] [AMDGPU] Implement variadic functions by IR lowering (PR #93362)

2024-05-31 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,1023 @@ +//===-- ExpandVariadicsPass.cpp *- C++ -*-=// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[clang] [llvm] [AMDGPU] Implement variadic functions by IR lowering (PR #93362)

2024-05-31 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,1023 @@ +//===-- ExpandVariadicsPass.cpp *- C++ -*-=// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[clang] [llvm] [clang][CodeGen] Global constructors/destructors are globals (PR #93914)

2024-05-31 Thread Matt Arsenault via cfe-commits
arsenm wrote: > The third argument here is like for llvm.used, it's a way to associate the > entry with a global or function. If the corresponding global or function is > omitted from the output then the entry will be removed. It isn't used for > anything at run time. So I think there should

[clang] [clang][CodeGen] `used` globals && the payloads for global ctors & dtors are globals (PR #93601)

2024-05-31 Thread Matt Arsenault via cfe-commits
@@ -2928,12 +2928,13 @@ static void emitUsed(CodeGenModule , StringRef Name, for (unsigned i = 0, e = List.size(); i != e; ++i) { UsedArray[i] = llvm::ConstantExpr::getPointerBitCastOrAddrSpaceCast( -cast(&*List[i]), CGM.Int8PtrTy);

[clang] [llvm] [WIP] Expand variadic functions in IR (PR #89007)

2024-05-31 Thread Matt Arsenault via cfe-commits
arsenm wrote: > I think the comments here are fed into #93362 successfully, will go through > the list again to check. So #93362 is the replacement, and not the sequential next piece? Can we close this one then? https://github.com/llvm/llvm-project/pull/89007

[clang] [llvm] Intrinsic: introduce minimumnum and maximumnum (PR #93841)

2024-05-31 Thread Matt Arsenault via cfe-commits
@@ -5005,8 +5007,11 @@ void computeKnownFPClass(const Value *V, const APInt , // If either operand is not NaN, the result is not NaN. if (NeverNaN && (IID == Intrinsic::minnum || IID == Intrinsic::maxnum)) Known.knownNot(fcNan); + if (NeverNaN && +

[clang] [llvm] Intrinsic: introduce minimumnum and maximumnum (PR #93841)

2024-05-31 Thread Matt Arsenault via cfe-commits
@@ -16049,6 +16094,84 @@ of the two arguments. -0.0 is considered to be less than +0.0 for this intrinsic. Note that these are the semantics specified in the draft of IEEE 754-2019. +.. _i_minimumnum: + +'``llvm.minimumnum.*``' Intrinsic +^ +

[clang] [llvm] Intrinsic: introduce minimumnum and maximumnum (PR #93841)

2024-05-31 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/93841 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] Intrinsic: introduce minimumnum and maximumnum (PR #93841)

2024-05-31 Thread Matt Arsenault via cfe-commits
@@ -3636,6 +3648,22 @@ def Fmin : FPMathTemplate, LibBuiltin<"math.h"> { let OnlyBuiltinPrefixedAliasIsConstexpr = 1; } +def FmaximumNum : FPMathTemplate, LibBuiltin<"math.h"> { arsenm wrote: I'd prefer to split the clang changes into a separate change

[clang] [llvm] Intrinsic: introduce minimumnum and maximumnum (PR #93841)

2024-05-31 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm commented: > 3. PowerPC: has some interaction with the behavior of `minnum/maxnum`: need > define `fcanonicalize`. AMDGPU has the same handling. This is to break the signaling nan handling from IEEE to the broken old glibc libm behavior. If we fix the definition to

[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-31 Thread Matt Arsenault via cfe-commits
arsenm wrote: > Does this need IR autoupgrade? This type of auto upgrade is free, it just happens https://github.com/llvm/llvm-project/pull/89217 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-31 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/89217 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-30 Thread Matt Arsenault via cfe-commits
@@ -5461,8 +5461,7 @@ bool AMDGPULegalizerInfo::legalizeLaneOp(LegalizerHelper , SmallVector PartialRes; unsigned NumParts = Size / 32; - MachineInstrBuilder Src0Parts, Src2Parts; - Src0Parts = B.buildUnmerge(PartialResTy, Src0); + MachineInstrBuilder Src0Parts =

[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-30 Thread Matt Arsenault via cfe-commits
@@ -1208,7 +1225,7 @@ The AMDGPU backend implements the following LLVM IR intrinsics. the output. llvm.amdgcn.sdot2Provides direct access to v_dot2_i32_i16 across targets which -

[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-30 Thread Matt Arsenault via cfe-commits
@@ -5387,6 +5387,98 @@ bool AMDGPULegalizerInfo::legalizeDSAtomicFPIntrinsic(LegalizerHelper , return true; } +// TODO: Fix pointer type handling +bool AMDGPULegalizerInfo::legalizeLaneOp(LegalizerHelper , + MachineInstr , +

[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-30 Thread Matt Arsenault via cfe-commits
@@ -1170,6 +1170,23 @@ The AMDGPU backend implements the following LLVM IR intrinsics. :ref:`llvm.set.fpenv` Sets the floating point environment to the specifies state. + llvm.amdgcn.readfirstlaneProvides direct access to

[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-30 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/89217 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-30 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm commented: lgtm with a few nits https://github.com/llvm/llvm-project/pull/89217 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang][CodeGen] `used` globals && the payloads for global ctors & dtors are globals (PR #93601)

2024-05-30 Thread Matt Arsenault via cfe-commits
@@ -2047,9 +2047,9 @@ void CodeGenModule::EmitCtorList(CtorList , const char *GlobalName) { llvm::Type *CtorPFTy = llvm::PointerType::get(CtorFTy, TheModule.getDataLayout().getProgramAddressSpace()); - // Get the type of a ctor entry, { i32, void ()*, i8* }.

[clang] [clang][CodeGen] `used` globals && the payloads for global ctors & dtors are globals (PR #93601)

2024-05-30 Thread Matt Arsenault via cfe-commits
@@ -1,6 +1,8 @@ // RUN: %clang_cc1 %s -triple x86_64-apple-darwin -emit-llvm -o - | FileCheck %s +// RUN: %clang_cc1 %s -triple amdgcn-amd-amdhsa -emit-llvm -o - | FileCheck %s --check-prefix=GLOBALAS // CHECK: @llvm.used = appending global [2 x ptr] [ptr @foo, ptr @X],

[clang] [clang][CodeGen] `used` globals && the payloads for global ctors & dtors are globals (PR #93601)

2024-05-30 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/93601 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang][CodeGen] `used` globals && the payloads for global ctors & dtors are globals (PR #93601)

2024-05-30 Thread Matt Arsenault via cfe-commits
@@ -2928,12 +2928,13 @@ static void emitUsed(CodeGenModule , StringRef Name, for (unsigned i = 0, e = List.size(); i != e; ++i) { UsedArray[i] = llvm::ConstantExpr::getPointerBitCastOrAddrSpaceCast( -cast(&*List[i]), CGM.Int8PtrTy);

[clang] [llvm] [Clang][Sanitizers] Add numerical sanitizer (PR #93783)

2024-05-30 Thread Matt Arsenault via cfe-commits
@@ -285,6 +285,9 @@ def SanitizeHWAddress : EnumAttr<"sanitize_hwaddress", [FnAttr]>; /// MemTagSanitizer is on. def SanitizeMemTag : EnumAttr<"sanitize_memtag", [FnAttr]>; +/// NumericalStabilitySanitizer is on. +def SanitizeNumericalStability :

[clang] [llvm] [AMDGPU][WIP] Extend permlane16, permlanex16 and permlane64 intrinsic lowering for generic types (PR #92725)

2024-05-29 Thread Matt Arsenault via cfe-commits
@@ -18479,6 +18479,28 @@ Value *CodeGenFunction::EmitAMDGPUBuiltinExpr(unsigned BuiltinID, CGM.getIntrinsic(Intrinsic::amdgcn_update_dpp, Args[0]->getType()); return Builder.CreateCall(F, Args); } + case AMDGPU::BI__builtin_amdgcn_permlane16: + case

[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-29 Thread Matt Arsenault via cfe-commits
@@ -6086,6 +6086,63 @@ static SDValue lowerBALLOTIntrinsic(const SITargetLowering , SDNode *N, DAG.getConstant(0, SL, MVT::i32), DAG.getCondCode(ISD::SETNE)); } +static SDValue lowerLaneOp(const SITargetLowering , SDNode *N, + SelectionDAG ) {

[clang] [clang][CodeGen] `used` globals && the payloads for global ctors & dtors are globals (PR #93601)

2024-05-29 Thread Matt Arsenault via cfe-commits
@@ -2928,12 +2928,13 @@ static void emitUsed(CodeGenModule , StringRef Name, for (unsigned i = 0, e = List.size(); i != e; ++i) { UsedArray[i] = llvm::ConstantExpr::getPointerBitCastOrAddrSpaceCast( -cast(&*List[i]), CGM.Int8PtrTy);

[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-29 Thread Matt Arsenault via cfe-commits
@@ -6086,6 +6086,63 @@ static SDValue lowerBALLOTIntrinsic(const SITargetLowering , SDNode *N, DAG.getConstant(0, SL, MVT::i32), DAG.getCondCode(ISD::SETNE)); } +static SDValue lowerLaneOp(const SITargetLowering , SDNode *N, + SelectionDAG ) {

[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-29 Thread Matt Arsenault via cfe-commits
@@ -5387,6 +5387,124 @@ bool AMDGPULegalizerInfo::legalizeDSAtomicFPIntrinsic(LegalizerHelper , return true; } +// TODO: Fix pointer type handling +bool AMDGPULegalizerInfo::legalizeLaneOp(LegalizerHelper , + MachineInstr , +

[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-29 Thread Matt Arsenault via cfe-commits
@@ -1170,6 +1170,23 @@ The AMDGPU backend implements the following LLVM IR intrinsics. :ref:`llvm.set.fpenv` Sets the floating point environment to the specifies state. + llvm.amdgcn.readfirstlaneProvides direct access to

[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-29 Thread Matt Arsenault via cfe-commits
@@ -6086,6 +6086,63 @@ static SDValue lowerBALLOTIntrinsic(const SITargetLowering , SDNode *N, DAG.getConstant(0, SL, MVT::i32), DAG.getCondCode(ISD::SETNE)); } +static SDValue lowerLaneOp(const SITargetLowering , SDNode *N, + SelectionDAG ) {

[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-29 Thread Matt Arsenault via cfe-commits
@@ -5387,6 +5387,124 @@ bool AMDGPULegalizerInfo::legalizeDSAtomicFPIntrinsic(LegalizerHelper , return true; } +// TODO: Fix pointer type handling +bool AMDGPULegalizerInfo::legalizeLaneOp(LegalizerHelper , + MachineInstr , +

[clang] [llvm] [AMDGPU] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-29 Thread Matt Arsenault via cfe-commits
@@ -1170,6 +1170,23 @@ The AMDGPU backend implements the following LLVM IR intrinsics. :ref:`llvm.set.fpenv` Sets the floating point environment to the specifies state. + llvm.amdgcn.readfirstlaneProvides direct access to

[clang] [llvm] [AMDGPU] Implement variadic functions by IR lowering (PR #93362)

2024-05-29 Thread Matt Arsenault via cfe-commits
@@ -103,19 +104,27 @@ void AMDGPUABIInfo::computeInfo(CGFunctionInfo ) const { if (!getCXXABI().classifyReturnType(FI)) FI.getReturnInfo() = classifyReturnType(FI.getReturnType()); + unsigned ArgumentIndex = 0; + const unsigned numFixedArguments =

[clang] [llvm] [AMDGPU] Implement variadic functions by IR lowering (PR #93362)

2024-05-29 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,180 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --function-signature + +// Simple calls to known variadic functions that are completely elided when +// optimisations are on This is a functional check that the

[clang] [clang][CodeGen] `used` globals && the payloads for global ctors & dtors are globals (PR #93601)

2024-05-28 Thread Matt Arsenault via cfe-commits
@@ -2047,9 +2047,9 @@ void CodeGenModule::EmitCtorList(CtorList , const char *GlobalName) { llvm::Type *CtorPFTy = llvm::PointerType::get(CtorFTy, TheModule.getDataLayout().getProgramAddressSpace()); - // Get the type of a ctor entry, { i32, void ()*, i8* }. + //

[clang] [clang][CodeGen] `used` globals && the payloads for global ctors & dtors are globals (PR #93601)

2024-05-28 Thread Matt Arsenault via cfe-commits
@@ -2928,12 +2928,13 @@ static void emitUsed(CodeGenModule , StringRef Name, for (unsigned i = 0, e = List.size(); i != e; ++i) { UsedArray[i] = llvm::ConstantExpr::getPointerBitCastOrAddrSpaceCast( -cast(&*List[i]), CGM.Int8PtrTy);

[clang] [clang][CodeGen] `used` globals && the payloads for global ctors & dtors are globals (PR #93601)

2024-05-28 Thread Matt Arsenault via cfe-commits
@@ -2047,9 +2047,9 @@ void CodeGenModule::EmitCtorList(CtorList , const char *GlobalName) { llvm::Type *CtorPFTy = llvm::PointerType::get(CtorFTy, TheModule.getDataLayout().getProgramAddressSpace()); - // Get the type of a ctor entry, { i32, void ()*, i8* }.

[clang] [llvm] [mlir] [Clang][CodeGen] Start migrating away from assuming the Default AS is 0 (PR #88182)

2024-05-28 Thread Matt Arsenault via cfe-commits
@@ -368,7 +368,8 @@ CodeGenModule::CodeGenModule(ASTContext , IntTy = llvm::IntegerType::get(LLVMContext, C.getTargetInfo().getIntWidth()); IntPtrTy = llvm::IntegerType::get(LLVMContext, C.getTargetInfo().getMaxPointerWidth()); - Int8PtrTy =

[clang] [llvm] [mlir] [Clang][CodeGen] Start migrating away from assuming the Default AS is 0 (PR #88182)

2024-05-28 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/88182 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [mlir] [Clang][CodeGen] Start migrating away from assuming the Default AS is 0 (PR #88182)

2024-05-28 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/88182 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [mlir] [Clang][CodeGen] Start migrating away from assuming the Default AS is 0 (PR #88182)

2024-05-28 Thread Matt Arsenault via cfe-commits
@@ -368,7 +368,8 @@ CodeGenModule::CodeGenModule(ASTContext , IntTy = llvm::IntegerType::get(LLVMContext, C.getTargetInfo().getIntWidth()); IntPtrTy = llvm::IntegerType::get(LLVMContext, C.getTargetInfo().getMaxPointerWidth()); - Int8PtrTy =

[clang] [llvm] [mlir] [Clang][CodeGen] Start migrating away from assuming the Default AS is 0 (PR #88182)

2024-05-28 Thread Matt Arsenault via cfe-commits
@@ -368,7 +368,8 @@ CodeGenModule::CodeGenModule(ASTContext , IntTy = llvm::IntegerType::get(LLVMContext, C.getTargetInfo().getIntWidth()); IntPtrTy = llvm::IntegerType::get(LLVMContext, C.getTargetInfo().getMaxPointerWidth()); - Int8PtrTy =

[clang] [llvm] [AMDGPU][Clang] Add check of size for __builtin_amdgcn_global_load_lds (PR #93064)

2024-05-23 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm approved this pull request. https://github.com/llvm/llvm-project/pull/93064 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU][WIP] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-23 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm commented: Should lose the [WIP] in the title https://github.com/llvm/llvm-project/pull/89217 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AMDGPU][Clang] Add check of size for __builtin_amdgcn_global_load_lds (PR #93064)

2024-05-23 Thread Matt Arsenault via cfe-commits
@@ -12385,4 +12385,8 @@ def err_acc_reduction_composite_type def err_acc_reduction_composite_member_type :Error< "OpenACC 'reduction' composite variable must not have non-scalar field">; def note_acc_reduction_composite_member_loc : Note<"invalid field is here">; + +//

[clang] [llvm] [AMDGPU][Clang] Add check of size for __builtin_amdgcn_global_load_lds (PR #93064)

2024-05-23 Thread Matt Arsenault via cfe-commits
@@ -0,0 +1,13 @@ +// RUN: %clang_cc1 -cl-std=CL2.0 -O0 -triple amdgcn-unknown-unknown -target-cpu gfx940 -S -verify -o - %s +// REQUIRES: amdgpu-registered-target + +typedef unsigned int u32; + +void test_global_load_lds_unsupported_size(global u32* src, local u32 *dst, u32

[clang] [clang] Introduce target-specific `Sema` components (PR #93179)

2024-05-23 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm edited https://github.com/llvm/llvm-project/pull/93179 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [clang] Introduce target-specific `Sema` components (PR #93179)

2024-05-23 Thread Matt Arsenault via cfe-commits
https://github.com/arsenm commented: Should update the GitHub autolabeler paths for the targets https://github.com/llvm/llvm-project/pull/93179 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[clang] [llvm] [AMDGPU][WIP] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-23 Thread Matt Arsenault via cfe-commits
@@ -6086,6 +6086,62 @@ static SDValue lowerBALLOTIntrinsic(const SITargetLowering , SDNode *N, DAG.getConstant(0, SL, MVT::i32), DAG.getCondCode(ISD::SETNE)); } +static SDValue lowerLaneOp(const SITargetLowering , SDNode *N, + SelectionDAG ) {

[clang] [llvm] [AMDGPU][WIP] Extend readlane, writelane and readfirstlane intrinsic lowering for generic types (PR #89217)

2024-05-23 Thread Matt Arsenault via cfe-commits
@@ -5456,43 +5444,32 @@ bool AMDGPULegalizerInfo::legalizeLaneOp(LegalizerHelper , if ((Size % 32) == 0) { SmallVector PartialRes; unsigned NumParts = Size / 32; -auto IsS16Vec = Ty.isVector() && Ty.getElementType() == S16; +bool IsS16Vec = Ty.isVector() &&

  1   2   3   4   5   6   7   8   9   10   >