[clang] [llvm] [clang-tools-extra] [AArch64] Add an AArch64 pass for loop idiom transformations (PR #72273)

2024-02-06 Thread David Sherwood via cfe-commits
david-arm wrote: > Hi! I wonder that have you conducted any tests to determine the potential > performance increase of this pass in the SPEC2017 557xz benchmark? I > attempted to apply it to the xz benchmark, but only one copy(--copies=1) > demonstrated a significant increase(about 3%), but

[clang] [LTO] Fix Veclib flags correctly pass to LTO flags (PR #78749)

2024-01-24 Thread David Sherwood via cfe-commits
https://github.com/david-arm approved this pull request. LGTM as well! https://github.com/llvm/llvm-project/pull/78749 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [LTO] Fix Veclib flags correctly pass to LTO flags (PR #78749)

2024-01-22 Thread David Sherwood via cfe-commits
@@ -783,6 +783,28 @@ void tools::addLTOOptions(const ToolChain , const ArgList , "-generate-arange-section")); } + // Pass vector library arguments to LTO. + Arg *ArgVecLib = Args.getLastArg(options::OPT_fveclib); + if (ArgVecLib

[clang] [LTO] Fix Veclib flags correctly pass to LTO flags (PR #78749)

2024-01-22 Thread David Sherwood via cfe-commits
@@ -783,6 +783,28 @@ void tools::addLTOOptions(const ToolChain , const ArgList , "-generate-arange-section")); } + // Pass vector library arguments to LTO. + Arg *ArgVecLib = Args.getLastArg(options::OPT_fveclib); + if (ArgVecLib

[clang] [LTO] Fix Veclib flags correctly pass to LTO flags (PR #78749)

2024-01-22 Thread David Sherwood via cfe-commits
@@ -31,3 +31,31 @@ // RUN: %clang -fveclib=Accelerate %s -nodefaultlibs -target arm64-apple-ios8.0.0 -### 2>&1 | FileCheck --check-prefix=CHECK-LINK-NODEFAULTLIBS %s // CHECK-LINK-NODEFAULTLIBS-NOT: "-framework" "Accelerate" + + +/* Verify that the correct vector library is

[llvm] [clang-tools-extra] [clang] [LoopVectorize] Refine runtime memory check costs when there is an outer loop (PR #76034)

2024-01-18 Thread David Sherwood via cfe-commits
@@ -2076,16 +2081,61 @@ class GeneratedRTChecks { LLVM_DEBUG(dbgs() << " " << C << " for " << I << "\n"); RTCheckCost += C; } -if (MemCheckBlock) +if (MemCheckBlock) { + InstructionCost MemCheckCost = 0; for (Instruction :

[clang-tools-extra] [llvm] [clang] [LoopVectorize] Refine runtime memory check costs when there is an outer loop (PR #76034)

2024-01-18 Thread David Sherwood via cfe-commits
https://github.com/david-arm updated https://github.com/llvm/llvm-project/pull/76034 >From a4caa47dc8d2db75f6bb2ac3f880da4e1f6bea82 Mon Sep 17 00:00:00 2001 From: David Sherwood Date: Tue, 19 Dec 2023 16:07:33 + Subject: [PATCH 1/6] Add tests showing runtime checks cost with low trip

[llvm] [clang-tools-extra] [clang] [LoopVectorize] Refine runtime memory check costs when there is an outer loop (PR #76034)

2024-01-15 Thread David Sherwood via cfe-commits
david-arm wrote: Gentle ping! https://github.com/llvm/llvm-project/pull/76034 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [AArch64][SME] Fix multi vector cvt builtins (PR #77656)

2024-01-11 Thread David Sherwood via cfe-commits
@@ -34,118 +34,118 @@ define @multi_vector_cvt_x2_bf16( %unu ; ; FCVTZS ; -define {, } @multi_vector_cvt_x2_f32_s32( %unused, %zn0, %zn1) { -; CHECK-LABEL: multi_vector_cvt_x2_f32_s32: +define {, } @multi_vector_cvt_x2_s32_f32( %unused, %zn0, %zn1) { +;

[clang] [clang-tools-extra] [llvm] [AArch64] Add an AArch64 pass for loop idiom transformations (PR #72273)

2024-01-09 Thread David Sherwood via cfe-commits
david-arm wrote: @dyung - fix pending here https://github.com/llvm/llvm-project/pull/77467 https://github.com/llvm/llvm-project/pull/72273 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[clang-tools-extra] [llvm] [clang] [AArch64] Add an AArch64 pass for loop idiom transformations (PR #72273)

2024-01-09 Thread David Sherwood via cfe-commits
david-arm wrote: @dyung - fix pending here https://github.com/llvm/llvm-project/pull/77467 https://github.com/llvm/llvm-project/pull/72273 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[clang] [clang-tools-extra] [llvm] [AArch64] Add an AArch64 pass for loop idiom transformations (PR #72273)

2024-01-09 Thread David Sherwood via cfe-commits
david-arm wrote: Hi @dyung, sorry about this! It passed for me locally. It sounds like it needs a REQUIRED aarch64-target somewhere then. I'll try to fix it asap. https://github.com/llvm/llvm-project/pull/72273 ___ cfe-commits mailing list

[llvm] [clang] [clang-tools-extra] [AArch64] Add an AArch64 pass for loop idiom transformations (PR #72273)

2024-01-09 Thread David Sherwood via cfe-commits
https://github.com/david-arm closed https://github.com/llvm/llvm-project/pull/72273 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [clang-tools-extra] [LoopVectorize] Refine runtime memory check costs when there is an outer loop (PR #76034)

2024-01-08 Thread David Sherwood via cfe-commits
https://github.com/david-arm updated https://github.com/llvm/llvm-project/pull/76034 >From a4caa47dc8d2db75f6bb2ac3f880da4e1f6bea82 Mon Sep 17 00:00:00 2001 From: David Sherwood Date: Tue, 19 Dec 2023 16:07:33 + Subject: [PATCH 1/6] Add tests showing runtime checks cost with low trip

[clang-tools-extra] [clang] [llvm] [LoopVectorize] Refine runtime memory check costs when there is an outer loop (PR #76034)

2024-01-08 Thread David Sherwood via cfe-commits
https://github.com/david-arm updated https://github.com/llvm/llvm-project/pull/76034 >From a4caa47dc8d2db75f6bb2ac3f880da4e1f6bea82 Mon Sep 17 00:00:00 2001 From: David Sherwood Date: Tue, 19 Dec 2023 16:07:33 + Subject: [PATCH 1/2] Add tests showing runtime checks cost with low trip

[llvm] [clang] [clang-tools-extra] [AArch64] Add an AArch64 pass for loop idiom transformations (PR #72273)

2023-12-22 Thread David Sherwood via cfe-commits
@@ -0,0 +1,816 @@ +//===- AArch64LoopIdiomTransform.cpp - Loop idiom recognition -===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[clang-tools-extra] [llvm] [clang] [AArch64] Add an AArch64 pass for loop idiom transformations (PR #72273)

2023-12-22 Thread David Sherwood via cfe-commits
@@ -0,0 +1,816 @@ +//===- AArch64LoopIdiomTransform.cpp - Loop idiom recognition -===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[clang-tools-extra] [llvm] [clang] [AArch64] Add an AArch64 pass for loop idiom transformations (PR #72273)

2023-12-22 Thread David Sherwood via cfe-commits
@@ -0,0 +1,816 @@ +//===- AArch64LoopIdiomTransform.cpp - Loop idiom recognition -===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[clang] [llvm] [clang-tools-extra] [AArch64] Add an AArch64 pass for loop idiom transformations (PR #72273)

2023-12-22 Thread David Sherwood via cfe-commits
@@ -0,0 +1,816 @@ +//===- AArch64LoopIdiomTransform.cpp - Loop idiom recognition -===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[clang] [llvm] [clang-tools-extra] [AArch64] Add an AArch64 pass for loop idiom transformations (PR #72273)

2023-12-22 Thread David Sherwood via cfe-commits
@@ -0,0 +1,816 @@ +//===- AArch64LoopIdiomTransform.cpp - Loop idiom recognition -===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[clang] [llvm] [clang-tools-extra] [AArch64] Add an AArch64 pass for loop idiom transformations (PR #72273)

2023-12-22 Thread David Sherwood via cfe-commits
@@ -0,0 +1,816 @@ +//===- AArch64LoopIdiomTransform.cpp - Loop idiom recognition -===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[llvm] [clang-tools-extra] [clang] [Clang][SME2] Enable multi-vector loads & stores for SME2 (PR #75821)

2023-12-21 Thread David Sherwood via cfe-commits
https://github.com/david-arm approved this pull request. LGTM! A lovely patch. :) https://github.com/llvm/llvm-project/pull/75821 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][SME2] Add builtins for multi-vector fp round to integral value (PR #75941)

2023-12-21 Thread David Sherwood via cfe-commits
https://github.com/david-arm approved this pull request. LGTM. Absolute perfection! https://github.com/llvm/llvm-project/pull/75941 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang-tools-extra] [LoopVectorize] Enable hoisting of runtime checks by default (PR #71538)

2023-12-18 Thread David Sherwood via cfe-commits
https://github.com/david-arm closed https://github.com/llvm/llvm-project/pull/71538 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [Clang][SME2] Add builtins for moving multi-vectors to/from ZA (PR #71191)

2023-12-14 Thread David Sherwood via cfe-commits
@@ -299,6 +299,44 @@ multiclass ZAAddSub { defm SVADD : ZAAddSub<"add">; defm SVSUB : ZAAddSub<"sub">; +// SME2 - MOVA + +// +// Single, 2 and 4 vector-group read/write intrinsics. +// + +multiclass ZAWrite_VG checks> { + def NAME # _VG2_H : Inst<"svwrite_hor_" # n # "_vg2",

[clang] [llvm] [Clang][SME2] Add builtins for moving multi-vectors to/from ZA (PR #71191)

2023-12-14 Thread David Sherwood via cfe-commits
https://github.com/david-arm edited https://github.com/llvm/llvm-project/pull/71191 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [Clang][SME2] Add builtins for moving multi-vectors to/from ZA (PR #71191)

2023-12-14 Thread David Sherwood via cfe-commits
https://github.com/david-arm approved this pull request. LGTM! I had one minor comment, but I won't hold up the patch for it. https://github.com/llvm/llvm-project/pull/71191 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[clang-tools-extra] [clang] [llvm] [AArch64] Add an AArch64 pass for loop idiom transformations (PR #72273)

2023-12-14 Thread David Sherwood via cfe-commits
@@ -0,0 +1,726 @@ + +//===- AArch64LoopIdiomTransform.cpp - Loop idiom recognition -===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[llvm] [clang] [clang-tools-extra] [AArch64] Add an AArch64 pass for loop idiom transformations (PR #72273)

2023-12-13 Thread David Sherwood via cfe-commits
@@ -0,0 +1,726 @@ + +//===- AArch64LoopIdiomTransform.cpp - Loop idiom recognition -===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[clang-tools-extra] [clang] [llvm] [AArch64] Add an AArch64 pass for loop idiom transformations (PR #72273)

2023-12-13 Thread David Sherwood via cfe-commits
@@ -0,0 +1,839 @@ +//===- AArch64LoopIdiomTransform.cpp - Loop idiom recognition -===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[clang-tools-extra] [clang] [llvm] [AArch64] Add an AArch64 pass for loop idiom transformations (PR #72273)

2023-12-13 Thread David Sherwood via cfe-commits
@@ -0,0 +1,839 @@ +//===- AArch64LoopIdiomTransform.cpp - Loop idiom recognition -===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[clang] [Clang][SME2] Add multi-vector zip & unzip builtins (PR #74841)

2023-12-12 Thread David Sherwood via cfe-commits
david-arm wrote: For builtins that operate purely on SVE vectors I think we've used the convention of adding _vector_ to the test name, i.e. see acle_sme2_vector_rshl.c, etc. Should we do the same here? https://github.com/llvm/llvm-project/pull/74841

[llvm] [clang-tools-extra] [LoopVectorize] Enable hoisting of runtime checks by default (PR #71538)

2023-12-12 Thread David Sherwood via cfe-commits
david-arm wrote: Gentle ping! https://github.com/llvm/llvm-project/pull/73515 has now landed so I think this patch should be ready to go. https://github.com/llvm/llvm-project/pull/71538 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[clang] [clang-tools-extra] [llvm] [AArch64] Add an AArch64 pass for loop idiom transformations (PR #72273)

2023-12-12 Thread David Sherwood via cfe-commits
@@ -0,0 +1,726 @@ + +//===- AArch64LoopIdiomTransform.cpp - Loop idiom recognition -===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[clang-tools-extra] [llvm] [clang] [AArch64] Add an AArch64 pass for loop idiom transformations (PR #72273)

2023-12-12 Thread David Sherwood via cfe-commits
@@ -0,0 +1,726 @@ + +//===- AArch64LoopIdiomTransform.cpp - Loop idiom recognition -===// +// +// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions. +// See https://llvm.org/LICENSE.txt for license information. +// SPDX-License-Identifier:

[clang-tools-extra] [llvm] [LoopVectorize] Enable hoisting of runtime checks by default (PR #71538)

2023-12-12 Thread David Sherwood via cfe-commits
https://github.com/david-arm updated https://github.com/llvm/llvm-project/pull/71538 >From 8a2af20a52fd851eaff1cfa7d50df8b994d0db0d Mon Sep 17 00:00:00 2001 From: David Sherwood Date: Tue, 7 Nov 2023 13:57:17 + Subject: [PATCH 1/2] [LoopVectorize] Enable hoisting of runtime checks by

[clang-tools-extra] [clang] [llvm] [LoopVectorize] Improve algorithm for hoisting runtime checks (PR #73515)

2023-12-12 Thread David Sherwood via cfe-commits
https://github.com/david-arm closed https://github.com/llvm/llvm-project/pull/73515 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang-tools-extra] [clang] [llvm] [LoopVectorize] Improve algorithm for hoisting runtime checks (PR #73515)

2023-12-11 Thread David Sherwood via cfe-commits
https://github.com/david-arm updated https://github.com/llvm/llvm-project/pull/73515 >From 30251642f8c208c63f3f3097c337ef0d5bc633b5 Mon Sep 17 00:00:00 2001 From: David Sherwood Date: Mon, 27 Nov 2023 13:43:26 + Subject: [PATCH 1/5] [LoopVectorize] Improve algorithm for hoisting runtime

[clang-tools-extra] [clang] [llvm] [LoopVectorize] Improve algorithm for hoisting runtime checks (PR #73515)

2023-12-08 Thread David Sherwood via cfe-commits
https://github.com/david-arm updated https://github.com/llvm/llvm-project/pull/73515 >From 30251642f8c208c63f3f3097c337ef0d5bc633b5 Mon Sep 17 00:00:00 2001 From: David Sherwood Date: Mon, 27 Nov 2023 13:43:26 + Subject: [PATCH 1/4] [LoopVectorize] Improve algorithm for hoisting runtime

[llvm] [clang] [clang-tools-extra] [LoopVectorize] Improve algorithm for hoisting runtime checks (PR #73515)

2023-12-08 Thread David Sherwood via cfe-commits
@@ -346,7 +346,9 @@ void RuntimePointerChecking::tryToCreateDiffCheck( auto *SinkStartAR = cast(SinkStartInt); const Loop *StartARLoop = SrcStartAR->getLoop(); if (StartARLoop == SinkStartAR->getLoop() && -StartARLoop == InnerLoop->getParentLoop()) { +

[clang] [llvm] [Clang] Emit TBAA info for enums in C (PR #73326)

2023-12-08 Thread David Sherwood via cfe-commits
https://github.com/david-arm closed https://github.com/llvm/llvm-project/pull/73326 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [Clang] Emit TBAA info for enums in C (PR #73326)

2023-12-07 Thread David Sherwood via cfe-commits
@@ -196,6 +196,9 @@ C Language Changes number of elements in the flexible array member. This information can improve the results of the array bound sanitizer and the ``__builtin_dynamic_object_size`` builtin. +- Enums will now be represented in TBAA metadata using their

[clang] [llvm] [Clang] Emit TBAA info for enums in C (PR #73326)

2023-12-07 Thread David Sherwood via cfe-commits
https://github.com/david-arm updated https://github.com/llvm/llvm-project/pull/73326 >From af76f6b6b3469fd0f5f24427c5a175c8d9d7c83a Mon Sep 17 00:00:00 2001 From: David Sherwood Date: Fri, 24 Nov 2023 13:20:23 + Subject: [PATCH 1/5] [Clang] Emit TBAA info for enums in C When emitting TBAA

[llvm] [clang] [Clang] Emit TBAA info for enums in C (PR #73326)

2023-12-07 Thread David Sherwood via cfe-commits
https://github.com/david-arm updated https://github.com/llvm/llvm-project/pull/73326 >From af76f6b6b3469fd0f5f24427c5a175c8d9d7c83a Mon Sep 17 00:00:00 2001 From: David Sherwood Date: Fri, 24 Nov 2023 13:20:23 + Subject: [PATCH 1/4] [Clang] Emit TBAA info for enums in C When emitting TBAA

[clang] [llvm] [Clang] Emit TBAA info for enums in C (PR #73326)

2023-12-07 Thread David Sherwood via cfe-commits
david-arm wrote: > I thought the suggestion was to add a few lines to > https://github.com/llvm/llvm-project/blob/main/clang/docs/ReleaseNotes.rst Yes you're right! For some reason I got mixed up with the LangRef, but I guess adding something to the LangRef does no harm either. I'll put

[llvm] [clang] [Clang] Emit TBAA info for enums in C (PR #73326)

2023-12-07 Thread David Sherwood via cfe-commits
david-arm wrote: > Do you think it's worth adding something to the Clang release note? Done. Hope the documentation I added makes sense! https://github.com/llvm/llvm-project/pull/73326 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[llvm] [clang] [Clang] Emit TBAA info for enums in C (PR #73326)

2023-12-07 Thread David Sherwood via cfe-commits
https://github.com/david-arm updated https://github.com/llvm/llvm-project/pull/73326 >From af76f6b6b3469fd0f5f24427c5a175c8d9d7c83a Mon Sep 17 00:00:00 2001 From: David Sherwood Date: Fri, 24 Nov 2023 13:20:23 + Subject: [PATCH 1/3] [Clang] Emit TBAA info for enums in C When emitting TBAA

[llvm] [clang] [clang-tools-extra] [LoopVectorize] Improve algorithm for hoisting runtime checks (PR #73515)

2023-12-07 Thread David Sherwood via cfe-commits
david-arm wrote: Gentle ping! https://github.com/llvm/llvm-project/pull/73515 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Emit TBAA info for enums in C (PR #73326)

2023-12-05 Thread David Sherwood via cfe-commits
david-arm wrote: Hi @AaronBallman, yes the problem I found with always choosing `char` as the alias type is that LLVM will just assume that enum types alias with absolutely everything. This is a conservative approach that works fine, but it does prevent important type-based alias

[clang] [Clang] Emit TBAA info for enums in C (PR #73326)

2023-12-04 Thread David Sherwood via cfe-commits
david-arm wrote: Gentle ping! https://github.com/llvm/llvm-project/pull/73326 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [AArch64][SME2] Remove IsPreservesZA from ldr_zt builtin (PR #74303)

2023-12-04 Thread David Sherwood via cfe-commits
@@ -319,7 +319,7 @@ let TargetGuard = "sme2" in { // Spill and fill of ZT0 // let TargetGuard = "sme2" in { - def SVLDR_ZT : Inst<"svldr_zt", "viQ", "", MergeNone, "aarch64_sme_ldr_zt", [IsOverloadNone, IsStreamingCompatible, IsSharedZA, IsPreservesZA], [ImmCheck<0,

[clang] [llvm] [SME2] Add LUTI2 and LUTI4 quad Builtins and Intrinsics (PR #73317)

2023-12-01 Thread David Sherwood via cfe-commits
https://github.com/david-arm commented: This looks good to me, but I think it needs rebasing after https://github.com/llvm/llvm-project/pull/72849 landed. It also looks like @sdesmalen-arm left a comment about renaming ImmToTile - perhaps that could be done in this patch?

[llvm] [clang] [SME2] Add LUTI2 and LUTI4 quad Builtins and Intrinsics (PR #73317)

2023-12-01 Thread David Sherwood via cfe-commits
@@ -0,0 +1,280 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py + +// REQUIRES: aarch64-registered-target + +// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sme2 -target-feature +sve -S -disable-O0-optnone -Werror -Wall

[llvm] [clang] [AArch64][SME2] Add ldr_zt, str_zt builtins and intrinsics (PR #72849)

2023-11-30 Thread David Sherwood via cfe-commits
https://github.com/david-arm approved this pull request. LGTM! C'est parfait! https://github.com/llvm/llvm-project/pull/72849 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AArch64][SME2] Add ldr_zt, str_zt builtins and intrinsics (PR #72849)

2023-11-30 Thread David Sherwood via cfe-commits
david-arm wrote: It looks like a few other pull requests are changing the same code around ImmToTile. Might be good to land this smaller patch first so you can rebase the others and reduce the diffs! https://github.com/llvm/llvm-project/pull/72849

[clang] [llvm] [AArch64][SME2] Add ldr_zt, str_zt builtins and intrinsics (PR #72849)

2023-11-30 Thread David Sherwood via cfe-commits
@@ -2748,6 +2748,22 @@ AArch64TargetLowering::EmitFill(MachineInstr , MachineBasicBlock *BB) const { return BB; } +MachineBasicBlock *AArch64TargetLowering::EmitZTSpillFill(MachineInstr , + MachineBasicBlock *BB, +

[clang] [llvm] [AArch64][SME2] Add ldr_zt, str_zt builtins and intrinsics (PR #72849)

2023-11-30 Thread David Sherwood via cfe-commits
@@ -0,0 +1,51 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py + +// REQUIRES: aarch64-registered-target + +// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sme2 -S -disable-O0-optnone -Werror -Wall -emit-llvm -o - %s | opt -S

[clang] [llvm] [AArch64][SME2] Add ldr_zt, str_zt builtins and intrinsics (PR #72849)

2023-11-30 Thread David Sherwood via cfe-commits
@@ -0,0 +1,51 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py + +// REQUIRES: aarch64-registered-target + +// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sme2 -S -disable-O0-optnone -Werror -Wall -emit-llvm -o - %s | opt -S

[clang] [llvm] [SME2] Add LUTI2 and LUTI4 quad Builtins and Intrinsics (PR #73317)

2023-11-30 Thread David Sherwood via cfe-commits
@@ -0,0 +1,280 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py + +// REQUIRES: aarch64-registered-target + +// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sme2 -target-feature +sve -S -disable-O0-optnone -Werror -Wall

[llvm] [clang] [SME2] Add LUTI2 and LUTI4 quad Builtins and Intrinsics (PR #73317)

2023-11-30 Thread David Sherwood via cfe-commits
@@ -1859,6 +1867,34 @@ void AArch64DAGToDAGISel::SelectFrintFromVT(SDNode *N, unsigned NumVecs, SelectUnaryMultiIntrinsic(N, NumVecs, true, Opcode); } +template +void AArch64DAGToDAGISel::SelectMultiVectorLuti(SDNode *Node, +

[clang] [llvm] [SME2] Add LUTI2 and LUTI4 quad Builtins and Intrinsics (PR #73317)

2023-11-30 Thread David Sherwood via cfe-commits
@@ -1859,6 +1867,34 @@ void AArch64DAGToDAGISel::SelectFrintFromVT(SDNode *N, unsigned NumVecs, SelectUnaryMultiIntrinsic(N, NumVecs, true, Opcode); } +template david-arm wrote: Rather than create two almost identical copies of the function with a

[clang-tools-extra] [clang] [llvm] [LoopVectorize] Improve algorithm for hoisting runtime checks (PR #73515)

2023-11-30 Thread David Sherwood via cfe-commits
https://github.com/david-arm updated https://github.com/llvm/llvm-project/pull/73515 >From 30251642f8c208c63f3f3097c337ef0d5bc633b5 Mon Sep 17 00:00:00 2001 From: David Sherwood Date: Mon, 27 Nov 2023 13:43:26 + Subject: [PATCH 1/3] [LoopVectorize] Improve algorithm for hoisting runtime

[clang] [AArch64][SME2] Add multi-vector SEL (x2, x4) ACLE builtins & intrinsics (PR #73188)

2023-11-29 Thread David Sherwood via cfe-commits
https://github.com/david-arm approved this pull request. LGTM! Thanks for the changes. :) https://github.com/llvm/llvm-project/pull/73188 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[clang] [AArch64][SME2] Add multi-vector SEL (x2, x4) ACLE builtins & intrinsics (PR #73188)

2023-11-28 Thread David Sherwood via cfe-commits
david-arm wrote: Should the file be renamed to acle_sme2_vector_selx2? This would make it consistent with the existing acle_sme2_vector_add.c file, which also has SVE-like instructions that only operate on SVE vectors.

[clang] [AArch64][SME2] Add multi-vector SEL (x2, x4) ACLE builtins & intrinsics (PR #73188)

2023-11-28 Thread David Sherwood via cfe-commits
david-arm wrote: Should the file be renamed to acle_sme2_vector_selx4? This would make it consistent with the existing acle_sme2_vector_add.c file, which also has SVE-like instructions that only operate on SVE vectors.

[clang] [AArch64][SME2] Add multi-vector SEL (x2, x4) ACLE builtins & intrinsics (PR #73188)

2023-11-28 Thread David Sherwood via cfe-commits
@@ -0,0 +1,384 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py +// REQUIRES: aarch64-registered-target + +// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve -target-feature +sme2 -S -disable-O0-optnone -Werror -Wall

[clang] [Clang] Emit TBAA info for enums in C (PR #73326)

2023-11-27 Thread David Sherwood via cfe-commits
@@ -196,11 +196,14 @@ llvm::MDNode *CodeGenTBAA::getTypeInfoHelper(const Type *Ty) { // Enum types are distinct types. In C++ they have "underlying types", // however they aren't related for TBAA. if (const EnumType *ETy = dyn_cast(Ty)) { +if (!Features.CPlusPlus) +

[clang] [Clang] Emit TBAA info for enums in C (PR #73326)

2023-11-27 Thread David Sherwood via cfe-commits
https://github.com/david-arm updated https://github.com/llvm/llvm-project/pull/73326 >From af76f6b6b3469fd0f5f24427c5a175c8d9d7c83a Mon Sep 17 00:00:00 2001 From: David Sherwood Date: Fri, 24 Nov 2023 13:20:23 + Subject: [PATCH 1/2] [Clang] Emit TBAA info for enums in C When emitting TBAA

[clang] [Clang] Emit TBAA info for enums in C (PR #73326)

2023-11-24 Thread David Sherwood via cfe-commits
@@ -196,11 +196,14 @@ llvm::MDNode *CodeGenTBAA::getTypeInfoHelper(const Type *Ty) { // Enum types are distinct types. In C++ they have "underlying types", // however they aren't related for TBAA. if (const EnumType *ETy = dyn_cast(Ty)) { +if (!Features.CPlusPlus) +

[clang] [Clang] Emit TBAA info for enums in C (PR #73326)

2023-11-24 Thread David Sherwood via cfe-commits
https://github.com/david-arm edited https://github.com/llvm/llvm-project/pull/73326 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang] Emit TBAA info for enums in C (PR #73326)

2023-11-24 Thread David Sherwood via cfe-commits
https://github.com/david-arm created https://github.com/llvm/llvm-project/pull/73326 When emitting TBAA information for enums in C code we currently just treat the data as an 'omnipotent char'. However, with C strict aliasing this means we fail to optimise certain cases. For example, in the

[clang] [llvm] [AArch64][SVE2.1] Add intrinsics for quadword loads/stores with unscaled offset (PR #70474)

2023-11-20 Thread David Sherwood via cfe-commits
https://github.com/david-arm approved this pull request. LGTM. I think I would have preferred the patch to be split up into 3 - one for contiguous extending loads/truncating stores, one for structured loads/stores, and one for the gathers. That's why it took me so long to review this patch as

[clang] [llvm] [AArch64][SVE2.1] Add intrinsics for quadword loads/stores with unscaled offset (PR #70474)

2023-11-15 Thread David Sherwood via cfe-commits
@@ -0,0 +1,2503 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py +// REQUIRES: aarch64-registered-target +// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sve2p1 -target-feature +bf16 -S -disable-O0-optnone -Werror -Wall

[llvm] [clang] [AArch64][SVE2.1] Add intrinsics for quadword loads/stores with unscaled offset (PR #70474)

2023-11-15 Thread David Sherwood via cfe-commits
https://github.com/david-arm edited https://github.com/llvm/llvm-project/pull/70474 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [AArch64][SVE2.1] Add intrinsics for quadword loads/stores with unscaled offset (PR #70474)

2023-11-15 Thread David Sherwood via cfe-commits
https://github.com/david-arm commented: Wow, this is a huge patch. :) It took me a few hours to work through all the tests, and it's quite possible I've missed something. However, overall it looks fine and I can't see any major issues. I think there is one missing test, but once that's fixed

[clang] [AArch64] Cast predicate operand of SVE gather loads/scater stores to the parameter type of the intrinsic (NFC) (PR #71289)

2023-11-06 Thread David Sherwood via cfe-commits
https://github.com/david-arm approved this pull request. LGTM! https://github.com/llvm/llvm-project/pull/71289 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[llvm] [clang] [AArch64][SVE2.1] Add intrinsics for quadword loads/stores with unscaled offset (PR #70474)

2023-11-03 Thread David Sherwood via cfe-commits
@@ -2614,6 +2619,37 @@ def int_aarch64_sve_ld1_pn_x4 : SVE2p1_Load_PN_X4_Intrinsic; def int_aarch64_sve_ldnt1_pn_x2 : SVE2p1_Load_PN_X2_Intrinsic; def int_aarch64_sve_ldnt1_pn_x4 : SVE2p1_Load_PN_X4_Intrinsic; +// +// SVE2.1 - Contiguous loads to quadword (single vector) +//

[llvm] [clang] [AArch64][SVE2.1] Add intrinsics for quadword loads/stores with unscaled offset (PR #70474)

2023-11-03 Thread David Sherwood via cfe-commits
@@ -9671,28 +9677,47 @@ Value *CodeGenFunction::EmitSVEMaskedLoad(const CallExpr *E, // The vector type that is returned may be different from the // eventual type loaded from memory. auto VectorTy = cast(ReturnTy); - auto MemoryTy =

[llvm] [clang] [AArch64][SVE2.1] Add intrinsics for quadword loads/stores with unscaled offset (PR #70474)

2023-11-03 Thread David Sherwood via cfe-commits
@@ -9702,17 +9727,34 @@ Value *CodeGenFunction::EmitSVEMaskedStore(const CallExpr *E, auto VectorTy = cast(Ops.back()->getType()); auto MemoryTy = llvm::ScalableVectorType::get(MemEltTy, VectorTy); - Value *Predicate = EmitSVEPredicateCast(Ops[0], MemoryTy); + auto

[llvm] [clang] [AArch64][SVE2.1] Add intrinsics for quadword loads/stores with unscaled offset (PR #70474)

2023-11-03 Thread David Sherwood via cfe-commits
@@ -9671,28 +9677,47 @@ Value *CodeGenFunction::EmitSVEMaskedLoad(const CallExpr *E, // The vector type that is returned may be different from the // eventual type loaded from memory. auto VectorTy = cast(ReturnTy); - auto MemoryTy =

[clang] [llvm] [AArch64][SVE2.1] Add intrinsics for quadword loads/stores with unscaled offset (PR #70474)

2023-11-03 Thread David Sherwood via cfe-commits
@@ -2614,6 +2619,37 @@ def int_aarch64_sve_ld1_pn_x4 : SVE2p1_Load_PN_X4_Intrinsic; def int_aarch64_sve_ldnt1_pn_x2 : SVE2p1_Load_PN_X2_Intrinsic; def int_aarch64_sve_ldnt1_pn_x4 : SVE2p1_Load_PN_X4_Intrinsic; +// +// SVE2.1 - Contiguous loads to quadword (single vector) +//

[clang] [llvm] [AArch64][SVE2.1] Add intrinsics for quadword loads/stores with unscaled offset (PR #70474)

2023-11-03 Thread David Sherwood via cfe-commits
@@ -9702,17 +9727,34 @@ Value *CodeGenFunction::EmitSVEMaskedStore(const CallExpr *E, auto VectorTy = cast(Ops.back()->getType()); auto MemoryTy = llvm::ScalableVectorType::get(MemEltTy, VectorTy); - Value *Predicate = EmitSVEPredicateCast(Ops[0], MemoryTy); + auto

[clang] [llvm] [AArch64][SVE2.1] Add intrinsics for quadword loads/stores with unscaled offset (PR #70474)

2023-11-03 Thread David Sherwood via cfe-commits
https://github.com/david-arm edited https://github.com/llvm/llvm-project/pull/70474 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [llvm] [AArch64][SVE2.1] Add intrinsics for quadword loads/stores with unscaled offset (PR #70474)

2023-11-03 Thread David Sherwood via cfe-commits
https://github.com/david-arm commented: Thanks for this! I've not done an exhaustive review, but I'll leave the comments I have so far. https://github.com/llvm/llvm-project/pull/70474 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[clang] [AArch64][Clang] Refactor code to emit SVE & SME builtins (PR #70959)

2023-11-02 Thread David Sherwood via cfe-commits
https://github.com/david-arm approved this pull request. LGTM! https://github.com/llvm/llvm-project/pull/70959 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][SME2] Add multi-vector add/sub builtins (PR #69725)

2023-10-27 Thread David Sherwood via cfe-commits
https://github.com/david-arm approved this pull request. LGTM! Eccelente! Thanks for the changes @kmclaughlin-arm. https://github.com/llvm/llvm-project/pull/69725 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[clang] [Clang][SME2] Add multi-vector add/sub builtins (PR #69725)

2023-10-25 Thread David Sherwood via cfe-commits
@@ -0,0 +1,1226 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py + +// REQUIRES: aarch64-registered-target + +// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sme2 -target-feature +sme-i16i64 -target-feature +sme-f64f64

[clang] [Clang][SME2] Add multi-vector add/sub builtins (PR #69725)

2023-10-25 Thread David Sherwood via cfe-commits
@@ -0,0 +1,418 @@ +// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py + +// REQUIRES: aarch64-registered-target + +// RUN: %clang_cc1 -triple aarch64-none-linux-gnu -target-feature +sme2 -target-feature +sme-i16i64 -target-feature +sme-f64f64

[clang] [Clang][SME2] Add multi-vector add/sub builtins (PR #69725)

2023-10-25 Thread David Sherwood via cfe-commits
https://github.com/david-arm commented: This looks a lot better now @kmclaughlin-arm - thanks for the changes! I just have a couple of comments about the tests that I missed previously... https://github.com/llvm/llvm-project/pull/69725 ___

[clang] [Clang][SME2] Add multi-vector add/sub builtins (PR #69725)

2023-10-25 Thread David Sherwood via cfe-commits
https://github.com/david-arm edited https://github.com/llvm/llvm-project/pull/69725 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [Clang][SME2] Add multi-vector add/sub builtins (PR #69725)

2023-10-20 Thread David Sherwood via cfe-commits
@@ -9893,24 +9888,37 @@ Value *CodeGenFunction::FormSVEBuiltinResult(Value *Call) { return Call; } -Value *CodeGenFunction::EmitAArch64SVEBuiltinExpr(unsigned BuiltinID, - const CallExpr *E) { +void

[clang] [Clang][SME2] Add multi-vector add/sub builtins (PR #69725)

2023-10-20 Thread David Sherwood via cfe-commits
@@ -9893,24 +9888,37 @@ Value *CodeGenFunction::FormSVEBuiltinResult(Value *Call) { return Call; } -Value *CodeGenFunction::EmitAArch64SVEBuiltinExpr(unsigned BuiltinID, - const CallExpr *E) { +void

[clang] [Clang][SME2] Add multi-vector add/sub builtins (PR #69725)

2023-10-20 Thread David Sherwood via cfe-commits
@@ -10272,29 +10291,13 @@ Value *CodeGenFunction::EmitAArch64SMEBuiltinExpr(unsigned BuiltinID, getContext().GetBuiltinType(BuiltinID, Error, ); david-arm wrote: Do we still need this code given we're now checking the ICE arguments in

[clang] [Clang][SME2] Add multi-vector add/sub builtins (PR #69725)

2023-10-20 Thread David Sherwood via cfe-commits
https://github.com/david-arm commented: I've not done an exhaustive review, but thought I'd leave the comments I have so far! https://github.com/llvm/llvm-project/pull/69725 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[clang] [Clang][SME2] Add multi-vector add/sub builtins (PR #69725)

2023-10-20 Thread David Sherwood via cfe-commits
@@ -1016,29 +1021,24 @@ std::string Intrinsic::mangleName(ClassKind LocalCK) const { getMergeSuffix(); } -void Intrinsic::emitIntrinsic(raw_ostream , SVEEmitter ) const { +void Intrinsic::emitIntrinsic(raw_ostream , ACLEKind Kind) const { bool IsOverloaded =

[clang] [Clang][SME2] Add multi-vector add/sub builtins (PR #69725)

2023-10-20 Thread David Sherwood via cfe-commits
@@ -9893,24 +9888,37 @@ Value *CodeGenFunction::FormSVEBuiltinResult(Value *Call) { return Call; } -Value *CodeGenFunction::EmitAArch64SVEBuiltinExpr(unsigned BuiltinID, - const CallExpr *E) { +void

[clang] [Clang][SME2] Add multi-vector add/sub builtins (PR #69725)

2023-10-20 Thread David Sherwood via cfe-commits
https://github.com/david-arm edited https://github.com/llvm/llvm-project/pull/69725 ___ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

[clang] [CXXNameMangler] Correct the mangling of SVE ACLE types within function names. (PR #69460)

2023-10-19 Thread David Sherwood via cfe-commits
https://github.com/david-arm approved this pull request. LGTM! An outstanding work of art @paulwalker-arm! https://github.com/llvm/llvm-project/pull/69460 ___ cfe-commits mailing list cfe-commits@lists.llvm.org

[clang] [SVE][InstCombine] Delete redundante sel instructions with ptrue (PR #68463)

2023-10-10 Thread David Sherwood via cfe-commits
@@ -63,6 +63,20 @@ svint32_t test_svsel_s32(svbool_t pg, svint32_t op1, svint32_t op2) return SVE_ACLE_FUNC(svsel,_s32,,)(pg, op1, op2); } +// CHECK-LABEL: @test_svsel_s32_ptrue( david-arm wrote: I'm not sure if this test really adds any more value, since

[clang] [SVE][InstCombine] Delete redundante sel instructions with ptrue (PR #68463)

2023-10-10 Thread David Sherwood via cfe-commits
@@ -800,6 +800,13 @@ instCombineConvertFromSVBool(InstCombiner , IntrinsicInst ) { static std::optional instCombineSVESel(InstCombiner , IntrinsicInst ) { + // svsel(ptrue, x, y) => x + auto *OpPredicate =

  1   2   >