[llvm-branch-commits] [llvm] [PowerPC][GlobalMerge] Reduce TOC usage by merging internal and private global data (PR #101224)

2024-08-05 Thread Chen Zheng via llvm-branch-commits

https://github.com/chenzheng1030 approved this pull request.

LGTM. Thanks very much for enabling this pass on PPC.

https://github.com/llvm/llvm-project/pull/101224
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [AArch64] Avoid inlining if ZT0 needs preserving. (#101343) (PR #101932)

2024-08-05 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/101932
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [AArch64] Avoid inlining if ZT0 needs preserving. (#101343) (PR #101932)

2024-08-05 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/101932

Backport fb470db7b3a8ce6853e8bf17d235617a2fa79434

Requested by: @sdesmalen-arm

>From 1bc684848a555a31d36fe0b3fe3bf263ecb66758 Mon Sep 17 00:00:00 2001
From: Sander de Smalen 
Date: Fri, 2 Aug 2024 10:29:08 +0100
Subject: [PATCH] [AArch64] Avoid inlining if ZT0 needs preserving. (#101343)

Inlining may result in different behaviour when the callee clobbers ZT0,
because normally the call-site will have code to preserve ZT0. When
inlining the function this code to preserve ZT0 will no longer be
emitted, and so the resulting behaviour of the program is changed.

(cherry picked from commit fb470db7b3a8ce6853e8bf17d235617a2fa79434)
---
 .../AArch64/AArch64TargetTransformInfo.cpp|  3 +-
 .../Inline/AArch64/sme-pstateza-attrs.ll  | 45 +++
 2 files changed, 47 insertions(+), 1 deletion(-)

diff --git a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp 
b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
index 45148449dfb82..9630b36b99348 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
@@ -254,7 +254,8 @@ bool AArch64TTIImpl::areInlineCompatible(const Function 
*Caller,
 return false;
 
   if (CallerAttrs.requiresLazySave(CalleeAttrs) ||
-  CallerAttrs.requiresSMChange(CalleeAttrs)) {
+  CallerAttrs.requiresSMChange(CalleeAttrs) ||
+  CallerAttrs.requiresPreservingZT0(CalleeAttrs)) {
 if (hasPossibleIncompatibleOps(Callee))
   return false;
   }
diff --git a/llvm/test/Transforms/Inline/AArch64/sme-pstateza-attrs.ll 
b/llvm/test/Transforms/Inline/AArch64/sme-pstateza-attrs.ll
index 816492768cc0f..5e638103a2b06 100644
--- a/llvm/test/Transforms/Inline/AArch64/sme-pstateza-attrs.ll
+++ b/llvm/test/Transforms/Inline/AArch64/sme-pstateza-attrs.ll
@@ -231,6 +231,51 @@ define void 
@shared_za_caller_private_za_callee_call_tpidr2_restore_dont_inline(
   ret void
 }
 
+define void @nonzt0_callee() {
+; CHECK-LABEL: define void @nonzt0_callee
+; CHECK-SAME: () #[[ATTR0]] {
+; CHECK-NEXT:call void asm sideeffect "
+; CHECK-NEXT:call void @inlined_body()
+; CHECK-NEXT:ret void
+;
+  call void asm sideeffect "; inlineasm", ""()
+  call void @inlined_body()
+  ret void
+}
+
+define void @shared_zt0_caller_nonzt0_callee_dont_inline() "aarch64_inout_zt0" 
{
+; CHECK-LABEL: define void @shared_zt0_caller_nonzt0_callee_dont_inline
+; CHECK-SAME: () #[[ATTR3:[0-9]+]] {
+; CHECK-NEXT:call void @nonzt0_callee()
+; CHECK-NEXT:ret void
+;
+  call void @nonzt0_callee()
+  ret void
+}
+
+define void @shared_zt0_callee() "aarch64_inout_zt0" {
+; CHECK-LABEL: define void @shared_zt0_callee
+; CHECK-SAME: () #[[ATTR3]] {
+; CHECK-NEXT:call void asm sideeffect "
+; CHECK-NEXT:call void @inlined_body()
+; CHECK-NEXT:ret void
+;
+  call void asm sideeffect "; inlineasm", ""()
+  call void @inlined_body()
+  ret void
+}
+
+define void @shared_zt0_caller_shared_zt0_callee_inline() "aarch64_inout_zt0" {
+; CHECK-LABEL: define void @shared_zt0_caller_shared_zt0_callee_inline
+; CHECK-SAME: () #[[ATTR3]] {
+; CHECK-NEXT:call void asm sideeffect "
+; CHECK-NEXT:call void @inlined_body()
+; CHECK-NEXT:ret void
+;
+  call void @shared_zt0_callee()
+  ret void
+}
+
 declare void @__arm_za_disable()
 declare void @__arm_tpidr2_save()
 declare void @__arm_tpidr2_restore(ptr)

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [AArch64] Avoid inlining if ZT0 needs preserving. (#101343) (PR #101932)

2024-08-05 Thread via llvm-branch-commits

llvmbot wrote:

@paulwalker-arm What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/101932
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [AArch64] Avoid inlining if ZT0 needs preserving. (#101343) (PR #101932)

2024-08-05 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-llvm-transforms

Author: None (llvmbot)


Changes

Backport fb470db7b3a8ce6853e8bf17d235617a2fa79434

Requested by: @sdesmalen-arm

---
Full diff: https://github.com/llvm/llvm-project/pull/101932.diff


2 Files Affected:

- (modified) llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp (+2-1) 
- (modified) llvm/test/Transforms/Inline/AArch64/sme-pstateza-attrs.ll (+45) 


``diff
diff --git a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp 
b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
index 45148449dfb82..9630b36b99348 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
@@ -254,7 +254,8 @@ bool AArch64TTIImpl::areInlineCompatible(const Function 
*Caller,
 return false;
 
   if (CallerAttrs.requiresLazySave(CalleeAttrs) ||
-  CallerAttrs.requiresSMChange(CalleeAttrs)) {
+  CallerAttrs.requiresSMChange(CalleeAttrs) ||
+  CallerAttrs.requiresPreservingZT0(CalleeAttrs)) {
 if (hasPossibleIncompatibleOps(Callee))
   return false;
   }
diff --git a/llvm/test/Transforms/Inline/AArch64/sme-pstateza-attrs.ll 
b/llvm/test/Transforms/Inline/AArch64/sme-pstateza-attrs.ll
index 816492768cc0f..5e638103a2b06 100644
--- a/llvm/test/Transforms/Inline/AArch64/sme-pstateza-attrs.ll
+++ b/llvm/test/Transforms/Inline/AArch64/sme-pstateza-attrs.ll
@@ -231,6 +231,51 @@ define void 
@shared_za_caller_private_za_callee_call_tpidr2_restore_dont_inline(
   ret void
 }
 
+define void @nonzt0_callee() {
+; CHECK-LABEL: define void @nonzt0_callee
+; CHECK-SAME: () #[[ATTR0]] {
+; CHECK-NEXT:call void asm sideeffect "
+; CHECK-NEXT:call void @inlined_body()
+; CHECK-NEXT:ret void
+;
+  call void asm sideeffect "; inlineasm", ""()
+  call void @inlined_body()
+  ret void
+}
+
+define void @shared_zt0_caller_nonzt0_callee_dont_inline() "aarch64_inout_zt0" 
{
+; CHECK-LABEL: define void @shared_zt0_caller_nonzt0_callee_dont_inline
+; CHECK-SAME: () #[[ATTR3:[0-9]+]] {
+; CHECK-NEXT:call void @nonzt0_callee()
+; CHECK-NEXT:ret void
+;
+  call void @nonzt0_callee()
+  ret void
+}
+
+define void @shared_zt0_callee() "aarch64_inout_zt0" {
+; CHECK-LABEL: define void @shared_zt0_callee
+; CHECK-SAME: () #[[ATTR3]] {
+; CHECK-NEXT:call void asm sideeffect "
+; CHECK-NEXT:call void @inlined_body()
+; CHECK-NEXT:ret void
+;
+  call void asm sideeffect "; inlineasm", ""()
+  call void @inlined_body()
+  ret void
+}
+
+define void @shared_zt0_caller_shared_zt0_callee_inline() "aarch64_inout_zt0" {
+; CHECK-LABEL: define void @shared_zt0_caller_shared_zt0_callee_inline
+; CHECK-SAME: () #[[ATTR3]] {
+; CHECK-NEXT:call void asm sideeffect "
+; CHECK-NEXT:call void @inlined_body()
+; CHECK-NEXT:ret void
+;
+  call void @shared_zt0_callee()
+  ret void
+}
+
 declare void @__arm_za_disable()
 declare void @__arm_tpidr2_save()
 declare void @__arm_tpidr2_restore(ptr)

``




https://github.com/llvm/llvm-project/pull/101932
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [AArch64] Avoid inlining if ZT0 needs preserving. (#101343) (PR #101932)

2024-08-05 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-aarch64

Author: None (llvmbot)


Changes

Backport fb470db7b3a8ce6853e8bf17d235617a2fa79434

Requested by: @sdesmalen-arm

---
Full diff: https://github.com/llvm/llvm-project/pull/101932.diff


2 Files Affected:

- (modified) llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp (+2-1) 
- (modified) llvm/test/Transforms/Inline/AArch64/sme-pstateza-attrs.ll (+45) 


``diff
diff --git a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp 
b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
index 45148449dfb82..9630b36b99348 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
@@ -254,7 +254,8 @@ bool AArch64TTIImpl::areInlineCompatible(const Function 
*Caller,
 return false;
 
   if (CallerAttrs.requiresLazySave(CalleeAttrs) ||
-  CallerAttrs.requiresSMChange(CalleeAttrs)) {
+  CallerAttrs.requiresSMChange(CalleeAttrs) ||
+  CallerAttrs.requiresPreservingZT0(CalleeAttrs)) {
 if (hasPossibleIncompatibleOps(Callee))
   return false;
   }
diff --git a/llvm/test/Transforms/Inline/AArch64/sme-pstateza-attrs.ll 
b/llvm/test/Transforms/Inline/AArch64/sme-pstateza-attrs.ll
index 816492768cc0f..5e638103a2b06 100644
--- a/llvm/test/Transforms/Inline/AArch64/sme-pstateza-attrs.ll
+++ b/llvm/test/Transforms/Inline/AArch64/sme-pstateza-attrs.ll
@@ -231,6 +231,51 @@ define void 
@shared_za_caller_private_za_callee_call_tpidr2_restore_dont_inline(
   ret void
 }
 
+define void @nonzt0_callee() {
+; CHECK-LABEL: define void @nonzt0_callee
+; CHECK-SAME: () #[[ATTR0]] {
+; CHECK-NEXT:call void asm sideeffect "
+; CHECK-NEXT:call void @inlined_body()
+; CHECK-NEXT:ret void
+;
+  call void asm sideeffect "; inlineasm", ""()
+  call void @inlined_body()
+  ret void
+}
+
+define void @shared_zt0_caller_nonzt0_callee_dont_inline() "aarch64_inout_zt0" 
{
+; CHECK-LABEL: define void @shared_zt0_caller_nonzt0_callee_dont_inline
+; CHECK-SAME: () #[[ATTR3:[0-9]+]] {
+; CHECK-NEXT:call void @nonzt0_callee()
+; CHECK-NEXT:ret void
+;
+  call void @nonzt0_callee()
+  ret void
+}
+
+define void @shared_zt0_callee() "aarch64_inout_zt0" {
+; CHECK-LABEL: define void @shared_zt0_callee
+; CHECK-SAME: () #[[ATTR3]] {
+; CHECK-NEXT:call void asm sideeffect "
+; CHECK-NEXT:call void @inlined_body()
+; CHECK-NEXT:ret void
+;
+  call void asm sideeffect "; inlineasm", ""()
+  call void @inlined_body()
+  ret void
+}
+
+define void @shared_zt0_caller_shared_zt0_callee_inline() "aarch64_inout_zt0" {
+; CHECK-LABEL: define void @shared_zt0_caller_shared_zt0_callee_inline
+; CHECK-SAME: () #[[ATTR3]] {
+; CHECK-NEXT:call void asm sideeffect "
+; CHECK-NEXT:call void @inlined_body()
+; CHECK-NEXT:ret void
+;
+  call void @shared_zt0_callee()
+  ret void
+}
+
 declare void @__arm_za_disable()
 declare void @__arm_tpidr2_save()
 declare void @__arm_tpidr2_restore(ptr)

``




https://github.com/llvm/llvm-project/pull/101932
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [AArch64] Avoid NEON dot product in streaming[-compatible] functions (#101677) (PR #101933)

2024-08-05 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/101933

Backport 12937b1bfb23cca4731fa274f3358f7286cc6784

Requested by: @sdesmalen-arm

>From 6c1bac971da50207bdb2a80b0dc40b642134619b Mon Sep 17 00:00:00 2001
From: Sander de Smalen 
Date: Fri, 2 Aug 2024 15:56:52 +0100
Subject: [PATCH] [AArch64] Avoid NEON dot product in streaming[-compatible]
 functions (#101677)

The NEON dot product is not valid in streaming mode.
A follow-up patch will improve codegen for these operations.

(cherry picked from commit 12937b1bfb23cca4731fa274f3358f7286cc6784)
---
 .../Target/AArch64/AArch64ISelLowering.cpp|   3 +
 ...-streaming-mode-fixed-length-reductions.ll | 143 ++
 2 files changed, 146 insertions(+)
 create mode 100644 
llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-reductions.ll

diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp 
b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 6d413a09407a9..62078822c89b1 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -17719,6 +17719,9 @@ static SDValue 
performVecReduceAddCombineWithUADDLP(SDNode *N,
 // and generate vecreduce.add(concat_vector(DOT, DOT2, ..)).
 static SDValue performVecReduceAddCombine(SDNode *N, SelectionDAG &DAG,
   const AArch64Subtarget *ST) {
+  if (!ST->isNeonAvailable())
+return SDValue();
+
   if (!ST->hasDotProd())
 return performVecReduceAddCombineWithUADDLP(N, DAG);
 
diff --git 
a/llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-reductions.ll 
b/llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-reductions.ll
new file mode 100644
index 0..00a15f4bcd639
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-reductions.ll
@@ -0,0 +1,143 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mattr=+sve < %s | FileCheck %s
+; RUN: llc -mattr=+dotprod,+sve < %s | FileCheck %s -check-prefix=DOT
+; RUN: llc -mattr=+dotprod,+sve -force-streaming-compatible < %s | FileCheck 
%s --check-prefix=STREAMING-SVE
+; RUN: llc -mattr=+dotprod,+sme -force-streaming < %s | FileCheck %s 
--check-prefix=STREAMING-SVE
+
+target triple = "aarch64-unknown-linux-gnu"
+
+define i32 @reduce_uaddv_v16i8(<32 x i8> %a) {
+; CHECK-LABEL: reduce_uaddv_v16i8:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ushll2 v2.8h, v1.16b, #0
+; CHECK-NEXT:ushll2 v3.8h, v0.16b, #0
+; CHECK-NEXT:ushll v1.8h, v1.8b, #0
+; CHECK-NEXT:ushll v0.8h, v0.8b, #0
+; CHECK-NEXT:uaddl2 v4.4s, v3.8h, v2.8h
+; CHECK-NEXT:uaddl v2.4s, v3.4h, v2.4h
+; CHECK-NEXT:uaddl2 v5.4s, v0.8h, v1.8h
+; CHECK-NEXT:uaddl v0.4s, v0.4h, v1.4h
+; CHECK-NEXT:add v1.4s, v5.4s, v4.4s
+; CHECK-NEXT:add v0.4s, v0.4s, v2.4s
+; CHECK-NEXT:add v0.4s, v0.4s, v1.4s
+; CHECK-NEXT:addv s0, v0.4s
+; CHECK-NEXT:fmov w0, s0
+; CHECK-NEXT:ret
+;
+; DOT-LABEL: reduce_uaddv_v16i8:
+; DOT:   // %bb.0:
+; DOT-NEXT:movi v2.16b, #1
+; DOT-NEXT:movi v3.2d, #
+; DOT-NEXT:udot v3.4s, v1.16b, v2.16b
+; DOT-NEXT:udot v3.4s, v0.16b, v2.16b
+; DOT-NEXT:addv s0, v3.4s
+; DOT-NEXT:fmov w0, s0
+; DOT-NEXT:ret
+;
+; STREAMING-SVE-LABEL: reduce_uaddv_v16i8:
+; STREAMING-SVE:   // %bb.0:
+; STREAMING-SVE-NEXT:// kill: def $q1 killed $q1 def $z1
+; STREAMING-SVE-NEXT:uunpklo z2.h, z1.b
+; STREAMING-SVE-NEXT:// kill: def $q0 killed $q0 def $z0
+; STREAMING-SVE-NEXT:uunpklo z3.h, z0.b
+; STREAMING-SVE-NEXT:ptrue p0.s, vl4
+; STREAMING-SVE-NEXT:ext z1.b, z1.b, z1.b, #8
+; STREAMING-SVE-NEXT:ext z0.b, z0.b, z0.b, #8
+; STREAMING-SVE-NEXT:uunpklo z1.h, z1.b
+; STREAMING-SVE-NEXT:uunpklo z0.h, z0.b
+; STREAMING-SVE-NEXT:uunpklo z4.s, z2.h
+; STREAMING-SVE-NEXT:ext z2.b, z2.b, z2.b, #8
+; STREAMING-SVE-NEXT:uunpklo z6.s, z3.h
+; STREAMING-SVE-NEXT:ext z3.b, z3.b, z3.b, #8
+; STREAMING-SVE-NEXT:mov z5.d, z1.d
+; STREAMING-SVE-NEXT:uunpklo z7.s, z0.h
+; STREAMING-SVE-NEXT:ext z0.b, z0.b, z0.b, #8
+; STREAMING-SVE-NEXT:uunpklo z2.s, z2.h
+; STREAMING-SVE-NEXT:uunpklo z3.s, z3.h
+; STREAMING-SVE-NEXT:add z4.s, z6.s, z4.s
+; STREAMING-SVE-NEXT:ext z5.b, z5.b, z1.b, #8
+; STREAMING-SVE-NEXT:uunpklo z1.s, z1.h
+; STREAMING-SVE-NEXT:uunpklo z0.s, z0.h
+; STREAMING-SVE-NEXT:add z2.s, z3.s, z2.s
+; STREAMING-SVE-NEXT:uunpklo z5.s, z5.h
+; STREAMING-SVE-NEXT:add z1.s, z7.s, z1.s
+; STREAMING-SVE-NEXT:add z0.s, z0.s, z5.s
+; STREAMING-SVE-NEXT:add z1.s, z4.s, z1.s
+; STREAMING-SVE-NEXT:add z0.s, z2.s, z0.s
+; STREAMING-SVE-NEXT:add z0.s, z1.s, z0.s
+; STREAMING-SVE-NEXT:uaddv d0, p0, z0.s
+; STREAMING-SVE-NEXT:fmov x0, d0
+; STREAMING-SVE-NEXT:// kill: def $w0 killed $w0 killed $x0
+; STREAMING-SVE-NEXT:ret
+  %1 = zext <32 x i8> %a to <32 x i32>
+  %2 = call i32 @llvm

[llvm-branch-commits] [llvm] release/19.x: [AArch64] Avoid NEON dot product in streaming[-compatible] functions (#101677) (PR #101933)

2024-08-05 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/101933
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [AArch64] Avoid NEON dot product in streaming[-compatible] functions (#101677) (PR #101933)

2024-08-05 Thread via llvm-branch-commits

llvmbot wrote:

@jroelofs What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/101933
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [AArch64] Avoid NEON dot product in streaming[-compatible] functions (#101677) (PR #101933)

2024-08-05 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-backend-aarch64

Author: None (llvmbot)


Changes

Backport 12937b1bfb23cca4731fa274f3358f7286cc6784

Requested by: @sdesmalen-arm

---
Full diff: https://github.com/llvm/llvm-project/pull/101933.diff


2 Files Affected:

- (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+3) 
- (added) 
llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-reductions.ll (+143) 


``diff
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp 
b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 6d413a09407a9..62078822c89b1 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -17719,6 +17719,9 @@ static SDValue 
performVecReduceAddCombineWithUADDLP(SDNode *N,
 // and generate vecreduce.add(concat_vector(DOT, DOT2, ..)).
 static SDValue performVecReduceAddCombine(SDNode *N, SelectionDAG &DAG,
   const AArch64Subtarget *ST) {
+  if (!ST->isNeonAvailable())
+return SDValue();
+
   if (!ST->hasDotProd())
 return performVecReduceAddCombineWithUADDLP(N, DAG);
 
diff --git 
a/llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-reductions.ll 
b/llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-reductions.ll
new file mode 100644
index 0..00a15f4bcd639
--- /dev/null
+++ b/llvm/test/CodeGen/AArch64/sve-streaming-mode-fixed-length-reductions.ll
@@ -0,0 +1,143 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc -mattr=+sve < %s | FileCheck %s
+; RUN: llc -mattr=+dotprod,+sve < %s | FileCheck %s -check-prefix=DOT
+; RUN: llc -mattr=+dotprod,+sve -force-streaming-compatible < %s | FileCheck 
%s --check-prefix=STREAMING-SVE
+; RUN: llc -mattr=+dotprod,+sme -force-streaming < %s | FileCheck %s 
--check-prefix=STREAMING-SVE
+
+target triple = "aarch64-unknown-linux-gnu"
+
+define i32 @reduce_uaddv_v16i8(<32 x i8> %a) {
+; CHECK-LABEL: reduce_uaddv_v16i8:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:ushll2 v2.8h, v1.16b, #0
+; CHECK-NEXT:ushll2 v3.8h, v0.16b, #0
+; CHECK-NEXT:ushll v1.8h, v1.8b, #0
+; CHECK-NEXT:ushll v0.8h, v0.8b, #0
+; CHECK-NEXT:uaddl2 v4.4s, v3.8h, v2.8h
+; CHECK-NEXT:uaddl v2.4s, v3.4h, v2.4h
+; CHECK-NEXT:uaddl2 v5.4s, v0.8h, v1.8h
+; CHECK-NEXT:uaddl v0.4s, v0.4h, v1.4h
+; CHECK-NEXT:add v1.4s, v5.4s, v4.4s
+; CHECK-NEXT:add v0.4s, v0.4s, v2.4s
+; CHECK-NEXT:add v0.4s, v0.4s, v1.4s
+; CHECK-NEXT:addv s0, v0.4s
+; CHECK-NEXT:fmov w0, s0
+; CHECK-NEXT:ret
+;
+; DOT-LABEL: reduce_uaddv_v16i8:
+; DOT:   // %bb.0:
+; DOT-NEXT:movi v2.16b, #1
+; DOT-NEXT:movi v3.2d, #
+; DOT-NEXT:udot v3.4s, v1.16b, v2.16b
+; DOT-NEXT:udot v3.4s, v0.16b, v2.16b
+; DOT-NEXT:addv s0, v3.4s
+; DOT-NEXT:fmov w0, s0
+; DOT-NEXT:ret
+;
+; STREAMING-SVE-LABEL: reduce_uaddv_v16i8:
+; STREAMING-SVE:   // %bb.0:
+; STREAMING-SVE-NEXT:// kill: def $q1 killed $q1 def $z1
+; STREAMING-SVE-NEXT:uunpklo z2.h, z1.b
+; STREAMING-SVE-NEXT:// kill: def $q0 killed $q0 def $z0
+; STREAMING-SVE-NEXT:uunpklo z3.h, z0.b
+; STREAMING-SVE-NEXT:ptrue p0.s, vl4
+; STREAMING-SVE-NEXT:ext z1.b, z1.b, z1.b, #8
+; STREAMING-SVE-NEXT:ext z0.b, z0.b, z0.b, #8
+; STREAMING-SVE-NEXT:uunpklo z1.h, z1.b
+; STREAMING-SVE-NEXT:uunpklo z0.h, z0.b
+; STREAMING-SVE-NEXT:uunpklo z4.s, z2.h
+; STREAMING-SVE-NEXT:ext z2.b, z2.b, z2.b, #8
+; STREAMING-SVE-NEXT:uunpklo z6.s, z3.h
+; STREAMING-SVE-NEXT:ext z3.b, z3.b, z3.b, #8
+; STREAMING-SVE-NEXT:mov z5.d, z1.d
+; STREAMING-SVE-NEXT:uunpklo z7.s, z0.h
+; STREAMING-SVE-NEXT:ext z0.b, z0.b, z0.b, #8
+; STREAMING-SVE-NEXT:uunpklo z2.s, z2.h
+; STREAMING-SVE-NEXT:uunpklo z3.s, z3.h
+; STREAMING-SVE-NEXT:add z4.s, z6.s, z4.s
+; STREAMING-SVE-NEXT:ext z5.b, z5.b, z1.b, #8
+; STREAMING-SVE-NEXT:uunpklo z1.s, z1.h
+; STREAMING-SVE-NEXT:uunpklo z0.s, z0.h
+; STREAMING-SVE-NEXT:add z2.s, z3.s, z2.s
+; STREAMING-SVE-NEXT:uunpklo z5.s, z5.h
+; STREAMING-SVE-NEXT:add z1.s, z7.s, z1.s
+; STREAMING-SVE-NEXT:add z0.s, z0.s, z5.s
+; STREAMING-SVE-NEXT:add z1.s, z4.s, z1.s
+; STREAMING-SVE-NEXT:add z0.s, z2.s, z0.s
+; STREAMING-SVE-NEXT:add z0.s, z1.s, z0.s
+; STREAMING-SVE-NEXT:uaddv d0, p0, z0.s
+; STREAMING-SVE-NEXT:fmov x0, d0
+; STREAMING-SVE-NEXT:// kill: def $w0 killed $w0 killed $x0
+; STREAMING-SVE-NEXT:ret
+  %1 = zext <32 x i8> %a to <32 x i32>
+  %2 = call i32 @llvm.vector.reduce.add.v32i32(<32 x i32> %1)
+  ret i32 %2
+}
+
+define i32 @reduce_saddv_v16i8(<32 x i8> %a) {
+; CHECK-LABEL: reduce_saddv_v16i8:
+; CHECK:   // %bb.0:
+; CHECK-NEXT:sshll2 v2.8h, v1.16b, #0
+; CHECK-NEXT:sshll2 v3.8h, v0.16b, #0
+; CHECK-NEXT:sshll v1.8h, v1.8b, #0
+; CHECK-NEXT:sshll v0.8h, v0.8b, #0
+; CHECK-NEXT:saddl2 v4.4s, v3.8h, v2.8h
+; CHECK-NEXT:saddl

[llvm-branch-commits] [lld] ReleaseNotes: lld/ELF: mention CREL (PR #101905)

2024-08-05 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru approved this pull request.


https://github.com/llvm/llvm-project/pull/101905
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] release/19.x: [ELF] Move ElfSym into Ctx. NFC (PR #101844)

2024-08-05 Thread Tobias Hieta via llvm-branch-commits

tru wrote:

I am unsure that I want to accept a refactor commit right before RC2, I 
understand it will make it much easier to accept backports in the future, but 
it does seems a bit risky to accept rather medium sized refactors as part of 
the RC process.

How confident are you in these changes @MaskRay - how long have they lived in 
`main`?

https://github.com/llvm/llvm-project/pull/101844
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang-tools-extra] release/19.x: [clang-tidy] Fix crash in modernize-use-ranges (#100427) (PR #101482)

2024-08-05 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru updated https://github.com/llvm/llvm-project/pull/101482

>From 8d2474975f56e85f5a25610fb6291dc0f3976a3e Mon Sep 17 00:00:00 2001
From: Nathan James 
Date: Thu, 25 Jul 2024 16:25:37 +0100
Subject: [PATCH] [clang-tidy] Fix crash in modernize-use-ranges (#100427)

Crash seems to be caused by the check function not handling inline
namespaces correctly for some instances. Changed how the Replacer is got
from the MatchResult now which should alleviate any potential issues

Fixes #100406

(cherry picked from commit 0762db6533eda3453158c7b9b0631542c47093a8)
---
 .../clang-tidy/utils/UseRangesCheck.cpp   | 64 +--
 .../clang-tidy/utils/UseRangesCheck.h |  2 +-
 .../modernize/Inputs/use-ranges/fake_std.h| 17 +++--
 3 files changed, 43 insertions(+), 40 deletions(-)

diff --git a/clang-tools-extra/clang-tidy/utils/UseRangesCheck.cpp 
b/clang-tools-extra/clang-tidy/utils/UseRangesCheck.cpp
index e2daa5010e2ae..aba4d17ccd035 100644
--- a/clang-tools-extra/clang-tidy/utils/UseRangesCheck.cpp
+++ b/clang-tools-extra/clang-tidy/utils/UseRangesCheck.cpp
@@ -39,12 +39,6 @@ static constexpr const char ArgName[] = "ArgName";
 
 namespace clang::tidy::utils {
 
-static bool operator==(const UseRangesCheck::Indexes &L,
-   const UseRangesCheck::Indexes &R) {
-  return std::tie(L.BeginArg, L.EndArg, L.ReplaceArg) ==
- std::tie(R.BeginArg, R.EndArg, R.ReplaceArg);
-}
-
 static std::string getFullPrefix(ArrayRef Signature) {
   std::string Output;
   llvm::raw_string_ostream OS(Output);
@@ -54,15 +48,6 @@ static std::string 
getFullPrefix(ArrayRef Signature) {
   return Output;
 }
 
-static llvm::hash_code hash_value(const UseRangesCheck::Indexes &Indexes) {
-  return llvm::hash_combine(Indexes.BeginArg, Indexes.EndArg,
-Indexes.ReplaceArg);
-}
-
-static llvm::hash_code hash_value(const UseRangesCheck::Signature &Sig) {
-  return llvm::hash_combine_range(Sig.begin(), Sig.end());
-}
-
 namespace {
 
 AST_MATCHER(Expr, hasSideEffects) {
@@ -123,24 +108,26 @@ makeMatcherPair(StringRef State, const 
UseRangesCheck::Indexes &Indexes,
 }
 
 void UseRangesCheck::registerMatchers(MatchFinder *Finder) {
-  Replaces = getReplacerMap();
+  auto Replaces = getReplacerMap();
   ReverseDescriptor = getReverseDescriptor();
   auto BeginEndNames = getFreeBeginEndMethods();
   llvm::SmallVector BeginNames{
   llvm::make_first_range(BeginEndNames)};
   llvm::SmallVector EndNames{
   llvm::make_second_range(BeginEndNames)};
-  llvm::DenseSet> Seen;
+  Replacers.clear();
+  llvm::DenseSet SeenRepl;
   for (auto I = Replaces.begin(), E = Replaces.end(); I != E; ++I) {
-const ArrayRef &Signatures =
-I->getValue()->getReplacementSignatures();
-if (!Seen.insert(Signatures).second)
+auto Replacer = I->getValue();
+if (!SeenRepl.insert(Replacer.get()).second)
   continue;
-assert(!Signatures.empty() &&
-   llvm::all_of(Signatures, [](auto Index) { return !Index.empty(); 
}));
+Replacers.push_back(Replacer);
+assert(!Replacer->getReplacementSignatures().empty() &&
+   llvm::all_of(Replacer->getReplacementSignatures(),
+[](auto Index) { return !Index.empty(); }));
 std::vector Names(1, I->getKey());
 for (auto J = std::next(I); J != E; ++J)
-  if (J->getValue()->getReplacementSignatures() == Signatures)
+  if (J->getValue() == Replacer)
 Names.push_back(J->getKey());
 
 std::vector TotalMatchers;
@@ -148,7 +135,7 @@ void UseRangesCheck::registerMatchers(MatchFinder *Finder) {
 // signatures in order of length(longest to shortest). This way any
 // signature that is a subset of another signature will be matched after 
the
 // other.
-SmallVector SigVec(Signatures);
+SmallVector SigVec(Replacer->getReplacementSignatures());
 llvm::sort(SigVec, [](auto &L, auto &R) { return R.size() < L.size(); });
 for (const auto &Signature : SigVec) {
   std::vector Matchers;
@@ -163,7 +150,8 @@ void UseRangesCheck::registerMatchers(MatchFinder *Finder) {
 }
 Finder->addMatcher(
 callExpr(
-callee(functionDecl(hasAnyName(std::move(Names))).bind(FuncDecl)),
+callee(functionDecl(hasAnyName(std::move(Names)))
+   .bind((FuncDecl + Twine(Replacers.size() - 1).str(,
 ast_matchers::internal::DynTypedMatcher::constructVariadic(
 ast_matchers::internal::DynTypedMatcher::VO_AnyOf,
 ASTNodeKind::getFromNodeKind(),
@@ -205,21 +193,33 @@ static void removeFunctionArgs(DiagnosticBuilder &Diag, 
const CallExpr &Call,
 }
 
 void UseRangesCheck::check(const MatchFinder::MatchResult &Result) {
-  const auto *Function = Result.Nodes.getNodeAs(FuncDecl);
-  std::string Qualified = "::" + Function->getQualifiedNameAsString();
-  auto Iter = Replaces.find(Qualified);
-  assert(Iter != Replaces.end());
+  Replacer *Replacer = nullptr;

[llvm-branch-commits] [clang-tools-extra] 8d24749 - [clang-tidy] Fix crash in modernize-use-ranges (#100427)

2024-08-05 Thread Tobias Hieta via llvm-branch-commits

Author: Nathan James
Date: 2024-08-05T10:17:06+02:00
New Revision: 8d2474975f56e85f5a25610fb6291dc0f3976a3e

URL: 
https://github.com/llvm/llvm-project/commit/8d2474975f56e85f5a25610fb6291dc0f3976a3e
DIFF: 
https://github.com/llvm/llvm-project/commit/8d2474975f56e85f5a25610fb6291dc0f3976a3e.diff

LOG: [clang-tidy] Fix crash in modernize-use-ranges (#100427)

Crash seems to be caused by the check function not handling inline
namespaces correctly for some instances. Changed how the Replacer is got
from the MatchResult now which should alleviate any potential issues

Fixes #100406

(cherry picked from commit 0762db6533eda3453158c7b9b0631542c47093a8)

Added: 


Modified: 
clang-tools-extra/clang-tidy/utils/UseRangesCheck.cpp
clang-tools-extra/clang-tidy/utils/UseRangesCheck.h

clang-tools-extra/test/clang-tidy/checkers/modernize/Inputs/use-ranges/fake_std.h

Removed: 




diff  --git a/clang-tools-extra/clang-tidy/utils/UseRangesCheck.cpp 
b/clang-tools-extra/clang-tidy/utils/UseRangesCheck.cpp
index e2daa5010e2ae..aba4d17ccd035 100644
--- a/clang-tools-extra/clang-tidy/utils/UseRangesCheck.cpp
+++ b/clang-tools-extra/clang-tidy/utils/UseRangesCheck.cpp
@@ -39,12 +39,6 @@ static constexpr const char ArgName[] = "ArgName";
 
 namespace clang::tidy::utils {
 
-static bool operator==(const UseRangesCheck::Indexes &L,
-   const UseRangesCheck::Indexes &R) {
-  return std::tie(L.BeginArg, L.EndArg, L.ReplaceArg) ==
- std::tie(R.BeginArg, R.EndArg, R.ReplaceArg);
-}
-
 static std::string getFullPrefix(ArrayRef Signature) {
   std::string Output;
   llvm::raw_string_ostream OS(Output);
@@ -54,15 +48,6 @@ static std::string 
getFullPrefix(ArrayRef Signature) {
   return Output;
 }
 
-static llvm::hash_code hash_value(const UseRangesCheck::Indexes &Indexes) {
-  return llvm::hash_combine(Indexes.BeginArg, Indexes.EndArg,
-Indexes.ReplaceArg);
-}
-
-static llvm::hash_code hash_value(const UseRangesCheck::Signature &Sig) {
-  return llvm::hash_combine_range(Sig.begin(), Sig.end());
-}
-
 namespace {
 
 AST_MATCHER(Expr, hasSideEffects) {
@@ -123,24 +108,26 @@ makeMatcherPair(StringRef State, const 
UseRangesCheck::Indexes &Indexes,
 }
 
 void UseRangesCheck::registerMatchers(MatchFinder *Finder) {
-  Replaces = getReplacerMap();
+  auto Replaces = getReplacerMap();
   ReverseDescriptor = getReverseDescriptor();
   auto BeginEndNames = getFreeBeginEndMethods();
   llvm::SmallVector BeginNames{
   llvm::make_first_range(BeginEndNames)};
   llvm::SmallVector EndNames{
   llvm::make_second_range(BeginEndNames)};
-  llvm::DenseSet> Seen;
+  Replacers.clear();
+  llvm::DenseSet SeenRepl;
   for (auto I = Replaces.begin(), E = Replaces.end(); I != E; ++I) {
-const ArrayRef &Signatures =
-I->getValue()->getReplacementSignatures();
-if (!Seen.insert(Signatures).second)
+auto Replacer = I->getValue();
+if (!SeenRepl.insert(Replacer.get()).second)
   continue;
-assert(!Signatures.empty() &&
-   llvm::all_of(Signatures, [](auto Index) { return !Index.empty(); 
}));
+Replacers.push_back(Replacer);
+assert(!Replacer->getReplacementSignatures().empty() &&
+   llvm::all_of(Replacer->getReplacementSignatures(),
+[](auto Index) { return !Index.empty(); }));
 std::vector Names(1, I->getKey());
 for (auto J = std::next(I); J != E; ++J)
-  if (J->getValue()->getReplacementSignatures() == Signatures)
+  if (J->getValue() == Replacer)
 Names.push_back(J->getKey());
 
 std::vector TotalMatchers;
@@ -148,7 +135,7 @@ void UseRangesCheck::registerMatchers(MatchFinder *Finder) {
 // signatures in order of length(longest to shortest). This way any
 // signature that is a subset of another signature will be matched after 
the
 // other.
-SmallVector SigVec(Signatures);
+SmallVector SigVec(Replacer->getReplacementSignatures());
 llvm::sort(SigVec, [](auto &L, auto &R) { return R.size() < L.size(); });
 for (const auto &Signature : SigVec) {
   std::vector Matchers;
@@ -163,7 +150,8 @@ void UseRangesCheck::registerMatchers(MatchFinder *Finder) {
 }
 Finder->addMatcher(
 callExpr(
-callee(functionDecl(hasAnyName(std::move(Names))).bind(FuncDecl)),
+callee(functionDecl(hasAnyName(std::move(Names)))
+   .bind((FuncDecl + Twine(Replacers.size() - 1).str(,
 ast_matchers::internal::DynTypedMatcher::constructVariadic(
 ast_matchers::internal::DynTypedMatcher::VO_AnyOf,
 ASTNodeKind::getFromNodeKind(),
@@ -205,21 +193,33 @@ static void removeFunctionArgs(DiagnosticBuilder &Diag, 
const CallExpr &Call,
 }
 
 void UseRangesCheck::check(const MatchFinder::MatchResult &Result) {
-  const auto *Function = Result.Nodes.getNodeAs(FuncDecl);
-  std::string Qualif

[llvm-branch-commits] [clang-tools-extra] release/19.x: [clang-tidy] Fix crash in modernize-use-ranges (#100427) (PR #101482)

2024-08-05 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru closed https://github.com/llvm/llvm-project/pull/101482
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang-tools-extra] release/19.x: [clang-tidy] Fix crash in modernize-use-ranges (#100427) (PR #101482)

2024-08-05 Thread via llvm-branch-commits

github-actions[bot] wrote:

@njames93 (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/101482
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] [llvm] release/19.x: Workflow fixes for building release binaries (PR #101791)

2024-08-05 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru updated https://github.com/llvm/llvm-project/pull/101791

>From 5c8dcabc529b4dc293dfc3fd29369ff1fe6517ea Mon Sep 17 00:00:00 2001
From: Tom Stellard 
Date: Fri, 26 Jul 2024 11:26:34 -0700
Subject: [PATCH 1/6] Build release binaries for multiple targets (#98431)

This adds release binary builds for the 4 platforms currently supported
by the free GitHub Action runners:

* Linux x86_64
* Windows x86_64
* Mac x86_64
* Mac AArch64

The test stages for these are known to fail, but the creating and
upoading of the release binaries should pass.

(cherry picked from commit 247251aee0d4314385a3fea86e31484d3d792ffb)
---
 .github/workflows/release-binaries-all.yml|  94 
 .../release-binaries-save-stage/action.yml|  38 ++
 .../release-binaries-setup-stage/action.yml   |  59 +++
 .github/workflows/release-binaries.yml| 474 --
 .github/workflows/release-tasks.yml   |  10 +
 clang/cmake/caches/Release.cmake  |   6 +-
 6 files changed, 520 insertions(+), 161 deletions(-)
 create mode 100644 .github/workflows/release-binaries-all.yml
 create mode 100644 .github/workflows/release-binaries-save-stage/action.yml
 create mode 100644 .github/workflows/release-binaries-setup-stage/action.yml

diff --git a/.github/workflows/release-binaries-all.yml 
b/.github/workflows/release-binaries-all.yml
new file mode 100644
index 0..73c9d96946e33
--- /dev/null
+++ b/.github/workflows/release-binaries-all.yml
@@ -0,0 +1,94 @@
+name: Release Binaries All
+
+permissions:
+  contents: read # Default everything to read-only
+
+on:
+  workflow_dispatch:
+inputs:
+  release-version:
+description: 'Release Version'
+required: true
+type: string
+  upload:
+description: 'Upload binaries to the release page'
+required: true
+default: false
+type: boolean
+
+  workflow_call:
+inputs:
+  release-version:
+description: 'Release Version'
+required: true
+type: string
+  upload:
+description: 'Upload binaries to the release page'
+required: true
+default: false
+type: boolean
+
+  pull_request:
+types:
+  - opened
+  - synchronize
+  - reopened
+  # When a PR is closed, we still start this workflow, but then skip
+  # all the jobs, which makes it effectively a no-op.  The reason to
+  # do this is that it allows us to take advantage of concurrency groups
+  # to cancel in progress CI jobs whenever the PR is closed.
+  - closed
+paths:
+  - '.github/workflows/release-binaries-all.yml'
+  - '.github/workflows/release-binaries.yml'
+  - '.github/workflows/release-binaries-setup-stage/*'
+  - '.github/workflows/release-binaries-save-stage/*'
+
+concurrency:
+  group: ${{ github.workflow }}-${{ github.event.pull_request.number || 
'dispatch' }}
+  cancel-in-progress: True
+
+jobs:
+  setup-variables:
+if: >-
+  (github.event_name != 'pull_request' || github.event.action != 'closed')
+runs-on: ubuntu-22.04
+outputs:
+  release-version: ${{ steps.vars.outputs.release-version }}
+  upload: ${{ steps.vars.outputs.upload }}
+steps:
+  - shell: bash
+id: vars
+run: |
+  upload="${{ inputs.upload }}"
+  release_version="${{ inputs.release-version }}"
+  if [ "${{ github.event_name }}" = "pull_request" ]; then
+upload="false"
+release_version=""
+  fi
+  echo "release-version=$release_version" >> "$GITHUB_OUTPUT"
+  echo "upload=$upload" >> "$GITHUB_OUTPUT"
+
+  release-binaries-all:
+name: Build Release Binaries
+needs:
+  - setup-variables
+permissions:
+  contents: write # For release uploads
+  id-token: write # For artifact attestations
+  attestations: write # For artifact attestations
+strategy:
+  fail-fast: false
+  matrix:
+runs-on:
+  - ubuntu-22.04
+  - windows-2022
+  - macos-13
+  - macos-14
+
+uses: ./.github/workflows/release-binaries.yml
+with:
+  release-version: "${{ needs.setup-variables.outputs.release-version }}"
+  upload: ${{ needs.setup-variables.outputs.upload == 'true'}}
+  runs-on: "${{ matrix.runs-on }}"
+
diff --git a/.github/workflows/release-binaries-save-stage/action.yml 
b/.github/workflows/release-binaries-save-stage/action.yml
new file mode 100644
index 0..e2f3eeadd15be
--- /dev/null
+++ b/.github/workflows/release-binaries-save-stage/action.yml
@@ -0,0 +1,38 @@
+name: Save Stage
+description: >-
+  Upload the source and binary directories from a build stage so that they
+  can be re-used in the next stage.  This action is used to the release
+  binaries workflow into multiple stages to avoid the 6 hour timeout on
+  the GitHub hosted runners.
+inputs:
+  build-prefix:
+description: "Directory containing the build dire

[llvm-branch-commits] [clang] 5c8dcab - Build release binaries for multiple targets (#98431)

2024-08-05 Thread Tobias Hieta via llvm-branch-commits

Author: Tom Stellard
Date: 2024-08-05T10:18:00+02:00
New Revision: 5c8dcabc529b4dc293dfc3fd29369ff1fe6517ea

URL: 
https://github.com/llvm/llvm-project/commit/5c8dcabc529b4dc293dfc3fd29369ff1fe6517ea
DIFF: 
https://github.com/llvm/llvm-project/commit/5c8dcabc529b4dc293dfc3fd29369ff1fe6517ea.diff

LOG: Build release binaries for multiple targets (#98431)

This adds release binary builds for the 4 platforms currently supported
by the free GitHub Action runners:

* Linux x86_64
* Windows x86_64
* Mac x86_64
* Mac AArch64

The test stages for these are known to fail, but the creating and
upoading of the release binaries should pass.

(cherry picked from commit 247251aee0d4314385a3fea86e31484d3d792ffb)

Added: 
.github/workflows/release-binaries-all.yml
.github/workflows/release-binaries-save-stage/action.yml
.github/workflows/release-binaries-setup-stage/action.yml

Modified: 
.github/workflows/release-binaries.yml
.github/workflows/release-tasks.yml
clang/cmake/caches/Release.cmake

Removed: 




diff  --git a/.github/workflows/release-binaries-all.yml 
b/.github/workflows/release-binaries-all.yml
new file mode 100644
index 0..73c9d96946e33
--- /dev/null
+++ b/.github/workflows/release-binaries-all.yml
@@ -0,0 +1,94 @@
+name: Release Binaries All
+
+permissions:
+  contents: read # Default everything to read-only
+
+on:
+  workflow_dispatch:
+inputs:
+  release-version:
+description: 'Release Version'
+required: true
+type: string
+  upload:
+description: 'Upload binaries to the release page'
+required: true
+default: false
+type: boolean
+
+  workflow_call:
+inputs:
+  release-version:
+description: 'Release Version'
+required: true
+type: string
+  upload:
+description: 'Upload binaries to the release page'
+required: true
+default: false
+type: boolean
+
+  pull_request:
+types:
+  - opened
+  - synchronize
+  - reopened
+  # When a PR is closed, we still start this workflow, but then skip
+  # all the jobs, which makes it effectively a no-op.  The reason to
+  # do this is that it allows us to take advantage of concurrency groups
+  # to cancel in progress CI jobs whenever the PR is closed.
+  - closed
+paths:
+  - '.github/workflows/release-binaries-all.yml'
+  - '.github/workflows/release-binaries.yml'
+  - '.github/workflows/release-binaries-setup-stage/*'
+  - '.github/workflows/release-binaries-save-stage/*'
+
+concurrency:
+  group: ${{ github.workflow }}-${{ github.event.pull_request.number || 
'dispatch' }}
+  cancel-in-progress: True
+
+jobs:
+  setup-variables:
+if: >-
+  (github.event_name != 'pull_request' || github.event.action != 'closed')
+runs-on: ubuntu-22.04
+outputs:
+  release-version: ${{ steps.vars.outputs.release-version }}
+  upload: ${{ steps.vars.outputs.upload }}
+steps:
+  - shell: bash
+id: vars
+run: |
+  upload="${{ inputs.upload }}"
+  release_version="${{ inputs.release-version }}"
+  if [ "${{ github.event_name }}" = "pull_request" ]; then
+upload="false"
+release_version=""
+  fi
+  echo "release-version=$release_version" >> "$GITHUB_OUTPUT"
+  echo "upload=$upload" >> "$GITHUB_OUTPUT"
+
+  release-binaries-all:
+name: Build Release Binaries
+needs:
+  - setup-variables
+permissions:
+  contents: write # For release uploads
+  id-token: write # For artifact attestations
+  attestations: write # For artifact attestations
+strategy:
+  fail-fast: false
+  matrix:
+runs-on:
+  - ubuntu-22.04
+  - windows-2022
+  - macos-13
+  - macos-14
+
+uses: ./.github/workflows/release-binaries.yml
+with:
+  release-version: "${{ needs.setup-variables.outputs.release-version }}"
+  upload: ${{ needs.setup-variables.outputs.upload == 'true'}}
+  runs-on: "${{ matrix.runs-on }}"
+

diff  --git a/.github/workflows/release-binaries-save-stage/action.yml 
b/.github/workflows/release-binaries-save-stage/action.yml
new file mode 100644
index 0..e2f3eeadd15be
--- /dev/null
+++ b/.github/workflows/release-binaries-save-stage/action.yml
@@ -0,0 +1,38 @@
+name: Save Stage
+description: >-
+  Upload the source and binary directories from a build stage so that they
+  can be re-used in the next stage.  This action is used to the release
+  binaries workflow into multiple stages to avoid the 6 hour timeout on
+  the GitHub hosted runners.
+inputs:
+  build-prefix:
+description: "Directory containing the build directory."
+required: true
+type: 'string'
+
+runs:
+  using: "composite"
+  steps:
+# We need to create an archive of the build directory

[llvm-branch-commits] [clang] [llvm] release/19.x: Workflow fixes for building release binaries (PR #101791)

2024-08-05 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru closed https://github.com/llvm/llvm-project/pull/101791
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: workflows: Re-implement the get-llvm-version action as a composite action (#101569) (PR #101793)

2024-08-05 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru updated https://github.com/llvm/llvm-project/pull/101793

>From b047c24071d5f12f5cd4fa397bfd144a4f735935 Mon Sep 17 00:00:00 2001
From: Tom Stellard 
Date: Fri, 2 Aug 2024 21:52:03 -0700
Subject: [PATCH] workflows: Re-implement the get-llvm-version action as a
 composite action (#101569)

The old version in the llvm/actions repo stopped working after the
version variables were moved out of llvm/CMakeLists.txt. Composite
actions are more simple and don't require javascript, which is why I
reimplemented it as a composite action.

This will fix the failing abi checks on the release branch.

(cherry picked from commit 14837aff058f9a2d32b8277debe619d8eb1995a1)
---
 .github/workflows/get-llvm-version/action.yml | 26 ++
 .github/workflows/libclang-abi-tests.yml  | 16 ++--
 .github/workflows/llvm-tests.yml  | 14 +--
 llvm/utils/release/get-llvm-version.sh| 86 +++
 4 files changed, 127 insertions(+), 15 deletions(-)
 create mode 100644 .github/workflows/get-llvm-version/action.yml
 create mode 100755 llvm/utils/release/get-llvm-version.sh

diff --git a/.github/workflows/get-llvm-version/action.yml 
b/.github/workflows/get-llvm-version/action.yml
new file mode 100644
index 0..2218d926fc13d
--- /dev/null
+++ b/.github/workflows/get-llvm-version/action.yml
@@ -0,0 +1,26 @@
+name: Get LLVM Version
+description: >-
+  Get the LLVM version from the llvm-project source tree.  This action assumes
+  the llvm-project sources have already been checked out into GITHUB_WORKSPACE.
+
+outputs:
+  major:
+description: LLVM major version
+value: ${{ steps.version.outputs.major }}
+  minor:
+description: LLVM minor version
+value: ${{ steps.version.outputs.minor }}
+  patch:
+description: LLVM patch version
+value: ${{ steps.version.outputs.patch }}
+
+runs:
+  using: "composite"
+  steps:
+- name: Get Version
+  shell: bash
+  id: version
+  run: |
+for v in major minor patch; do
+  echo "$v=`llvm/utils/release/get-llvm-version.sh --$v`" >> 
$GITHUB_OUTPUT
+done
diff --git a/.github/workflows/libclang-abi-tests.yml 
b/.github/workflows/libclang-abi-tests.yml
index 972d21c3bcedf..9e839ff49e283 100644
--- a/.github/workflows/libclang-abi-tests.yml
+++ b/.github/workflows/libclang-abi-tests.yml
@@ -33,9 +33,9 @@ jobs:
   ABI_HEADERS: ${{ steps.vars.outputs.ABI_HEADERS }}
   ABI_LIBS: ${{ steps.vars.outputs.ABI_LIBS }}
   BASELINE_VERSION_MAJOR: ${{ steps.vars.outputs.BASELINE_VERSION_MAJOR }}
-  LLVM_VERSION_MAJOR: ${{ steps.version.outputs.LLVM_VERSION_MAJOR }}
-  LLVM_VERSION_MINOR: ${{ steps.version.outputs.LLVM_VERSION_MINOR }}
-  LLVM_VERSION_PATCH: ${{ steps.version.outputs.LLVM_VERSION_PATCH }}
+  LLVM_VERSION_MAJOR: ${{ steps.version.outputs.major }}
+  LLVM_VERSION_MINOR: ${{ steps.version.outputs.minor }}
+  LLVM_VERSION_PATCH: ${{ steps.version.outputs.patch }}
 steps:
   - name: Checkout source
 uses: actions/checkout@v4
@@ -44,14 +44,14 @@ jobs:
 
   - name: Get LLVM version
 id: version
-uses: llvm/actions/get-llvm-version@main
+uses: ./.github/workflows/get-llvm-version
 
   - name: Setup Variables
 id: vars
 run: |
   remote_repo='https://github.com/llvm/llvm-project'
-  if [ ${{ steps.version.outputs.LLVM_VERSION_PATCH }} -eq 0 ]; then
-major_version=$(( ${{ steps.version.outputs.LLVM_VERSION_MAJOR }} 
- 1))
+  if [ ${{ steps.version.outputs.patch }} -eq 0 ]; then
+major_version=$(( ${{ steps.version.outputs.major }} - 1))
 baseline_ref="llvmorg-$major_version.1.0"
 
 # If there is a minor release, we want to use that as the base 
line.
@@ -73,8 +73,8 @@ jobs:
 } >> "$GITHUB_OUTPUT"
   else
 {
-  echo "BASELINE_VERSION_MAJOR=${{ 
steps.version.outputs.LLVM_VERSION_MAJOR }}"
-  echo "BASELINE_REF=llvmorg-${{ 
steps.version.outputs.LLVM_VERSION_MAJOR }}.1.0"
+  echo "BASELINE_VERSION_MAJOR=${{ steps.version.outputs.major }}"
+  echo "BASELINE_REF=llvmorg-${{ steps.version.outputs.major 
}}.1.0"
   echo "ABI_HEADERS=."
   echo "ABI_LIBS=libclang.so libclang-cpp.so"
 } >> "$GITHUB_OUTPUT"
diff --git a/.github/workflows/llvm-tests.yml b/.github/workflows/llvm-tests.yml
index 64d60bc3da45e..26e644229aaa2 100644
--- a/.github/workflows/llvm-tests.yml
+++ b/.github/workflows/llvm-tests.yml
@@ -43,9 +43,9 @@ jobs:
   ABI_HEADERS: ${{ steps.vars.outputs.ABI_HEADERS }}
   BASELINE_VERSION_MAJOR: ${{ steps.vars.outputs.BASELINE_VERSION_MAJOR }}
   BASELINE_VERSION_MINOR: ${{ steps.vars.outputs.BASELINE_VERSION_MINOR }}
-  LLVM_VERSION_MAJOR: ${{ steps.version.outputs.LLVM_VERSION_MAJOR }}
-  LLVM_VERSION_MINOR: ${{ steps.version.outputs.LLVM_VERSION_MINOR }}
-  LLVM_VERSION_PATCH: 

[llvm-branch-commits] [clang] [llvm] release/19.x: Workflow fixes for building release binaries (PR #101791)

2024-08-05 Thread via llvm-branch-commits

github-actions[bot] wrote:

@tstellar (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/101791
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: workflows: Re-implement the get-llvm-version action as a composite action (#101569) (PR #101793)

2024-08-05 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru closed https://github.com/llvm/llvm-project/pull/101793
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: workflows: Re-implement the get-llvm-version action as a composite action (#101569) (PR #101793)

2024-08-05 Thread via llvm-branch-commits

github-actions[bot] wrote:

@tstellar (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/101793
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] release/19.x: [AArch64][SME] Rewrite __arm_sc_memset to remove invalid instruction (#101522) (PR #101938)

2024-08-05 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/101938
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] release/19.x: [AArch64][SME] Rewrite __arm_sc_memset to remove invalid instruction (#101522) (PR #101938)

2024-08-05 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/101938

Backport d6649f2

Requested by: @kmclaughlin-arm

>From e8c807eaa338c83f82634cffdb2fa176712e101b Mon Sep 17 00:00:00 2001
From: Kerry McLaughlin 
Date: Fri, 2 Aug 2024 18:00:59 +0100
Subject: [PATCH] [AArch64][SME] Rewrite __arm_sc_memset to remove invalid
 instruction (#101522)

The implementation of __arm_sc_memset in compiler-rt contains
a Neon dup instruction which is not valid in streaming mode. This
patch rewrites the function, using an SVE mov instruction if available.

(cherry picked from commit d6649f2d4871c4535ae0519920e36100748890c4)
---
 .../lib/builtins/aarch64/sme-libc-mem-routines.S   | 10 +-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/compiler-rt/lib/builtins/aarch64/sme-libc-mem-routines.S 
b/compiler-rt/lib/builtins/aarch64/sme-libc-mem-routines.S
index 926ad3b1b6331..0318d9a6f1ebd 100644
--- a/compiler-rt/lib/builtins/aarch64/sme-libc-mem-routines.S
+++ b/compiler-rt/lib/builtins/aarch64/sme-libc-mem-routines.S
@@ -252,7 +252,15 @@ DEFINE_COMPILERRT_FUNCTION_ALIAS(__arm_sc_memmove, 
__arm_sc_memcpy)
 #define zva_val  x5
 
 DEFINE_COMPILERRT_OUTLINE_FUNCTION_UNMANGLED(__arm_sc_memset)
-dup v0.16B, valw
+#ifdef __ARM_FEATURE_SVE
+mov z0.b, valw
+#else
+bfi valw, valw, #8, #8
+bfi valw, valw, #16, #16
+bfi val, val, #32, #32
+fmov d0, val
+fmov v0.d[1], val
+#endif
 add dstend2, dstin, count
 
 cmp count, 96

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] release/19.x: [AArch64][SME] Rewrite __arm_sc_memset to remove invalid instruction (#101522) (PR #101938)

2024-08-05 Thread via llvm-branch-commits

llvmbot wrote:

@kmclaughlin-arm What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/101938
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [Metadata] Try to merge the first and last ranges. (#101860) (PR #101875)

2024-08-05 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru updated https://github.com/llvm/llvm-project/pull/101875

>From 1069d16a3980111c804ab15ffc4de85c42a7f22a Mon Sep 17 00:00:00 2001
From: DianQK 
Date: Sun, 4 Aug 2024 16:45:10 +0800
Subject: [PATCH] [Metadata] Try to merge the first and last ranges. (#101860)

Fixes #101859.

If we have at least 2 ranges, we have to try to merge the last and first
ones to handle the wrap range.

(cherry picked from commit 4377656f2419a8eb18c01e86929b689dcf22b5d6)
---
 llvm/lib/IR/Metadata.cpp|  5 +++--
 llvm/test/Transforms/GVN/merge-range.ll | 20 
 2 files changed, 23 insertions(+), 2 deletions(-)
 create mode 100644 llvm/test/Transforms/GVN/merge-range.ll

diff --git a/llvm/lib/IR/Metadata.cpp b/llvm/lib/IR/Metadata.cpp
index ae5f5de142328..fd2f4d184162f 100644
--- a/llvm/lib/IR/Metadata.cpp
+++ b/llvm/lib/IR/Metadata.cpp
@@ -1318,10 +1318,11 @@ MDNode *MDNode::getMostGenericRange(MDNode *A, MDNode 
*B) {
 ++BI;
   }
 
-  // If we have more than 2 ranges (4 endpoints) we have to try to merge
+  // We haven't handled wrap in the previous merge,
+  // if we have at least 2 ranges (4 endpoints) we have to try to merge
   // the last and first ones.
   unsigned Size = EndPoints.size();
-  if (Size > 4) {
+  if (Size > 2) {
 ConstantInt *FB = EndPoints[0];
 ConstantInt *FE = EndPoints[1];
 if (tryMergeRange(EndPoints, FB, FE)) {
diff --git a/llvm/test/Transforms/GVN/merge-range.ll 
b/llvm/test/Transforms/GVN/merge-range.ll
new file mode 100644
index 0..ad1fa4cae5662
--- /dev/null
+++ b/llvm/test/Transforms/GVN/merge-range.ll
@@ -0,0 +1,20 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 5
+; RUN: opt -passes=gvn -S < %s | FileCheck %s
+
+define i8 @foo(ptr %arg, i8 %arg1) {
+; CHECK-LABEL: define i8 @foo(
+; CHECK-SAME: ptr [[ARG:%.*]], i8 [[ARG1:%.*]]) {
+; CHECK-NEXT:  [[BB:.*:]]
+; CHECK-NEXT:[[I:%.*]] = load i8, ptr [[ARG]], align 1, !range 
[[RNG0:![0-9]+]]
+; CHECK-NEXT:[[I3:%.*]] = add i8 [[I]], [[I]]
+; CHECK-NEXT:ret i8 [[I3]]
+;
+bb:
+  %i = load i8, ptr %arg, align 1, !range !{i8 127, i8 -20}
+  %i2 = load i8, ptr %arg, align 1, !range !{i8 -27, i8 -24, i8 -20, i8 -17}
+  %i3 = add i8 %i, %i2
+  ret i8 %i3
+}
+;.
+; CHECK: [[RNG0]] = !{i8 127, i8 -17}
+;.

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 1069d16 - [Metadata] Try to merge the first and last ranges. (#101860)

2024-08-05 Thread Tobias Hieta via llvm-branch-commits

Author: DianQK
Date: 2024-08-05T10:20:17+02:00
New Revision: 1069d16a3980111c804ab15ffc4de85c42a7f22a

URL: 
https://github.com/llvm/llvm-project/commit/1069d16a3980111c804ab15ffc4de85c42a7f22a
DIFF: 
https://github.com/llvm/llvm-project/commit/1069d16a3980111c804ab15ffc4de85c42a7f22a.diff

LOG: [Metadata] Try to merge the first and last ranges. (#101860)

Fixes #101859.

If we have at least 2 ranges, we have to try to merge the last and first
ones to handle the wrap range.

(cherry picked from commit 4377656f2419a8eb18c01e86929b689dcf22b5d6)

Added: 
llvm/test/Transforms/GVN/merge-range.ll

Modified: 
llvm/lib/IR/Metadata.cpp

Removed: 




diff  --git a/llvm/lib/IR/Metadata.cpp b/llvm/lib/IR/Metadata.cpp
index ae5f5de142328..fd2f4d184162f 100644
--- a/llvm/lib/IR/Metadata.cpp
+++ b/llvm/lib/IR/Metadata.cpp
@@ -1318,10 +1318,11 @@ MDNode *MDNode::getMostGenericRange(MDNode *A, MDNode 
*B) {
 ++BI;
   }
 
-  // If we have more than 2 ranges (4 endpoints) we have to try to merge
+  // We haven't handled wrap in the previous merge,
+  // if we have at least 2 ranges (4 endpoints) we have to try to merge
   // the last and first ones.
   unsigned Size = EndPoints.size();
-  if (Size > 4) {
+  if (Size > 2) {
 ConstantInt *FB = EndPoints[0];
 ConstantInt *FE = EndPoints[1];
 if (tryMergeRange(EndPoints, FB, FE)) {

diff  --git a/llvm/test/Transforms/GVN/merge-range.ll 
b/llvm/test/Transforms/GVN/merge-range.ll
new file mode 100644
index 0..ad1fa4cae5662
--- /dev/null
+++ b/llvm/test/Transforms/GVN/merge-range.ll
@@ -0,0 +1,20 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 5
+; RUN: opt -passes=gvn -S < %s | FileCheck %s
+
+define i8 @foo(ptr %arg, i8 %arg1) {
+; CHECK-LABEL: define i8 @foo(
+; CHECK-SAME: ptr [[ARG:%.*]], i8 [[ARG1:%.*]]) {
+; CHECK-NEXT:  [[BB:.*:]]
+; CHECK-NEXT:[[I:%.*]] = load i8, ptr [[ARG]], align 1, !range 
[[RNG0:![0-9]+]]
+; CHECK-NEXT:[[I3:%.*]] = add i8 [[I]], [[I]]
+; CHECK-NEXT:ret i8 [[I3]]
+;
+bb:
+  %i = load i8, ptr %arg, align 1, !range !{i8 127, i8 -20}
+  %i2 = load i8, ptr %arg, align 1, !range !{i8 -27, i8 -24, i8 -20, i8 -17}
+  %i3 = add i8 %i, %i2
+  ret i8 %i3
+}
+;.
+; CHECK: [[RNG0]] = !{i8 127, i8 -17}
+;.



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [Metadata] Try to merge the first and last ranges. (#101860) (PR #101875)

2024-08-05 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru closed https://github.com/llvm/llvm-project/pull/101875
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: InferAddressSpaces: Fix mishandling stores of pointers to themselves (#101877) (PR #101887)

2024-08-05 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru updated https://github.com/llvm/llvm-project/pull/101887

>From 06484125e0f419dcc023ff7dd007066f5a9c9a7e Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Sun, 4 Aug 2024 16:36:00 +0400
Subject: [PATCH] InferAddressSpaces: Fix mishandling stores of pointers to
 themselves (#101877)

(cherry picked from commit 3c483b887e5a32a0ddc0a52a467b31f74aad25bb)
---
 .../Transforms/Scalar/InferAddressSpaces.cpp  |  2 +-
 .../AMDGPU/store-pointer-to-self.ll   | 71 +++
 2 files changed, 72 insertions(+), 1 deletion(-)
 create mode 100644 
llvm/test/Transforms/InferAddressSpaces/AMDGPU/store-pointer-to-self.ll

diff --git a/llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp 
b/llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp
index c9be8ee00cdc7..6b9566f1ae461 100644
--- a/llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp
+++ b/llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp
@@ -1233,7 +1233,7 @@ bool InferAddressSpacesImpl::rewriteWithNewAddressSpaces(
 // If V is used as the pointer operand of a compatible memory 
operation,
 // sets the pointer operand to NewV. This replacement does not change
 // the element type, so the resultant load/store is still valid.
-CurUser->replaceUsesOfWith(V, NewV);
+U.set(NewV);
 continue;
   }
 
diff --git 
a/llvm/test/Transforms/InferAddressSpaces/AMDGPU/store-pointer-to-self.ll 
b/llvm/test/Transforms/InferAddressSpaces/AMDGPU/store-pointer-to-self.ll
new file mode 100644
index 0..bce0e4ec1fe16
--- /dev/null
+++ b/llvm/test/Transforms/InferAddressSpaces/AMDGPU/store-pointer-to-self.ll
@@ -0,0 +1,71 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 5
+; RUN: opt -S -mtriple=amdgcn-amd-amdhsa -passes=infer-address-spaces %s | 
FileCheck %s
+
+; Make sure memory instructions where the pointer appears in both a
+; pointer and value operand work correctly.
+
+declare void @user(ptr)
+
+; Make sure only the pointer operand use of the store is replaced
+define void @store_flat_pointer_to_self() {
+; CHECK-LABEL: define void @store_flat_pointer_to_self() {
+; CHECK-NEXT:[[ALLOCA:%.*]] = alloca ptr, align 8, addrspace(5)
+; CHECK-NEXT:[[FLAT:%.*]] = addrspacecast ptr addrspace(5) [[ALLOCA]] to 
ptr
+; CHECK-NEXT:store ptr [[FLAT]], ptr addrspace(5) [[ALLOCA]], align 8
+; CHECK-NEXT:call void @user(ptr [[FLAT]])
+; CHECK-NEXT:ret void
+;
+  %alloca = alloca ptr, align 8, addrspace(5)
+  %flat = addrspacecast ptr addrspace(5) %alloca to ptr
+  store ptr %flat, ptr %flat, align 8
+  call void @user(ptr %flat)
+  ret void
+}
+
+; FIXME: Should be able to optimize the pointer operand to flat.
+define ptr @atomicrmw_xchg_flat_pointer_to_self() {
+; CHECK-LABEL: define ptr @atomicrmw_xchg_flat_pointer_to_self() {
+; CHECK-NEXT:[[ALLOCA:%.*]] = alloca ptr, align 8, addrspace(5)
+; CHECK-NEXT:[[FLAT:%.*]] = addrspacecast ptr addrspace(5) [[ALLOCA]] to 
ptr
+; CHECK-NEXT:[[XCHG:%.*]] = atomicrmw xchg ptr [[FLAT]], ptr [[FLAT]] 
seq_cst, align 8
+; CHECK-NEXT:call void @user(ptr [[FLAT]])
+; CHECK-NEXT:ret ptr [[XCHG]]
+;
+  %alloca = alloca ptr, align 8, addrspace(5)
+  %flat = addrspacecast ptr addrspace(5) %alloca to ptr
+  %xchg = atomicrmw xchg ptr %flat, ptr %flat seq_cst, align 8
+  call void @user(ptr %flat)
+  ret ptr %xchg
+}
+
+define { ptr, i1 } @cmpxchg_flat_pointer_new_to_self(ptr %cmp) {
+; CHECK-LABEL: define { ptr, i1 } @cmpxchg_flat_pointer_new_to_self(
+; CHECK-SAME: ptr [[CMP:%.*]]) {
+; CHECK-NEXT:[[ALLOCA:%.*]] = alloca ptr, align 8, addrspace(5)
+; CHECK-NEXT:[[FLAT:%.*]] = addrspacecast ptr addrspace(5) [[ALLOCA]] to 
ptr
+; CHECK-NEXT:[[CMPX:%.*]] = cmpxchg ptr [[FLAT]], ptr [[CMP]], ptr 
[[FLAT]] seq_cst seq_cst, align 8
+; CHECK-NEXT:call void @user(ptr [[FLAT]])
+; CHECK-NEXT:ret { ptr, i1 } [[CMPX]]
+;
+  %alloca = alloca ptr, align 8, addrspace(5)
+  %flat = addrspacecast ptr addrspace(5) %alloca to ptr
+  %cmpx = cmpxchg ptr %flat, ptr %cmp, ptr %flat seq_cst seq_cst, align 8
+  call void @user(ptr %flat)
+  ret { ptr, i1 } %cmpx
+}
+
+define { ptr, i1 } @cmpxchg_flat_pointer_cmp_to_self(ptr %new) {
+; CHECK-LABEL: define { ptr, i1 } @cmpxchg_flat_pointer_cmp_to_self(
+; CHECK-SAME: ptr [[NEW:%.*]]) {
+; CHECK-NEXT:[[ALLOCA:%.*]] = alloca ptr, align 8, addrspace(5)
+; CHECK-NEXT:[[FLAT:%.*]] = addrspacecast ptr addrspace(5) [[ALLOCA]] to 
ptr
+; CHECK-NEXT:[[CMPX:%.*]] = cmpxchg ptr [[FLAT]], ptr [[FLAT]], ptr 
[[NEW]] seq_cst seq_cst, align 8
+; CHECK-NEXT:call void @user(ptr [[FLAT]])
+; CHECK-NEXT:ret { ptr, i1 } [[CMPX]]
+;
+  %alloca = alloca ptr, align 8, addrspace(5)
+  %flat = addrspacecast ptr addrspace(5) %alloca to ptr
+  %cmpx = cmpxchg ptr %flat, ptr %flat, ptr %new seq_cst seq_cst, align 8
+  call void @user(ptr %flat)
+  ret { ptr, i1 } %cmpx
+}

___
llvm-branch-commits maili

[llvm-branch-commits] [llvm] release/19.x: InferAddressSpaces: Fix mishandling stores of pointers to themselves (#101877) (PR #101887)

2024-08-05 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru closed https://github.com/llvm/llvm-project/pull/101887
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] 0648412 - InferAddressSpaces: Fix mishandling stores of pointers to themselves (#101877)

2024-08-05 Thread Tobias Hieta via llvm-branch-commits

Author: Matt Arsenault
Date: 2024-08-05T10:21:02+02:00
New Revision: 06484125e0f419dcc023ff7dd007066f5a9c9a7e

URL: 
https://github.com/llvm/llvm-project/commit/06484125e0f419dcc023ff7dd007066f5a9c9a7e
DIFF: 
https://github.com/llvm/llvm-project/commit/06484125e0f419dcc023ff7dd007066f5a9c9a7e.diff

LOG: InferAddressSpaces: Fix mishandling stores of pointers to themselves 
(#101877)

(cherry picked from commit 3c483b887e5a32a0ddc0a52a467b31f74aad25bb)

Added: 
llvm/test/Transforms/InferAddressSpaces/AMDGPU/store-pointer-to-self.ll

Modified: 
llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp

Removed: 




diff  --git a/llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp 
b/llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp
index c9be8ee00cdc7..6b9566f1ae461 100644
--- a/llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp
+++ b/llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp
@@ -1233,7 +1233,7 @@ bool InferAddressSpacesImpl::rewriteWithNewAddressSpaces(
 // If V is used as the pointer operand of a compatible memory 
operation,
 // sets the pointer operand to NewV. This replacement does not change
 // the element type, so the resultant load/store is still valid.
-CurUser->replaceUsesOfWith(V, NewV);
+U.set(NewV);
 continue;
   }
 

diff  --git 
a/llvm/test/Transforms/InferAddressSpaces/AMDGPU/store-pointer-to-self.ll 
b/llvm/test/Transforms/InferAddressSpaces/AMDGPU/store-pointer-to-self.ll
new file mode 100644
index 0..bce0e4ec1fe16
--- /dev/null
+++ b/llvm/test/Transforms/InferAddressSpaces/AMDGPU/store-pointer-to-self.ll
@@ -0,0 +1,71 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 5
+; RUN: opt -S -mtriple=amdgcn-amd-amdhsa -passes=infer-address-spaces %s | 
FileCheck %s
+
+; Make sure memory instructions where the pointer appears in both a
+; pointer and value operand work correctly.
+
+declare void @user(ptr)
+
+; Make sure only the pointer operand use of the store is replaced
+define void @store_flat_pointer_to_self() {
+; CHECK-LABEL: define void @store_flat_pointer_to_self() {
+; CHECK-NEXT:[[ALLOCA:%.*]] = alloca ptr, align 8, addrspace(5)
+; CHECK-NEXT:[[FLAT:%.*]] = addrspacecast ptr addrspace(5) [[ALLOCA]] to 
ptr
+; CHECK-NEXT:store ptr [[FLAT]], ptr addrspace(5) [[ALLOCA]], align 8
+; CHECK-NEXT:call void @user(ptr [[FLAT]])
+; CHECK-NEXT:ret void
+;
+  %alloca = alloca ptr, align 8, addrspace(5)
+  %flat = addrspacecast ptr addrspace(5) %alloca to ptr
+  store ptr %flat, ptr %flat, align 8
+  call void @user(ptr %flat)
+  ret void
+}
+
+; FIXME: Should be able to optimize the pointer operand to flat.
+define ptr @atomicrmw_xchg_flat_pointer_to_self() {
+; CHECK-LABEL: define ptr @atomicrmw_xchg_flat_pointer_to_self() {
+; CHECK-NEXT:[[ALLOCA:%.*]] = alloca ptr, align 8, addrspace(5)
+; CHECK-NEXT:[[FLAT:%.*]] = addrspacecast ptr addrspace(5) [[ALLOCA]] to 
ptr
+; CHECK-NEXT:[[XCHG:%.*]] = atomicrmw xchg ptr [[FLAT]], ptr [[FLAT]] 
seq_cst, align 8
+; CHECK-NEXT:call void @user(ptr [[FLAT]])
+; CHECK-NEXT:ret ptr [[XCHG]]
+;
+  %alloca = alloca ptr, align 8, addrspace(5)
+  %flat = addrspacecast ptr addrspace(5) %alloca to ptr
+  %xchg = atomicrmw xchg ptr %flat, ptr %flat seq_cst, align 8
+  call void @user(ptr %flat)
+  ret ptr %xchg
+}
+
+define { ptr, i1 } @cmpxchg_flat_pointer_new_to_self(ptr %cmp) {
+; CHECK-LABEL: define { ptr, i1 } @cmpxchg_flat_pointer_new_to_self(
+; CHECK-SAME: ptr [[CMP:%.*]]) {
+; CHECK-NEXT:[[ALLOCA:%.*]] = alloca ptr, align 8, addrspace(5)
+; CHECK-NEXT:[[FLAT:%.*]] = addrspacecast ptr addrspace(5) [[ALLOCA]] to 
ptr
+; CHECK-NEXT:[[CMPX:%.*]] = cmpxchg ptr [[FLAT]], ptr [[CMP]], ptr 
[[FLAT]] seq_cst seq_cst, align 8
+; CHECK-NEXT:call void @user(ptr [[FLAT]])
+; CHECK-NEXT:ret { ptr, i1 } [[CMPX]]
+;
+  %alloca = alloca ptr, align 8, addrspace(5)
+  %flat = addrspacecast ptr addrspace(5) %alloca to ptr
+  %cmpx = cmpxchg ptr %flat, ptr %cmp, ptr %flat seq_cst seq_cst, align 8
+  call void @user(ptr %flat)
+  ret { ptr, i1 } %cmpx
+}
+
+define { ptr, i1 } @cmpxchg_flat_pointer_cmp_to_self(ptr %new) {
+; CHECK-LABEL: define { ptr, i1 } @cmpxchg_flat_pointer_cmp_to_self(
+; CHECK-SAME: ptr [[NEW:%.*]]) {
+; CHECK-NEXT:[[ALLOCA:%.*]] = alloca ptr, align 8, addrspace(5)
+; CHECK-NEXT:[[FLAT:%.*]] = addrspacecast ptr addrspace(5) [[ALLOCA]] to 
ptr
+; CHECK-NEXT:[[CMPX:%.*]] = cmpxchg ptr [[FLAT]], ptr [[FLAT]], ptr 
[[NEW]] seq_cst seq_cst, align 8
+; CHECK-NEXT:call void @user(ptr [[FLAT]])
+; CHECK-NEXT:ret { ptr, i1 } [[CMPX]]
+;
+  %alloca = alloca ptr, align 8, addrspace(5)
+  %flat = addrspacecast ptr addrspace(5) %alloca to ptr
+  %cmpx = cmpxchg ptr %flat, ptr %flat, ptr %new seq_cst seq_cst, align 8
+  call void @user(ptr %flat)
+  ret { ptr, i1 } %cmpx
+}



_

[llvm-branch-commits] [llvm] release/19.x: [ARM] [Windows] Use IMAGE_SYM_CLASS_STATIC for private functions (#101828) (PR #101904)

2024-08-05 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru updated https://github.com/llvm/llvm-project/pull/101904

>From 47ee66b9b969af834cfd9df82273ed9fe0418802 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Martin=20Storsj=C3=B6?= 
Date: Sun, 4 Aug 2024 23:20:45 +0300
Subject: [PATCH] [ARM] [Windows] Use IMAGE_SYM_CLASS_STATIC for private
 functions (#101828)

For functions with private linkage, pick
IMAGE_SYM_CLASS_STATIC rather than IMAGE_SYM_CLASS_EXTERNAL;
GlobalValue::isInternalLinkage() only checks for
InternalLinkage, while GlobalValue::isLocalLinkage() checks for both
InternalLinkage and PrivateLinkage.

This matches what the AArch64 target does, since commit
3406934e4db4bf95c230db072608ed062c13ad5b.

This activates a preexisting fix for the AArch64 target from
1e7f592a890aad860605cf5220530b3744e107ba, for the ARM target as well.

When a relocation points at a symbol, one usually can convey an offset
to the symbol by encoding it as an immediate in the instruction.
However, for the ARM and AArch64 branch instructions, the immediate
stored in the instruction is ignored by MS link.exe (and lld-link
matches this aspect). (It would be simple to extend lld-link to support
it - but such object files would be incompatible with MS link.exe.)

This was worked around by 1e7f592a890aad860605cf5220530b3744e107ba by
emitting symbols into the object file symbol table, for temporary
symbols that otherwise would have been omitted, if they have the class
IMAGE_SYM_CLASS_STATIC, in order to avoid needing an offset in the
relocated instruction.

This change gives the symbols generated from functions with the IR level
"private" linkage the right class, to activate that workaround.

This fixes https://github.com/llvm/llvm-project/issues/100101, fixing
code generation for coroutines for Windows on ARM. After the change in
f78688134026686288a8d310b493d9327753a022, coroutines generate a function
with private linkage, and calls to this function were previously broken
for this target.

(cherry picked from commit 8dd065d5bc81b0c8ab57f365bb169a5d92928f25)
---
 llvm/lib/Target/ARM/ARMAsmPrinter.cpp |  6 +-
 llvm/test/CodeGen/ARM/Windows/private-func.ll | 17 ++
 .../test/MC/ARM/Windows/branch-reloc-offset.s | 57 +++
 3 files changed, 77 insertions(+), 3 deletions(-)
 create mode 100644 llvm/test/CodeGen/ARM/Windows/private-func.ll
 create mode 100644 llvm/test/MC/ARM/Windows/branch-reloc-offset.s

diff --git a/llvm/lib/Target/ARM/ARMAsmPrinter.cpp 
b/llvm/lib/Target/ARM/ARMAsmPrinter.cpp
index 642739a29d6b0..96d7074e6ef37 100644
--- a/llvm/lib/Target/ARM/ARMAsmPrinter.cpp
+++ b/llvm/lib/Target/ARM/ARMAsmPrinter.cpp
@@ -153,9 +153,9 @@ bool ARMAsmPrinter::runOnMachineFunction(MachineFunction 
&MF) {
 OptimizationGoals = 0;
 
   if (Subtarget->isTargetCOFF()) {
-bool Internal = F.hasInternalLinkage();
-COFF::SymbolStorageClass Scl = Internal ? COFF::IMAGE_SYM_CLASS_STATIC
-: COFF::IMAGE_SYM_CLASS_EXTERNAL;
+bool Local = F.hasLocalLinkage();
+COFF::SymbolStorageClass Scl =
+Local ? COFF::IMAGE_SYM_CLASS_STATIC : COFF::IMAGE_SYM_CLASS_EXTERNAL;
 int Type = COFF::IMAGE_SYM_DTYPE_FUNCTION << COFF::SCT_COMPLEX_TYPE_SHIFT;
 
 OutStreamer->beginCOFFSymbolDef(CurrentFnSym);
diff --git a/llvm/test/CodeGen/ARM/Windows/private-func.ll 
b/llvm/test/CodeGen/ARM/Windows/private-func.ll
new file mode 100644
index 0..2d030ae3fabbb
--- /dev/null
+++ b/llvm/test/CodeGen/ARM/Windows/private-func.ll
@@ -0,0 +1,17 @@
+; RUN: llc -mtriple thumbv7-windows -filetype asm -o - %s | FileCheck %s
+
+define dso_local void @func1() {
+entry:
+  call void @func2()
+  ret void
+}
+
+define private void @func2() {
+entry:
+  ret void
+}
+
+; CHECK:  .def.Lfunc2;
+; CHECK-NEXT: .scl3;
+; CHECK-NEXT: .type   32;
+; CHECK-NEXT: .endef
diff --git a/llvm/test/MC/ARM/Windows/branch-reloc-offset.s 
b/llvm/test/MC/ARM/Windows/branch-reloc-offset.s
new file mode 100644
index 0..2e70a723ccf78
--- /dev/null
+++ b/llvm/test/MC/ARM/Windows/branch-reloc-offset.s
@@ -0,0 +1,57 @@
+// RUN: llvm-mc -triple thumbv7-windows-gnu -filetype obj %s -o - | 
llvm-objdump -D -r - | FileCheck %s
+
+.text
+main:
+nop
+b .Ltarget
+b .Lother_target
+
+// A private label target in the same section
+.def .Ltarget
+.scl 3
+.type 32
+.endef
+.p2align 2
+.Ltarget:
+bx lr
+
+// A private label target in another section
+.section "other", "xr"
+nop
+nop
+nop
+nop
+nop
+nop
+nop
+nop
+.def .Lother_target
+.scl 3
+.type 32
+.endef
+.p2align 2
+.Lother_target:
+bx lr
+
+// Check that both branches have a relocation with a zero offset.
+//
+// CHECK:  :
+// CHECK:0: bf00  nop
+// CHECK:2: f000 b800 b.w 0x6   @ imm = #0x0
+// CHECK: 0002:  IMAGE_REL_ARM_BRANCH24T  
.Ltarget
+// CHECK:6: f000 b800 b.w 0xa   

[llvm-branch-commits] [llvm] 47ee66b - [ARM] [Windows] Use IMAGE_SYM_CLASS_STATIC for private functions (#101828)

2024-08-05 Thread Tobias Hieta via llvm-branch-commits

Author: Martin Storsjö
Date: 2024-08-05T10:22:05+02:00
New Revision: 47ee66b9b969af834cfd9df82273ed9fe0418802

URL: 
https://github.com/llvm/llvm-project/commit/47ee66b9b969af834cfd9df82273ed9fe0418802
DIFF: 
https://github.com/llvm/llvm-project/commit/47ee66b9b969af834cfd9df82273ed9fe0418802.diff

LOG: [ARM] [Windows] Use IMAGE_SYM_CLASS_STATIC for private functions (#101828)

For functions with private linkage, pick
IMAGE_SYM_CLASS_STATIC rather than IMAGE_SYM_CLASS_EXTERNAL;
GlobalValue::isInternalLinkage() only checks for
InternalLinkage, while GlobalValue::isLocalLinkage() checks for both
InternalLinkage and PrivateLinkage.

This matches what the AArch64 target does, since commit
3406934e4db4bf95c230db072608ed062c13ad5b.

This activates a preexisting fix for the AArch64 target from
1e7f592a890aad860605cf5220530b3744e107ba, for the ARM target as well.

When a relocation points at a symbol, one usually can convey an offset
to the symbol by encoding it as an immediate in the instruction.
However, for the ARM and AArch64 branch instructions, the immediate
stored in the instruction is ignored by MS link.exe (and lld-link
matches this aspect). (It would be simple to extend lld-link to support
it - but such object files would be incompatible with MS link.exe.)

This was worked around by 1e7f592a890aad860605cf5220530b3744e107ba by
emitting symbols into the object file symbol table, for temporary
symbols that otherwise would have been omitted, if they have the class
IMAGE_SYM_CLASS_STATIC, in order to avoid needing an offset in the
relocated instruction.

This change gives the symbols generated from functions with the IR level
"private" linkage the right class, to activate that workaround.

This fixes https://github.com/llvm/llvm-project/issues/100101, fixing
code generation for coroutines for Windows on ARM. After the change in
f78688134026686288a8d310b493d9327753a022, coroutines generate a function
with private linkage, and calls to this function were previously broken
for this target.

(cherry picked from commit 8dd065d5bc81b0c8ab57f365bb169a5d92928f25)

Added: 
llvm/test/CodeGen/ARM/Windows/private-func.ll
llvm/test/MC/ARM/Windows/branch-reloc-offset.s

Modified: 
llvm/lib/Target/ARM/ARMAsmPrinter.cpp

Removed: 




diff  --git a/llvm/lib/Target/ARM/ARMAsmPrinter.cpp 
b/llvm/lib/Target/ARM/ARMAsmPrinter.cpp
index 642739a29d6b0..96d7074e6ef37 100644
--- a/llvm/lib/Target/ARM/ARMAsmPrinter.cpp
+++ b/llvm/lib/Target/ARM/ARMAsmPrinter.cpp
@@ -153,9 +153,9 @@ bool ARMAsmPrinter::runOnMachineFunction(MachineFunction 
&MF) {
 OptimizationGoals = 0;
 
   if (Subtarget->isTargetCOFF()) {
-bool Internal = F.hasInternalLinkage();
-COFF::SymbolStorageClass Scl = Internal ? COFF::IMAGE_SYM_CLASS_STATIC
-: COFF::IMAGE_SYM_CLASS_EXTERNAL;
+bool Local = F.hasLocalLinkage();
+COFF::SymbolStorageClass Scl =
+Local ? COFF::IMAGE_SYM_CLASS_STATIC : COFF::IMAGE_SYM_CLASS_EXTERNAL;
 int Type = COFF::IMAGE_SYM_DTYPE_FUNCTION << COFF::SCT_COMPLEX_TYPE_SHIFT;
 
 OutStreamer->beginCOFFSymbolDef(CurrentFnSym);

diff  --git a/llvm/test/CodeGen/ARM/Windows/private-func.ll 
b/llvm/test/CodeGen/ARM/Windows/private-func.ll
new file mode 100644
index 0..2d030ae3fabbb
--- /dev/null
+++ b/llvm/test/CodeGen/ARM/Windows/private-func.ll
@@ -0,0 +1,17 @@
+; RUN: llc -mtriple thumbv7-windows -filetype asm -o - %s | FileCheck %s
+
+define dso_local void @func1() {
+entry:
+  call void @func2()
+  ret void
+}
+
+define private void @func2() {
+entry:
+  ret void
+}
+
+; CHECK:  .def.Lfunc2;
+; CHECK-NEXT: .scl3;
+; CHECK-NEXT: .type   32;
+; CHECK-NEXT: .endef

diff  --git a/llvm/test/MC/ARM/Windows/branch-reloc-offset.s 
b/llvm/test/MC/ARM/Windows/branch-reloc-offset.s
new file mode 100644
index 0..2e70a723ccf78
--- /dev/null
+++ b/llvm/test/MC/ARM/Windows/branch-reloc-offset.s
@@ -0,0 +1,57 @@
+// RUN: llvm-mc -triple thumbv7-windows-gnu -filetype obj %s -o - | 
llvm-objdump -D -r - | FileCheck %s
+
+.text
+main:
+nop
+b .Ltarget
+b .Lother_target
+
+// A private label target in the same section
+.def .Ltarget
+.scl 3
+.type 32
+.endef
+.p2align 2
+.Ltarget:
+bx lr
+
+// A private label target in another section
+.section "other", "xr"
+nop
+nop
+nop
+nop
+nop
+nop
+nop
+nop
+.def .Lother_target
+.scl 3
+.type 32
+.endef
+.p2align 2
+.Lother_target:
+bx lr
+
+// Check that both branches have a relocation with a zero offset.
+//
+// CHECK:  :
+// CHECK:0: bf00  nop
+// CHECK:2: f000 b800 b.w 0x6   @ imm = #0x0
+// CHECK: 0002:  IMAGE_REL_ARM_BRANCH24T  
.Ltarget
+// CHECK:6: f000 b800 b.w 0xa   @ imm = #0x0
+// CHECK:

[llvm-branch-commits] [llvm] release/19.x: [ARM] [Windows] Use IMAGE_SYM_CLASS_STATIC for private functions (#101828) (PR #101904)

2024-08-05 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru closed https://github.com/llvm/llvm-project/pull/101904
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] release/19.x: Forward declare OSSpinLockLock on MacOS since it's not shipped on the system. (#101392) (PR #101432)

2024-08-05 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru updated https://github.com/llvm/llvm-project/pull/101432

>From 2193e4f88c5a96462a41528912ddedcb405ea29c Mon Sep 17 00:00:00 2001
From: Amara Emerson 
Date: Wed, 31 Jul 2024 16:51:45 -0700
Subject: [PATCH] Forward declare OSSpinLockLock on MacOS since it's not
 shipped on the system. (#101392)

Fixes build errors on some SDKs.

rdar://132607572
(cherry picked from commit 3a4c7cc56c07b2db9010c2228fc7cb2a43dd9b2d)
---
 compiler-rt/lib/rtsan/rtsan_interceptors.cpp | 12 
 1 file changed, 12 insertions(+)

diff --git a/compiler-rt/lib/rtsan/rtsan_interceptors.cpp 
b/compiler-rt/lib/rtsan/rtsan_interceptors.cpp
index 4d5423ec629d2..b63040446e53c 100644
--- a/compiler-rt/lib/rtsan/rtsan_interceptors.cpp
+++ b/compiler-rt/lib/rtsan/rtsan_interceptors.cpp
@@ -21,6 +21,18 @@
 #include "rtsan/rtsan_context.h"
 
 #if SANITIZER_APPLE
+
+#if TARGET_OS_MAC
+// On MacOS OSSpinLockLock is deprecated and no longer present in the headers,
+// but the symbol still exists on the system. Forward declare here so we
+// don't get compilation errors.
+#include 
+extern "C" {
+typedef int32_t OSSpinLock;
+void OSSpinLockLock(volatile OSSpinLock *__lock);
+}
+#endif
+
 #include 
 #include 
 #endif

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] 2193e4f - Forward declare OSSpinLockLock on MacOS since it's not shipped on the system. (#101392)

2024-08-05 Thread Tobias Hieta via llvm-branch-commits

Author: Amara Emerson
Date: 2024-08-05T10:23:35+02:00
New Revision: 2193e4f88c5a96462a41528912ddedcb405ea29c

URL: 
https://github.com/llvm/llvm-project/commit/2193e4f88c5a96462a41528912ddedcb405ea29c
DIFF: 
https://github.com/llvm/llvm-project/commit/2193e4f88c5a96462a41528912ddedcb405ea29c.diff

LOG: Forward declare OSSpinLockLock on MacOS since it's not shipped on the 
system. (#101392)

Fixes build errors on some SDKs.

rdar://132607572
(cherry picked from commit 3a4c7cc56c07b2db9010c2228fc7cb2a43dd9b2d)

Added: 


Modified: 
compiler-rt/lib/rtsan/rtsan_interceptors.cpp

Removed: 




diff  --git a/compiler-rt/lib/rtsan/rtsan_interceptors.cpp 
b/compiler-rt/lib/rtsan/rtsan_interceptors.cpp
index 4d5423ec629d2..b63040446e53c 100644
--- a/compiler-rt/lib/rtsan/rtsan_interceptors.cpp
+++ b/compiler-rt/lib/rtsan/rtsan_interceptors.cpp
@@ -21,6 +21,18 @@
 #include "rtsan/rtsan_context.h"
 
 #if SANITIZER_APPLE
+
+#if TARGET_OS_MAC
+// On MacOS OSSpinLockLock is deprecated and no longer present in the headers,
+// but the symbol still exists on the system. Forward declare here so we
+// don't get compilation errors.
+#include 
+extern "C" {
+typedef int32_t OSSpinLock;
+void OSSpinLockLock(volatile OSSpinLock *__lock);
+}
+#endif
+
 #include 
 #include 
 #endif



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] release/19.x: Forward declare OSSpinLockLock on MacOS since it's not shipped on the system. (#101392) (PR #101432)

2024-08-05 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru closed https://github.com/llvm/llvm-project/pull/101432
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] ReleaseNotes: lld/ELF: mention CREL (PR #101905)

2024-08-05 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru updated https://github.com/llvm/llvm-project/pull/101905

>From 283443371b8c2e40750e56d86c05f711a89d786c Mon Sep 17 00:00:00 2001
From: Fangrui Song 
Date: Sun, 4 Aug 2024 13:48:22 -0700
Subject: [PATCH] ReleaseNotes: lld/ELF: mention CREL

---
 lld/docs/ReleaseNotes.rst | 8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/lld/docs/ReleaseNotes.rst b/lld/docs/ReleaseNotes.rst
index 09081e421e905..98fddcd7bf7f2 100644
--- a/lld/docs/ReleaseNotes.rst
+++ b/lld/docs/ReleaseNotes.rst
@@ -26,6 +26,12 @@ Non-comprehensive list of changes in this release
 ELF Improvements
 
 
+* Experimental CREL relocations with explicit addends are now supported using 
the
+  temporary section type code 0x4020 (``clang -c 
-Wa,--crel,--allow-experimental-crel``).
+  LLVM will change the code and break compatibility (Clang and lld of different
+  versions are not guaranteed to cooperate, unlike other features). CREL with
+  implicit addends are not supported.
+  (`#98115 `_)
 * ``EI_OSABI`` in the output is now inferred from input object files.
   (`#97144 `_)
 * ``--compress-sections ={none,zlib,zstd}[:level]`` is added to 
compress
@@ -88,7 +94,7 @@ ELF Improvements
   (`#94099 `_)
   Non-alloc orphan sections are now placed at the end.
   (`#94519 `_)
-* R_X86_64_REX_GOTPCRELX of the addq form is no longer incorrectly optimized 
when the address is larger than 0x8000.
+* ``R_X86_64_REX_GOTPCRELX`` of the addq form is no longer incorrectly 
optimized when the address is larger than 0x8000.
 
 Breaking changes
 

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] 2834433 - ReleaseNotes: lld/ELF: mention CREL

2024-08-05 Thread Tobias Hieta via llvm-branch-commits

Author: Fangrui Song
Date: 2024-08-05T10:26:12+02:00
New Revision: 283443371b8c2e40750e56d86c05f711a89d786c

URL: 
https://github.com/llvm/llvm-project/commit/283443371b8c2e40750e56d86c05f711a89d786c
DIFF: 
https://github.com/llvm/llvm-project/commit/283443371b8c2e40750e56d86c05f711a89d786c.diff

LOG: ReleaseNotes: lld/ELF: mention CREL

Added: 


Modified: 
lld/docs/ReleaseNotes.rst

Removed: 




diff  --git a/lld/docs/ReleaseNotes.rst b/lld/docs/ReleaseNotes.rst
index 09081e421e905..98fddcd7bf7f2 100644
--- a/lld/docs/ReleaseNotes.rst
+++ b/lld/docs/ReleaseNotes.rst
@@ -26,6 +26,12 @@ Non-comprehensive list of changes in this release
 ELF Improvements
 
 
+* Experimental CREL relocations with explicit addends are now supported using 
the
+  temporary section type code 0x4020 (``clang -c 
-Wa,--crel,--allow-experimental-crel``).
+  LLVM will change the code and break compatibility (Clang and lld of 
diff erent
+  versions are not guaranteed to cooperate, unlike other features). CREL with
+  implicit addends are not supported.
+  (`#98115 `_)
 * ``EI_OSABI`` in the output is now inferred from input object files.
   (`#97144 `_)
 * ``--compress-sections ={none,zlib,zstd}[:level]`` is added to 
compress
@@ -88,7 +94,7 @@ ELF Improvements
   (`#94099 `_)
   Non-alloc orphan sections are now placed at the end.
   (`#94519 `_)
-* R_X86_64_REX_GOTPCRELX of the addq form is no longer incorrectly optimized 
when the address is larger than 0x8000.
+* ``R_X86_64_REX_GOTPCRELX`` of the addq form is no longer incorrectly 
optimized when the address is larger than 0x8000.
 
 Breaking changes
 



___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] ReleaseNotes: lld/ELF: mention CREL (PR #101905)

2024-08-05 Thread Tobias Hieta via llvm-branch-commits

https://github.com/tru closed https://github.com/llvm/llvm-project/pull/101905
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [Metadata] Try to merge the first and last ranges. (#101860) (PR #101875)

2024-08-05 Thread via llvm-branch-commits

github-actions[bot] wrote:

@DianQK (or anyone else). If you would like to add a note about this fix in the 
release notes (completely optional). Please reply to this comment with a one or 
two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/101875
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: InferAddressSpaces: Fix mishandling stores of pointers to themselves (#101877) (PR #101887)

2024-08-05 Thread via llvm-branch-commits

github-actions[bot] wrote:

@jhuber6 (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/101887
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] release/19.x: [AArch64][SME] Rewrite __arm_sc_memset to remove invalid instruction (#101522) (PR #101938)

2024-08-05 Thread Kerry McLaughlin via llvm-branch-commits

kmclaughlin-arm wrote:

I think this should be merged into the release branch as the __arm_ac_memset 
routine it fixes cannot be run in streaming-mode without this change.

https://github.com/llvm/llvm-project/pull/101938
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] release/19.x: [AArch64][SME] Rewrite __arm_sc_memset to remove invalid instruction (#101522) (PR #101938)

2024-08-05 Thread Kerry McLaughlin via llvm-branch-commits

https://github.com/kmclaughlin-arm approved this pull request.


https://github.com/llvm/llvm-project/pull/101938
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [ARM] [Windows] Use IMAGE_SYM_CLASS_STATIC for private functions (#101828) (PR #101904)

2024-08-05 Thread via llvm-branch-commits

github-actions[bot] wrote:

@mstorsjo (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/101904
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] AMDGPU: Handle folding frame indexes into s_add_i32 (PR #101694)

2024-08-05 Thread Christudasan Devadasan via llvm-branch-commits


@@ -0,0 +1,930 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx700 -verify-machineinstrs 
-run-pass=prologepilog %s -o - | FileCheck -check-prefix=MUBUFW64 %s
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx803 -verify-machineinstrs 
-run-pass=prologepilog %s -o - | FileCheck -check-prefix=MUBUFW64 %s
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 -verify-machineinstrs 
-run-pass=prologepilog %s -o - | FileCheck -check-prefix=MUBUFW64 %s
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx90a -verify-machineinstrs 
-run-pass=prologepilog %s -o - | FileCheck -check-prefix=MUBUFW64 %s
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1010 -verify-machineinstrs 
-run-pass=prologepilog %s -o - | FileCheck -check-prefix=MUBUFW32 %s
+
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx940 -verify-machineinstrs 
-run-pass=prologepilog %s -o - | FileCheck -check-prefix=FLATSCRW64 %s
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1100 -verify-machineinstrs 
-run-pass=prologepilog %s -o - | FileCheck -check-prefix=FLATSCRW32 %s
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1200 -verify-machineinstrs 
-run-pass=prologepilog %s -o - | FileCheck -check-prefix=FLATSCRW32 %s
+
+---
+name: s_add_i32__inline_imm__fi_offset0
+tracksRegLiveness: true
+stack:
+  - { id: 0, size: 32, alignment: 16 }
+machineFunctionInfo:
+  scratchRSrcReg:  '$sgpr0_sgpr1_sgpr2_sgpr3'
+  frameOffsetReg:  '$sgpr33'
+  stackPtrOffsetReg: '$sgpr32'
+body: |
+  bb.0:
+; MUBUFW64-LABEL: name: s_add_i32__inline_imm__fi_offset0
+; MUBUFW64: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 6, implicit-def dead 
$scc
+; MUBUFW64-NEXT: renamable $sgpr7 = S_ADD_I32 $sgpr7, 12, implicit-def 
dead $scc
+; MUBUFW64-NEXT: SI_RETURN implicit $sgpr7
+;
+; MUBUFW32-LABEL: name: s_add_i32__inline_imm__fi_offset0
+; MUBUFW32: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 5, implicit-def dead 
$scc
+; MUBUFW32-NEXT: renamable $sgpr7 = S_ADD_I32 $sgpr7, 12, implicit-def 
dead $scc
+; MUBUFW32-NEXT: SI_RETURN implicit $sgpr7
+;
+; FLATSCRW64-LABEL: name: s_add_i32__inline_imm__fi_offset0
+; FLATSCRW64: renamable $sgpr7 = S_ADD_I32 $sgpr32, 12, implicit-def dead 
$scc
+; FLATSCRW64-NEXT: SI_RETURN implicit $sgpr7
+;
+; FLATSCRW32-LABEL: name: s_add_i32__inline_imm__fi_offset0
+; FLATSCRW32: renamable $sgpr7 = S_ADD_I32 $sgpr32, 12, implicit-def dead 
$scc
+; FLATSCRW32-NEXT: SI_RETURN implicit $sgpr7
+renamable $sgpr7 = S_ADD_I32 12, %stack.0, implicit-def dead $scc
+SI_RETURN implicit $sgpr7
+
+...
+
+---
+name: s_add_i32__fi_offset0__inline_imm
+tracksRegLiveness: true
+stack:
+  - { id: 0, size: 32, alignment: 16 }
+machineFunctionInfo:
+  scratchRSrcReg:  '$sgpr0_sgpr1_sgpr2_sgpr3'
+  frameOffsetReg:  '$sgpr33'
+  stackPtrOffsetReg: '$sgpr32'
+body: |
+  bb.0:
+; MUBUFW64-LABEL: name: s_add_i32__fi_offset0__inline_imm
+; MUBUFW64: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 6, implicit-def dead 
$scc
+; MUBUFW64-NEXT: renamable $sgpr7 = S_ADD_I32 12, $sgpr7, implicit-def 
dead $scc
+; MUBUFW64-NEXT: SI_RETURN implicit $sgpr7
+;
+; MUBUFW32-LABEL: name: s_add_i32__fi_offset0__inline_imm
+; MUBUFW32: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 5, implicit-def dead 
$scc
+; MUBUFW32-NEXT: renamable $sgpr7 = S_ADD_I32 12, $sgpr7, implicit-def 
dead $scc
+; MUBUFW32-NEXT: SI_RETURN implicit $sgpr7
+;
+; FLATSCRW64-LABEL: name: s_add_i32__fi_offset0__inline_imm
+; FLATSCRW64: renamable $sgpr7 = S_ADD_I32 12, $sgpr32, implicit-def dead 
$scc
+; FLATSCRW64-NEXT: SI_RETURN implicit $sgpr7
+;
+; FLATSCRW32-LABEL: name: s_add_i32__fi_offset0__inline_imm
+; FLATSCRW32: renamable $sgpr7 = S_ADD_I32 12, $sgpr32, implicit-def dead 
$scc
+; FLATSCRW32-NEXT: SI_RETURN implicit $sgpr7
+renamable $sgpr7 = S_ADD_I32 %stack.0, 12, implicit-def dead $scc
+SI_RETURN implicit $sgpr7
+
+...
+
+---
+name: s_add_i32__inline_imm___fi_offset_inline_imm
+tracksRegLiveness: true
+stack:
+  - { id: 0, size: 16, alignment: 16 }
+  - { id: 1, size: 24, alignment: 4 }
+machineFunctionInfo:
+  scratchRSrcReg:  '$sgpr0_sgpr1_sgpr2_sgpr3'
+  frameOffsetReg:  '$sgpr33'
+  stackPtrOffsetReg: '$sgpr32'
+body: |
+  bb.0:
+; MUBUFW64-LABEL: name: s_add_i32__inline_imm___fi_offset_inline_imm
+; MUBUFW64: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 6, implicit-def dead 
$scc
+; MUBUFW64-NEXT: renamable $sgpr7 = S_ADD_I32 $sgpr7, 28, implicit-def $scc
+; MUBUFW64-NEXT: SI_RETURN implicit $sgpr7
+;
+; MUBUFW32-LABEL: name: s_add_i32__inline_imm___fi_offset_inline_imm
+; MUBUFW32: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 5, implicit-def dead 
$scc
+; MUBUFW32-NEXT: renamable $sgpr7 = S_ADD_I32 $sgpr7, 28, implicit-def $scc
+; MUBUFW32-NEXT: SI_RETURN implicit $sgpr7
+;
+; FLATSCRW64-LABEL: name: s_add_i32__inline_imm___

[llvm-branch-commits] [compiler-rt] release/19.x: Forward declare OSSpinLockLock on MacOS since it's not shipped on the system. (#101392) (PR #101432)

2024-08-05 Thread via llvm-branch-commits

github-actions[bot] wrote:

@aemerson (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/101432
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [lld] ReleaseNotes: lld/ELF: mention CREL (PR #101905)

2024-08-05 Thread via llvm-branch-commits

github-actions[bot] wrote:

@MaskRay (or anyone else). If you would like to add a note about this fix in 
the release notes (completely optional). Please reply to this comment with a 
one or two sentence description of the fix.  When you are done, please add the 
release:note label to this PR. 

https://github.com/llvm/llvm-project/pull/101905
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] AMDGPU: Handle folding frame indexes into s_add_i32 (PR #101694)

2024-08-05 Thread Matt Arsenault via llvm-branch-commits


@@ -0,0 +1,930 @@
+# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx700 -verify-machineinstrs 
-run-pass=prologepilog %s -o - | FileCheck -check-prefix=MUBUFW64 %s
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx803 -verify-machineinstrs 
-run-pass=prologepilog %s -o - | FileCheck -check-prefix=MUBUFW64 %s
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx900 -verify-machineinstrs 
-run-pass=prologepilog %s -o - | FileCheck -check-prefix=MUBUFW64 %s
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx90a -verify-machineinstrs 
-run-pass=prologepilog %s -o - | FileCheck -check-prefix=MUBUFW64 %s
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1010 -verify-machineinstrs 
-run-pass=prologepilog %s -o - | FileCheck -check-prefix=MUBUFW32 %s
+
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx940 -verify-machineinstrs 
-run-pass=prologepilog %s -o - | FileCheck -check-prefix=FLATSCRW64 %s
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1100 -verify-machineinstrs 
-run-pass=prologepilog %s -o - | FileCheck -check-prefix=FLATSCRW32 %s
+# RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx1200 -verify-machineinstrs 
-run-pass=prologepilog %s -o - | FileCheck -check-prefix=FLATSCRW32 %s
+
+---
+name: s_add_i32__inline_imm__fi_offset0
+tracksRegLiveness: true
+stack:
+  - { id: 0, size: 32, alignment: 16 }
+machineFunctionInfo:
+  scratchRSrcReg:  '$sgpr0_sgpr1_sgpr2_sgpr3'
+  frameOffsetReg:  '$sgpr33'
+  stackPtrOffsetReg: '$sgpr32'
+body: |
+  bb.0:
+; MUBUFW64-LABEL: name: s_add_i32__inline_imm__fi_offset0
+; MUBUFW64: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 6, implicit-def dead 
$scc
+; MUBUFW64-NEXT: renamable $sgpr7 = S_ADD_I32 $sgpr7, 12, implicit-def 
dead $scc
+; MUBUFW64-NEXT: SI_RETURN implicit $sgpr7
+;
+; MUBUFW32-LABEL: name: s_add_i32__inline_imm__fi_offset0
+; MUBUFW32: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 5, implicit-def dead 
$scc
+; MUBUFW32-NEXT: renamable $sgpr7 = S_ADD_I32 $sgpr7, 12, implicit-def 
dead $scc
+; MUBUFW32-NEXT: SI_RETURN implicit $sgpr7
+;
+; FLATSCRW64-LABEL: name: s_add_i32__inline_imm__fi_offset0
+; FLATSCRW64: renamable $sgpr7 = S_ADD_I32 $sgpr32, 12, implicit-def dead 
$scc
+; FLATSCRW64-NEXT: SI_RETURN implicit $sgpr7
+;
+; FLATSCRW32-LABEL: name: s_add_i32__inline_imm__fi_offset0
+; FLATSCRW32: renamable $sgpr7 = S_ADD_I32 $sgpr32, 12, implicit-def dead 
$scc
+; FLATSCRW32-NEXT: SI_RETURN implicit $sgpr7
+renamable $sgpr7 = S_ADD_I32 12, %stack.0, implicit-def dead $scc
+SI_RETURN implicit $sgpr7
+
+...
+
+---
+name: s_add_i32__fi_offset0__inline_imm
+tracksRegLiveness: true
+stack:
+  - { id: 0, size: 32, alignment: 16 }
+machineFunctionInfo:
+  scratchRSrcReg:  '$sgpr0_sgpr1_sgpr2_sgpr3'
+  frameOffsetReg:  '$sgpr33'
+  stackPtrOffsetReg: '$sgpr32'
+body: |
+  bb.0:
+; MUBUFW64-LABEL: name: s_add_i32__fi_offset0__inline_imm
+; MUBUFW64: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 6, implicit-def dead 
$scc
+; MUBUFW64-NEXT: renamable $sgpr7 = S_ADD_I32 12, $sgpr7, implicit-def 
dead $scc
+; MUBUFW64-NEXT: SI_RETURN implicit $sgpr7
+;
+; MUBUFW32-LABEL: name: s_add_i32__fi_offset0__inline_imm
+; MUBUFW32: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 5, implicit-def dead 
$scc
+; MUBUFW32-NEXT: renamable $sgpr7 = S_ADD_I32 12, $sgpr7, implicit-def 
dead $scc
+; MUBUFW32-NEXT: SI_RETURN implicit $sgpr7
+;
+; FLATSCRW64-LABEL: name: s_add_i32__fi_offset0__inline_imm
+; FLATSCRW64: renamable $sgpr7 = S_ADD_I32 12, $sgpr32, implicit-def dead 
$scc
+; FLATSCRW64-NEXT: SI_RETURN implicit $sgpr7
+;
+; FLATSCRW32-LABEL: name: s_add_i32__fi_offset0__inline_imm
+; FLATSCRW32: renamable $sgpr7 = S_ADD_I32 12, $sgpr32, implicit-def dead 
$scc
+; FLATSCRW32-NEXT: SI_RETURN implicit $sgpr7
+renamable $sgpr7 = S_ADD_I32 %stack.0, 12, implicit-def dead $scc
+SI_RETURN implicit $sgpr7
+
+...
+
+---
+name: s_add_i32__inline_imm___fi_offset_inline_imm
+tracksRegLiveness: true
+stack:
+  - { id: 0, size: 16, alignment: 16 }
+  - { id: 1, size: 24, alignment: 4 }
+machineFunctionInfo:
+  scratchRSrcReg:  '$sgpr0_sgpr1_sgpr2_sgpr3'
+  frameOffsetReg:  '$sgpr33'
+  stackPtrOffsetReg: '$sgpr32'
+body: |
+  bb.0:
+; MUBUFW64-LABEL: name: s_add_i32__inline_imm___fi_offset_inline_imm
+; MUBUFW64: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 6, implicit-def dead 
$scc
+; MUBUFW64-NEXT: renamable $sgpr7 = S_ADD_I32 $sgpr7, 28, implicit-def $scc
+; MUBUFW64-NEXT: SI_RETURN implicit $sgpr7
+;
+; MUBUFW32-LABEL: name: s_add_i32__inline_imm___fi_offset_inline_imm
+; MUBUFW32: renamable $sgpr7 = S_LSHR_B32 $sgpr32, 5, implicit-def dead 
$scc
+; MUBUFW32-NEXT: renamable $sgpr7 = S_ADD_I32 $sgpr7, 28, implicit-def $scc
+; MUBUFW32-NEXT: SI_RETURN implicit $sgpr7
+;
+; FLATSCRW64-LABEL: name: s_add_i32__inline_imm___

[llvm-branch-commits] [flang] [flang] Lower omp.workshare to other omp constructs (PR #101446)

2024-08-05 Thread Tom Eccles via llvm-branch-commits


@@ -0,0 +1,259 @@
+//===- LowerWorkshare.cpp - special cases for bufferization ---===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM 
Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===--===//
+// Lower omp workshare construct.
+//===--===//
+
+#include "flang/Optimizer/Dialect/FIROps.h"
+#include "flang/Optimizer/Dialect/FIRType.h"
+#include "flang/Optimizer/OpenMP/Passes.h"
+#include "mlir/Dialect/OpenMP/OpenMPDialect.h"
+#include "mlir/IR/BuiltinOps.h"
+#include "mlir/IR/IRMapping.h"
+#include "mlir/IR/OpDefinition.h"
+#include "mlir/IR/PatternMatch.h"
+#include "mlir/Support/LLVM.h"
+#include "mlir/Transforms/GreedyPatternRewriteDriver.h"
+#include "llvm/ADT/STLExtras.h"
+#include "llvm/ADT/iterator_range.h"
+
+#include 
+
+namespace flangomp {
+#define GEN_PASS_DEF_LOWERWORKSHARE
+#include "flang/Optimizer/OpenMP/Passes.h.inc"
+} // namespace flangomp
+
+#define DEBUG_TYPE "lower-workshare"
+
+using namespace mlir;
+
+namespace flangomp {
+bool shouldUseWorkshareLowering(Operation *op) {
+  auto workshare = dyn_cast(op->getParentOp());
+  if (!workshare)
+return false;
+  return workshare->getParentOfType();
+}
+} // namespace flangomp
+
+namespace {
+
+struct SingleRegion {
+  Block::iterator begin, end;
+};
+
+static bool isSupportedByFirAlloca(Type ty) {
+  return !isa(ty);
+}
+
+static bool isSafeToParallelize(Operation *op) {
+  if (isa(op))
+return true;
+
+  llvm::SmallVector effects;
+  MemoryEffectOpInterface interface = dyn_cast(op);
+  if (!interface) {
+return false;
+  }
+  interface.getEffects(effects);
+  if (effects.empty())
+return true;
+
+  return false;
+}
+
+/// Lowers workshare to a sequence of single-thread regions and parallel loops
+///
+/// For example:
+///
+/// omp.workshare {
+///   %a = fir.allocmem
+///   omp.wsloop {}
+///   fir.call Assign %b %a
+///   fir.freemem %a
+/// }
+///
+/// becomes
+///
+/// omp.single {
+///   %a = fir.allocmem
+///   fir.store %a %tmp
+/// }
+/// %a_reloaded = fir.load %tmp
+/// omp.wsloop {}
+/// omp.single {
+///   fir.call Assign %b %a_reloaded
+///   fir.freemem %a_reloaded
+/// }
+///
+/// Note that we allocate temporary memory for values in omp.single's which 
need
+/// to be accessed in all threads in the closest omp.parallel
+///
+/// TODO currently we need to be able to access the encompassing omp.parallel 
so
+/// that we can allocate temporaries accessible by all threads outside of it.
+/// In case we do not find it, we fall back to converting the omp.workshare to
+/// omp.single.
+/// To better handle this we should probably enable yielding values out of an
+/// omp.single which will be supported by the omp runtime.
+void lowerWorkshare(mlir::omp::WorkshareOp wsOp) {
+  assert(wsOp.getRegion().getBlocks().size() == 1);
+
+  Location loc = wsOp->getLoc();
+
+  omp::ParallelOp parallelOp = wsOp->getParentOfType();
+  if (!parallelOp) {
+wsOp.emitWarning("cannot handle workshare, converting to single");
+Operation *terminator = wsOp.getRegion().front().getTerminator();
+wsOp->getBlock()->getOperations().splice(
+wsOp->getIterator(), wsOp.getRegion().front().getOperations());
+terminator->erase();
+return;
+  }
+
+  OpBuilder allocBuilder(parallelOp);
+  OpBuilder rootBuilder(wsOp);
+  IRMapping rootMapping;
+
+  omp::SingleOp singleOp = nullptr;
+
+  auto mapReloadedValue = [&](Value v, OpBuilder singleBuilder,
+  IRMapping singleMapping) {
+if (auto reloaded = rootMapping.lookupOrNull(v))
+  return;
+Type llvmPtrTy = LLVM::LLVMPointerType::get(allocBuilder.getContext());
+Type ty = v.getType();
+Value alloc, reloaded;
+if (isSupportedByFirAlloca(ty)) {
+  alloc = allocBuilder.create(loc, ty);
+  singleBuilder.create(loc, singleMapping.lookup(v), alloc);

tblah wrote:

Ahh you are correct for the allocatable array case. If the fir.freemem or 
runtime destroy call is in a single region this will be fine.

https://github.com/llvm/llvm-project/pull/101446
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [AArch64] Avoid inlining if ZT0 needs preserving. (#101343) (PR #101932)

2024-08-05 Thread Paul Walker via llvm-branch-commits

https://github.com/paulwalker-arm approved this pull request.


https://github.com/llvm/llvm-project/pull/101932
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] AMDGPU: Handle folding frame indexes into s_add_i32 (PR #101694)

2024-08-05 Thread Christudasan Devadasan via llvm-branch-commits

https://github.com/cdevadas approved this pull request.


https://github.com/llvm/llvm-project/pull/101694
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [LLVM][TTI][SME] Allow optional auto-vectorisation for streaming functions. (#101679) (PR #101959)

2024-08-05 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/101959

Backport 7775a48

Requested by: @paulwalker-arm

>From ffe77522d32f4502318d43e9bca6b6ee968d5be7 Mon Sep 17 00:00:00 2001
From: Paul Walker 
Date: Mon, 5 Aug 2024 11:25:44 +0100
Subject: [PATCH] [LLVM][TTI][SME] Allow optional auto-vectorisation for
 streaming functions. (#101679)

The command line option enable-scalable-autovec-in-streaming-mode is
used to enable scalable vectors but the same check is missing from
enableScalableVectorization, which is blocking auto-vectorisation.

(cherry picked from commit 7775a4882d7105fde7f7a81f3c72567d39afce45)
---
 .../AArch64/AArch64TargetTransformInfo.cpp|  5 ++
 .../AArch64/AArch64TargetTransformInfo.h  |  2 +-
 .../AArch64/streaming-vectorization.ll| 56 +++
 3 files changed, 62 insertions(+), 1 deletion(-)
 create mode 100644 
llvm/test/Transforms/LoopVectorize/AArch64/streaming-vectorization.ll

diff --git a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp 
b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
index 45148449dfb82..b77c534f09f41 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
@@ -2295,6 +2295,11 @@ std::optional 
AArch64TTIImpl::simplifyDemandedVectorEltsIntrinsic(
   return std::nullopt;
 }
 
+bool AArch64TTIImpl::enableScalableVectorization() const {
+  return ST->isSVEAvailable() || (ST->isSVEorStreamingSVEAvailable() &&
+  EnableScalableAutovecInStreamingMode);
+}
+
 TypeSize
 AArch64TTIImpl::getRegisterBitWidth(TargetTransformInfo::RegisterKind K) const 
{
   switch (K) {
diff --git a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h 
b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h
index a9189fd53f40b..4a6457d7a7dbf 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h
+++ b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h
@@ -381,7 +381,7 @@ class AArch64TTIImpl : public 
BasicTTIImplBase {
 return ST->isSVEorStreamingSVEAvailable();
   }
 
-  bool enableScalableVectorization() const { return ST->isSVEAvailable(); }
+  bool enableScalableVectorization() const;
 
   bool isLegalToVectorizeReduction(const RecurrenceDescriptor &RdxDesc,
ElementCount VF) const;
diff --git 
a/llvm/test/Transforms/LoopVectorize/AArch64/streaming-vectorization.ll 
b/llvm/test/Transforms/LoopVectorize/AArch64/streaming-vectorization.ll
new file mode 100644
index 0..924d4bfb7836a
--- /dev/null
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/streaming-vectorization.ll
@@ -0,0 +1,56 @@
+; REQUIRES: asserts
+; RUN: opt -S -passes=loop-vectorize -debug-only=loop-vectorize < %s 2>&1 | 
FileCheck %s --check-prefixes=CHECK,NOVEC
+; RUN: opt -S -passes=loop-vectorize -debug-only=loop-vectorize 
-enable-scalable-autovec-in-streaming-mode < %s 2>&1 | FileCheck %s 
--check-prefixes=CHECK,VEC
+
+target triple = "aarch64-unknown-linux-gnu"
+
+define void @normal_function(ptr %a, ptr %b, ptr %c) #0 {
+; CHECK: LV: Checking a loop in 'normal_function'
+; CHECK: LV: Scalable vectorization is available
+entry:
+  br label %loop
+
+loop:
+  %iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]
+  %arrayidx = getelementptr inbounds i32, ptr %c, i64 %iv
+  %0 = load i32, ptr %arrayidx, align 4
+  %arrayidx2 = getelementptr inbounds i8, ptr %b, i64 %iv
+  %1 = load i8, ptr %arrayidx2, align 4
+  %zext = zext i8 %1 to i32
+  %add = add nsw i32 %zext, %0
+  %arrayidx5 = getelementptr inbounds i32, ptr %a, i64 %iv
+  store i32 %add, ptr %arrayidx5, align 4
+  %iv.next = add nuw nsw i64 %iv, 1
+  %exitcond.not = icmp eq i64 %iv.next, 1024
+  br i1 %exitcond.not, label %exit, label %loop
+
+exit:
+  ret void
+}
+
+define void @streaming_function(ptr %a, ptr %b, ptr %c) #0 
"aarch64_pstate_sm_enabled" {
+; CHECK: LV: Checking a loop in 'streaming_function'
+; VEC: LV: Scalable vectorization is available
+; NOVEC: LV: Scalable vectorization is explicitly disabled
+entry:
+  br label %loop
+
+loop:
+  %iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]
+  %arrayidx = getelementptr inbounds i32, ptr %c, i64 %iv
+  %0 = load i32, ptr %arrayidx, align 4
+  %arrayidx2 = getelementptr inbounds i8, ptr %b, i64 %iv
+  %1 = load i8, ptr %arrayidx2, align 4
+  %zext = zext i8 %1 to i32
+  %add = add nsw i32 %zext, %0
+  %arrayidx5 = getelementptr inbounds i32, ptr %a, i64 %iv
+  store i32 %add, ptr %arrayidx5, align 4
+  %iv.next = add nuw nsw i64 %iv, 1
+  %exitcond.not = icmp eq i64 %iv.next, 1024
+  br i1 %exitcond.not, label %exit, label %loop
+
+exit:
+  ret void
+}
+
+attributes #0 = { vscale_range(1, 16) "target-features"="+sve,+sme" }

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [LLVM][TTI][SME] Allow optional auto-vectorisation for streaming functions. (#101679) (PR #101959)

2024-08-05 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/101959
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [LLVM][TTI][SME] Allow optional auto-vectorisation for streaming functions. (#101679) (PR #101959)

2024-08-05 Thread via llvm-branch-commits

llvmbot wrote:

@sdesmalen-arm What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/101959
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [LLVM][TTI][SME] Allow optional auto-vectorisation for streaming functions. (#101679) (PR #101959)

2024-08-05 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-llvm-transforms

Author: None (llvmbot)


Changes

Backport 7775a48

Requested by: @paulwalker-arm

---
Full diff: https://github.com/llvm/llvm-project/pull/101959.diff


3 Files Affected:

- (modified) llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp (+5) 
- (modified) llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h (+1-1) 
- (added) llvm/test/Transforms/LoopVectorize/AArch64/streaming-vectorization.ll 
(+56) 


``diff
diff --git a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp 
b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
index 45148449dfb82..b77c534f09f41 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
+++ b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
@@ -2295,6 +2295,11 @@ std::optional 
AArch64TTIImpl::simplifyDemandedVectorEltsIntrinsic(
   return std::nullopt;
 }
 
+bool AArch64TTIImpl::enableScalableVectorization() const {
+  return ST->isSVEAvailable() || (ST->isSVEorStreamingSVEAvailable() &&
+  EnableScalableAutovecInStreamingMode);
+}
+
 TypeSize
 AArch64TTIImpl::getRegisterBitWidth(TargetTransformInfo::RegisterKind K) const 
{
   switch (K) {
diff --git a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h 
b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h
index a9189fd53f40b..4a6457d7a7dbf 100644
--- a/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h
+++ b/llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h
@@ -381,7 +381,7 @@ class AArch64TTIImpl : public 
BasicTTIImplBase {
 return ST->isSVEorStreamingSVEAvailable();
   }
 
-  bool enableScalableVectorization() const { return ST->isSVEAvailable(); }
+  bool enableScalableVectorization() const;
 
   bool isLegalToVectorizeReduction(const RecurrenceDescriptor &RdxDesc,
ElementCount VF) const;
diff --git 
a/llvm/test/Transforms/LoopVectorize/AArch64/streaming-vectorization.ll 
b/llvm/test/Transforms/LoopVectorize/AArch64/streaming-vectorization.ll
new file mode 100644
index 0..924d4bfb7836a
--- /dev/null
+++ b/llvm/test/Transforms/LoopVectorize/AArch64/streaming-vectorization.ll
@@ -0,0 +1,56 @@
+; REQUIRES: asserts
+; RUN: opt -S -passes=loop-vectorize -debug-only=loop-vectorize < %s 2>&1 | 
FileCheck %s --check-prefixes=CHECK,NOVEC
+; RUN: opt -S -passes=loop-vectorize -debug-only=loop-vectorize 
-enable-scalable-autovec-in-streaming-mode < %s 2>&1 | FileCheck %s 
--check-prefixes=CHECK,VEC
+
+target triple = "aarch64-unknown-linux-gnu"
+
+define void @normal_function(ptr %a, ptr %b, ptr %c) #0 {
+; CHECK: LV: Checking a loop in 'normal_function'
+; CHECK: LV: Scalable vectorization is available
+entry:
+  br label %loop
+
+loop:
+  %iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]
+  %arrayidx = getelementptr inbounds i32, ptr %c, i64 %iv
+  %0 = load i32, ptr %arrayidx, align 4
+  %arrayidx2 = getelementptr inbounds i8, ptr %b, i64 %iv
+  %1 = load i8, ptr %arrayidx2, align 4
+  %zext = zext i8 %1 to i32
+  %add = add nsw i32 %zext, %0
+  %arrayidx5 = getelementptr inbounds i32, ptr %a, i64 %iv
+  store i32 %add, ptr %arrayidx5, align 4
+  %iv.next = add nuw nsw i64 %iv, 1
+  %exitcond.not = icmp eq i64 %iv.next, 1024
+  br i1 %exitcond.not, label %exit, label %loop
+
+exit:
+  ret void
+}
+
+define void @streaming_function(ptr %a, ptr %b, ptr %c) #0 
"aarch64_pstate_sm_enabled" {
+; CHECK: LV: Checking a loop in 'streaming_function'
+; VEC: LV: Scalable vectorization is available
+; NOVEC: LV: Scalable vectorization is explicitly disabled
+entry:
+  br label %loop
+
+loop:
+  %iv = phi i64 [ 0, %entry ], [ %iv.next, %loop ]
+  %arrayidx = getelementptr inbounds i32, ptr %c, i64 %iv
+  %0 = load i32, ptr %arrayidx, align 4
+  %arrayidx2 = getelementptr inbounds i8, ptr %b, i64 %iv
+  %1 = load i8, ptr %arrayidx2, align 4
+  %zext = zext i8 %1 to i32
+  %add = add nsw i32 %zext, %0
+  %arrayidx5 = getelementptr inbounds i32, ptr %a, i64 %iv
+  store i32 %add, ptr %arrayidx5, align 4
+  %iv.next = add nuw nsw i64 %iv, 1
+  %exitcond.not = icmp eq i64 %iv.next, 1024
+  br i1 %exitcond.not, label %exit, label %loop
+
+exit:
+  ret void
+}
+
+attributes #0 = { vscale_range(1, 16) "target-features"="+sve,+sme" }

``




https://github.com/llvm/llvm-project/pull/101959
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [LLVM][TTI][SME] Allow optional auto-vectorisation for streaming functions. (#101679) (PR #101959)

2024-08-05 Thread Sander de Smalen via llvm-branch-commits

https://github.com/sdesmalen-arm approved this pull request.


https://github.com/llvm/llvm-project/pull/101959
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: [Driver] Restrict Ofast deprecation help message to Clang (#101682) (PR #101963)

2024-08-05 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/101963

Backport e60ee1f

Requested by: @kiranchandramohan

>From f4e81a5e3db7f1f4a7042b0c7c17a614e0563f92 Mon Sep 17 00:00:00 2001
From: Kiran Chandramohan 
Date: Mon, 5 Aug 2024 12:43:37 +0100
Subject: [PATCH] [Driver] Restrict Ofast deprecation help message to Clang
 (#101682)

The discussion about this in Flang
(https://discourse.llvm.org/t/rfc-deprecate-ofast-in-flang/80243) has
not concluded hence restricting the deprecation only to Clang.

(cherry picked from commit e60ee1f2d70bdb0ac87b09ae685d669d8543b7bd)
---
 clang/include/clang/Driver/Options.td | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 359a698ea87dd..014a2bd85fdc6 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -932,8 +932,9 @@ def O_flag : Flag<["-"], "O">, Visibility<[ClangOption, 
CC1Option, FC1Option]>,
   Alias, AliasArgs<["1"]>;
 def Ofast : Joined<["-"], "Ofast">, Group,
   Visibility<[ClangOption, CC1Option, FlangOption]>,
-  HelpText<"Deprecated; use '-O3 -ffast-math' for the same behavior,"
-  " or '-O3' to enable only conforming optimizations">;
+  HelpTextForVariants<[ClangOption, CC1Option],
+  "Deprecated; use '-O3 -ffast-math' for the same 
behavior,"
+  " or '-O3' to enable only conforming optimizations">;
 def P : Flag<["-"], "P">,
   Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>,
   Group,

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: [Driver] Restrict Ofast deprecation help message to Clang (#101682) (PR #101963)

2024-08-05 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/101963
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: [Driver] Restrict Ofast deprecation help message to Clang (#101682) (PR #101963)

2024-08-05 Thread via llvm-branch-commits

llvmbot wrote:

@banach-space What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/101963
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: [Driver] Restrict Ofast deprecation help message to Clang (#101682) (PR #101963)

2024-08-05 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: None (llvmbot)


Changes

Backport e60ee1f

Requested by: @kiranchandramohan

---
Full diff: https://github.com/llvm/llvm-project/pull/101963.diff


1 Files Affected:

- (modified) clang/include/clang/Driver/Options.td (+3-2) 


``diff
diff --git a/clang/include/clang/Driver/Options.td 
b/clang/include/clang/Driver/Options.td
index 359a698ea87dd..014a2bd85fdc6 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -932,8 +932,9 @@ def O_flag : Flag<["-"], "O">, Visibility<[ClangOption, 
CC1Option, FC1Option]>,
   Alias, AliasArgs<["1"]>;
 def Ofast : Joined<["-"], "Ofast">, Group,
   Visibility<[ClangOption, CC1Option, FlangOption]>,
-  HelpText<"Deprecated; use '-O3 -ffast-math' for the same behavior,"
-  " or '-O3' to enable only conforming optimizations">;
+  HelpTextForVariants<[ClangOption, CC1Option],
+  "Deprecated; use '-O3 -ffast-math' for the same 
behavior,"
+  " or '-O3' to enable only conforming optimizations">;
 def P : Flag<["-"], "P">,
   Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>,
   Group,

``




https://github.com/llvm/llvm-project/pull/101963
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: [Driver] Restrict Ofast deprecation help message to Clang (#101682) (PR #101963)

2024-08-05 Thread Aaron Ballman via llvm-branch-commits

https://github.com/AaronBallman approved this pull request.

LGTM for a backport

https://github.com/llvm/llvm-project/pull/101963
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: [Clang] Define __cpp_pack_indexing (#101956) (PR #101965)

2024-08-05 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/101965
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: [Clang] Define __cpp_pack_indexing (#101956) (PR #101965)

2024-08-05 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/101965

Backport c65afad9c58474a784633314e945c874ed06584a

Requested by: @Sirraide

>From 68217f5d2765f7774e40526d7f9960f42d067dca Mon Sep 17 00:00:00 2001
From: Sirraide 
Date: Mon, 5 Aug 2024 14:02:15 +0200
Subject: [PATCH] [Clang] Define __cpp_pack_indexing (#101956)

Following the discussion on #101448 this defines
`__cpp_pack_indexing`. Since pack indexing is currently
supported in all language modes, the feature test macro
is also defined in all language modes.

(cherry picked from commit c65afad9c58474a784633314e945c874ed06584a)
---
 clang/docs/LanguageExtensions.rst   | 1 +
 clang/lib/Frontend/InitPreprocessor.cpp | 1 +
 clang/test/Lexer/cxx-features.cpp   | 4 
 3 files changed, 6 insertions(+)

diff --git a/clang/docs/LanguageExtensions.rst 
b/clang/docs/LanguageExtensions.rst
index 81784c75081ba..1c4a6ecca2142 100644
--- a/clang/docs/LanguageExtensions.rst
+++ b/clang/docs/LanguageExtensions.rst
@@ -1503,6 +1503,7 @@ Conditional ``explicit`` 
__cpp_conditional_explicit   C+
 ``static operator()``__cpp_static_call_operator   
C++23 C++03
 Attributes on Lambda-Expressions  
C++23 C++11
 Attributes on Structured Bindings__cpp_structured_bindings
C++26 C++03
+Pack Indexing__cpp_pack_indexing  
C++26 C++03
 ``= delete ("should have a reason");``   __cpp_deleted_function   
C++26 C++03
   
- -
 Designated initializers (N494)
C99   C89
diff --git a/clang/lib/Frontend/InitPreprocessor.cpp 
b/clang/lib/Frontend/InitPreprocessor.cpp
index 920ddf7e59913..3ed7243deba8a 100644
--- a/clang/lib/Frontend/InitPreprocessor.cpp
+++ b/clang/lib/Frontend/InitPreprocessor.cpp
@@ -763,6 +763,7 @@ static void InitializeCPlusPlusFeatureTestMacros(const 
LangOptions &LangOpts,
   Builder.defineMacro("__cpp_placeholder_variables", "202306L");
 
   // C++26 features supported in earlier language modes.
+  Builder.defineMacro("__cpp_pack_indexing", "202311L");
   Builder.defineMacro("__cpp_deleted_function", "202403L");
 
   if (LangOpts.Char8)
diff --git a/clang/test/Lexer/cxx-features.cpp 
b/clang/test/Lexer/cxx-features.cpp
index 4c2aa3ae2c544..08b732132228b 100644
--- a/clang/test/Lexer/cxx-features.cpp
+++ b/clang/test/Lexer/cxx-features.cpp
@@ -38,6 +38,10 @@
 #error "wrong value for __cpp_deleted_function"
 #endif
 
+#if check(pack_indexing, 202311, 202311, 202311, 202311, 202311, 202311, 
202311)
+#error "wrong value for __cpp_pack_indexing"
+#endif
+
 #if check(placeholder_variables, 202306, 202306, 202306, 202306, 202306, 
202306, 202306)
 #error "wrong value for __cpp_placeholder_variables"
 #endif

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: [Clang] Define __cpp_pack_indexing (#101956) (PR #101965)

2024-08-05 Thread via llvm-branch-commits

llvmbot wrote:

@cor3ntin What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/101965
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: [Clang] Define __cpp_pack_indexing (#101956) (PR #101965)

2024-08-05 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: None (llvmbot)


Changes

Backport c65afad9c58474a784633314e945c874ed06584a

Requested by: @Sirraide

---
Full diff: https://github.com/llvm/llvm-project/pull/101965.diff


3 Files Affected:

- (modified) clang/docs/LanguageExtensions.rst (+1) 
- (modified) clang/lib/Frontend/InitPreprocessor.cpp (+1) 
- (modified) clang/test/Lexer/cxx-features.cpp (+4) 


``diff
diff --git a/clang/docs/LanguageExtensions.rst 
b/clang/docs/LanguageExtensions.rst
index 81784c75081ba..1c4a6ecca2142 100644
--- a/clang/docs/LanguageExtensions.rst
+++ b/clang/docs/LanguageExtensions.rst
@@ -1503,6 +1503,7 @@ Conditional ``explicit`` 
__cpp_conditional_explicit   C+
 ``static operator()``__cpp_static_call_operator   
C++23 C++03
 Attributes on Lambda-Expressions  
C++23 C++11
 Attributes on Structured Bindings__cpp_structured_bindings
C++26 C++03
+Pack Indexing__cpp_pack_indexing  
C++26 C++03
 ``= delete ("should have a reason");``   __cpp_deleted_function   
C++26 C++03
   
- -
 Designated initializers (N494)
C99   C89
diff --git a/clang/lib/Frontend/InitPreprocessor.cpp 
b/clang/lib/Frontend/InitPreprocessor.cpp
index 920ddf7e59913..3ed7243deba8a 100644
--- a/clang/lib/Frontend/InitPreprocessor.cpp
+++ b/clang/lib/Frontend/InitPreprocessor.cpp
@@ -763,6 +763,7 @@ static void InitializeCPlusPlusFeatureTestMacros(const 
LangOptions &LangOpts,
   Builder.defineMacro("__cpp_placeholder_variables", "202306L");
 
   // C++26 features supported in earlier language modes.
+  Builder.defineMacro("__cpp_pack_indexing", "202311L");
   Builder.defineMacro("__cpp_deleted_function", "202403L");
 
   if (LangOpts.Char8)
diff --git a/clang/test/Lexer/cxx-features.cpp 
b/clang/test/Lexer/cxx-features.cpp
index 4c2aa3ae2c544..08b732132228b 100644
--- a/clang/test/Lexer/cxx-features.cpp
+++ b/clang/test/Lexer/cxx-features.cpp
@@ -38,6 +38,10 @@
 #error "wrong value for __cpp_deleted_function"
 #endif
 
+#if check(pack_indexing, 202311, 202311, 202311, 202311, 202311, 202311, 
202311)
+#error "wrong value for __cpp_pack_indexing"
+#endif
+
 #if check(placeholder_variables, 202306, 202306, 202306, 202306, 202306, 
202306, 202306)
 #error "wrong value for __cpp_placeholder_variables"
 #endif

``




https://github.com/llvm/llvm-project/pull/101965
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: [Clang] SFINAE on mismatching pack length during constraint satisfaction checking (#101879) (PR #101967)

2024-08-05 Thread via llvm-branch-commits

https://github.com/llvmbot created 
https://github.com/llvm/llvm-project/pull/101967

Backport da380b2

Requested by: @cor3ntin

>From 1541d2dd615b4462a0fbdafdd6847fcb7bd15eb8 Mon Sep 17 00:00:00 2001
From: cor3ntin 
Date: Mon, 5 Aug 2024 14:22:07 +0200
Subject: [PATCH] [Clang] SFINAE on mismatching pack length during constraint
 satisfaction checking (#101879)

If a fold expanded constraint would expand packs of different size, it
is not a valid pack expansion and it is not satisfied. This should not
produce an error.

Fixes #99430

(cherry picked from commit da380b26e4748ade5a8dba85b7df5e1c4eded8bc)
---
 clang/lib/Sema/SemaConcept.cpp  |  4 
 clang/test/SemaCXX/cxx2c-fold-exprs.cpp | 30 +
 2 files changed, 34 insertions(+)

diff --git a/clang/lib/Sema/SemaConcept.cpp b/clang/lib/Sema/SemaConcept.cpp
index 9e16b67284be4..c34d32002b5ad 100644
--- a/clang/lib/Sema/SemaConcept.cpp
+++ b/clang/lib/Sema/SemaConcept.cpp
@@ -531,6 +531,10 @@ static ExprResult calculateConstraintSatisfaction(
 
 std::optional
 EvaluateFoldExpandedConstraintSize(const CXXFoldExpr *FE) const {
+
+  // We should ignore errors in the presence of packs of different size.
+  Sema::SFINAETrap Trap(S);
+
   Expr *Pattern = FE->getPattern();
 
   SmallVector Unexpanded;
diff --git a/clang/test/SemaCXX/cxx2c-fold-exprs.cpp 
b/clang/test/SemaCXX/cxx2c-fold-exprs.cpp
index 1e0bc7bcfb4e7..0674135aac483 100644
--- a/clang/test/SemaCXX/cxx2c-fold-exprs.cpp
+++ b/clang/test/SemaCXX/cxx2c-fold-exprs.cpp
@@ -275,3 +275,33 @@ static_assert(S::g() == 2); // expected-error 
{{call to 'g' is ambiguo
 
 
 }
+
+namespace GH99430 {
+
+template 
+using _Synth_three_way_result = int;
+
+template 
+class tuple;
+
+template 
+struct tuple_element;
+
+template 
+struct _Three_way_comparison_result_with_tuple_like {
+  using type = int;
+};
+
+template 
+  requires(requires {
+typename _Synth_three_way_result<_TTypes, tuple_element<_Indices>>;
+  } && ...)
+
+struct _Three_way_comparison_result_with_tuple_like, 
_Indices...>{
+using type = long;
+};
+
+static_assert(__is_same_as(_Three_way_comparison_result_with_tuple_like,
 0, 1>::type, int));
+static_assert(__is_same_as(_Three_way_comparison_result_with_tuple_like,
 0>::type, long));
+
+}

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: [Clang] SFINAE on mismatching pack length during constraint satisfaction checking (#101879) (PR #101967)

2024-08-05 Thread via llvm-branch-commits

https://github.com/llvmbot milestoned 
https://github.com/llvm/llvm-project/pull/101967
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: [Clang] SFINAE on mismatching pack length during constraint satisfaction checking (#101879) (PR #101967)

2024-08-05 Thread via llvm-branch-commits

llvmbot wrote:

@AaronBallman What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/101967
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: [Clang] SFINAE on mismatching pack length during constraint satisfaction checking (#101879) (PR #101967)

2024-08-05 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-clang

Author: None (llvmbot)


Changes

Backport da380b2

Requested by: @cor3ntin

---
Full diff: https://github.com/llvm/llvm-project/pull/101967.diff


2 Files Affected:

- (modified) clang/lib/Sema/SemaConcept.cpp (+4) 
- (modified) clang/test/SemaCXX/cxx2c-fold-exprs.cpp (+30) 


``diff
diff --git a/clang/lib/Sema/SemaConcept.cpp b/clang/lib/Sema/SemaConcept.cpp
index 9e16b67284be4..c34d32002b5ad 100644
--- a/clang/lib/Sema/SemaConcept.cpp
+++ b/clang/lib/Sema/SemaConcept.cpp
@@ -531,6 +531,10 @@ static ExprResult calculateConstraintSatisfaction(
 
 std::optional
 EvaluateFoldExpandedConstraintSize(const CXXFoldExpr *FE) const {
+
+  // We should ignore errors in the presence of packs of different size.
+  Sema::SFINAETrap Trap(S);
+
   Expr *Pattern = FE->getPattern();
 
   SmallVector Unexpanded;
diff --git a/clang/test/SemaCXX/cxx2c-fold-exprs.cpp 
b/clang/test/SemaCXX/cxx2c-fold-exprs.cpp
index 1e0bc7bcfb4e7..0674135aac483 100644
--- a/clang/test/SemaCXX/cxx2c-fold-exprs.cpp
+++ b/clang/test/SemaCXX/cxx2c-fold-exprs.cpp
@@ -275,3 +275,33 @@ static_assert(S::g() == 2); // expected-error 
{{call to 'g' is ambiguo
 
 
 }
+
+namespace GH99430 {
+
+template 
+using _Synth_three_way_result = int;
+
+template 
+class tuple;
+
+template 
+struct tuple_element;
+
+template 
+struct _Three_way_comparison_result_with_tuple_like {
+  using type = int;
+};
+
+template 
+  requires(requires {
+typename _Synth_three_way_result<_TTypes, tuple_element<_Indices>>;
+  } && ...)
+
+struct _Three_way_comparison_result_with_tuple_like, 
_Indices...>{
+using type = long;
+};
+
+static_assert(__is_same_as(_Three_way_comparison_result_with_tuple_like,
 0, 1>::type, int));
+static_assert(__is_same_as(_Three_way_comparison_result_with_tuple_like,
 0>::type, long));
+
+}

``




https://github.com/llvm/llvm-project/pull/101967
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: [Clang] Define __cpp_pack_indexing (#101956) (PR #101965)

2024-08-05 Thread via llvm-branch-commits

cor3ntin wrote:

LGTM

https://github.com/llvm/llvm-project/pull/101965
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [clang] release/19.x: [Clang] SFINAE on mismatching pack length during constraint satisfaction checking (#101879) (PR #101967)

2024-08-05 Thread Aaron Ballman via llvm-branch-commits

https://github.com/AaronBallman approved this pull request.

LGTM!

https://github.com/llvm/llvm-project/pull/101967
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/19.x: [AArch64] Avoid NEON dot product in streaming[-compatible] functions (#101677) (PR #101933)

2024-08-05 Thread Jon Roelofs via llvm-branch-commits

https://github.com/jroelofs approved this pull request.


https://github.com/llvm/llvm-project/pull/101933
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] InferAddressSpaces: Handle masked load and store intrinsics (PR #102007)

2024-08-05 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm created 
https://github.com/llvm/llvm-project/pull/102007

None

>From 0cd0fd76cf0c4bd5139f5ae138c32da5c0c154c4 Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Mon, 5 Aug 2024 19:49:31 +0400
Subject: [PATCH] InferAddressSpaces: Handle masked load and store intrinsics

---
 .../Transforms/Scalar/InferAddressSpaces.cpp  | 16 +++--
 .../AMDGPU/masked-load-store.ll   | 68 +++
 2 files changed, 77 insertions(+), 7 deletions(-)
 create mode 100644 
llvm/test/Transforms/InferAddressSpaces/AMDGPU/masked-load-store.ll

diff --git a/llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp 
b/llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp
index 87b885447cc02..2ddf24be67702 100644
--- a/llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp
+++ b/llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp
@@ -369,13 +369,13 @@ bool 
InferAddressSpacesImpl::rewriteIntrinsicOperands(IntrinsicInst *II,
   Value *OldV,
   Value *NewV) const {
   Module *M = II->getParent()->getParent()->getParent();
-
-  switch (II->getIntrinsicID()) {
-  case Intrinsic::objectsize: {
+  Intrinsic::ID IID = II->getIntrinsicID();
+  switch (IID) {
+  case Intrinsic::objectsize:
+  case Intrinsic::masked_load: {
 Type *DestTy = II->getType();
 Type *SrcTy = NewV->getType();
-Function *NewDecl =
-Intrinsic::getDeclaration(M, II->getIntrinsicID(), {DestTy, SrcTy});
+Function *NewDecl = Intrinsic::getDeclaration(M, IID, {DestTy, SrcTy});
 II->setArgOperand(0, NewV);
 II->setCalledFunction(NewDecl);
 return true;
@@ -386,12 +386,12 @@ bool 
InferAddressSpacesImpl::rewriteIntrinsicOperands(IntrinsicInst *II,
   case Intrinsic::masked_gather: {
 Type *RetTy = II->getType();
 Type *NewPtrTy = NewV->getType();
-Function *NewDecl =
-Intrinsic::getDeclaration(M, II->getIntrinsicID(), {RetTy, NewPtrTy});
+Function *NewDecl = Intrinsic::getDeclaration(M, IID, {RetTy, NewPtrTy});
 II->setArgOperand(0, NewV);
 II->setCalledFunction(NewDecl);
 return true;
   }
+  case Intrinsic::masked_store:
   case Intrinsic::masked_scatter: {
 Type *ValueTy = II->getOperand(0)->getType();
 Type *NewPtrTy = NewV->getType();
@@ -429,11 +429,13 @@ void 
InferAddressSpacesImpl::collectRewritableIntrinsicOperands(
 appendsFlatAddressExpressionToPostorderStack(II->getArgOperand(0),
  PostorderStack, Visited);
 break;
+  case Intrinsic::masked_load:
   case Intrinsic::masked_gather:
   case Intrinsic::prefetch:
 appendsFlatAddressExpressionToPostorderStack(II->getArgOperand(0),
  PostorderStack, Visited);
 break;
+  case Intrinsic::masked_store:
   case Intrinsic::masked_scatter:
 appendsFlatAddressExpressionToPostorderStack(II->getArgOperand(1),
  PostorderStack, Visited);
diff --git 
a/llvm/test/Transforms/InferAddressSpaces/AMDGPU/masked-load-store.ll 
b/llvm/test/Transforms/InferAddressSpaces/AMDGPU/masked-load-store.ll
new file mode 100644
index 0..e14dfd055cbe8
--- /dev/null
+++ b/llvm/test/Transforms/InferAddressSpaces/AMDGPU/masked-load-store.ll
@@ -0,0 +1,68 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 5
+; RUN: opt -S -mtriple=amdgcn-amd-amdhsa -passes=infer-address-spaces %s | 
FileCheck %s
+
+define <32 x i32> @masked_load_v32i32_global_to_flat(ptr addrspace(1) %ptr, 
<32 x i1> %mask) {
+; CHECK-LABEL: define <32 x i32> @masked_load_v32i32_global_to_flat(
+; CHECK-SAME: ptr addrspace(1) [[PTR:%.*]], <32 x i1> [[MASK:%.*]]) {
+; CHECK-NEXT:[[LOAD:%.*]] = call <32 x i32> 
@llvm.masked.load.v32i32.p1(ptr addrspace(1) [[PTR]], i32 8, <32 x i1> 
[[MASK]], <32 x i32> zeroinitializer)
+; CHECK-NEXT:ret <32 x i32> [[LOAD]]
+;
+  %cast = addrspacecast ptr addrspace(1) %ptr to ptr
+  %load = call <32 x i32> @llvm.masked.load.v32i32.p0(ptr %cast, i32 8, <32 x 
i1> %mask, <32 x i32> zeroinitializer)
+  ret <32 x i32> %load
+}
+define <32 x i32> @masked_load_v32i32_local_to_flat(ptr addrspace(3) %ptr, <32 
x i1> %mask) {
+; CHECK-LABEL: define <32 x i32> @masked_load_v32i32_local_to_flat(
+; CHECK-SAME: ptr addrspace(3) [[PTR:%.*]], <32 x i1> [[MASK:%.*]]) {
+; CHECK-NEXT:[[LOAD:%.*]] = call <32 x i32> 
@llvm.masked.load.v32i32.p3(ptr addrspace(3) [[PTR]], i32 8, <32 x i1> 
[[MASK]], <32 x i32> zeroinitializer)
+; CHECK-NEXT:ret <32 x i32> [[LOAD]]
+;
+  %cast = addrspacecast ptr addrspace(3) %ptr to ptr
+  %load = call <32 x i32> @llvm.masked.load.v32i32.p0(ptr %cast, i32 8, <32 x 
i1> %mask, <32 x i32> zeroinitializer)
+  ret <32 x i32> %load
+}
+
+define <32 x i32> @masked_load_v32i32_private_to_flat(ptr addrspace(5) %ptr, 
<32 x i1> %mask) {
+; CHECK-LABEL: define <32 x i32> @masked_load_v32i32_private_to_flat

[llvm-branch-commits] [llvm] InferAddressSpaces: Handle masked load and store intrinsics (PR #102007)

2024-08-05 Thread Matt Arsenault via llvm-branch-commits

arsenm wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/102007?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#102007** https://app.graphite.dev/github/pr/llvm/llvm-project/102007?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈
* **#101982** https://app.graphite.dev/github/pr/llvm/llvm-project/101982?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`

This stack of pull requests is managed by Graphite. https://stacking.dev/?utm_source=stack-comment";>Learn more about 
stacking.


 Join @arsenm and the rest of your teammates on https://graphite.dev?utm-source=stack-comment";>https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="11px" height="11px"/> Graphite
  

https://github.com/llvm/llvm-project/pull/102007
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] InferAddressSpaces: Handle masked load and store intrinsics (PR #102007)

2024-08-05 Thread Matt Arsenault via llvm-branch-commits

arsenm wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/102007?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#102007** https://app.graphite.dev/github/pr/llvm/llvm-project/102007?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈
* **#101982** https://app.graphite.dev/github/pr/llvm/llvm-project/101982?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`

This stack of pull requests is managed by Graphite. https://stacking.dev/?utm_source=stack-comment";>Learn more about 
stacking.


 Join @arsenm and the rest of your teammates on https://graphite.dev?utm-source=stack-comment";>https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="11px" height="11px"/> Graphite
  

https://github.com/llvm/llvm-project/pull/102007
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] InferAddressSpaces: Handle masked load and store intrinsics (PR #102007)

2024-08-05 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm ready_for_review 
https://github.com/llvm/llvm-project/pull/102007
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] InferAddressSpaces: Handle masked load and store intrinsics (PR #102007)

2024-08-05 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-llvm-transforms

Author: Matt Arsenault (arsenm)


Changes



---
Full diff: https://github.com/llvm/llvm-project/pull/102007.diff


2 Files Affected:

- (modified) llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp (+9-7) 
- (added) llvm/test/Transforms/InferAddressSpaces/AMDGPU/masked-load-store.ll 
(+68) 


``diff
diff --git a/llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp 
b/llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp
index 87b885447cc02..2ddf24be67702 100644
--- a/llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp
+++ b/llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp
@@ -369,13 +369,13 @@ bool 
InferAddressSpacesImpl::rewriteIntrinsicOperands(IntrinsicInst *II,
   Value *OldV,
   Value *NewV) const {
   Module *M = II->getParent()->getParent()->getParent();
-
-  switch (II->getIntrinsicID()) {
-  case Intrinsic::objectsize: {
+  Intrinsic::ID IID = II->getIntrinsicID();
+  switch (IID) {
+  case Intrinsic::objectsize:
+  case Intrinsic::masked_load: {
 Type *DestTy = II->getType();
 Type *SrcTy = NewV->getType();
-Function *NewDecl =
-Intrinsic::getDeclaration(M, II->getIntrinsicID(), {DestTy, SrcTy});
+Function *NewDecl = Intrinsic::getDeclaration(M, IID, {DestTy, SrcTy});
 II->setArgOperand(0, NewV);
 II->setCalledFunction(NewDecl);
 return true;
@@ -386,12 +386,12 @@ bool 
InferAddressSpacesImpl::rewriteIntrinsicOperands(IntrinsicInst *II,
   case Intrinsic::masked_gather: {
 Type *RetTy = II->getType();
 Type *NewPtrTy = NewV->getType();
-Function *NewDecl =
-Intrinsic::getDeclaration(M, II->getIntrinsicID(), {RetTy, NewPtrTy});
+Function *NewDecl = Intrinsic::getDeclaration(M, IID, {RetTy, NewPtrTy});
 II->setArgOperand(0, NewV);
 II->setCalledFunction(NewDecl);
 return true;
   }
+  case Intrinsic::masked_store:
   case Intrinsic::masked_scatter: {
 Type *ValueTy = II->getOperand(0)->getType();
 Type *NewPtrTy = NewV->getType();
@@ -429,11 +429,13 @@ void 
InferAddressSpacesImpl::collectRewritableIntrinsicOperands(
 appendsFlatAddressExpressionToPostorderStack(II->getArgOperand(0),
  PostorderStack, Visited);
 break;
+  case Intrinsic::masked_load:
   case Intrinsic::masked_gather:
   case Intrinsic::prefetch:
 appendsFlatAddressExpressionToPostorderStack(II->getArgOperand(0),
  PostorderStack, Visited);
 break;
+  case Intrinsic::masked_store:
   case Intrinsic::masked_scatter:
 appendsFlatAddressExpressionToPostorderStack(II->getArgOperand(1),
  PostorderStack, Visited);
diff --git 
a/llvm/test/Transforms/InferAddressSpaces/AMDGPU/masked-load-store.ll 
b/llvm/test/Transforms/InferAddressSpaces/AMDGPU/masked-load-store.ll
new file mode 100644
index 0..e14dfd055cbe8
--- /dev/null
+++ b/llvm/test/Transforms/InferAddressSpaces/AMDGPU/masked-load-store.ll
@@ -0,0 +1,68 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 5
+; RUN: opt -S -mtriple=amdgcn-amd-amdhsa -passes=infer-address-spaces %s | 
FileCheck %s
+
+define <32 x i32> @masked_load_v32i32_global_to_flat(ptr addrspace(1) %ptr, 
<32 x i1> %mask) {
+; CHECK-LABEL: define <32 x i32> @masked_load_v32i32_global_to_flat(
+; CHECK-SAME: ptr addrspace(1) [[PTR:%.*]], <32 x i1> [[MASK:%.*]]) {
+; CHECK-NEXT:[[LOAD:%.*]] = call <32 x i32> 
@llvm.masked.load.v32i32.p1(ptr addrspace(1) [[PTR]], i32 8, <32 x i1> 
[[MASK]], <32 x i32> zeroinitializer)
+; CHECK-NEXT:ret <32 x i32> [[LOAD]]
+;
+  %cast = addrspacecast ptr addrspace(1) %ptr to ptr
+  %load = call <32 x i32> @llvm.masked.load.v32i32.p0(ptr %cast, i32 8, <32 x 
i1> %mask, <32 x i32> zeroinitializer)
+  ret <32 x i32> %load
+}
+define <32 x i32> @masked_load_v32i32_local_to_flat(ptr addrspace(3) %ptr, <32 
x i1> %mask) {
+; CHECK-LABEL: define <32 x i32> @masked_load_v32i32_local_to_flat(
+; CHECK-SAME: ptr addrspace(3) [[PTR:%.*]], <32 x i1> [[MASK:%.*]]) {
+; CHECK-NEXT:[[LOAD:%.*]] = call <32 x i32> 
@llvm.masked.load.v32i32.p3(ptr addrspace(3) [[PTR]], i32 8, <32 x i1> 
[[MASK]], <32 x i32> zeroinitializer)
+; CHECK-NEXT:ret <32 x i32> [[LOAD]]
+;
+  %cast = addrspacecast ptr addrspace(3) %ptr to ptr
+  %load = call <32 x i32> @llvm.masked.load.v32i32.p0(ptr %cast, i32 8, <32 x 
i1> %mask, <32 x i32> zeroinitializer)
+  ret <32 x i32> %load
+}
+
+define <32 x i32> @masked_load_v32i32_private_to_flat(ptr addrspace(5) %ptr, 
<32 x i1> %mask) {
+; CHECK-LABEL: define <32 x i32> @masked_load_v32i32_private_to_flat(
+; CHECK-SAME: ptr addrspace(5) [[PTR:%.*]], <32 x i1> [[MASK:%.*]]) {
+; CHECK-NEXT:[[LOAD:%.*]] = call <32 x i32> 
@llvm.masked.load.v32i32.p5(ptr addrspace(5) [[PTR]], i32 8, <32 x i1> 
[[MASK]], <32 

[llvm-branch-commits] [libcxx] [libc++][bit] Improves rotate functions. (#98032) (PR #101892)

2024-08-05 Thread Louis Dionne via llvm-branch-commits

https://github.com/ldionne approved this pull request.


https://github.com/llvm/llvm-project/pull/101892
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [AMDGPU][SILoadStoreOptimizer] Include constrained buffer load variants (PR #101619)

2024-08-05 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm approved this pull request.


https://github.com/llvm/llvm-project/pull/101619
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] InferAddressSpaces: Handle llvm.is.constant (PR #102010)

2024-08-05 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm created 
https://github.com/llvm/llvm-project/pull/102010

None

>From 2dde2add87950b6016cbb1d6c6aa979731378a1b Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Mon, 5 Aug 2024 20:27:07 +0400
Subject: [PATCH] InferAddressSpaces: Handle llvm.is.constant

---
 .../Transforms/Scalar/InferAddressSpaces.cpp  |  4 ++-
 .../InferAddressSpaces/AMDGPU/is.constant.ll  | 35 +++
 2 files changed, 38 insertions(+), 1 deletion(-)
 create mode 100644 
llvm/test/Transforms/InferAddressSpaces/AMDGPU/is.constant.ll

diff --git a/llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp 
b/llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp
index 2ddf24be67702..67d4e7b5ee146 100644
--- a/llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp
+++ b/llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp
@@ -401,7 +401,8 @@ bool 
InferAddressSpacesImpl::rewriteIntrinsicOperands(IntrinsicInst *II,
 II->setCalledFunction(NewDecl);
 return true;
   }
-  case Intrinsic::prefetch: {
+  case Intrinsic::prefetch:
+  case Intrinsic::is_constant: {
 Function *NewDecl =
 Intrinsic::getDeclaration(M, II->getIntrinsicID(), {NewV->getType()});
 II->setArgOperand(0, NewV);
@@ -426,6 +427,7 @@ void 
InferAddressSpacesImpl::collectRewritableIntrinsicOperands(
   switch (IID) {
   case Intrinsic::ptrmask:
   case Intrinsic::objectsize:
+  case Intrinsic::is_constant:
 appendsFlatAddressExpressionToPostorderStack(II->getArgOperand(0),
  PostorderStack, Visited);
 break;
diff --git a/llvm/test/Transforms/InferAddressSpaces/AMDGPU/is.constant.ll 
b/llvm/test/Transforms/InferAddressSpaces/AMDGPU/is.constant.ll
new file mode 100644
index 0..767681a281792
--- /dev/null
+++ b/llvm/test/Transforms/InferAddressSpaces/AMDGPU/is.constant.ll
@@ -0,0 +1,35 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 5
+; RUN: opt -S -mtriple=amdgcn-amd-amdhsa -passes=infer-address-spaces %s | 
FileCheck %s
+
+define i1 @is_constant_global_to_flat(ptr addrspace(1) %ptr) {
+; CHECK-LABEL: define i1 @is_constant_global_to_flat(
+; CHECK-SAME: ptr addrspace(1) [[PTR:%.*]]) {
+; CHECK-NEXT:[[RET:%.*]] = call i1 @llvm.is.constant.p1(ptr addrspace(1) 
[[PTR]])
+; CHECK-NEXT:ret i1 [[RET]]
+;
+  %cast = addrspacecast ptr addrspace(1) %ptr to ptr
+  %ret = call i1 @llvm.is.constant.p0(ptr %cast)
+  ret i1 %ret
+}
+
+define i1 @is_constant_local_to_flat(ptr addrspace(3) %ptr) {
+; CHECK-LABEL: define i1 @is_constant_local_to_flat(
+; CHECK-SAME: ptr addrspace(3) [[PTR:%.*]]) {
+; CHECK-NEXT:[[RET:%.*]] = call i1 @llvm.is.constant.p3(ptr addrspace(3) 
[[PTR]])
+; CHECK-NEXT:ret i1 [[RET]]
+;
+  %cast = addrspacecast ptr addrspace(3) %ptr to ptr
+  %ret = call i1 @llvm.is.constant.p0(ptr %cast)
+  ret i1 %ret
+}
+
+define i1 @is_constant_private_to_flat(ptr addrspace(5) %ptr) {
+; CHECK-LABEL: define i1 @is_constant_private_to_flat(
+; CHECK-SAME: ptr addrspace(5) [[PTR:%.*]]) {
+; CHECK-NEXT:[[RET:%.*]] = call i1 @llvm.is.constant.p5(ptr addrspace(5) 
[[PTR]])
+; CHECK-NEXT:ret i1 [[RET]]
+;
+  %cast = addrspacecast ptr addrspace(5) %ptr to ptr
+  %ret = call i1 @llvm.is.constant.p0(ptr %cast)
+  ret i1 %ret
+}

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] InferAddressSpaces: Handle llvm.is.constant (PR #102010)

2024-08-05 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm ready_for_review 
https://github.com/llvm/llvm-project/pull/102010
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] InferAddressSpaces: Handle llvm.is.constant (PR #102010)

2024-08-05 Thread Matt Arsenault via llvm-branch-commits

arsenm wrote:

> [!WARNING]
> This pull request is not mergeable via GitHub because a downstack PR is 
> open. Once all requirements are satisfied, merge this PR as a stack  href="https://app.graphite.dev/github/pr/llvm/llvm-project/102010?utm_source=stack-comment-downstack-mergeability-warning";
>  >on Graphite.
> https://graphite.dev/docs/merge-pull-requests";>Learn more

* **#102010** https://app.graphite.dev/github/pr/llvm/llvm-project/102010?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/> 👈
* **#102007** https://app.graphite.dev/github/pr/llvm/llvm-project/102007?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* **#101982** https://app.graphite.dev/github/pr/llvm/llvm-project/101982?utm_source=stack-comment-icon";
 target="_blank">https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="10px" height="10px"/>
* `main`

This stack of pull requests is managed by Graphite. https://stacking.dev/?utm_source=stack-comment";>Learn more about 
stacking.


 Join @arsenm and the rest of your teammates on https://graphite.dev?utm-source=stack-comment";>https://static.graphite.dev/graphite-32x32-black.png"; alt="Graphite" 
width="11px" height="11px"/> Graphite
  

https://github.com/llvm/llvm-project/pull/102010
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] InferAddressSpaces: Handle llvm.is.constant (PR #102010)

2024-08-05 Thread via llvm-branch-commits

llvmbot wrote:




@llvm/pr-subscribers-llvm-transforms

Author: Matt Arsenault (arsenm)


Changes



---
Full diff: https://github.com/llvm/llvm-project/pull/102010.diff


2 Files Affected:

- (modified) llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp (+3-1) 
- (added) llvm/test/Transforms/InferAddressSpaces/AMDGPU/is.constant.ll (+35) 


``diff
diff --git a/llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp 
b/llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp
index 2ddf24be67702..67d4e7b5ee146 100644
--- a/llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp
+++ b/llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp
@@ -401,7 +401,8 @@ bool 
InferAddressSpacesImpl::rewriteIntrinsicOperands(IntrinsicInst *II,
 II->setCalledFunction(NewDecl);
 return true;
   }
-  case Intrinsic::prefetch: {
+  case Intrinsic::prefetch:
+  case Intrinsic::is_constant: {
 Function *NewDecl =
 Intrinsic::getDeclaration(M, II->getIntrinsicID(), {NewV->getType()});
 II->setArgOperand(0, NewV);
@@ -426,6 +427,7 @@ void 
InferAddressSpacesImpl::collectRewritableIntrinsicOperands(
   switch (IID) {
   case Intrinsic::ptrmask:
   case Intrinsic::objectsize:
+  case Intrinsic::is_constant:
 appendsFlatAddressExpressionToPostorderStack(II->getArgOperand(0),
  PostorderStack, Visited);
 break;
diff --git a/llvm/test/Transforms/InferAddressSpaces/AMDGPU/is.constant.ll 
b/llvm/test/Transforms/InferAddressSpaces/AMDGPU/is.constant.ll
new file mode 100644
index 0..767681a281792
--- /dev/null
+++ b/llvm/test/Transforms/InferAddressSpaces/AMDGPU/is.constant.ll
@@ -0,0 +1,35 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 5
+; RUN: opt -S -mtriple=amdgcn-amd-amdhsa -passes=infer-address-spaces %s | 
FileCheck %s
+
+define i1 @is_constant_global_to_flat(ptr addrspace(1) %ptr) {
+; CHECK-LABEL: define i1 @is_constant_global_to_flat(
+; CHECK-SAME: ptr addrspace(1) [[PTR:%.*]]) {
+; CHECK-NEXT:[[RET:%.*]] = call i1 @llvm.is.constant.p1(ptr addrspace(1) 
[[PTR]])
+; CHECK-NEXT:ret i1 [[RET]]
+;
+  %cast = addrspacecast ptr addrspace(1) %ptr to ptr
+  %ret = call i1 @llvm.is.constant.p0(ptr %cast)
+  ret i1 %ret
+}
+
+define i1 @is_constant_local_to_flat(ptr addrspace(3) %ptr) {
+; CHECK-LABEL: define i1 @is_constant_local_to_flat(
+; CHECK-SAME: ptr addrspace(3) [[PTR:%.*]]) {
+; CHECK-NEXT:[[RET:%.*]] = call i1 @llvm.is.constant.p3(ptr addrspace(3) 
[[PTR]])
+; CHECK-NEXT:ret i1 [[RET]]
+;
+  %cast = addrspacecast ptr addrspace(3) %ptr to ptr
+  %ret = call i1 @llvm.is.constant.p0(ptr %cast)
+  ret i1 %ret
+}
+
+define i1 @is_constant_private_to_flat(ptr addrspace(5) %ptr) {
+; CHECK-LABEL: define i1 @is_constant_private_to_flat(
+; CHECK-SAME: ptr addrspace(5) [[PTR:%.*]]) {
+; CHECK-NEXT:[[RET:%.*]] = call i1 @llvm.is.constant.p5(ptr addrspace(5) 
[[PTR]])
+; CHECK-NEXT:ret i1 [[RET]]
+;
+  %cast = addrspacecast ptr addrspace(5) %ptr to ptr
+  %ret = call i1 @llvm.is.constant.p0(ptr %cast)
+  ret i1 %ret
+}

``




https://github.com/llvm/llvm-project/pull/102010
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] InferAddressSpaces: Handle llvm.is.constant (PR #102010)

2024-08-05 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec commented:

Add some tests where argument is not a pointer?

https://github.com/llvm/llvm-project/pull/102010
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] InferAddressSpaces: Handle masked load and store intrinsics (PR #102007)

2024-08-05 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec approved this pull request.

LGTM

https://github.com/llvm/llvm-project/pull/102007
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] InferAddressSpaces: Handle llvm.is.constant (PR #102010)

2024-08-05 Thread Matt Arsenault via llvm-branch-commits

https://github.com/arsenm updated 
https://github.com/llvm/llvm-project/pull/102010

>From 227f60ec2454ade7df9715c84f6d83840872c18d Mon Sep 17 00:00:00 2001
From: Matt Arsenault 
Date: Mon, 5 Aug 2024 20:27:07 +0400
Subject: [PATCH] InferAddressSpaces: Handle llvm.is.constant

---
 .../Transforms/Scalar/InferAddressSpaces.cpp  | 12 +++-
 .../InferAddressSpaces/AMDGPU/is.constant.ll  | 56 +++
 2 files changed, 67 insertions(+), 1 deletion(-)
 create mode 100644 
llvm/test/Transforms/InferAddressSpaces/AMDGPU/is.constant.ll

diff --git a/llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp 
b/llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp
index 2ddf24be67702..0bb7517d784a3 100644
--- a/llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp
+++ b/llvm/lib/Transforms/Scalar/InferAddressSpaces.cpp
@@ -401,7 +401,8 @@ bool 
InferAddressSpacesImpl::rewriteIntrinsicOperands(IntrinsicInst *II,
 II->setCalledFunction(NewDecl);
 return true;
   }
-  case Intrinsic::prefetch: {
+  case Intrinsic::prefetch:
+  case Intrinsic::is_constant: {
 Function *NewDecl =
 Intrinsic::getDeclaration(M, II->getIntrinsicID(), {NewV->getType()});
 II->setArgOperand(0, NewV);
@@ -429,6 +430,15 @@ void 
InferAddressSpacesImpl::collectRewritableIntrinsicOperands(
 appendsFlatAddressExpressionToPostorderStack(II->getArgOperand(0),
  PostorderStack, Visited);
 break;
+  case Intrinsic::is_constant: {
+Value *Ptr = II->getArgOperand(0);
+if (Ptr->getType()->isPtrOrPtrVectorTy()) {
+  appendsFlatAddressExpressionToPostorderStack(Ptr, PostorderStack,
+   Visited);
+}
+
+break;
+  }
   case Intrinsic::masked_load:
   case Intrinsic::masked_gather:
   case Intrinsic::prefetch:
diff --git a/llvm/test/Transforms/InferAddressSpaces/AMDGPU/is.constant.ll 
b/llvm/test/Transforms/InferAddressSpaces/AMDGPU/is.constant.ll
new file mode 100644
index 0..a761f59b4312f
--- /dev/null
+++ b/llvm/test/Transforms/InferAddressSpaces/AMDGPU/is.constant.ll
@@ -0,0 +1,56 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py 
UTC_ARGS: --version 5
+; RUN: opt -S -mtriple=amdgcn-amd-amdhsa -passes=infer-address-spaces %s | 
FileCheck %s
+
+define i1 @is_constant_global_to_flat(ptr addrspace(1) %ptr) {
+; CHECK-LABEL: define i1 @is_constant_global_to_flat(
+; CHECK-SAME: ptr addrspace(1) [[PTR:%.*]]) {
+; CHECK-NEXT:[[RET:%.*]] = call i1 @llvm.is.constant.p1(ptr addrspace(1) 
[[PTR]])
+; CHECK-NEXT:ret i1 [[RET]]
+;
+  %cast = addrspacecast ptr addrspace(1) %ptr to ptr
+  %ret = call i1 @llvm.is.constant.p0(ptr %cast)
+  ret i1 %ret
+}
+
+define i1 @is_constant_local_to_flat(ptr addrspace(3) %ptr) {
+; CHECK-LABEL: define i1 @is_constant_local_to_flat(
+; CHECK-SAME: ptr addrspace(3) [[PTR:%.*]]) {
+; CHECK-NEXT:[[RET:%.*]] = call i1 @llvm.is.constant.p3(ptr addrspace(3) 
[[PTR]])
+; CHECK-NEXT:ret i1 [[RET]]
+;
+  %cast = addrspacecast ptr addrspace(3) %ptr to ptr
+  %ret = call i1 @llvm.is.constant.p0(ptr %cast)
+  ret i1 %ret
+}
+
+define i1 @is_constant_private_to_flat(ptr addrspace(5) %ptr) {
+; CHECK-LABEL: define i1 @is_constant_private_to_flat(
+; CHECK-SAME: ptr addrspace(5) [[PTR:%.*]]) {
+; CHECK-NEXT:[[RET:%.*]] = call i1 @llvm.is.constant.p5(ptr addrspace(5) 
[[PTR]])
+; CHECK-NEXT:ret i1 [[RET]]
+;
+  %cast = addrspacecast ptr addrspace(5) %ptr to ptr
+  %ret = call i1 @llvm.is.constant.p0(ptr %cast)
+  ret i1 %ret
+}
+
+define i1 @is_constant_private_to_flat_v2(<2 x ptr addrspace(5)> %ptr) {
+; CHECK-LABEL: define i1 @is_constant_private_to_flat_v2(
+; CHECK-SAME: <2 x ptr addrspace(5)> [[PTR:%.*]]) {
+; CHECK-NEXT:[[RET:%.*]] = call i1 @llvm.is.constant.v2p5(<2 x ptr 
addrspace(5)> [[PTR]])
+; CHECK-NEXT:ret i1 [[RET]]
+;
+  %cast = addrspacecast <2 x ptr addrspace(5)> %ptr to <2 x ptr>
+  %ret = call i1 @llvm.is.constant.v2p0(<2 x ptr> %cast)
+  ret i1 %ret
+}
+
+define i1 @is_constant_i32(i32 %val) {
+; CHECK-LABEL: define i1 @is_constant_i32(
+; CHECK-SAME: i32 [[VAL:%.*]]) {
+; CHECK-NEXT:[[RET:%.*]] = call i1 @llvm.is.constant.i32(i32 [[VAL]])
+; CHECK-NEXT:ret i1 [[RET]]
+;
+  %ret = call i1 @llvm.is.constant.i32(i32 %val)
+  ret i1 %ret
+}

___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] InferAddressSpaces: Handle llvm.is.constant (PR #102010)

2024-08-05 Thread Thorsten Schütt via llvm-branch-commits


@@ -429,6 +430,15 @@ void 
InferAddressSpacesImpl::collectRewritableIntrinsicOperands(
 appendsFlatAddressExpressionToPostorderStack(II->getArgOperand(0),
  PostorderStack, Visited);
 break;
+  case Intrinsic::is_constant: {
+Value *Ptr = II->getArgOperand(0);
+if (Ptr->getType()->isPtrOrPtrVectorTy()) {
+  appendsFlatAddressExpressionToPostorderStack(Ptr, PostorderStack,
+   Visited);
+}

tschuett wrote:

unnecessary braces?

https://github.com/llvm/llvm-project/pull/102010
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] InferAddressSpaces: Handle llvm.is.constant (PR #102010)

2024-08-05 Thread Stanislav Mekhanoshin via llvm-branch-commits

https://github.com/rampitec approved this pull request.

LGTM modulo braces comment.

https://github.com/llvm/llvm-project/pull/102010
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] InferAddressSpaces: Handle llvm.is.constant (PR #102010)

2024-08-05 Thread Matt Arsenault via llvm-branch-commits


@@ -429,6 +430,15 @@ void 
InferAddressSpacesImpl::collectRewritableIntrinsicOperands(
 appendsFlatAddressExpressionToPostorderStack(II->getArgOperand(0),
  PostorderStack, Visited);
 break;
+  case Intrinsic::is_constant: {
+Value *Ptr = II->getArgOperand(0);
+if (Ptr->getType()->isPtrOrPtrVectorTy()) {
+  appendsFlatAddressExpressionToPostorderStack(Ptr, PostorderStack,
+   Visited);
+}

arsenm wrote:

Multiple lines 

https://github.com/llvm/llvm-project/pull/102010
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] InferAddressSpaces: Handle llvm.is.constant (PR #102010)

2024-08-05 Thread Thorsten Schütt via llvm-branch-commits


@@ -429,6 +430,15 @@ void 
InferAddressSpacesImpl::collectRewritableIntrinsicOperands(
 appendsFlatAddressExpressionToPostorderStack(II->getArgOperand(0),
  PostorderStack, Visited);
 break;
+  case Intrinsic::is_constant: {
+Value *Ptr = II->getArgOperand(0);
+if (Ptr->getType()->isPtrOrPtrVectorTy()) {
+  appendsFlatAddressExpressionToPostorderStack(Ptr, PostorderStack,
+   Visited);
+}

tschuett wrote:

multiple lines is not listed as an excuse in the coding style

https://github.com/llvm/llvm-project/pull/102010
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] InferAddressSpaces: Handle llvm.is.constant (PR #102010)

2024-08-05 Thread Matt Arsenault via llvm-branch-commits


@@ -429,6 +430,15 @@ void 
InferAddressSpacesImpl::collectRewritableIntrinsicOperands(
 appendsFlatAddressExpressionToPostorderStack(II->getArgOperand(0),
  PostorderStack, Visited);
 break;
+  case Intrinsic::is_constant: {
+Value *Ptr = II->getArgOperand(0);
+if (Ptr->getType()->isPtrOrPtrVectorTy()) {
+  appendsFlatAddressExpressionToPostorderStack(Ptr, PostorderStack,
+   Visited);
+}

arsenm wrote:

"However, braces should be used in cases where the omission of braces harm the 
readability and maintainability of the code."

and in my judgment this covers any situation with multiple lines 

https://github.com/llvm/llvm-project/pull/102010
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] InferAddressSpaces: Handle llvm.is.constant (PR #102010)

2024-08-05 Thread Thorsten Schütt via llvm-branch-commits


@@ -429,6 +430,15 @@ void 
InferAddressSpacesImpl::collectRewritableIntrinsicOperands(
 appendsFlatAddressExpressionToPostorderStack(II->getArgOperand(0),
  PostorderStack, Visited);
 break;
+  case Intrinsic::is_constant: {
+Value *Ptr = II->getArgOperand(0);
+if (Ptr->getType()->isPtrOrPtrVectorTy()) {
+  appendsFlatAddressExpressionToPostorderStack(Ptr, PostorderStack,
+   Visited);
+}

tschuett wrote:

How could it hurt readability?

https://github.com/llvm/llvm-project/pull/102010
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [BOLT] Add profile density computation (PR #101094)

2024-08-05 Thread via llvm-branch-commits


@@ -1441,6 +1458,18 @@ Error PrintProgramStats::runOnFunctions(BinaryContext 
&BC) {
   StaleSampleCount += SampleCount;
   ++NumAllStaleFunctions;
 }
+
+if (opts::ShowDensity) {
+  uint64_t Instructions = Function.getInputInstructionCount();
+  // In case of BOLT split functions registered in BAT, samples are
+  // automatically attributed to the main fragment. Add instructions from
+  // all fragments.
+  if (IsHotParentOfBOLTSplitFunction)
+for (const BinaryFunction *Fragment : Function.getFragments())
+  Instructions += Fragment->getInputInstructionCount();
+  double Density = (double)1.0 * SampleCount / Instructions;

WenleiHe wrote:

If sample count is from branch count, size represented in instructions count 
may not be a good denominator? 

For compiler CSSPGO, we use sum of block execution count (sum of counts on 
probes) divided by number of probes (proxy for number of blocks) as density. If 
we use instructions count as the denominator, then the numerator should 
probably be total dynamic instructions count, which is different from raw 
branch count but can be derived from branches.

https://github.com/llvm/llvm-project/pull/101094
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] InferAddressSpaces: Handle llvm.is.constant (PR #102010)

2024-08-05 Thread Matt Arsenault via llvm-branch-commits


@@ -429,6 +430,15 @@ void 
InferAddressSpacesImpl::collectRewritableIntrinsicOperands(
 appendsFlatAddressExpressionToPostorderStack(II->getArgOperand(0),
  PostorderStack, Visited);
 break;
+  case Intrinsic::is_constant: {
+Value *Ptr = II->getArgOperand(0);
+if (Ptr->getType()->isPtrOrPtrVectorTy()) {
+  appendsFlatAddressExpressionToPostorderStack(Ptr, PostorderStack,
+   Visited);
+}

arsenm wrote:

Because unlike a compiler I read by lines, and not statements. 

The length of that section of the style guide is absurd. It should never be 
wrong to include braces. That's a lot simpler than parsing out whatever that is 

https://github.com/llvm/llvm-project/pull/102010
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] InferAddressSpaces: Handle llvm.is.constant (PR #102010)

2024-08-05 Thread Artem Belevich via llvm-branch-commits

https://github.com/Artem-B approved this pull request.


https://github.com/llvm/llvm-project/pull/102010
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] InferAddressSpaces: Handle llvm.is.constant (PR #102010)

2024-08-05 Thread Artem Belevich via llvm-branch-commits


@@ -429,6 +430,15 @@ void 
InferAddressSpacesImpl::collectRewritableIntrinsicOperands(
 appendsFlatAddressExpressionToPostorderStack(II->getArgOperand(0),
  PostorderStack, Visited);
 break;
+  case Intrinsic::is_constant: {
+Value *Ptr = II->getArgOperand(0);
+if (Ptr->getType()->isPtrOrPtrVectorTy()) {
+  appendsFlatAddressExpressionToPostorderStack(Ptr, PostorderStack,
+   Visited);
+}

Artem-B wrote:

> It should never be wrong to include braces.

Having to deal with google style that demands braces everywhere, and LLVM which 
does not want them, my personal choice is "whatever the style guide says". It 
may not always be a perfect choice, but  it's not worth anyone's time to argue 
over specific instances, where the right choice is ambiguous or is a matter of 
personal preference. I wish we could delegate braces/no-braces decisions to 
clang-format, too, but I don't think it currently handles that.

I'd stick with the style guide defaults and either have the braces removed, or 
a comment added to the body. Perhaps, making the function name shorter, and 
avoiding line-wrapping would address your readability concerns about 
braces/no-braces here, too.

https://github.com/llvm/llvm-project/pull/102010
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [compiler-rt] release/19.x: [InstrProf] Remove duplicate definition of IntPtrT (PR #101061)

2024-08-05 Thread Mingming Liu via llvm-branch-commits

minglotus-6 wrote:

@tru thanks for the fix! 

Sorry I didn't regularly check the PRs with me as reviewers but I should really 
do that.

https://github.com/llvm/llvm-project/pull/101061
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] InferAddressSpaces: Handle masked load and store intrinsics (PR #102007)

2024-08-05 Thread Artem Belevich via llvm-branch-commits

https://github.com/Artem-B approved this pull request.


https://github.com/llvm/llvm-project/pull/102007
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


  1   2   >