[llvm-branch-commits] [mlir] [mlir][Interfaces][NFC] `ValueBoundsConstraintSet`: Delete dead code (PR #86098)

2024-03-21 Thread Benjamin Maxwell via llvm-branch-commits

https://github.com/MacDue approved this pull request.

LGTM :+1: 

https://github.com/llvm/llvm-project/pull/86098
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][Interfaces][NFC] `ValueBoundsConstraintSet`: Pass stop condition in the constructor (PR #86099)

2024-03-21 Thread Benjamin Maxwell via llvm-branch-commits


@@ -316,6 +317,9 @@ class ValueBoundsConstraintSet {
 
   /// Builder for constructing affine expressions.
   Builder builder;
+
+  /// The current stop condition function.
+  StopConditionFn stopCondition = nullptr;

MacDue wrote:

Just wondering if this should be a `std::function` instead? `function_ref` 
begin non-owning could lead to some surprises. E.g. by doing 
`ValueBoundsConstraintSet cstr(..., /*stopCondition=*/[&]{ ... })`.

https://github.com/llvm/llvm-project/pull/86099
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][Interfaces][NFC] `ValueBoundsConstraintSet`: Pass stop condition in the constructor (PR #86099)

2024-03-22 Thread Benjamin Maxwell via llvm-branch-commits

https://github.com/MacDue approved this pull request.


https://github.com/llvm/llvm-project/pull/86099
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [mlir] [mlir][Interfaces] `ValueBoundsOpInterface`: Fix typo (PR #87976)

2024-04-08 Thread Benjamin Maxwell via llvm-branch-commits

https://github.com/MacDue approved this pull request.


https://github.com/llvm/llvm-project/pull/87976
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LV] Mask off possibly aliasing vector lanes (PR #100579)

2024-12-09 Thread Benjamin Maxwell via llvm-branch-commits


@@ -1416,14 +1466,14 @@ void VPlanTransforms::addActiveLaneMask(
   auto *FoundWidenCanonicalIVUser =
   find_if(Plan.getCanonicalIV()->users(),
   [](VPUser *U) { return isa(U); });
-  assert(FoundWidenCanonicalIVUser &&
+  assert(FoundWidenCanonicalIVUser && *FoundWidenCanonicalIVUser &&

MacDue wrote:


This looks a little odd. Doesn't `find_if` return an iterator? 
```suggestion
  auto IVUsers = Plan.getCanonicalIV()->users();
  /// ...
  assert(FoundWidenCanonicalIVUser != IVUsers.end() && "Must have widened 
canonical IV when tail folding!"); 
```

https://github.com/llvm/llvm-project/pull/100579
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LV] Mask off possibly aliasing vector lanes (PR #100579)

2024-12-09 Thread Benjamin Maxwell via llvm-branch-commits


@@ -77,9 +77,13 @@ struct VPlanTransforms {
   /// creation) and instead it is handled using active-lane-mask. \p
   /// DataAndControlFlowWithoutRuntimeCheck implies \p
   /// UseActiveLaneMaskForControlFlow.
+  /// RTChecks refers to the pointer pairs that need aliasing elements to be
+  /// masked off each loop iteration.

MacDue wrote:

No docs for PSE?

https://github.com/llvm/llvm-project/pull/100579
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LV] Mask off possibly aliasing vector lanes (PR #100579)

2024-12-09 Thread Benjamin Maxwell via llvm-branch-commits


@@ -3073,6 +3075,56 @@ struct VPWidenStoreEVLRecipe final : public 
VPWidenMemoryRecipe {
   }
 };
 
+// Given a pointer A that is being stored to, and pointer B that is being
+// read from, both with unknown lengths, create a mask that disables
+// elements which could overlap across a loop iteration. For example, if A
+// is X and B is X + 2 with VF being 4, only the final two elements of the
+// loaded vector can be stored since they don't overlap with the stored
+// vector. %b.vec = load %b ; = [s, t, u, v]
+// [...]
+// store %a, %b.vec ; only u and v can be stored as their addresses don't
+// overlap with %a + (VF - 1)

MacDue wrote:

This is specifically RAW? Of something like:
```
store A[x]
load A[x + 2] 
```
Perhaps I'm muddled on what "final two elements" means, but isn't the first two 
elements store that is valid (so it won't overwrite the elements for the load)?



https://github.com/llvm/llvm-project/pull/100579
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LV] Mask off possibly aliasing vector lanes (PR #100579)

2024-12-09 Thread Benjamin Maxwell via llvm-branch-commits


@@ -427,6 +428,29 @@ Value *VPInstruction::generate(VPTransformState &State) {
{PredTy, ScalarTC->getType()},
{VIVElem0, ScalarTC}, nullptr, Name);
   }
+  // Count the number of bits set in each lane and reduce the result to a 
scalar
+  case VPInstruction::PopCount: {
+Value *Op = State.get(getOperand(0));
+auto *VT = Op->getType();

MacDue wrote:

nit: Spell out type if it's not present on the RHS.
```suggestion
Type *VT = Op->getType();
```

https://github.com/llvm/llvm-project/pull/100579
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LV] Mask off possibly aliasing vector lanes (PR #100579)

2024-12-09 Thread Benjamin Maxwell via llvm-branch-commits


@@ -1300,14 +1301,38 @@ static VPActiveLaneMaskPHIRecipe 
*addVPLaneMaskPhiAndUpdateExitBranch(
   cast(CanonicalIVPHI->getBackedgeValue());
   // TODO: Check if dropping the flags is needed if
   // !DataAndControlFlowWithoutRuntimeCheck.
+  VPValue *IncVal = CanonicalIVIncrement->getOperand(1);
+  assert(IncVal != CanonicalIVPHI && "Unexpected operand order");
+
   CanonicalIVIncrement->dropPoisonGeneratingFlags();
   DebugLoc DL = CanonicalIVIncrement->getDebugLoc();
+
   // We can't use StartV directly in the ActiveLaneMask VPInstruction, since
   // we have to take unrolling into account. Each part needs to start at
   //   Part * VF
   auto *VecPreheader = Plan.getVectorPreheader();
   VPBuilder Builder(VecPreheader);
 
+  // Create an alias mask for each possibly-aliasing pointer pair. If there
+  // are multiple they are combined together with ANDs.
+  VPValue *AliasMask = nullptr;
+
+  for (auto C : RTChecks) {
+// FIXME: How to pass this info back?
+//HasAliasMask = true;

MacDue wrote:

This FIXME is a little unclear. Does it mean `HasAliasMask` should be set here 
but it's not?

https://github.com/llvm/llvm-project/pull/100579
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LV] Mask off possibly aliasing vector lanes (PR #100579)

2024-12-09 Thread Benjamin Maxwell via llvm-branch-commits


@@ -195,6 +195,13 @@ enum class TailFoldingStyle {
   DataWithEVL,
 };
 
+enum class RTCheckStyle {
+  /// Branch to scalar loop if checks fails at runtime.
+  ScalarFallback,
+  /// Form a mask based on elements which won't be a WAR or RAW hazard

MacDue wrote:

ultra nit: One of these comments ends with a full-stop and the other does not.
```suggestion
  /// Branch to scalar loop if checks fails at runtime.
  ScalarFallback,
  /// Form a mask based on elements which won't be a WAR or RAW hazard.
```

https://github.com/llvm/llvm-project/pull/100579
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LV] Mask off possibly aliasing vector lanes (PR #100579)

2024-12-09 Thread Benjamin Maxwell via llvm-branch-commits

https://github.com/MacDue edited 
https://github.com/llvm/llvm-project/pull/100579
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LV] Mask off possibly aliasing vector lanes (PR #100579)

2024-12-09 Thread Benjamin Maxwell via llvm-branch-commits


@@ -1331,14 +1356,37 @@ static VPActiveLaneMaskPHIRecipe 
*addVPLaneMaskPhiAndUpdateExitBranch(
   "index.part.next");
 
   // Create the active lane mask instruction in the VPlan preheader.
-  auto *EntryALM =
+  VPValue *Mask =
   Builder.createNaryOp(VPInstruction::ActiveLaneMask, {EntryIncrement, TC},
DL, "active.lane.mask.entry");
 
   // Now create the ActiveLaneMaskPhi recipe in the main loop using the
   // preheader ActiveLaneMask instruction.
-  auto *LaneMaskPhi = new VPActiveLaneMaskPHIRecipe(EntryALM, DebugLoc());
+  auto *LaneMaskPhi = new VPActiveLaneMaskPHIRecipe(Mask, DebugLoc());
   LaneMaskPhi->insertAfter(CanonicalIVPHI);
+  VPValue *LaneMask = LaneMaskPhi;
+  if (AliasMask) {
+// Increment phi by correct amount.
+Builder.setInsertPoint(CanonicalIVIncrement);
+
+VPValue *IncrementBy = Builder.createNaryOp(VPInstruction::PopCount,
+{AliasMask}, DL, "popcount");
+Type *IVType = CanonicalIVPHI->getScalarType();
+
+if (IVType->getScalarSizeInBits() < 64) {
+  auto *Cast =
+  new VPScalarCastRecipe(Instruction::Trunc, IncrementBy, IVType);
+  Cast->insertAfter(IncrementBy->getDefiningRecipe());
+  IncrementBy = Cast;
+}
+CanonicalIVIncrement->setOperand(1, IncrementBy);
+
+// And the alias mask so the iteration only processes non-aliasing lanes
+Builder.setInsertPoint(CanonicalIVPHI->getParent(),
+   CanonicalIVPHI->getParent()->getFirstNonPhi());
+LaneMask = Builder.createNaryOp(Instruction::BinaryOps::And,
+{LaneMaskPhi, AliasMask}, DL);

MacDue wrote:

Do we know this AND won't be all-false?

https://github.com/llvm/llvm-project/pull/100579
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LV] Mask off possibly aliasing vector lanes (PR #100579)

2024-12-09 Thread Benjamin Maxwell via llvm-branch-commits


@@ -3235,6 +3263,36 @@ void VPWidenPointerInductionRecipe::print(raw_ostream 
&O, const Twine &Indent,
 }
 #endif
 
+void VPAliasLaneMaskRecipe::execute(VPTransformState &State) {
+  IRBuilderBase Builder = State.Builder;
+  Value *SinkValue = State.get(getSinkValue(), true);
+  Value *SourceValue = State.get(getSourceValue(), true);
+
+  auto *Type = SinkValue->getType();
+  Value *AliasMask = Builder.CreateIntrinsic(
+  Intrinsic::experimental_get_alias_lane_mask,
+  {VectorType::get(Builder.getInt1Ty(), State.VF), Type,
+   Builder.getInt64Ty()},
+  {SourceValue, SinkValue, Builder.getInt64(getAccessedElementSize()),
+   Builder.getInt1(WriteAfterRead)},
+  nullptr, "alias.lane.mask");
+  State.set(this, AliasMask, /*IsScalar=*/false);
+}
+
+#if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
+void VPAliasLaneMaskRecipe::print(raw_ostream &O, const Twine &Indent,
+  VPSlotTracker &SlotTracker) const {
+  O << Indent << "EMIT ";
+  getVPSingleValue()->printAsOperand(O, SlotTracker);
+  O << " = alias lane mask ";

MacDue wrote:

nit: These seem more commonly printed in all caps with hyphens. 
```suggestion
  O << " = ALIAS-LANE-MASK ";
```

https://github.com/llvm/llvm-project/pull/100579
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LV] Mask off possibly aliasing vector lanes (PR #100579)

2024-12-09 Thread Benjamin Maxwell via llvm-branch-commits


@@ -952,7 +952,6 @@ void VPlan::prepareToExecute(Value *TripCountV, Value 
*VectorTripCountV,
 
   IRBuilder<> Builder(State.CFG.PrevBB->getTerminator());
   // FIXME: Model VF * UF computation completely in VPlan.
-  assert(VFxUF.getNumUsers() && "VFxUF expected to always have users");

MacDue wrote:

How does removing this assert relate to these changes? 

https://github.com/llvm/llvm-project/pull/100579
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LV] Mask off possibly aliasing vector lanes (PR #100579)

2024-12-09 Thread Benjamin Maxwell via llvm-branch-commits


@@ -3235,6 +3263,36 @@ void VPWidenPointerInductionRecipe::print(raw_ostream 
&O, const Twine &Indent,
 }
 #endif
 
+void VPAliasLaneMaskRecipe::execute(VPTransformState &State) {
+  IRBuilderBase Builder = State.Builder;
+  Value *SinkValue = State.get(getSinkValue(), true);
+  Value *SourceValue = State.get(getSourceValue(), true);
+
+  auto *Type = SinkValue->getType();

MacDue wrote:

nit:
```suggestion
  Type *PtrType = SinkValue->getType();
```

https://github.com/llvm/llvm-project/pull/100579
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LV] Mask off possibly aliasing vector lanes (PR #100579)

2024-12-09 Thread Benjamin Maxwell via llvm-branch-commits

https://github.com/MacDue commented:

A bunch of little comments (mostly just nitpicks from a pass over the PR) 
:slightly_smiling_face: 

https://github.com/llvm/llvm-project/pull/100579
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LV] Mask off possibly aliasing vector lanes (PR #100579)

2024-12-09 Thread Benjamin Maxwell via llvm-branch-commits


@@ -418,7 +418,13 @@ class LoopVectorizationPlanner {
   /// Build VPlans for the specified \p UserVF and \p UserIC if they are
   /// non-zero or all applicable candidate VFs otherwise. If vectorization and
   /// interleaving should be avoided up-front, no plans are generated.
-  void plan(ElementCount UserVF, unsigned UserIC);
+  /// RTChecks is a list of pointer pairs that should be checked for aliasing,
+  /// setting HasAliasMask to true in the case that an alias mask is generated

MacDue wrote:

Outdated comment? Is this `DiffChecks` now?

https://github.com/llvm/llvm-project/pull/100579
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [AArch64][SME] [AArch64][SME] Spill p-regs as z-regs when streaming hazards are possible (PR #126503)

2025-02-10 Thread Benjamin Maxwell via llvm-branch-commits

https://github.com/MacDue edited 
https://github.com/llvm/llvm-project/pull/126503
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [AArch64][SME] [AArch64][SME] Spill p-regs as z-regs when streaming hazards are possible (PR #126503)

2025-02-10 Thread Benjamin Maxwell via llvm-branch-commits

https://github.com/MacDue edited 
https://github.com/llvm/llvm-project/pull/126503
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [AArch64][SME] [AArch64][SME] Spill p-regs as z-regs when streaming hazards are possible (PR #126503)

2025-02-10 Thread Benjamin Maxwell via llvm-branch-commits

https://github.com/MacDue milestoned 
https://github.com/llvm/llvm-project/pull/126503
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [AArch64][SME] [AArch64][SME] Spill p-regs as z-regs when streaming hazards are possible (PR #126503)

2025-02-10 Thread Benjamin Maxwell via llvm-branch-commits

https://github.com/MacDue edited 
https://github.com/llvm/llvm-project/pull/126503
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LoopVectorizer] Bundle partial reductions inside VPMulAccumulateReductionRecipe (PR #136173)

2025-04-22 Thread Benjamin Maxwell via llvm-branch-commits


@@ -253,38 +253,38 @@ define i64 @not_dotp_i8_to_i64_has_neon_dotprod(ptr 
readonly %a, ptr readonly %b
 ; CHECK-MAXBW-SAME: ptr readonly [[A:%.*]], ptr readonly [[B:%.*]]) 
#[[ATTR1:[0-9]+]] {
 ; CHECK-MAXBW-NEXT:  entry:
 ; CHECK-MAXBW-NEXT:[[TMP0:%.*]] = call i64 @llvm.vscale.i64()
-; CHECK-MAXBW-NEXT:[[TMP1:%.*]] = mul i64 [[TMP0]], 8
+; CHECK-MAXBW-NEXT:[[TMP1:%.*]] = mul i64 [[TMP0]], 16
 ; CHECK-MAXBW-NEXT:br i1 false, label [[SCALAR_PH:%.*]], label 
[[VECTOR_PH:%.*]]
 ; CHECK-MAXBW:   vector.ph:
 ; CHECK-MAXBW-NEXT:[[TMP2:%.*]] = call i64 @llvm.vscale.i64()
-; CHECK-MAXBW-NEXT:[[TMP3:%.*]] = mul i64 [[TMP2]], 8
+; CHECK-MAXBW-NEXT:[[TMP3:%.*]] = mul i64 [[TMP2]], 16
 ; CHECK-MAXBW-NEXT:[[N_MOD_VF:%.*]] = urem i64 1024, [[TMP3]]
 ; CHECK-MAXBW-NEXT:[[N_VEC:%.*]] = sub i64 1024, [[N_MOD_VF]]
 ; CHECK-MAXBW-NEXT:[[TMP4:%.*]] = call i64 @llvm.vscale.i64()
-; CHECK-MAXBW-NEXT:[[TMP5:%.*]] = mul i64 [[TMP4]], 8
+; CHECK-MAXBW-NEXT:[[TMP5:%.*]] = mul i64 [[TMP4]], 16
 ; CHECK-MAXBW-NEXT:[[TMP6:%.*]] = getelementptr i8, ptr [[A]], i64 
[[N_VEC]]
 ; CHECK-MAXBW-NEXT:[[TMP7:%.*]] = getelementptr i8, ptr [[B]], i64 
[[N_VEC]]
 ; CHECK-MAXBW-NEXT:br label [[VECTOR_BODY:%.*]]
 ; CHECK-MAXBW:   vector.body:
 ; CHECK-MAXBW-NEXT:[[INDEX:%.*]] = phi i64 [ 0, [[VECTOR_PH]] ], [ 
[[INDEX_NEXT:%.*]], [[VECTOR_BODY]] ]
-; CHECK-MAXBW-NEXT:[[VEC_PHI:%.*]] = phi  [ 
zeroinitializer, [[VECTOR_PH]] ], [ [[TMP15:%.*]], [[VECTOR_BODY]] ]
+; CHECK-MAXBW-NEXT:[[VEC_PHI:%.*]] = phi  [ 
zeroinitializer, [[VECTOR_PH]] ], [ [[PARTIAL_REDUCE:%.*]], [[VECTOR_BODY]] ]
 ; CHECK-MAXBW-NEXT:[[TMP8:%.*]] = add i64 [[INDEX]], 0
 ; CHECK-MAXBW-NEXT:[[NEXT_GEP:%.*]] = getelementptr i8, ptr [[A]], i64 
[[TMP8]]
 ; CHECK-MAXBW-NEXT:[[TMP9:%.*]] = add i64 [[INDEX]], 0
 ; CHECK-MAXBW-NEXT:[[NEXT_GEP1:%.*]] = getelementptr i8, ptr [[B]], i64 
[[TMP9]]
 ; CHECK-MAXBW-NEXT:[[TMP10:%.*]] = getelementptr i8, ptr [[NEXT_GEP]], i32 0
-; CHECK-MAXBW-NEXT:[[WIDE_LOAD:%.*]] = load , ptr 
[[TMP10]], align 1
-; CHECK-MAXBW-NEXT:[[TMP11:%.*]] = zext  [[WIDE_LOAD]] to 

+; CHECK-MAXBW-NEXT:[[WIDE_LOAD:%.*]] = load , ptr 
[[TMP10]], align 1
 ; CHECK-MAXBW-NEXT:[[TMP12:%.*]] = getelementptr i8, ptr [[NEXT_GEP1]], 
i32 0
-; CHECK-MAXBW-NEXT:[[WIDE_LOAD2:%.*]] = load , ptr 
[[TMP12]], align 1
-; CHECK-MAXBW-NEXT:[[TMP13:%.*]] = zext  [[WIDE_LOAD2]] 
to 
-; CHECK-MAXBW-NEXT:[[TMP14:%.*]] = mul nuw nsw  
[[TMP13]], [[TMP11]]
-; CHECK-MAXBW-NEXT:[[TMP15]] = add  [[TMP14]], 
[[VEC_PHI]]
+; CHECK-MAXBW-NEXT:[[WIDE_LOAD2:%.*]] = load , ptr 
[[TMP12]], align 1
+; CHECK-MAXBW-NEXT:[[TMP15:%.*]] = zext  [[WIDE_LOAD2]] 
to 
+; CHECK-MAXBW-NEXT:[[TMP13:%.*]] = zext  [[WIDE_LOAD]] 
to 
+; CHECK-MAXBW-NEXT:[[TMP14:%.*]] = mul nuw nsw  
[[TMP15]], [[TMP13]]
+; CHECK-MAXBW-NEXT:[[PARTIAL_REDUCE]] = call  
@llvm.experimental.vector.partial.reduce.add.nxv2i64.nxv16i64( [[VEC_PHI]],  [[TMP14]])

MacDue wrote:

This test is called "not_dotp" but now looks like it's dotp 
:slightly_smiling_face: IIRC this won't map directly a dot product instruction 
(as `nxv16i64` to `nxv2i64` is not supported at the moment). 

https://github.com/llvm/llvm-project/pull/136173
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LoopVectorizer] Bundle partial reductions inside VPMulAccumulateReductionRecipe (PR #136173)

2025-04-22 Thread Benjamin Maxwell via llvm-branch-commits


@@ -2376,6 +2327,59 @@ class VPReductionRecipe : public VPRecipeWithIRFlags {
   }
 };
 
+/// A recipe for forming partial reductions. In the loop, an accumulator and
+/// vector operand are added together and passed to the next iteration as the
+/// next accumulator. After the loop body, the accumulator is reduced to a
+/// scalar value.
+class VPPartialReductionRecipe : public VPReductionRecipe {

MacDue wrote:

Should the `classof` for `VPReductionRecipe` now include 
`VPPartialReductionRecipe`? 

https://github.com/llvm/llvm-project/pull/136173
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LoopVectorizer] Bundle partial reductions inside VPMulAccumulateReductionRecipe (PR #136173)

2025-04-22 Thread Benjamin Maxwell via llvm-branch-commits

https://github.com/MacDue edited 
https://github.com/llvm/llvm-project/pull/136173
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [AArch64][SME] Prevent spills of ZT0 when ZA is not enabled (PR #137683)

2025-04-28 Thread Benjamin Maxwell via llvm-branch-commits

https://github.com/MacDue created 
https://github.com/llvm/llvm-project/pull/137683

This cherry-picks https://github.com/llvm/llvm-project/pull/132722 and 
https://github.com/llvm/llvm-project/pull/136726 (the latter is based on the 
former). 

These patches are needed to prevent invalid codegen as attempting to store ZT0 
without ZA enabled results in a SIGILL.

>From c2e81b014aebc262b4db59eb7fbdde2b1376a39a Mon Sep 17 00:00:00 2001
From: Benjamin Maxwell 
Date: Tue, 25 Mar 2025 10:09:25 +
Subject: [PATCH 1/2] [AArch64][SME2] Don't preserve ZT0 around SME ABI
 routines (#132722)

This caused ZT0 to be preserved around `__arm_tpidr2_save` in functions
with "aarch64_new_zt0". The block in which `__arm_tpidr2_save` is called
is added by the SMEABIPass and may be reachable in cases where ZA has
not been enabled* (so using `str zt0` is invalid).

* (when za_save_buffer is null and num_za_save_slices is zero)
---
 .../AArch64/Utils/AArch64SMEAttributes.h  |  3 +-
 .../AArch64/sme-disable-gisel-fisel.ll|  9 +--
 llvm/test/CodeGen/AArch64/sme-zt0-state.ll| 61 +--
 3 files changed, 46 insertions(+), 27 deletions(-)

diff --git a/llvm/lib/Target/AArch64/Utils/AArch64SMEAttributes.h 
b/llvm/lib/Target/AArch64/Utils/AArch64SMEAttributes.h
index fb093da70c46b..a3ebf764a6e0c 100644
--- a/llvm/lib/Target/AArch64/Utils/AArch64SMEAttributes.h
+++ b/llvm/lib/Target/AArch64/Utils/AArch64SMEAttributes.h
@@ -133,7 +133,8 @@ class SMEAttrs {
   bool hasZT0State() const { return isNewZT0() || sharesZT0(); }
   bool requiresPreservingZT0(const SMEAttrs &Callee) const {
 return hasZT0State() && !Callee.sharesZT0() &&
-   !Callee.hasAgnosticZAInterface();
+   !Callee.hasAgnosticZAInterface() &&
+   !(Callee.Bitmask & SME_ABI_Routine);
   }
   bool requiresDisablingZABeforeCall(const SMEAttrs &Callee) const {
 return hasZT0State() && !hasZAState() && Callee.hasPrivateZAInterface() &&
diff --git a/llvm/test/CodeGen/AArch64/sme-disable-gisel-fisel.ll 
b/llvm/test/CodeGen/AArch64/sme-disable-gisel-fisel.ll
index 33d08beae2ca7..4a52bf27a7591 100644
--- a/llvm/test/CodeGen/AArch64/sme-disable-gisel-fisel.ll
+++ b/llvm/test/CodeGen/AArch64/sme-disable-gisel-fisel.ll
@@ -475,16 +475,12 @@ declare double @zt0_shared_callee(double) 
"aarch64_inout_zt0"
 define double  @zt0_new_caller_to_zt0_shared_callee(double %x) nounwind 
noinline optnone "aarch64_new_zt0" {
 ; CHECK-COMMON-LABEL: zt0_new_caller_to_zt0_shared_callee:
 ; CHECK-COMMON:   // %bb.0: // %prelude
-; CHECK-COMMON-NEXT:sub sp, sp, #80
-; CHECK-COMMON-NEXT:str x30, [sp, #64] // 8-byte Folded Spill
+; CHECK-COMMON-NEXT:str x30, [sp, #-16]! // 8-byte Folded Spill
 ; CHECK-COMMON-NEXT:mrs x8, TPIDR2_EL0
 ; CHECK-COMMON-NEXT:cbz x8, .LBB13_2
 ; CHECK-COMMON-NEXT:b .LBB13_1
 ; CHECK-COMMON-NEXT:  .LBB13_1: // %save.za
-; CHECK-COMMON-NEXT:mov x8, sp
-; CHECK-COMMON-NEXT:str zt0, [x8]
 ; CHECK-COMMON-NEXT:bl __arm_tpidr2_save
-; CHECK-COMMON-NEXT:ldr zt0, [x8]
 ; CHECK-COMMON-NEXT:msr TPIDR2_EL0, xzr
 ; CHECK-COMMON-NEXT:b .LBB13_2
 ; CHECK-COMMON-NEXT:  .LBB13_2: // %entry
@@ -495,8 +491,7 @@ define double  @zt0_new_caller_to_zt0_shared_callee(double 
%x) nounwind noinline
 ; CHECK-COMMON-NEXT:fmov d1, x8
 ; CHECK-COMMON-NEXT:fadd d0, d0, d1
 ; CHECK-COMMON-NEXT:smstop za
-; CHECK-COMMON-NEXT:ldr x30, [sp, #64] // 8-byte Folded Reload
-; CHECK-COMMON-NEXT:add sp, sp, #80
+; CHECK-COMMON-NEXT:ldr x30, [sp], #16 // 8-byte Folded Reload
 ; CHECK-COMMON-NEXT:ret
 entry:
   %call = call double @zt0_shared_callee(double %x)
diff --git a/llvm/test/CodeGen/AArch64/sme-zt0-state.ll 
b/llvm/test/CodeGen/AArch64/sme-zt0-state.ll
index 312537630e77a..500fff4eb20db 100644
--- a/llvm/test/CodeGen/AArch64/sme-zt0-state.ll
+++ b/llvm/test/CodeGen/AArch64/sme-zt0-state.ll
@@ -112,7 +112,7 @@ define void @za_zt0_shared_caller_za_zt0_shared_callee() 
"aarch64_inout_za" "aar
   ret void;
 }
 
-; New-ZA Callee
+; New-ZT0 Callee
 
 ; Expect spill & fill of ZT0 around call
 ; Expect smstop/smstart za around call
@@ -134,6 +134,39 @@ define void @zt0_in_caller_zt0_new_callee() 
"aarch64_in_zt0" nounwind {
   ret void;
 }
 
+; New-ZT0 Callee
+
+; Expect commit of lazy-save if ZA is dormant
+; Expect smstart ZA & clear ZT0
+; Expect spill & fill of ZT0 around call
+; Before return, expect smstop ZA
+define void @zt0_new_caller_zt0_new_callee() "aarch64_new_zt0" nounwind {
+; CHECK-LABEL: zt0_new_caller_zt0_new_callee:
+; CHECK:   // %bb.0: // %prelude
+; CHECK-NEXT:sub sp, sp, #80
+; CHECK-NEXT:stp x30, x19, [sp, #64] // 16-byte Folded Spill
+; CHECK-NEXT:mrs x8, TPIDR2_EL0
+; CHECK-NEXT:cbz x8, .LBB6_2
+; CHECK-NEXT:  // %bb.1: // %save.za
+; CHECK-NEXT:bl __arm_tpidr2_save
+; CHECK-NEXT:msr TPIDR2_EL0, xzr
+; CHECK-NEXT:  .LBB6_2:
+; CHECK-NEXT:smstart za
+; CHECK-NEXT:zero { zt0 }
+; CHECK-NEXT:mov x19, sp
+; CHECK-NEXT:

[llvm-branch-commits] [llvm] release/20.x: [AArch64][SME] Prevent spills of ZT0 when ZA is not enabled (PR #137683)

2025-04-28 Thread Benjamin Maxwell via llvm-branch-commits

https://github.com/MacDue milestoned 
https://github.com/llvm/llvm-project/pull/137683
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [AArch64][SME] Prevent spills of ZT0 when ZA is not enabled (PR #137683)

2025-04-30 Thread Benjamin Maxwell via llvm-branch-commits

MacDue wrote:

@sdesmalen-arm What do you think about merging this PR to the release branch?

https://github.com/llvm/llvm-project/pull/137683
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LoopVectorizer] Bundle partial reductions inside VPMulAccumulateReductionRecipe (PR #136173)

2025-05-01 Thread Benjamin Maxwell via llvm-branch-commits


@@ -4923,9 +4923,7 @@ InstructionCost AArch64TTIImpl::getPartialReductionCost(
 return Invalid;
   break;
 case 16:
-  if (AccumEVT == MVT::i64)
-Cost *= 2;
-  else if (AccumEVT != MVT::i32)
+  if (AccumEVT != MVT::i32)

MacDue wrote:

If we allow this case make sure to rename the test from "not_dotp" to "dotp".  

https://github.com/llvm/llvm-project/pull/136173
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] [LoopVectorizer] Bundle partial reductions inside VPMulAccumulateReductionRecipe (PR #136173)

2025-04-24 Thread Benjamin Maxwell via llvm-branch-commits


@@ -4923,9 +4923,7 @@ InstructionCost AArch64TTIImpl::getPartialReductionCost(
 return Invalid;
   break;
 case 16:
-  if (AccumEVT == MVT::i64)
-Cost *= 2;
-  else if (AccumEVT != MVT::i32)
+  if (AccumEVT != MVT::i32)

MacDue wrote:

It's due to: 
https://github.com/llvm/llvm-project/pull/136173#discussion_r2053920360

https://github.com/llvm/llvm-project/pull/136173
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits


[llvm-branch-commits] [llvm] release/20.x: [SDAG] Ensure load is included in output chain of sincos expansion (#140525) (PR #140703)

2025-05-20 Thread Benjamin Maxwell via llvm-branch-commits

MacDue wrote:

Not sure why the bot is asking me (I think it's fine, but I requested the 
backport). 

cc @arsenm, @RKSimon 
 

https://github.com/llvm/llvm-project/pull/140703
___
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits