[llvm-branch-commits] [llvm] [LV] Mask off possibly aliasing vector lanes (PR #100579)

Sander de Smalen via llvm-branch-commits Thu, 20 Nov 2025 06:56:17 -0800

================
@@ -8974,11 +8982,104 @@ void LoopVectorizationPlanner::attachRuntimeChecks(
     assert((!CM.OptForSize ||
             CM.Hints->getForce() == LoopVectorizeHints::FK_Enabled) &&
            "Cannot SCEV check stride or overflow when optimizing for size");
-    VPlanTransforms::attachCheckBlock(Plan, SCEVCheckCond, SCEVCheckBlock,
+    VPlanTransforms::attachCheckBlock(Plan, Plan.getOrAddLiveIn(SCEVCheckCond),
+                                      
Plan.createVPIRBasicBlock(SCEVCheckBlock),
                                       HasBranchWeights);
   }
   const auto &[MemCheckCond, MemCheckBlock] = RTChecks.getMemRuntimeChecks();
   if (MemCheckBlock && MemCheckBlock->hasNPredecessors(0)) {
+    VPValue *MemCheckCondVPV = Plan.getOrAddLiveIn(MemCheckCond);
+    VPBasicBlock *MemCheckBlockVP = Plan.createVPIRBasicBlock(MemCheckBlock);
+    std::optional<ArrayRef<PointerDiffInfo>> ChecksOpt =
+        CM.Legal->getRuntimePointerChecking()->getDiffChecks();
+
+    // Create a mask enabling safe elements for each iteration.
+    if (CM.getRTCheckStyle(TTI) == RTCheckStyle::UseSafeEltsMask &&
+        ChecksOpt.has_value() && ChecksOpt->size() > 0) {
+      ArrayRef<PointerDiffInfo> Checks = *ChecksOpt;
+      VPRegionBlock *LoopRegion = Plan.getVectorLoopRegion();
+      VPBasicBlock *LoopBody = LoopRegion->getEntryBasicBlock();
+      VPBuilder Builder(MemCheckBlockVP);
+
+      /// Create a mask for each possibly-aliasing pointer pair, ANDing them if
+      /// there's more than one pair.
+      VPValue *AliasMask = nullptr;
+      for (PointerDiffInfo Check : Checks) {
+        VPValue *Sink =
+            vputils::getOrCreateVPValueForSCEVExpr(Plan, Check.SinkStart);
+        VPValue *Src =
+            vputils::getOrCreateVPValueForSCEVExpr(Plan, Check.SrcStart);
+
+        Type *PtrType = PointerType::getUnqual(Plan.getContext());
+        Sink = Builder.createScalarCast(Instruction::CastOps::IntToPtr, Sink,
+                                        PtrType, DebugLoc());
+        Src = Builder.createScalarCast(Instruction::CastOps::IntToPtr, Src,
+                                       PtrType, DebugLoc());
+
+        SmallVector<VPValue *, 3> Ops{
+            Src, Sink,
+            Plan.getConstantInt(IntegerType::getInt64Ty(Plan.getContext()),
+                                Check.AccessSize)};
+        VPWidenIntrinsicRecipe *M = new VPWidenIntrinsicRecipe(
+            Check.WriteAfterRead ? Intrinsic::loop_dependence_war_mask
+                                 : Intrinsic::loop_dependence_raw_mask,
+            Ops, IntegerType::getInt1Ty(Plan.getContext()));
+        MemCheckBlockVP->appendRecipe(M);
+        if (AliasMask)
+          AliasMask = Builder.createAnd(AliasMask, M);
+        else
+          AliasMask = M;
+      }
+      assert(AliasMask && "Expected an alias mask to have been created");
+
+      // Replace uses of the loop body's active lane mask phi with an AND of 
the
+      // phi and the alias mask.
+      for (VPRecipeBase &R : *LoopBody) {
+        auto *MaskPhi = dyn_cast<VPActiveLaneMaskPHIRecipe>(&R);
----------------
sdesmalen-arm wrote:


I believe the transform is currently incorrect. When there is no active lane 
mask, it would create an unpredicated vector loop that handles e.g. VF=16 lanes 
in a loop, even when the result of the alias.mask would say that only 3 lanes 
could be safely handled, for example. It would then increment the IV by 3 
elements, but that doesn't mean only 3 lanes are handled each iteration. 
Without predication, it still handles 16 lanes each iteration.

I think there are two options here:
1) if there is no active lane mask in the loop, we could bail out to the scalar 
loop if the number of lanes < VF
2) request the use of an active lane mask in the loop for data when there is an 
alias mask required and the target supports the use of an active lane mask.

I wouldn't mind taking approach 1 first, so that we can already use the whilerw 
instructions for the alias checks in the check block, rather than a bunch of 
scalar instructions, and then follow this up by option 2.

https://github.com/llvm/llvm-project/pull/100579
_______________________________________________
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [LV] Mask off possibly aliasing vector lanes (PR #100579)

Reply via email to