[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)

Fabian Ritter via llvm-branch-commits Thu, 05 Jun 2025 00:17:06 -0700

================
@@ -2627,6 +2629,93 @@ SDValue DAGCombiner::foldSubToAvg(SDNode *N, const SDLoc 
&DL) {
   return SDValue();
 }
 
+/// Try to fold a pointer arithmetic node.
+/// This needs to be done separately from normal addition, because pointer
+/// addition is not commutative.
+SDValue DAGCombiner::visitPTRADD(SDNode *N) {
+  SDValue N0 = N->getOperand(0);
+  SDValue N1 = N->getOperand(1);
+  EVT PtrVT = N0.getValueType();
+  EVT IntVT = N1.getValueType();
+  SDLoc DL(N);
+
+  // This is already ensured by an assert in SelectionDAG::getNode(). Several
+  // combines here depend on this assumption.
+  assert(PtrVT == IntVT &&
+         "PTRADD with different operand types is not supported");
+
+  // fold (ptradd undef, y) -> undef
+  if (N0.isUndef())
+    return N0;
+
+  // fold (ptradd x, undef) -> undef
+  if (N1.isUndef())
+    return DAG.getUNDEF(PtrVT);
+
+  // fold (ptradd x, 0) -> x
+  if (isNullConstant(N1))
+    return N0;
+
+  // fold (ptradd 0, x) -> x
+  if (isNullConstant(N0))
+    return N1;
+
+  if (N0.getOpcode() == ISD::PTRADD &&
+      !reassociationCanBreakAddressingModePattern(ISD::PTRADD, DL, N, N0, N1)) 
{
+    SDValue X = N0.getOperand(0);
+    SDValue Y = N0.getOperand(1);
+    SDValue Z = N1;
+    bool N0OneUse = N0.hasOneUse();
+    bool YIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Y);
+    bool ZIsConstant = DAG.isConstantIntBuildVectorOrConstantInt(Z);
+
+    // (ptradd (ptradd x, y), z) -> (ptradd x, (add y, z)) if:
+    //   * y is a constant and (ptradd x, y) has one use; or
+    //   * y and z are both constants.
+    if ((YIsConstant && N0OneUse) || (YIsConstant && ZIsConstant)) {
+      SDNodeFlags Flags;
+      // If both additions in the original were NUW, the new ones are as well.
+      if (N->getFlags().hasNoUnsignedWrap() &&
+          N0->getFlags().hasNoUnsignedWrap())
+        Flags |= SDNodeFlags::NoUnsignedWrap;
+      SDValue Add = DAG.getNode(ISD::ADD, DL, IntVT, {Y, Z}, Flags);
+      AddToWorklist(Add.getNode());
+      return DAG.getMemBasePlusOffset(X, Add, DL, Flags);
+    }
+
+    // TODO: There is another possible fold here that was proven useful.
+    // It would be this:
+    //
+    // (ptradd (ptradd x, y), z) -> (ptradd (ptradd x, z), y) if:
+    //   * (ptradd x, y) has one use; and
+    //   * y is a constant; and
+    //   * z is not a constant.
+    //
+    // In some cases, specifically in AArch64's FEAT_CPA, it exposes the
+    // opportunity to select more complex instructions such as SUBPT and
+    // MSUBPT. However, a hypothetical corner case has been found that we could
+    // not avoid. Consider this (pseudo-POSIX C):
+    //
+    // char *foo(char *x, int z) {return (x + LARGE_CONSTANT) + z;}
+    // char *p = mmap(LARGE_CONSTANT);
+    // char *q = foo(p, -LARGE_CONSTANT);
+    //
+    // Then x + LARGE_CONSTANT is one-past-the-end, so valid, and a
+    // further + z takes it back to the start of the mapping, so valid,
+    // regardless of the address mmap gave back. However, if mmap gives you an
+    // address < LARGE_CONSTANT (ignoring high bits), x - LARGE_CONSTANT will
+    // borrow from the high bits (with the subsequent + z carrying back into
+    // the high bits to give you a well-defined pointer) and thus trip
+    // FEAT_CPA's pointer corruption checks.
+    //
+    // We leave this fold as an opportunity for future work, addressing the
+    // corner case for FEAT_CPA, as well as reconciling the solution with the
+    // more general application of pointer arithmetic in other future targets.
----------------
ritter-x2a wrote:


My vague idea of handling this properly in the future would be to
- have an `inbounds` flag on `PTRADD` nodes (see #131862),
- have backends generate instructions that break for out-of-bounds arithmetic 
only when the `inbounds` flag is present, and
- give targets an option to request that transformations that would generate 
`PTRADD`s without `inbounds` flag are not applied.

I think that would give things like the CPA implementation a better semantic 
footing, since otherwise they would just be miscompiling the IR's 
`getelementptr`s without `inbounds` flags. However, at least the last point 
above is currently not on my critical path, so I'm open to adding the comment 
here or moving the other transform here.

https://github.com/llvm/llvm-project/pull/142739
_______________________________________________
llvm-branch-commits mailing list
llvm-branch-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits

[llvm-branch-commits] [llvm] [AMDGPU][SDAG] Add ISD::PTRADD DAG combines (PR #142739)

Reply via email to