================
@@ -2037,13 +2037,36 @@ bool AMDGPUDAGToDAGISel::SelectGlobalSAddr(SDNode *N,
SDValue Addr,
LHS = Addr.getOperand(0);
if (!LHS->isDivergent()) {
- // add (i64 sgpr), (*_extend (i32 vgpr))
RHS = Addr.getOperand(1);
- ScaleOffset = SelectScaleOffset(N, RHS, Subtarget->hasSignedGVSOffset());
+
if (SDValue ExtRHS = matchExtFromI32orI32(
RHS, Subtarget->hasSignedGVSOffset(), CurDAG)) {
+ // add (i64 sgpr), (*_extend (scale (i32 vgpr)))
SAddr = LHS;
VOffset = ExtRHS;
+ if (NeedIOffset && !ImmOffset &&
+ CurDAG->isBaseWithConstantOffset(ExtRHS)) {
+ // add (i64 sgpr), (*_extend (add (scale (i32 vgpr)), (i32 imm)))
----------------
arsenm wrote:
alive2 will be correct, assuming you wrote the IR that actually matches the
hardware addressing mode. The tricky part is being sure that it matches what
the hardware actually does. Your proof matches my understanding of what the
hardware does. So yes, the overflow on 32-bit is a problem and this addressing
calculation should be reassociated earlier
https://github.com/llvm/llvm-project/pull/178608
_______________________________________________
llvm-branch-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits