https://github.com/c-rhodes updated https://github.com/llvm/llvm-project/pull/178197
>From e3519250215c0feb7bf1c48d95019a76aafc299d Mon Sep 17 00:00:00 2001 From: Simon Tatham <[email protected]> Date: Mon, 26 Jan 2026 09:28:38 +0000 Subject: [PATCH] [ARM] Count register copies when estimating function size (#175763) `EstimateFunctionSizeInBytes`, in `ARMFrameLowering.cpp`, provides an early estimate of the compiled size of a function, in a context that wants to overestimate rather than underestimate. In some cases it was underestimating severely, by over 20%. The discrepancy was entirely accounted for by the fact that `COPY` operations were not being counted at all, even though each one (or at least each one that survives any post-regalloc optimizations) takes 2 bytes in Thumb or 4 in Arm. This could lead to a compile failure, if the underestimated function size led frame lowering to not stack LR, but later, `ARMConstantIslandsPass` needed to insert an intra-function branch long enough to require a `bl` instruction, needing LR to have been stacked. The result of `EstimateFunctionSizeInBytes` was not directly available for testing, so I added an `LLVM_DEBUG` at the end of the function. That way, the test file doesn't need to try to make a >2048 byte function estimated at <2048 bytes; it just needs to exhibit a function with a single `COPY` and make sure it's counted. At the moment, `EstimateFunctionSizeInBytes` is only used at all in Thumb-1 compilations, to decide whether the function is large enough to justify stacking LR as a precaution. However, the subroutine `ARMBaseInstrInfo::getInstSizeInBytes` which counts each individual `MachineInstr` is called from other contexts too, so I've made it return a sensible answer for `COPY` nodes in both of Arm and Thumb. (cherry picked from commit 0921542e3b0557e926af846a414676a6a5d0e43c) --- llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp | 5 +++ llvm/lib/Target/ARM/ARMFrameLowering.cpp | 2 ++ llvm/test/CodeGen/ARM/estimate-size-copy.mir | 37 ++++++++++++++++++++ 3 files changed, 44 insertions(+) create mode 100644 llvm/test/CodeGen/ARM/estimate-size-copy.mir diff --git a/llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp b/llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp index 402a4e30fe3ca..1c8ab6afb9095 100644 --- a/llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp +++ b/llvm/lib/Target/ARM/ARMBaseInstrInfo.cpp @@ -619,6 +619,11 @@ unsigned ARMBaseInstrInfo::getInstSizeInBytes(const MachineInstr &MI) const { return MCID.getSize(); case TargetOpcode::BUNDLE: return getInstBundleLength(MI); + case TargetOpcode::COPY: + if (!MF->getInfo<ARMFunctionInfo>()->isThumbFunction()) + return 4; + else + return 2; case ARM::CONSTPOOL_ENTRY: case ARM::JUMPTABLE_INSTS: case ARM::JUMPTABLE_ADDRS: diff --git a/llvm/lib/Target/ARM/ARMFrameLowering.cpp b/llvm/lib/Target/ARM/ARMFrameLowering.cpp index 2fc64894b0d34..a0b975072bd8e 100644 --- a/llvm/lib/Target/ARM/ARMFrameLowering.cpp +++ b/llvm/lib/Target/ARM/ARMFrameLowering.cpp @@ -2335,6 +2335,8 @@ static unsigned EstimateFunctionSizeInBytes(const MachineFunction &MF, for (auto &Table: MF.getJumpTableInfo()->getJumpTables()) FnSize += Table.MBBs.size() * 4; FnSize += MF.getConstantPool()->getConstants().size() * 4; + LLVM_DEBUG(dbgs() << "Estimated function size for " << MF.getName() << " = " + << FnSize << " bytes\n"); return FnSize; } diff --git a/llvm/test/CodeGen/ARM/estimate-size-copy.mir b/llvm/test/CodeGen/ARM/estimate-size-copy.mir new file mode 100644 index 0000000000000..154117d1743d1 --- /dev/null +++ b/llvm/test/CodeGen/ARM/estimate-size-copy.mir @@ -0,0 +1,37 @@ +# REQUIRES: asserts +# +# RUN: llc -mtriple=thumbv6m -start-before=machine-cp -debug -o - %s 2>%t | \ +# RUN: FileCheck %s --check-prefix=OUTPUT +# RUN: FileCheck %s --check-prefix=DEBUG < %t +# +# DEBUG: Estimated function size for f = 4 bytes +# +# OUTPUT: mov r0, r1 +# OUTPUT: bx lr + +--- | + target datalayout = "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64" + target triple = "thumbv6m-unknown-none-eabi" + + define i32 @f(i32 %x, i32 %y) { + entry: + ret i32 %y + } + +... +--- +name: f +tracksRegLiveness: true +frameInfo: + isFrameAddressTaken: false + isReturnAddressTaken: false + localFrameSize: 0 +machineFunctionInfo: + isLRSpilled: false +body: | + bb.0.entry: + liveins: $r1 + + renamable $r0 = COPY $r1 + tBX_RET 14 /* CC::al */, $noreg, implicit $r0 +... _______________________________________________ llvm-branch-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-branch-commits
