llvmorg-github-actions[bot] wrote:
<!--LLVM PR SUMMARY COMMENT--> @llvm/pr-subscribers-clang-codegen Author: Xavier Roche (xroche) <details> <summary>Changes</summary> When a C function passes a struct by value to a `musttail` callee, Clang implements "by value" with a local `alloca + memcpy + pass-pointer` (the byval-temp). The alloca lives in the caller's frame, which the tail call deallocates before the callee dereferences the pointer. The callee reads freed stack memory. Reproduces on RV64, AArch64, ARM, LoongArch64, and SystemZ at every optimization level. This is the argument-side analog of the SRet musttail fix in `a96c14eeb8fc` (PR #<!-- -->104795). In `CGCall.cpp`'s `ABIArgInfo::Indirect` case, when the call argument's source LValue resolves to a forwarded incoming Indirect parameter, forward the incoming `llvm::Argument` directly. Falls through to byval-temp for any other source. `IndirectAliased` is left untouched (the byval-ness check would assert there, and the safe fallback is the existing path). Three guards: `byval` parity (Verifier V7), `noalias` deduplication when the same Argument would be forwarded to two slots in one call, and one `AddrSpaceCastInst` peek-through (no load unwrapping). ## Scope C source only. C++ for the same construct lowers through `CXXConstructExpr + EmitAnyExprToTemp` which materializes the `agg.tmp` before `EmitCall` runs; the fix correctly falls through (the source is a local alloca, not the Argument) but does not eliminate the dangle in C++. Follow-up planned. ## Tests `clang/test/CodeGen/musttail-indirect-arg.c` (modeled on `musttail-sret.cpp`): plain forward, two-arg, swapped args, mixed direct+indirect, modify-then-forward, cross-BB, same-Argument-to-two-slots, and negatives (local source, non-musttail). Triples: riscv64, aarch64, loongarch64, s390x. Full `clang/test/CodeGen` lit suite (6047 tests) clean. Local sanity check on the minimal C reproducer at `-O1` runs on x86_64 native and on aarch64/riscv64/arm/loongarch64/s390x under QEMU. PowerPC and MIPS hit pre-existing musttail support gaps (Clang diagnostic / backend `failed to perform tail call elimination`) -- not regressions from this PR. Fixes #<!-- -->56908. Partially addresses #<!-- -->46402 (RV64/ARM64 manifestation; X86 was addressed by #<!-- -->190540). Related: #<!-- -->56435, #<!-- -->72555, #<!-- -->84090, #<!-- -->116568, #<!-- -->157814, #<!-- -->190429. Backend complement of #<!-- -->185094 (RISC-V) and `0be65bac6907` (LoongArch). Assisted-by: Claude (Anthropic) --- Full diff: https://github.com/llvm/llvm-project/pull/199351.diff 2 Files Affected: - (modified) clang/lib/CodeGen/CGCall.cpp (+60) - (added) clang/test/CodeGen/musttail-indirect-arg.c (+118) ``````````diff diff --git a/clang/lib/CodeGen/CGCall.cpp b/clang/lib/CodeGen/CGCall.cpp index 40cc275d40273..ad402c459b892 100644 --- a/clang/lib/CodeGen/CGCall.cpp +++ b/clang/lib/CodeGen/CGCall.cpp @@ -5448,6 +5448,45 @@ static unsigned getMaxVectorWidth(const llvm::Type *Ty) { return MaxVectorWidth; } +/// Returns the incoming llvm::Argument of the current function if this +/// musttail call argument forwards an incoming Indirect parameter with a +/// matching ABI shape; nullptr to fall through to the byval-temp path. +/// Argument-side analog of a96c14eeb8fc (SRet musttail forwarding). +static llvm::Argument *getForwardableIncomingMustTailArg( + CodeGenFunction &CGF, const CallArg &CallArgument, + const ABIArgInfo &CallSlotInfo, + llvm::SmallPtrSetImpl<llvm::Argument *> &AlreadyForwarded) { + Address SrcAddr = Address::invalid(); + if (CallArgument.hasLValue()) + SrcAddr = CallArgument.getKnownLValue().getAddress(); + else if (CallArgument.getKnownRValue().isAggregate()) + SrcAddr = CallArgument.getKnownRValue().getAggregateAddress(); + else + return nullptr; + llvm::Value *SrcPtr = SrcAddr.emitRawPointer(CGF); + + // Peek through one AddrSpaceCastInst (NVPTX / AMDGPU / SPIR wrap incoming + // Indirect params via EmitParmDecl). Do not unwrap loads: a load through + // a local alloca means the source is a local. + if (auto *ASC = llvm::dyn_cast<llvm::AddrSpaceCastInst>(SrcPtr)) + SrcPtr = ASC->getOperand(0); + + auto *IncomingArg = llvm::dyn_cast<llvm::Argument>(SrcPtr); + if (!IncomingArg || IncomingArg->getParent() != CGF.CurFn) + return nullptr; + + // Verifier V7 requires matching ABI attributes across musttail. + if (IncomingArg->hasByValAttr() != CallSlotInfo.getIndirectByVal()) + return nullptr; + + // Do not forward the same noalias Argument to two slots in one call. + if (IncomingArg->hasNoAliasAttr() && + !AlreadyForwarded.insert(IncomingArg).second) + return nullptr; + + return IncomingArg; +} + RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo, const CGCallee &Callee, ReturnValueSlot ReturnValue, @@ -5571,6 +5610,10 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo, // markers that need to be ended right after the call. SmallVector<CallLifetimeEnd, 2> CallLifetimeEndAfterCall; + // Tracks incoming Arguments already forwarded by a musttail Indirect arg, + // for noalias deduplication in getForwardableIncomingMustTailArg. + llvm::SmallPtrSet<llvm::Argument *, 4> ForwardedMustTailArgs; + // Translate all of the arguments as necessary to match the IR lowering. assert(CallInfo.arg_size() == CallArgs.size() && "Mismatch between function signature & arguments."); @@ -5643,6 +5686,23 @@ RValue CodeGenFunction::EmitCall(const CGFunctionInfo &CallInfo, case ABIArgInfo::Indirect: case ABIArgInfo::IndirectAliased: { assert(NumIRArgs == 1); + + // For musttail, forward an incoming Indirect parameter directly. A + // local alloca would dangle after the tail call. Mirrors the SRet + // forwarding above (a96c14eeb8fc). Limited to Indirect (not + // IndirectAliased) so the byval-ness check in the helper does not + // assert on getIndirectByVal(). + if (IsMustTail && ArgInfo.isIndirect()) { + if (llvm::Argument *FwdArg = getForwardableIncomingMustTailArg( + *this, *I, ArgInfo, ForwardedMustTailArgs)) { + llvm::Value *Val = FwdArg; + if (ArgHasMaybeUndefAttr) + Val = Builder.CreateFreeze(Val); + IRCallArgs[FirstIRArg] = Val; + break; + } + } + if (I->isAggregate()) { // We want to avoid creating an unnecessary temporary+copy here; // however, we need one in three cases: diff --git a/clang/test/CodeGen/musttail-indirect-arg.c b/clang/test/CodeGen/musttail-indirect-arg.c new file mode 100644 index 0000000000000..0e7493f9e38b7 --- /dev/null +++ b/clang/test/CodeGen/musttail-indirect-arg.c @@ -0,0 +1,118 @@ +// Test that Clang forwards incoming Indirect parameters across musttail calls +// instead of creating a byval-temp alloca that would dangle after the tail call +// deallocates the caller's frame. +// +// Companion to musttail-sret.cpp (commit a96c14eeb8fc): same idea, applied to +// incoming arguments rather than the sret return slot. + +// RUN: %clang_cc1 -triple=riscv64-linux-gnu %s -emit-llvm -O1 -o - | FileCheck %s --check-prefix=COMMON +// RUN: %clang_cc1 -triple=aarch64-linux-gnu %s -emit-llvm -O1 -o - | FileCheck %s --check-prefix=COMMON +// RUN: %clang_cc1 -triple=loongarch64-linux-gnu %s -emit-llvm -O1 -o - | FileCheck %s --check-prefix=COMMON +// RUN: %clang_cc1 -triple=s390x-linux-gnu %s -emit-llvm -O1 -o - | FileCheck %s --check-prefix=COMMON + +// A struct large enough to land on the indirect-arg path on RV64 (>2*XLEN=16 +// bytes), AArch64 (>16 bytes), LoongArch64, SystemZ. +struct Big { + unsigned long long a, b, c, d; +}; + +// Plain forward: caller(B) musttails callee(B). The fix should emit no +// alloca for the forwarded arg; the call should forward the incoming +// parameter %a. +struct Big C1(struct Big a); +struct Big P1(struct Big a) { + __attribute__((musttail)) return C1(a); +} +// COMMON-LABEL: define {{.*}} @P1( +// COMMON-NOT: = alloca {{.*}}struct.Big +// COMMON-NOT: = alloca [32 x i8] +// COMMON: musttail call {{.*}} @C1({{.*}} %a) + +// Two indirect args, same forwarding: each forwards its own incoming param. +struct Big C2(struct Big a, struct Big b); +struct Big P2(struct Big a, struct Big b) { + __attribute__((musttail)) return C2(a, b); +} +// COMMON-LABEL: define {{.*}} @P2( +// COMMON-NOT: = alloca {{.*}}struct.Big +// COMMON: musttail call {{.*}} @C2({{.*}} %a, {{.*}} %b) + +// Swapped args: caller(a, b) musttails callee(b, a). Each forwarded slot +// must resolve to the correct incoming Argument, not by position. +struct Big C3(struct Big x, struct Big y); +struct Big P3(struct Big a, struct Big b) { + __attribute__((musttail)) return C3(b, a); +} +// COMMON-LABEL: define {{.*}} @P3( +// COMMON-NOT: = alloca {{.*}}struct.Big +// COMMON: musttail call {{.*}} @C3({{.*}} %b, {{.*}} %a) + +// Mixed direct + indirect: only the indirect arg is affected by the fix. +struct Big C4(int n, struct Big a); +struct Big P4(int n, struct Big a) { + __attribute__((musttail)) return C4(n, a); +} +// COMMON-LABEL: define {{.*}} @P4( +// COMMON-NOT: = alloca {{.*}}struct.Big +// COMMON: musttail call {{.*}} @C4({{.*}} %n, {{.*}} %a) + +// Caller modifies the parameter before the musttail. Clang lowers the +// write through the incoming pointer, and the fix forwards the same +// pointer to the callee. No byval-temp. +struct Big C5(struct Big a); +struct Big P5(struct Big a) { + a.a += 1; + __attribute__((musttail)) return C5(a); +} +// COMMON-LABEL: define {{.*}} @P5( +// COMMON-NOT: = alloca {{.*}}struct.Big +// COMMON: musttail call {{.*}} @C5({{.*}} %a) + +// musttail behind a branch: the forwarded pointer must remain live across +// the basic block transition. Tests that the helper does not assume the +// musttail is in the entry block. +struct Big C6(struct Big a, int cond); +struct Big P6(struct Big a, int cond) { + if (cond) + __attribute__((musttail)) return C6(a, cond); + return a; +} +// COMMON-LABEL: define {{.*}} @P6( +// COMMON-NOT: = alloca {{.*}}struct.Big +// COMMON: musttail call {{.*}} @C6({{.*}} %a, + +// Same Argument forwarded to two slots: the helper engages for both. The +// noalias deduplication, if it ever fired, would force the second slot +// back to a byval-temp; but incoming Indirect params under the Linux C +// ABI are not noalias, so both slots forward %a directly. This pins the +// behavior so a future change introducing noalias on Indirect params +// would surface here. (musttail requires matching prototypes, so caller +// and callee both take two Big args.) +struct Big C7(struct Big x, struct Big y); +struct Big P7(struct Big a, struct Big b) { + __attribute__((musttail)) return C7(a, a); +} +// COMMON-LABEL: define {{.*}} @P7( +// COMMON-NOT: = alloca {{.*}}struct.Big +// COMMON: musttail call {{.*}} @C7({{.*}} %a, {{.*}} %a) + +// Negative: local source. Caller takes Big a, but musttails with a LOCAL +// Big initialized in caller's frame. The byval-temp must remain because the +// source lives in caller's frame and would dangle if forwarded. +struct Big C8(struct Big a); +struct Big P8(struct Big a) { + struct Big local = {1, 2, 3, 4}; + __attribute__((musttail)) return C8(local); +} +// COMMON-LABEL: define {{.*}} @P8( +// COMMON: = alloca +// COMMON: musttail call {{.*}} @C8( + +// Non-musttail tail call: the fix must NOT engage. Existing path emits +// the byval-temp as before, no musttail in the IR. +struct Big C9(struct Big a); +struct Big P9(struct Big a) { + return C9(a); +} +// COMMON-LABEL: define {{.*}} @P9( +// COMMON-NOT: musttail `````````` </details> https://github.com/llvm/llvm-project/pull/199351 _______________________________________________ cfe-commits mailing list [email protected] https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits
