On Jun 13, 2011, at 6:37 PM, Eli Friedman wrote: > Author: efriedma > Date: Mon Jun 13 20:37:52 2011 > New Revision: 132957 > > URL: http://llvm.org/viewvc/llvm-project?rev=132957&view=rev > Log: > The LLVM IR representation of byval arguments has a rather strange property: > if the alignment of an argument to a call is less than the specified byval > alignment for that argument, there is no way to specify the alignment of the > implied copy. Therefore, we must ensure that the alignment of the argument > is at least the byval alignment. To do this, we have to mess with the > alignment of relevant alloca's in some cases, and insert a copy that > conceptually shouldn't be necessary in some cases. > > This patch tries relatively hard to avoid creating an extra copy if it can be > avoided (see test3 in the included testcase), but it is not possible to avoid > in some cases (like test2 in the included testcase). > > rdar://9483886
Hi Eli, I think that this can be simplified and generalized through the use of getOrEnforceKnownAlignment in llvm/Transforms/Utils/Local.h. Something like: if (getOrEnforceKnownAlignment(RV.getAggregateAddr(), ArgInfo.getIndirectAlign(), TD) < ArgInfo.getIndirectAlign()) emit temporary. should work. -Chris > > > Modified: > cfe/trunk/lib/CodeGen/CGCall.cpp > cfe/trunk/test/CodeGen/byval-memcpy-elim.c > > Modified: cfe/trunk/lib/CodeGen/CGCall.cpp > URL: > http://llvm.org/viewvc/llvm-project/cfe/trunk/lib/CodeGen/CGCall.cpp?rev=132957&r1=132956&r2=132957&view=diff > ============================================================================== > --- cfe/trunk/lib/CodeGen/CGCall.cpp (original) > +++ cfe/trunk/lib/CodeGen/CGCall.cpp Mon Jun 13 20:37:52 2011 > @@ -1263,12 +1263,51 @@ > Alignment, I->Ty); > else > StoreComplexToAddr(RV.getComplexVal(), Args.back(), false); > - } else if (I->NeedsCopy && !ArgInfo.getIndirectByVal()) { > - Args.push_back(CreateMemTemp(I->Ty)); > - EmitAggregateCopy(Args.back(), RV.getAggregateAddr(), I->Ty, > - RV.isVolatileQualified()); > } else { > - Args.push_back(RV.getAggregateAddr()); > + // We want to avoid creating an unnecessary temporary+copy here; > + // however, we need one in two cases: > + // 1. If the argument is not byval, and we are required to copy the > + // source. (This case doesn't occur on any common architecture.) > + // 2. If the argument is byval, RV is not sufficiently aligned, and > + // we cannot force it to be sufficiently aligned. > + // FIXME: This code is ugly because we don't know the required > + // alignment when RV is generated. > + llvm::AllocaInst *AI = > + dyn_cast<llvm::AllocaInst>(RV.getAggregateAddr()); > + bool NeedsAggCopy = false; > + if (I->NeedsCopy && !ArgInfo.getIndirectByVal()) > + NeedsAggCopy = true; > + if (ArgInfo.getIndirectByVal()) { > + if (AI) { > + // The source is an alloca; we can force appropriate alignment. > + if (ArgInfo.getIndirectAlign() > AI->getAlignment()) > + AI->setAlignment(ArgInfo.getIndirectAlign()); > + } else if (llvm::Argument *A = > + dyn_cast<llvm::Argument>(RV.getAggregateAddr())) { > + // Check if the source is an appropriately aligned byval > argument. > + if (!A->hasByValAttr() || > + A->getParamAlignment() < ArgInfo.getIndirectAlign()) > + NeedsAggCopy = true; > + } else { > + // We don't know what the input is; force a temporary+copy if > + // the type alignment is not sufficient. > + assert(I->NeedsCopy && "Temporary must be AllocaInst"); > + if (ArgInfo.getIndirectAlign() > Alignment) > + NeedsAggCopy = true; > + } > + } > + if (NeedsAggCopy) { > + // Create an aligned temporary, and copy to it. > + AI = CreateMemTemp(I->Ty); > + if (ArgInfo.getIndirectAlign() > AI->getAlignment()) > + AI->setAlignment(ArgInfo.getIndirectAlign()); > + Args.push_back(AI); > + EmitAggregateCopy(AI, RV.getAggregateAddr(), I->Ty, > + RV.isVolatileQualified()); > + } else { > + // Skip the extra memcpy call. > + Args.push_back(RV.getAggregateAddr()); > + } > } > break; > } > > Modified: cfe/trunk/test/CodeGen/byval-memcpy-elim.c > URL: > http://llvm.org/viewvc/llvm-project/cfe/trunk/test/CodeGen/byval-memcpy-elim.c?rev=132957&r1=132956&r2=132957&view=diff > ============================================================================== > --- cfe/trunk/test/CodeGen/byval-memcpy-elim.c (original) > +++ cfe/trunk/test/CodeGen/byval-memcpy-elim.c Mon Jun 13 20:37:52 2011 > @@ -18,3 +18,36 @@ > void test1(struct Test1S *A, struct Test2S *B) { > test1a(*A, *B); > } > + > +// The above gets tricker when the byval argument requires higher alignment > +// than the natural alignment of the type in question. > +// rdar://9483886 > + > +// Make sure we do generate a memcpy when we cannot guarantee alignment. > +struct Test3S { > + int a,b,c,d,e,f,g,h,i,j,k,l; > +}; > +void test2a(struct Test3S q); > +// CHECK: define void @test2( > +// CHECK: alloca %struct.Test3S, align 8 > +// CHECK: memcpy > +// CHECK: call void @test2a > +void test2(struct Test3S *q) { > + test2a(*q); > +} > + > +// But make sure we don't generate a memcpy when we can guarantee alignment. > +void fooey(void); > +// CHECK: define void @test3( > +// CHECK: alloca %struct.Test3S, align 8 > +// CHECK: call void @fooey > +// CHECK-NOT: memcpy > +// CHECK: call void @test2a > +// CHECK-NOT: memcpy > +// CHECK: call void @test2a > +void test3(struct Test3S a) { > + struct Test3S b = a; > + fooey(); > + test2a(a); > + test2a(b); > +} > > > _______________________________________________ > cfe-commits mailing list > [email protected] > http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits _______________________________________________ cfe-commits mailing list [email protected] http://lists.cs.uiuc.edu/mailman/listinfo/cfe-commits
