avl marked an inline comment as done.
avl added inline comments.

================
Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:825
+  // The local stack holds all alloca instructions and all byval arguments.
+  AllocaDerivedValueTracker Tracker;
+  for (Argument &Arg : F.args()) {
----------------
efriedma wrote:
> avl wrote:
> > efriedma wrote:
> > > avl wrote:
> > > > efriedma wrote:
> > > > > avl wrote:
> > > > > > efriedma wrote:
> > > > > > > avl wrote:
> > > > > > > > efriedma wrote:
> > > > > > > > > avl wrote:
> > > > > > > > > > efriedma wrote:
> > > > > > > > > > > Do you have to redo the AllocaDerivedValueTracker 
> > > > > > > > > > > analysis?  Is it not enough that the call you're trying 
> > > > > > > > > > > to TRE is marked "tail"?
> > > > > > > > > > >Do you have to redo the AllocaDerivedValueTracker analysis?
> > > > > > > > > > 
> > > > > > > > > > AllocaDerivedValueTracker analysis(done in markTails) could 
> > > > > > > > > > be reused here. 
> > > > > > > > > > But marking, done in markTails(), looks like separate 
> > > > > > > > > > tasks. i.e. it is better 
> > > > > > > > > > to make TRE not depending on markTails(). There is a review 
> > > > > > > > > > for this - https://reviews.llvm.org/D60031
> > > > > > > > > > Thus such separation looks useful(To not reuse result of 
> > > > > > > > > > markTails but have it computed inplace).
> > > > > > > > > > 
> > > > > > > > > > > Is it not enough that the call you're trying to TRE is 
> > > > > > > > > > > marked "tail"?
> > > > > > > > > > 
> > > > > > > > > > It is not enough that call which is subject to TRE is 
> > > > > > > > > > marked "Tail".
> > > > > > > > > > It also should be checked that other calls does not capture 
> > > > > > > > > > pointer to local stack: 
> > > > > > > > > > 
> > > > > > > > > > ```
> > > > > > > > > > // do not do TRE if any pointer to local stack has escaped.
> > > > > > > > > > if (!Tracker.EscapePoints.empty())
> > > > > > > > > >    return false;
> > > > > > > > > > 
> > > > > > > > > > ```
> > > > > > > > > > 
> > > > > > > > > > It is not enough that call which is subject to TRE is 
> > > > > > > > > > marked "Tail". It also should be checked that other calls 
> > > > > > > > > > does not capture pointer to local stack:
> > > > > > > > > 
> > > > > > > > > If there's an escaped pointer to the local stack, we wouldn't 
> > > > > > > > > infer "tail" in the first place, would we?
> > > > > > > > If function receives pointer to alloca then it would not be 
> > > > > > > > marked with "Tail". Then we do not have a possibility to 
> > > > > > > > understand whether this function receives pointer to alloca but 
> > > > > > > > does not capture it:
> > > > > > > > 
> > > > > > > > ```
> > > > > > > > void test(int recurseCount)
> > > > > > > > {
> > > > > > > >     if (recurseCount == 0) return;
> > > > > > > >     int temp = 10;
> > > > > > > >     globalIncrement(&temp);
> > > > > > > >     test(recurseCount - 1);
> > > > > > > > }
> > > > > > > > ```
> > > > > > > > 
> > > > > > > > test - marked with Tail.
> > > > > > > > globalIncrement - not marked with Tail. But TRE could be done 
> > > > > > > > since it does not capture pointer. But if it will capture the 
> > > > > > > > pointer then we could not do TRE. So we need to check 
> > > > > > > > !Tracker.EscapePoints.empty().
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > test - marked with Tail.
> > > > > > > 
> > > > > > > For the given code, TRE won't mark the recursive call "tail".  
> > > > > > > That transform isn't legal: the recursive call could access the 
> > > > > > > caller's version of "temp".
> > > > > > >For the given code, TRE won't mark the recursive call "tail". That 
> > > > > > >transform isn't legal: the recursive call could access the 
> > > > > > >caller's version of "temp".
> > > > > > 
> > > > > > it looks like recursive call could NOT access the caller's version 
> > > > > > of "temp":
> > > > > > 
> > > > > > ```
> > > > > > test(recurseCount - 1);
> > > > > > ```
> > > > > > 
> > > > > > Caller`s version of temp is accessed by non-recursive call:
> > > > > > 
> > > > > > ```
> > > > > > globalIncrement(&temp);
> > > > > > ```
> > > > > > 
> > > > > > If globalIncrement does not capture the "&temp" then TRE looks to 
> > > > > > be legal for that case. 
> > > > > > 
> > > > > > globalIncrement() would not be marked with "Tail". test() would be 
> > > > > > marked with Tail.
> > > > > > 
> > > > > > Thus the pre-requisite for TRE would be: tail-recursive call must 
> > > > > > not receive pointer to local stack(Tail) and non-recursive calls 
> > > > > > must not capture the pointer to local stack.
> > > > > Can you give a complete IR example where we infer "tail", but TRE is 
> > > > > illegal?
> > > > > 
> > > > > Can you give a complete IR example, we we don't infer "tail", but we 
> > > > > still do the TRE transform here?
> > > > >Can you give a complete IR example where we infer "tail", but TRE is 
> > > > >illegal?
> > > > 
> > > > there is no such example. Currently all cases where we  infer "tail" 
> > > > would be valid for TRE.
> > > > 
> > > > >Can you give a complete IR example, we we don't infer "tail", but we 
> > > > >still do the TRE transform here?
> > > > 
> > > > For the following example current code base would not infer "tail" for 
> > > > _Z15globalIncrementPKi and as the result would not do TRE for _Z4testi.
> > > > This patch changes this behavior: so that if _Z15globalIncrementPKi is 
> > > > not marked with "tail" and does not capture its pointer argument - TRE 
> > > > would be allowed for _Z4testi.
> > > > 
> > > > 
> > > > ```
> > > > @count = dso_local local_unnamed_addr global i32 0, align 4
> > > > 
> > > > ; Function Attrs: nofree noinline norecurse nounwind uwtable
> > > > define dso_local void @_Z15globalIncrementPKi(i32* nocapture readonly 
> > > > %param) local_unnamed_addr #0 {
> > > > entry:
> > > >   %0 = load i32, i32* %param, align 4
> > > >   %1 = load i32, i32* @count, align 4
> > > >   %add = add nsw i32 %1, %0
> > > >   store i32 %add, i32* @count, align 4
> > > >   ret void
> > > > }
> > > > 
> > > > ; Function Attrs: nounwind uwtable
> > > > define dso_local void @_Z4testi(i32 %recurseCount) local_unnamed_addr 
> > > > #1 {
> > > > entry:
> > > >   %temp = alloca i32, align 4
> > > >   %cmp = icmp eq i32 %recurseCount, 0
> > > >   br i1 %cmp, label %return, label %if.end
> > > > 
> > > > if.end:                                           ; preds = %entry
> > > >   %0 = bitcast i32* %temp to i8*
> > > >   call void @llvm.lifetime.start.p0i8(i64 4, i8* nonnull %0) #6
> > > >   store i32 10, i32* %temp, align 4
> > > >   call void @_Z15globalIncrementPKi(i32* nonnull %temp)
> > > >   %sub = add nsw i32 %recurseCount, -1
> > > >   call void @_Z4testi(i32 %sub)
> > > >   call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %0) #6
> > > >   br label %return
> > > > 
> > > > return:                                           ; preds = %entry, 
> > > > %if.end
> > > >   ret void
> > > > }
> > > > 
> > > > ; Function Attrs: argmemonly nounwind willreturn
> > > > declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #2
> > > > 
> > > > ; Function Attrs: argmemonly nounwind willreturn
> > > > declare void @llvm.lifetime.end.p0i8(i64 immarg, i8* nocapture) #2
> > > > 
> > > > attributes #0 = { nofree noinline norecurse nounwind uwtable }
> > > > attributes #1 = { nounwind uwtable }
> > > > attributes #2 = { argmemonly nounwind willreturn }
> > > > 
> > > > ```
> > > In your example, we don't infer "tail" for globalIncrement... but we do 
> > > infer it for the recursive call to test().  I'm suggesting you could just 
> > > check for "tail" on test(), instead of using AllocaDerivedValueTracker.
> > Checking only test() is not enough. There additionally should be checked 
> > globalIncrement().
> > 
> > It is correct that while checking test() there could be checked "Tail" 
> > flag. Which does not require using AllocaDerivedValueTracker.
> > 
> > But while checking globalIncrement() there should be checked whether some 
> > alloca value escaped or not. "Tail" flag could not be used for that. 
> > AllocaDerivedValueTracker allow to do such check:
> > 
> > Tracker.EscapePoints.empty()
> > 
> > If we would not do check for globalIncrement then it is not valid to do 
> > TRE. Thus it seems we need to check globalIncrement for escaping pointer 
> > and we need to use AllocaDerivedValueTracker for that.
> > Checking only test() is not enough. There additionally should be checked 
> > globalIncrement().
> 
> >> Can you give a complete IR example where we infer "tail", but TRE is 
> >> illegal?
> > there is no such example. Currently all cases where we infer "tail" would 
> > be valid for TRE.
> 
> I'm not sure how to reconcile this.  Are you saying we could infer "tail" in 
> some case where TRE is illegal, but don't right now?  Or are you saying that 
> you plan to extend TRE to handle cases where we can't infer "tail" on the 
> recursive call?
>Or are you saying that you plan to extend TRE to handle cases where we can't 
>infer "tail" on the recursive call?

I think not exactly. more precise would probably be : "I am saying that plan to 
extend TRE to handle cases where we can't infer "tail" on the NON-recursive 
NON-last call"

globalIncrement() is non-recursive non-last call in above example. But we need 
to check whether it captures argument pointer or not to decide whether it is OK 
to do TRE.

To make things clear - I am suggesting instead of current pre-requisite for TRE 
:

"All call sites are marked with Tail"

to make following: 

"Recursive last calls are marked with "Tail", non-recursive non-last calls are 
proved to not capture alloca".

For the above example it means : the requirement for test() should stay the 
same(should be marked with Tail). The requirement for globalIncrement() should 
be "does not capture alloca".



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to