avl marked an inline comment as done.
avl added inline comments.

================
Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:825
+  // The local stack holds all alloca instructions and all byval arguments.
+  AllocaDerivedValueTracker Tracker;
+  for (Argument &Arg : F.args()) {
----------------
efriedma wrote:
> avl wrote:
> > efriedma wrote:
> > > avl wrote:
> > > > efriedma wrote:
> > > > > avl wrote:
> > > > > > efriedma wrote:
> > > > > > > avl wrote:
> > > > > > > > efriedma wrote:
> > > > > > > > > avl wrote:
> > > > > > > > > > efriedma wrote:
> > > > > > > > > > > avl wrote:
> > > > > > > > > > > > efriedma wrote:
> > > > > > > > > > > > > Do you have to redo the AllocaDerivedValueTracker 
> > > > > > > > > > > > > analysis?  Is it not enough that the call you're 
> > > > > > > > > > > > > trying to TRE is marked "tail"?
> > > > > > > > > > > > >Do you have to redo the AllocaDerivedValueTracker 
> > > > > > > > > > > > >analysis?
> > > > > > > > > > > > 
> > > > > > > > > > > > AllocaDerivedValueTracker analysis(done in markTails) 
> > > > > > > > > > > > could be reused here. 
> > > > > > > > > > > > But marking, done in markTails(), looks like separate 
> > > > > > > > > > > > tasks. i.e. it is better 
> > > > > > > > > > > > to make TRE not depending on markTails(). There is a 
> > > > > > > > > > > > review for this - https://reviews.llvm.org/D60031
> > > > > > > > > > > > Thus such separation looks useful(To not reuse result 
> > > > > > > > > > > > of markTails but have it computed inplace).
> > > > > > > > > > > > 
> > > > > > > > > > > > > Is it not enough that the call you're trying to TRE 
> > > > > > > > > > > > > is marked "tail"?
> > > > > > > > > > > > 
> > > > > > > > > > > > It is not enough that call which is subject to TRE is 
> > > > > > > > > > > > marked "Tail".
> > > > > > > > > > > > It also should be checked that other calls does not 
> > > > > > > > > > > > capture pointer to local stack: 
> > > > > > > > > > > > 
> > > > > > > > > > > > ```
> > > > > > > > > > > > // do not do TRE if any pointer to local stack has 
> > > > > > > > > > > > escaped.
> > > > > > > > > > > > if (!Tracker.EscapePoints.empty())
> > > > > > > > > > > >    return false;
> > > > > > > > > > > > 
> > > > > > > > > > > > ```
> > > > > > > > > > > > 
> > > > > > > > > > > > It is not enough that call which is subject to TRE is 
> > > > > > > > > > > > marked "Tail". It also should be checked that other 
> > > > > > > > > > > > calls does not capture pointer to local stack:
> > > > > > > > > > > 
> > > > > > > > > > > If there's an escaped pointer to the local stack, we 
> > > > > > > > > > > wouldn't infer "tail" in the first place, would we?
> > > > > > > > > > If function receives pointer to alloca then it would not be 
> > > > > > > > > > marked with "Tail". Then we do not have a possibility to 
> > > > > > > > > > understand whether this function receives pointer to alloca 
> > > > > > > > > > but does not capture it:
> > > > > > > > > > 
> > > > > > > > > > ```
> > > > > > > > > > void test(int recurseCount)
> > > > > > > > > > {
> > > > > > > > > >     if (recurseCount == 0) return;
> > > > > > > > > >     int temp = 10;
> > > > > > > > > >     globalIncrement(&temp);
> > > > > > > > > >     test(recurseCount - 1);
> > > > > > > > > > }
> > > > > > > > > > ```
> > > > > > > > > > 
> > > > > > > > > > test - marked with Tail.
> > > > > > > > > > globalIncrement - not marked with Tail. But TRE could be 
> > > > > > > > > > done since it does not capture pointer. But if it will 
> > > > > > > > > > capture the pointer then we could not do TRE. So we need to 
> > > > > > > > > > check !Tracker.EscapePoints.empty().
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > test - marked with Tail.
> > > > > > > > > 
> > > > > > > > > For the given code, TRE won't mark the recursive call "tail". 
> > > > > > > > >  That transform isn't legal: the recursive call could access 
> > > > > > > > > the caller's version of "temp".
> > > > > > > > >For the given code, TRE won't mark the recursive call "tail". 
> > > > > > > > >That transform isn't legal: the recursive call could access 
> > > > > > > > >the caller's version of "temp".
> > > > > > > > 
> > > > > > > > it looks like recursive call could NOT access the caller's 
> > > > > > > > version of "temp":
> > > > > > > > 
> > > > > > > > ```
> > > > > > > > test(recurseCount - 1);
> > > > > > > > ```
> > > > > > > > 
> > > > > > > > Caller`s version of temp is accessed by non-recursive call:
> > > > > > > > 
> > > > > > > > ```
> > > > > > > > globalIncrement(&temp);
> > > > > > > > ```
> > > > > > > > 
> > > > > > > > If globalIncrement does not capture the "&temp" then TRE looks 
> > > > > > > > to be legal for that case. 
> > > > > > > > 
> > > > > > > > globalIncrement() would not be marked with "Tail". test() would 
> > > > > > > > be marked with Tail.
> > > > > > > > 
> > > > > > > > Thus the pre-requisite for TRE would be: tail-recursive call 
> > > > > > > > must not receive pointer to local stack(Tail) and non-recursive 
> > > > > > > > calls must not capture the pointer to local stack.
> > > > > > > Can you give a complete IR example where we infer "tail", but TRE 
> > > > > > > is illegal?
> > > > > > > 
> > > > > > > Can you give a complete IR example, we we don't infer "tail", but 
> > > > > > > we still do the TRE transform here?
> > > > > > >Can you give a complete IR example where we infer "tail", but TRE 
> > > > > > >is illegal?
> > > > > > 
> > > > > > there is no such example. Currently all cases where we  infer 
> > > > > > "tail" would be valid for TRE.
> > > > > > 
> > > > > > >Can you give a complete IR example, we we don't infer "tail", but 
> > > > > > >we still do the TRE transform here?
> > > > > > 
> > > > > > For the following example current code base would not infer "tail" 
> > > > > > for _Z15globalIncrementPKi and as the result would not do TRE for 
> > > > > > _Z4testi.
> > > > > > This patch changes this behavior: so that if _Z15globalIncrementPKi 
> > > > > > is not marked with "tail" and does not capture its pointer argument 
> > > > > > - TRE would be allowed for _Z4testi.
> > > > > > 
> > > > > > 
> > > > > > ```
> > > > > > @count = dso_local local_unnamed_addr global i32 0, align 4
> > > > > > 
> > > > > > ; Function Attrs: nofree noinline norecurse nounwind uwtable
> > > > > > define dso_local void @_Z15globalIncrementPKi(i32* nocapture 
> > > > > > readonly %param) local_unnamed_addr #0 {
> > > > > > entry:
> > > > > >   %0 = load i32, i32* %param, align 4
> > > > > >   %1 = load i32, i32* @count, align 4
> > > > > >   %add = add nsw i32 %1, %0
> > > > > >   store i32 %add, i32* @count, align 4
> > > > > >   ret void
> > > > > > }
> > > > > > 
> > > > > > ; Function Attrs: nounwind uwtable
> > > > > > define dso_local void @_Z4testi(i32 %recurseCount) 
> > > > > > local_unnamed_addr #1 {
> > > > > > entry:
> > > > > >   %temp = alloca i32, align 4
> > > > > >   %cmp = icmp eq i32 %recurseCount, 0
> > > > > >   br i1 %cmp, label %return, label %if.end
> > > > > > 
> > > > > > if.end:                                           ; preds = %entry
> > > > > >   %0 = bitcast i32* %temp to i8*
> > > > > >   call void @llvm.lifetime.start.p0i8(i64 4, i8* nonnull %0) #6
> > > > > >   store i32 10, i32* %temp, align 4
> > > > > >   call void @_Z15globalIncrementPKi(i32* nonnull %temp)
> > > > > >   %sub = add nsw i32 %recurseCount, -1
> > > > > >   call void @_Z4testi(i32 %sub)
> > > > > >   call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %0) #6
> > > > > >   br label %return
> > > > > > 
> > > > > > return:                                           ; preds = %entry, 
> > > > > > %if.end
> > > > > >   ret void
> > > > > > }
> > > > > > 
> > > > > > ; Function Attrs: argmemonly nounwind willreturn
> > > > > > declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #2
> > > > > > 
> > > > > > ; Function Attrs: argmemonly nounwind willreturn
> > > > > > declare void @llvm.lifetime.end.p0i8(i64 immarg, i8* nocapture) #2
> > > > > > 
> > > > > > attributes #0 = { nofree noinline norecurse nounwind uwtable }
> > > > > > attributes #1 = { nounwind uwtable }
> > > > > > attributes #2 = { argmemonly nounwind willreturn }
> > > > > > 
> > > > > > ```
> > > > > In your example, we don't infer "tail" for globalIncrement... but we 
> > > > > do infer it for the recursive call to test().  I'm suggesting you 
> > > > > could just check for "tail" on test(), instead of using 
> > > > > AllocaDerivedValueTracker.
> > > > Checking only test() is not enough. There additionally should be 
> > > > checked globalIncrement().
> > > > 
> > > > It is correct that while checking test() there could be checked "Tail" 
> > > > flag. Which does not require using AllocaDerivedValueTracker.
> > > > 
> > > > But while checking globalIncrement() there should be checked whether 
> > > > some alloca value escaped or not. "Tail" flag could not be used for 
> > > > that. AllocaDerivedValueTracker allow to do such check:
> > > > 
> > > > Tracker.EscapePoints.empty()
> > > > 
> > > > If we would not do check for globalIncrement then it is not valid to do 
> > > > TRE. Thus it seems we need to check globalIncrement for escaping 
> > > > pointer and we need to use AllocaDerivedValueTracker for that.
> > > > Checking only test() is not enough. There additionally should be 
> > > > checked globalIncrement().
> > > 
> > > >> Can you give a complete IR example where we infer "tail", but TRE is 
> > > >> illegal?
> > > > there is no such example. Currently all cases where we infer "tail" 
> > > > would be valid for TRE.
> > > 
> > > I'm not sure how to reconcile this.  Are you saying we could infer "tail" 
> > > in some case where TRE is illegal, but don't right now?  Or are you 
> > > saying that you plan to extend TRE to handle cases where we can't infer 
> > > "tail" on the recursive call?
> > >Or are you saying that you plan to extend TRE to handle cases where we 
> > >can't infer "tail" on the recursive call?
> > 
> > I think not exactly. more precise would probably be : "I am saying that 
> > plan to extend TRE to handle cases where we can't infer "tail" on the 
> > NON-recursive NON-last call"
> > 
> > globalIncrement() is non-recursive non-last call in above example. But we 
> > need to check whether it captures argument pointer or not to decide whether 
> > it is OK to do TRE.
> > 
> > To make things clear - I am suggesting instead of current pre-requisite for 
> > TRE :
> > 
> > "All call sites are marked with Tail"
> > 
> > to make following: 
> > 
> > "Recursive last calls are marked with "Tail", non-recursive non-last calls 
> > are proved to not capture alloca".
> > 
> > For the above example it means : the requirement for test() should stay the 
> > same(should be marked with Tail). The requirement for globalIncrement() 
> > should be "does not capture alloca".
> > 
> "Recursive last calls are marked with 'tail'" implies "non-recursive non-last 
> calls are proved to not capture alloca".
I see, thank you for explanations. 
There is a test which makes me think that above rule is not always correct:

Transforms/TailCallElim/basic.ll:@test1


```
; PR615. Make sure that we do not move the alloca so that it interferes with 
the tail call.
define i32 @test1() {
; CHECK: i32 @test1()
; CHECK-NEXT: alloca
        %A = alloca i32         ; <i32*> [#uses=2]
        store i32 5, i32* %A
        call void @use(i32* %A)
; CHECK: tail call i32 @test1
        %X = tail call i32 @test1()             ; <i32> [#uses=1]
        ret i32 %X
}
```

I removed usages of AllocaDerivedValueTracker and corrected the test1 from 
Transforms/TailCallElim/basic.ll.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to