Re: [PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-07-13 Thread Alexey Lapshin via cfe-commits


Hi David,
 
Thank you for the information. I would revert that commit and work on the fix.
 
Thank you, Alexey. 
>Воскресенье, 12 июля 2020, 12:44 +03:00 от David Zarzycki via Phabricator 
>:
> 
>davezarzycki added a comment.
>
>Hello. I have an auto-bisecting multi-stage bot that is failing on two after 
>this change. Can we please revert this or commit a quick fix?
>
>  FAIL: Clang :: CXX/class/class.compare/class.spaceship/p1.cpp (6232 of 64222)
>   TEST 'Clang :: 
>CXX/class/class.compare/class.spaceship/p1.cpp' FAILED 
>  Script:
>  --
>  : 'RUN: at line 1'; /tmp/_update_lc/t/bin/clang -cc1 -internal-isystem 
>/tmp/_update_lc/t/lib/clang/11.0.0/include -nostdsysteminc -std=c++2a -verify 
>/home/dave/s/lp/clang/test/CXX/class/class.compare/class.spaceship/p1.cpp 
>-fcxx-exceptions
>  --
>  Exit Code: 134
>  
>  Command Output (stderr):
>  --
>  clang: /home/dave/s/lp/clang/lib/Basic/SourceManager.cpp:917: clang::FileID 
>clang::SourceManager::getFileIDLoaded(unsigned int) const: Assertion `0 && 
>"Invalid SLocOffset or bad function choice"' failed.
>  PLEASE submit a bug report to  https://bugs.llvm.org/ and include the crash 
>backtrace, preprocessed source, and associated run script.
>  Stack dump:
>  0. Program arguments: /tmp/_update_lc/t/bin/clang -cc1 -internal-isystem 
>/tmp/_update_lc/t/lib/clang/11.0.0/include -nostdsysteminc -std=c++2a -verify 
>/home/dave/s/lp/clang/test/CXX/class/class.compare/class.spaceship/p1.cpp 
>-fcxx-exceptions
>  1. 
>/home/dave/s/lp/clang/test/CXX/class/class.compare/class.spaceship/p1.cpp:127:38:
> current parser token ','
>  2. 
>/home/dave/s/lp/clang/test/CXX/class/class.compare/class.spaceship/p1.cpp:39:1:
> parsing namespace 'Deletedness'
>  3. 
>/home/dave/s/lp/clang/test/CXX/class/class.compare/class.spaceship/p1.cpp:123:12:
> parsing function body 'Deletedness::g'
>  4. 
>/home/dave/s/lp/clang/test/CXX/class/class.compare/class.spaceship/p1.cpp:123:12:
> in compound statement ('{}')
>   #0 0x0359273f llvm::sys::PrintStackTrace(llvm::raw_ostream&) 
>(/tmp/_update_lc/t/bin/clang+0x359273f)
>   #1 0x03590912 llvm::sys::RunSignalHandlers() 
>(/tmp/_update_lc/t/bin/clang+0x3590912)
>   #2 0x03592bb5 SignalHandler(int) 
>(/tmp/_update_lc/t/bin/clang+0x3592bb5)
>   #3 0x77fa6a90 __restore_rt (/lib64/libpthread.so.0+0x14a90)
>   #4 0x77b3da25 raise (/lib64/libc.so.6+0x3ca25)
>   #5 0x77b26895 abort (/lib64/libc.so.6+0x25895)
>   #6 0x77b26769 _nl_load_domain.cold (/lib64/libc.so.6+0x25769)
>   #7 0x77b35e86 (/lib64/libc.so.6+0x34e86)
>   #8 0x0375636c clang::SourceManager::getFileIDLoaded(unsigned int) 
>const (/tmp/_update_lc/t/bin/clang+0x375636c)
>   #9 0x03ee0bbb 
>clang::VerifyDiagnosticConsumer::HandleDiagnostic(clang::DiagnosticsEngine::Level,
> clang::Diagnostic const&) (/tmp/_update_lc/t/bin/clang+0x3ee0bbb)
>  #10 0x037501ab 
>clang::DiagnosticIDs::ProcessDiag(clang::DiagnosticsEngine&) const 
>(/tmp/_update_lc/t/bin/clang+0x37501ab)
>  #11 0x03749fca clang::DiagnosticsEngine::EmitCurrentDiagnostic(bool) 
>(/tmp/_update_lc/t/bin/clang+0x3749fca)
>  #12 0x04df0c60 clang::Sema::EmitCurrentDiagnostic(unsigned int) 
>(/tmp/_update_lc/t/bin/clang+0x4df0c60)
>  #13 0x05092783 (anonymous 
>namespace)::DefaultedComparisonAnalyzer::visitBinaryOperator(clang::OverloadedOperatorKind,
> llvm::ArrayRef, (anonymous 
>namespace)::DefaultedComparisonSubobject, clang::OverloadCandidateSet*) 
>(/tmp/_update_lc/t/bin/clang+0x5092783)
>  #14 0x05091dba (anonymous 
>namespace)::DefaultedComparisonAnalyzer::visitExpandedSubobject(clang::QualType,
> (anonymous namespace)::DefaultedComparisonSubobject) 
>(/tmp/_update_lc/t/bin/clang+0x5091dba)
>  #15 0x05091b86 (anonymous 
>namespace)::DefaultedComparisonVisitor<(anonymous 
>namespace)::DefaultedComparisonAnalyzer, (anonymous 
>namespace)::DefaultedComparisonInfo, (anonymous 
>namespace)::DefaultedComparisonInfo, (anonymous 
>namespace)::DefaultedComparisonSubobject>::visitSubobjects((anonymous 
>namespace)::DefaultedComparisonInfo&, clang::CXXRecordDecl*, 
>clang::Qualifiers) (/tmp/_update_lc/t/bin/clang+0x5091b86)
>  #16 0x05058c8c (anonymous 
>namespace)::DefaultedComparisonAnalyzer::visit() 
>(/tmp/_update_lc/t/bin/clang+0x5058c8c)
>  #17 0x0505ab22 
>clang::Sema::DiagnoseDeletedDefaultedFunction(clang::FunctionDecl*) 
>(/tmp/_update_lc/t/bin/clang+0x505ab22)
>  #18 0x053e60ed 
>clang::Sema::CreateOverloadedBinOp(clang::SourceLocation, 
>clang::BinaryOperatorKind, clang::UnresolvedSetImpl const&, clang::Expr*, 
>clang::Expr*, bool, bool, clang::FunctionDecl*) 
>(/tmp/_update_lc/t/bin/clang+0x53e60ed)
>  #19 0x0514270a BuildOverloadedBinOp(clang::Sema&, clang::Scope*, 
>clang::SourceLocation, clang::BinaryOperatorKind, clang::Expr*, clang::Expr*) 
>(/tmp/_update_lc/t/bin/clang+0x514270a)
>  #20 0x050fbf49 

[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-07-12 Thread David Zarzycki via Phabricator via cfe-commits
davezarzycki added a comment.

Hello. I have an auto-bisecting multi-stage bot that is failing on two after 
this change. Can we please revert this or commit a quick fix?

  FAIL: Clang :: CXX/class/class.compare/class.spaceship/p1.cpp (6232 of 64222)
   TEST 'Clang :: 
CXX/class/class.compare/class.spaceship/p1.cpp' FAILED 
  Script:
  --
  : 'RUN: at line 1';   /tmp/_update_lc/t/bin/clang -cc1 -internal-isystem 
/tmp/_update_lc/t/lib/clang/11.0.0/include -nostdsysteminc -std=c++2a -verify 
/home/dave/s/lp/clang/test/CXX/class/class.compare/class.spaceship/p1.cpp 
-fcxx-exceptions
  --
  Exit Code: 134
  
  Command Output (stderr):
  --
  clang: /home/dave/s/lp/clang/lib/Basic/SourceManager.cpp:917: clang::FileID 
clang::SourceManager::getFileIDLoaded(unsigned int) const: Assertion `0 && 
"Invalid SLocOffset or bad function choice"' failed.
  PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash 
backtrace, preprocessed source, and associated run script.
  Stack dump:
  0.  Program arguments: /tmp/_update_lc/t/bin/clang -cc1 -internal-isystem 
/tmp/_update_lc/t/lib/clang/11.0.0/include -nostdsysteminc -std=c++2a -verify 
/home/dave/s/lp/clang/test/CXX/class/class.compare/class.spaceship/p1.cpp 
-fcxx-exceptions
  1.  
/home/dave/s/lp/clang/test/CXX/class/class.compare/class.spaceship/p1.cpp:127:38:
 current parser token ','
  2.  
/home/dave/s/lp/clang/test/CXX/class/class.compare/class.spaceship/p1.cpp:39:1: 
parsing namespace 'Deletedness'
  3.  
/home/dave/s/lp/clang/test/CXX/class/class.compare/class.spaceship/p1.cpp:123:12:
 parsing function body 'Deletedness::g'
  4.  
/home/dave/s/lp/clang/test/CXX/class/class.compare/class.spaceship/p1.cpp:123:12:
 in compound statement ('{}')
   #0 0x0359273f llvm::sys::PrintStackTrace(llvm::raw_ostream&) 
(/tmp/_update_lc/t/bin/clang+0x359273f)
   #1 0x03590912 llvm::sys::RunSignalHandlers() 
(/tmp/_update_lc/t/bin/clang+0x3590912)
   #2 0x03592bb5 SignalHandler(int) 
(/tmp/_update_lc/t/bin/clang+0x3592bb5)
   #3 0x77fa6a90 __restore_rt (/lib64/libpthread.so.0+0x14a90)
   #4 0x77b3da25 raise (/lib64/libc.so.6+0x3ca25)
   #5 0x77b26895 abort (/lib64/libc.so.6+0x25895)
   #6 0x77b26769 _nl_load_domain.cold (/lib64/libc.so.6+0x25769)
   #7 0x77b35e86 (/lib64/libc.so.6+0x34e86)
   #8 0x0375636c clang::SourceManager::getFileIDLoaded(unsigned int) 
const (/tmp/_update_lc/t/bin/clang+0x375636c)
   #9 0x03ee0bbb 
clang::VerifyDiagnosticConsumer::HandleDiagnostic(clang::DiagnosticsEngine::Level,
 clang::Diagnostic const&) (/tmp/_update_lc/t/bin/clang+0x3ee0bbb)
  #10 0x037501ab 
clang::DiagnosticIDs::ProcessDiag(clang::DiagnosticsEngine&) const 
(/tmp/_update_lc/t/bin/clang+0x37501ab)
  #11 0x03749fca clang::DiagnosticsEngine::EmitCurrentDiagnostic(bool) 
(/tmp/_update_lc/t/bin/clang+0x3749fca)
  #12 0x04df0c60 clang::Sema::EmitCurrentDiagnostic(unsigned int) 
(/tmp/_update_lc/t/bin/clang+0x4df0c60)
  #13 0x05092783 (anonymous 
namespace)::DefaultedComparisonAnalyzer::visitBinaryOperator(clang::OverloadedOperatorKind,
 llvm::ArrayRef, (anonymous 
namespace)::DefaultedComparisonSubobject, clang::OverloadCandidateSet*) 
(/tmp/_update_lc/t/bin/clang+0x5092783)
  #14 0x05091dba (anonymous 
namespace)::DefaultedComparisonAnalyzer::visitExpandedSubobject(clang::QualType,
 (anonymous namespace)::DefaultedComparisonSubobject) 
(/tmp/_update_lc/t/bin/clang+0x5091dba)
  #15 0x05091b86 (anonymous 
namespace)::DefaultedComparisonVisitor<(anonymous 
namespace)::DefaultedComparisonAnalyzer, (anonymous 
namespace)::DefaultedComparisonInfo, (anonymous 
namespace)::DefaultedComparisonInfo, (anonymous 
namespace)::DefaultedComparisonSubobject>::visitSubobjects((anonymous 
namespace)::DefaultedComparisonInfo&, clang::CXXRecordDecl*, clang::Qualifiers) 
(/tmp/_update_lc/t/bin/clang+0x5091b86)
  #16 0x05058c8c (anonymous 
namespace)::DefaultedComparisonAnalyzer::visit() 
(/tmp/_update_lc/t/bin/clang+0x5058c8c)
  #17 0x0505ab22 
clang::Sema::DiagnoseDeletedDefaultedFunction(clang::FunctionDecl*) 
(/tmp/_update_lc/t/bin/clang+0x505ab22)
  #18 0x053e60ed 
clang::Sema::CreateOverloadedBinOp(clang::SourceLocation, 
clang::BinaryOperatorKind, clang::UnresolvedSetImpl const&, clang::Expr*, 
clang::Expr*, bool, bool, clang::FunctionDecl*) 
(/tmp/_update_lc/t/bin/clang+0x53e60ed)
  #19 0x0514270a BuildOverloadedBinOp(clang::Sema&, clang::Scope*, 
clang::SourceLocation, clang::BinaryOperatorKind, clang::Expr*, clang::Expr*) 
(/tmp/_update_lc/t/bin/clang+0x514270a)
  #20 0x050fbf49 clang::Sema::ActOnBinOp(clang::Scope*, 
clang::SourceLocation, clang::tok::TokenKind, clang::Expr*, clang::Expr*) 
(/tmp/_update_lc/t/bin/clang+0x50fbf49)
  #21 0x04d52ccc 
clang::Parser::ParseRHSOfBinaryExpression(clang::ActionResult, clang::prec::Level) 

[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-07-11 Thread Alexey Lapshin via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rGf7907e9d223d: [TRE] allow TRE for non-capturing calls. 
(authored by avl).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085

Files:
  llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
  llvm/test/Transforms/TailCallElim/basic.ll
  llvm/test/Transforms/TailCallElim/tre-multiple-exits.ll
  llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll

Index: llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
===
--- /dev/null
+++ llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
@@ -0,0 +1,74 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt < %s -tailcallelim -verify-dom-info -S | FileCheck %s
+
+; IR for that test was generated from the following C++ source:
+;
+;int count;
+;__attribute__((noinline)) void globalIncrement(const int* param) { count += *param; }
+;
+;void test(int recurseCount)
+;{
+;if (recurseCount == 0) return;
+;int temp = 10;
+;globalIncrement();
+;test(recurseCount - 1);
+;}
+;
+
+@count = dso_local local_unnamed_addr global i32 0, align 4
+
+; Function Attrs: nofree noinline norecurse nounwind uwtable
+declare void @_Z15globalIncrementPKi(i32* nocapture readonly %param) #0
+
+; Test that TRE could be done for recursive tail routine containing
+; call to function receiving a pointer to local stack.
+
+; Function Attrs: nounwind uwtable
+define dso_local void @_Z4testi(i32 %recurseCount) local_unnamed_addr #1 {
+; CHECK-LABEL: @_Z4testi(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:[[TEMP:%.*]] = alloca i32, align 4
+; CHECK-NEXT:br label [[TAILRECURSE:%.*]]
+; CHECK:   tailrecurse:
+; CHECK-NEXT:[[RECURSECOUNT_TR:%.*]] = phi i32 [ [[RECURSECOUNT:%.*]], [[ENTRY:%.*]] ], [ [[SUB:%.*]], [[IF_END:%.*]] ]
+; CHECK-NEXT:[[CMP:%.*]] = icmp eq i32 [[RECURSECOUNT_TR]], 0
+; CHECK-NEXT:br i1 [[CMP]], label [[RETURN:%.*]], label [[IF_END]]
+; CHECK:   if.end:
+; CHECK-NEXT:[[TMP0:%.*]] = bitcast i32* [[TEMP]] to i8*
+; CHECK-NEXT:call void @llvm.lifetime.start.p0i8(i64 4, i8* nonnull [[TMP0]])
+; CHECK-NEXT:store i32 10, i32* [[TEMP]], align 4
+; CHECK-NEXT:call void @_Z15globalIncrementPKi(i32* nonnull [[TEMP]])
+; CHECK-NEXT:[[SUB]] = add nsw i32 [[RECURSECOUNT_TR]], -1
+; CHECK-NEXT:call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull [[TMP0]])
+; CHECK-NEXT:br label [[TAILRECURSE]]
+; CHECK:   return:
+; CHECK-NEXT:ret void
+;
+entry:
+  %temp = alloca i32, align 4
+  %cmp = icmp eq i32 %recurseCount, 0
+  br i1 %cmp, label %return, label %if.end
+
+if.end:   ; preds = %entry
+  %0 = bitcast i32* %temp to i8*
+  call void @llvm.lifetime.start.p0i8(i64 4, i8* nonnull %0) #6
+  store i32 10, i32* %temp, align 4
+  call void @_Z15globalIncrementPKi(i32* nonnull %temp)
+  %sub = add nsw i32 %recurseCount, -1
+  call void @_Z4testi(i32 %sub)
+  call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %0) #6
+  br label %return
+
+return:   ; preds = %entry, %if.end
+  ret void
+}
+
+; Function Attrs: argmemonly nounwind willreturn
+declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #2
+
+; Function Attrs: argmemonly nounwind willreturn
+declare void @llvm.lifetime.end.p0i8(i64 immarg, i8* nocapture) #2
+
+attributes #0 = { nofree noinline norecurse nounwind uwtable }
+attributes #1 = { nounwind uwtable }
+attributes #2 = { argmemonly nounwind willreturn }
Index: llvm/test/Transforms/TailCallElim/tre-multiple-exits.ll
===
--- /dev/null
+++ llvm/test/Transforms/TailCallElim/tre-multiple-exits.ll
@@ -0,0 +1,125 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt < %s -tailcallelim -verify-dom-info -S | FileCheck %s
+
+; This test checks that TRE would be done for only one recursive call.
+; The test_multiple_exits function has three recursive calls.
+; First recursive call could not be eliminated because there is
+; escaped pointer to local variable. Second recursive call could
+; be eliminated. Thrid recursive call could not be eliminated since
+; this is not last call. Thus, test checks that TRE would be done
+; for only second recursive call.
+
+; IR for that test was generated from the following C++ source:
+;
+; void capture_arg (int*);
+; void test_multiple_exits (int param);
+;   if (param >= 0 && param < 10) {
+; int temp;
+; capture_arg();
+; // TRE could not be done because pointer to local
+; // variable "temp" is escaped.
+; test_multiple_exits(param + 1);
+;   } else if (param >=10 && param < 20) {
+; // TRE should be done.
+; test_multiple_exits(param + 1);
+;   } else if 

[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-07-09 Thread Alexey Lapshin via Phabricator via cfe-commits
avl added a comment.

Thank you, for the review.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-07-09 Thread Eli Friedman via Phabricator via cfe-commits
efriedma accepted this revision.
efriedma added a comment.
This revision is now accepted and ready to land.

LGTM


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-07-09 Thread Alexey Lapshin via Phabricator via cfe-commits
avl updated this revision to Diff 276799.
avl added a comment.

addressed comments:

added test for multiple recursive calls, 
removed duplicated check for operand bundles, 
simplified and commented tests.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085

Files:
  llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
  llvm/test/Transforms/TailCallElim/basic.ll
  llvm/test/Transforms/TailCallElim/tre-multiple-exits.ll
  llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll

Index: llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
===
--- /dev/null
+++ llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
@@ -0,0 +1,74 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt < %s -tailcallelim -verify-dom-info -S | FileCheck %s
+
+; IR for that test was generated from the following C++ source:
+;
+;int count;
+;__attribute__((noinline)) void globalIncrement(const int* param) { count += *param; }
+;
+;void test(int recurseCount)
+;{
+;if (recurseCount == 0) return;
+;int temp = 10;
+;globalIncrement();
+;test(recurseCount - 1);
+;}
+;
+
+@count = dso_local local_unnamed_addr global i32 0, align 4
+
+; Function Attrs: nofree noinline norecurse nounwind uwtable
+declare void @_Z15globalIncrementPKi(i32* nocapture readonly %param) #0
+
+; Test that TRE could be done for recursive tail routine containing
+; call to function receiving a pointer to local stack.
+
+; Function Attrs: nounwind uwtable
+define dso_local void @_Z4testi(i32 %recurseCount) local_unnamed_addr #1 {
+; CHECK-LABEL: @_Z4testi(
+; CHECK-NEXT:  entry:
+; CHECK-NEXT:[[TEMP:%.*]] = alloca i32, align 4
+; CHECK-NEXT:br label [[TAILRECURSE:%.*]]
+; CHECK:   tailrecurse:
+; CHECK-NEXT:[[RECURSECOUNT_TR:%.*]] = phi i32 [ [[RECURSECOUNT:%.*]], [[ENTRY:%.*]] ], [ [[SUB:%.*]], [[IF_END:%.*]] ]
+; CHECK-NEXT:[[CMP:%.*]] = icmp eq i32 [[RECURSECOUNT_TR]], 0
+; CHECK-NEXT:br i1 [[CMP]], label [[RETURN:%.*]], label [[IF_END]]
+; CHECK:   if.end:
+; CHECK-NEXT:[[TMP0:%.*]] = bitcast i32* [[TEMP]] to i8*
+; CHECK-NEXT:call void @llvm.lifetime.start.p0i8(i64 4, i8* nonnull [[TMP0]])
+; CHECK-NEXT:store i32 10, i32* [[TEMP]], align 4
+; CHECK-NEXT:call void @_Z15globalIncrementPKi(i32* nonnull [[TEMP]])
+; CHECK-NEXT:[[SUB]] = add nsw i32 [[RECURSECOUNT_TR]], -1
+; CHECK-NEXT:call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull [[TMP0]])
+; CHECK-NEXT:br label [[TAILRECURSE]]
+; CHECK:   return:
+; CHECK-NEXT:ret void
+;
+entry:
+  %temp = alloca i32, align 4
+  %cmp = icmp eq i32 %recurseCount, 0
+  br i1 %cmp, label %return, label %if.end
+
+if.end:   ; preds = %entry
+  %0 = bitcast i32* %temp to i8*
+  call void @llvm.lifetime.start.p0i8(i64 4, i8* nonnull %0) #6
+  store i32 10, i32* %temp, align 4
+  call void @_Z15globalIncrementPKi(i32* nonnull %temp)
+  %sub = add nsw i32 %recurseCount, -1
+  call void @_Z4testi(i32 %sub)
+  call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %0) #6
+  br label %return
+
+return:   ; preds = %entry, %if.end
+  ret void
+}
+
+; Function Attrs: argmemonly nounwind willreturn
+declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #2
+
+; Function Attrs: argmemonly nounwind willreturn
+declare void @llvm.lifetime.end.p0i8(i64 immarg, i8* nocapture) #2
+
+attributes #0 = { nofree noinline norecurse nounwind uwtable }
+attributes #1 = { nounwind uwtable }
+attributes #2 = { argmemonly nounwind willreturn }
Index: llvm/test/Transforms/TailCallElim/tre-multiple-exits.ll
===
--- /dev/null
+++ llvm/test/Transforms/TailCallElim/tre-multiple-exits.ll
@@ -0,0 +1,125 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt < %s -tailcallelim -verify-dom-info -S | FileCheck %s
+
+; This test checks that TRE would be done for only one recursive call.
+; The test_multiple_exits function has three recursive calls.
+; First recursive call could not be eliminated because there is
+; escaped pointer to local variable. Second recursive call could
+; be eliminated. Thrid recursive call could not be eliminated since
+; this is not last call. Thus, test checks that TRE would be done
+; for only second recursive call.
+
+; IR for that test was generated from the following C++ source:
+;
+; void capture_arg (int*);
+; void test_multiple_exits (int param);
+;   if (param >= 0 && param < 10) {
+; int temp;
+; capture_arg();
+; // TRE could not be done because pointer to local
+; // variable "temp" is escaped.
+; test_multiple_exits(param + 1);
+;   } else if (param >=10 && param < 20) {
+; // TRE should be done.
+; 

[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-07-08 Thread Alexey Lapshin via Phabricator via cfe-commits
avl marked an inline comment as done.
avl added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:474
+  // Operand Bundles or not marked as TailCall.
+  if (CI->isNoTailCall() || CI->hasOperandBundles() || !CI->isTailCall())
 return nullptr;

efriedma wrote:
> The hasOperandBundles() check looks completely new; is there some test for it?
> 
> The `isNoTailCall()` check is currently redundant; it isn't legal to write 
> "tail notail".  I guess it makes sense to guard against that, though.
>The hasOperandBundles() check looks completely new; is there some test for it?

it is not new. it is copied from 245 line. Now, when patch changed from its 
original state all above conditions could be changed just to :

if (!CI->isTailCall())

the test is Transforms/TailCallElim/deopt-bundle.ll

>The isNoTailCall() check is currently redundant; it isn't legal to write "tail 
>notail". I guess it makes sense to guard against that, though.

would add checking for that.



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-07-08 Thread Eli Friedman via Phabricator via cfe-commits
efriedma added a comment.

I think I'd like to see a testcase where there are multiple recursive calls, 
but only one is a tail call that can be eliminated.




Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:474
+  // Operand Bundles or not marked as TailCall.
+  if (CI->isNoTailCall() || CI->hasOperandBundles() || !CI->isTailCall())
 return nullptr;

The hasOperandBundles() check looks completely new; is there some test for it?

The `isNoTailCall()` check is currently redundant; it isn't legal to write 
"tail notail".  I guess it makes sense to guard against that, though.



Comment at: llvm/test/Transforms/TailCallElim/basic.ll:23
+; CHECK: call i32 @test1
+   %X = call i32 @test1()  ;  [#uses=1]
ret i32 %X

I'm not sure this is testing what it was originally supposed to.  I guess 
that's okay, but please fix the comment at least.



Comment at: 
llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll:20
+; Function Attrs: nofree noinline norecurse nounwind uwtable
+define dso_local void @_Z15globalIncrementPKi(i32* nocapture readonly %param) 
local_unnamed_addr #0 {
+entry:

For the purpose of this testcase, we don't need the definition of 
_Z15globalIncrementPKi.



Comment at: 
llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll:37
+; CHECK: br label %tailrecurse
+; CHECK-NOT: call void @_Z4testi
+; CHECK: ret

I think I'd prefer to just generate this with update_test_checks.py


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-07-08 Thread Alexey Lapshin via Phabricator via cfe-commits
avl updated this revision to Diff 276591.
avl added a comment.

addressed comments.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085

Files:
  llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
  llvm/test/Transforms/TailCallElim/basic.ll
  llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll

Index: llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
===
--- /dev/null
+++ llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
@@ -0,0 +1,69 @@
+; RUN: opt < %s -tailcallelim -verify-dom-info -S | FileCheck %s
+
+; IR for that test was generated from the following C++ source:
+;
+;int count;
+;__attribute__((noinline)) void globalIncrement(const int* param) { count += *param; }
+;
+;void test(int recurseCount)
+;{
+;if (recurseCount == 0) return;
+;int temp = 10;
+;globalIncrement();
+;test(recurseCount - 1);
+;}
+;
+
+@count = dso_local local_unnamed_addr global i32 0, align 4
+
+; Function Attrs: nofree noinline norecurse nounwind uwtable
+define dso_local void @_Z15globalIncrementPKi(i32* nocapture readonly %param) local_unnamed_addr #0 {
+entry:
+  %0 = load i32, i32* %param, align 4
+  %1 = load i32, i32* @count, align 4
+  %add = add nsw i32 %1, %0
+  store i32 %add, i32* @count, align 4
+  ret void
+}
+
+; Test that TRE could be done for recursive tail routine containing
+; call to function receiving a pointer to local stack. 
+
+; CHECK: void @_Z4testi
+; CHECK: br label %tailrecurse
+; CHECK: tailrecurse:
+; CHECK-NOT: call void @_Z4testi
+; CHECK: br label %tailrecurse
+; CHECK-NOT: call void @_Z4testi
+; CHECK: ret
+
+; Function Attrs: nounwind uwtable
+define dso_local void @_Z4testi(i32 %recurseCount) local_unnamed_addr #1 {
+entry:
+  %temp = alloca i32, align 4
+  %cmp = icmp eq i32 %recurseCount, 0
+  br i1 %cmp, label %return, label %if.end
+
+if.end:   ; preds = %entry
+  %0 = bitcast i32* %temp to i8*
+  call void @llvm.lifetime.start.p0i8(i64 4, i8* nonnull %0) #6
+  store i32 10, i32* %temp, align 4
+  call void @_Z15globalIncrementPKi(i32* nonnull %temp)
+  %sub = add nsw i32 %recurseCount, -1
+  call void @_Z4testi(i32 %sub)
+  call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %0) #6
+  br label %return
+
+return:   ; preds = %entry, %if.end
+  ret void
+}
+
+; Function Attrs: argmemonly nounwind willreturn
+declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #2
+
+; Function Attrs: argmemonly nounwind willreturn
+declare void @llvm.lifetime.end.p0i8(i64 immarg, i8* nocapture) #2
+
+attributes #0 = { nofree noinline norecurse nounwind uwtable }
+attributes #1 = { nounwind uwtable }
+attributes #2 = { argmemonly nounwind willreturn }
Index: llvm/test/Transforms/TailCallElim/basic.ll
===
--- llvm/test/Transforms/TailCallElim/basic.ll
+++ llvm/test/Transforms/TailCallElim/basic.ll
@@ -19,8 +19,8 @@
 	%A = alloca i32		;  [#uses=2]
 	store i32 5, i32* %A
 	call void @use(i32* %A)
-; CHECK: tail call i32 @test1
-	%X = tail call i32 @test1()		;  [#uses=1]
+; CHECK: call i32 @test1
+	%X = call i32 @test1()		;  [#uses=1]
 	ret i32 %X
 }
 
Index: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
===
--- llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
+++ llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
@@ -81,6 +81,7 @@
 #include "llvm/Support/raw_ostream.h"
 #include "llvm/Transforms/Scalar.h"
 #include "llvm/Transforms/Utils/BasicBlockUtils.h"
+#include "llvm/Transforms/Utils/Local.h"
 using namespace llvm;
 
 #define DEBUG_TYPE "tailcallelim"
@@ -92,7 +93,10 @@
 /// Scan the specified function for alloca instructions.
 /// If it contains any dynamic allocas, returns false.
 static bool canTRE(Function ) {
-  // Because of PR962, we don't TRE dynamic allocas.
+  // TODO: We don't do TRE if dynamic allocas are used.
+  // Dynamic allocas allocate stack space which should be
+  // deallocated before new iteration started. That is
+  // currently not implemented.
   return llvm::all_of(instructions(F), [](Instruction ) {
 auto *AI = dyn_cast();
 return !AI || AI->isStaticAlloca();
@@ -185,11 +189,9 @@
 };
 }
 
-static bool markTails(Function , bool ,
-  OptimizationRemarkEmitter *ORE) {
+static bool markTails(Function , OptimizationRemarkEmitter *ORE) {
   if (F.callsFunctionThatReturnsTwice())
 return false;
-  AllCallsAreTailCalls = true;
 
   // The local stack holds all alloca instructions and all byval arguments.
   AllocaDerivedValueTracker Tracker;
@@ -272,11 +274,8 @@
 }
   }
 
-  if (!IsNoTail && Escaped == UNESCAPED && !Tracker.AllocaUsers.count(CI)) {
+  if (!IsNoTail && Escaped 

[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-07-08 Thread Eli Friedman via Phabricator via cfe-commits
efriedma added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:94
 /// If it contains any dynamic allocas, returns false.
 static bool canTRE(Function ) {
   // Because of PR962, we don't TRE dynamic allocas.

avl wrote:
> efriedma wrote:
> > If we're not going to try to do TRE at all on calls not marked "tail", we 
> > can probably drop this check.
> It looks to me that original idea(PR962) was to avoid inefficient code which 
> is generated for dynamic alloca. 
> 
> Currently there would still be generated inefficient code:
> 
> Doing TRE for dynamic alloca requires correct stack adjustment to avoid extra 
> stack usage. 
> i.e. dynamic stack reservation done for alloca should be restored 
> in the end of the current iteration. Current TRE implementation does not do 
> this.
> 
> Please, consider the test case:
> 
> 
> ```
> #include 
> 
> int count;
> __attribute__((noinline)) void globalIncrement(const int* param) { 
>   count += *param; 
> }
> 
> void test(int recurseCount)
> {
> if (recurseCount == 0) return;
> {
> int *temp = (int*)alloca(100);
> globalIncrement(temp);
> }
> test(recurseCount - 1);
> }
> 
> 
> ```
> Following is the x86 asm generated for the above test case in assumption that 
> dynamic allocas are possible:
> 
> ```
> 
> .LBB1_2:
> movq%rsp, %rdi
> addq$-112, %rdi   << dynamic stack reservation, need 
> to be restored before "jne .LBB1_2"
> movq%rdi, %rsp
> callq   _Z15globalIncrementPKi
> addl$-1, %ebx
> jne .LBB1_2
> ```
> 
> So, it looks like we still have inefficient code here and it was a reason for 
> avoiding TRE.
I guess we can leave this for a later patch.

This isn't really any worse than the stack usage before TRE, assuming we can't 
emit a sibling call in the backend.  And we could avoid this by making TRE 
insert stacksave/stackrestore intrinsics.  But better to do one thing at a time.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-07-08 Thread Alexey Lapshin via Phabricator via cfe-commits
avl marked 3 inline comments as done.
avl added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:94
 /// If it contains any dynamic allocas, returns false.
 static bool canTRE(Function ) {
   // Because of PR962, we don't TRE dynamic allocas.

efriedma wrote:
> If we're not going to try to do TRE at all on calls not marked "tail", we can 
> probably drop this check.
It looks to me that original idea(PR962) was to avoid inefficient code which is 
generated for dynamic alloca. 

Currently there would still be generated inefficient code:

Doing TRE for dynamic alloca requires correct stack adjustment to avoid extra 
stack usage. 
i.e. dynamic stack reservation done for alloca should be restored 
in the end of the current iteration. Current TRE implementation does not do 
this.

Please, consider the test case:


```
#include 

int count;
__attribute__((noinline)) void globalIncrement(const int* param) { 
count += *param; 
}

void test(int recurseCount)
{
if (recurseCount == 0) return;
{
int *temp = (int*)alloca(100);
globalIncrement(temp);
}
test(recurseCount - 1);
}


```
Following is the x86 asm generated for the above test case in assumption that 
dynamic allocas are possible:

```

.LBB1_2:
movq%rsp, %rdi
addq$-112, %rdi   << dynamic stack reservation, need to 
be restored before "jne .LBB1_2"
movq%rdi, %rsp
callq   _Z15globalIncrementPKi
addl$-1, %ebx
jne .LBB1_2
```

So, it looks like we still have inefficient code here and it was a reason for 
avoiding TRE.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:808
   // Until this is resolved, disable this transformation if that would ever
   // happen.  This bug is PR962.
   for (Function::iterator BBI = F.begin(), E = F.end(); BBI != E; /*in loop*/) 
{

efriedma wrote:
> Can you move this FIXME into a more appropriate spot?
OK.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:335
+II->getIntrinsicID() == Intrinsic::assume)
+  return true;
+

efriedma wrote:
> avl wrote:
> > efriedma wrote:
> > > What is the new handling for lifetime.end/assume doing?
> > They are just skipped. In following test case:
> > 
> > 
> > ```
> >   call void @_Z5test5i(i32 %sub)
> >   call void @llvm.lifetime.end.p0i8(i64 24, i8* nonnull %1) #5
> >   call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %0) #5
> >   br label %return
> > 
> > ```
> > 
> > they are generated in between call and ret. It is safe to ignore them while 
> > checking whether transformation is possible.
> It makes sense we can ignore lifetime.end on an alloca: we know the call 
> doesn't refer to the alloca.  (Maybe we should check that the pointer 
> argument is pointing at an alloca?  That should usually be true anyway, but 
> better to be on the safe side, I guess.)
> 
> I don't think it's safe to hoist assume without additional checks; I think 
> we'd need to check that the call is marked "willreturn"?
> 
> Since this is sort of tricky, I'd prefer to split this off into a followup.
>It makes sense we can ignore lifetime.end on an alloca: we know the call 
>doesn't refer to the alloca. (Maybe we should check that the pointer argument 
>is pointing at an alloca? That should usually be true anyway, but better to be 
>on the safe side, I guess.)

OK, I would add checking that the pointer argument of lifetime.end is pointing 
to an alloca.

>I don't think it's safe to hoist assume without additional checks; I think 
>we'd need to check that the call is marked "willreturn"?

>Since this is sort of tricky, I'd prefer to split this off into a followup.

Ok, I would split Intrinsic::assume into another review.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-07-07 Thread Eli Friedman via Phabricator via cfe-commits
efriedma added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:94
 /// If it contains any dynamic allocas, returns false.
 static bool canTRE(Function ) {
   // Because of PR962, we don't TRE dynamic allocas.

If we're not going to try to do TRE at all on calls not marked "tail", we can 
probably drop this check.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:808
   // Until this is resolved, disable this transformation if that would ever
   // happen.  This bug is PR962.
   for (Function::iterator BBI = F.begin(), E = F.end(); BBI != E; /*in loop*/) 
{

Can you move this FIXME into a more appropriate spot?



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:335
+II->getIntrinsicID() == Intrinsic::assume)
+  return true;
+

avl wrote:
> efriedma wrote:
> > What is the new handling for lifetime.end/assume doing?
> They are just skipped. In following test case:
> 
> 
> ```
>   call void @_Z5test5i(i32 %sub)
>   call void @llvm.lifetime.end.p0i8(i64 24, i8* nonnull %1) #5
>   call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %0) #5
>   br label %return
> 
> ```
> 
> they are generated in between call and ret. It is safe to ignore them while 
> checking whether transformation is possible.
It makes sense we can ignore lifetime.end on an alloca: we know the call 
doesn't refer to the alloca.  (Maybe we should check that the pointer argument 
is pointing at an alloca?  That should usually be true anyway, but better to be 
on the safe side, I guess.)

I don't think it's safe to hoist assume without additional checks; I think we'd 
need to check that the call is marked "willreturn"?

Since this is sort of tricky, I'd prefer to split this off into a followup.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-07-02 Thread Alexey Lapshin via Phabricator via cfe-commits
avl added a comment.

@efriedma What do you think about current state of this patch? Is it OK?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-07-02 Thread Alexey Lapshin via Phabricator via cfe-commits
avl updated this revision to Diff 275123.
avl added a comment.

rebased.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085

Files:
  llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
  llvm/test/Transforms/TailCallElim/basic.ll
  llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll

Index: llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
===
--- /dev/null
+++ llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
@@ -0,0 +1,69 @@
+; RUN: opt < %s -tailcallelim -verify-dom-info -S | FileCheck %s
+
+; IR for that test was generated from the following C++ source:
+;
+;int count;
+;__attribute__((noinline)) void globalIncrement(const int* param) { count += *param; }
+;
+;void test(int recurseCount)
+;{
+;if (recurseCount == 0) return;
+;int temp = 10;
+;globalIncrement();
+;test(recurseCount - 1);
+;}
+;
+
+@count = dso_local local_unnamed_addr global i32 0, align 4
+
+; Function Attrs: nofree noinline norecurse nounwind uwtable
+define dso_local void @_Z15globalIncrementPKi(i32* nocapture readonly %param) local_unnamed_addr #0 {
+entry:
+  %0 = load i32, i32* %param, align 4
+  %1 = load i32, i32* @count, align 4
+  %add = add nsw i32 %1, %0
+  store i32 %add, i32* @count, align 4
+  ret void
+}
+
+; Test that TRE could be done for recursive tail routine containing
+; call to function receiving a pointer to local stack. 
+
+; CHECK: void @_Z4testi
+; CHECK: br label %tailrecurse
+; CHECK: tailrecurse:
+; CHECK-NOT: call void @_Z4testi
+; CHECK: br label %tailrecurse
+; CHECK-NOT: call void @_Z4testi
+; CHECK: ret
+
+; Function Attrs: nounwind uwtable
+define dso_local void @_Z4testi(i32 %recurseCount) local_unnamed_addr #1 {
+entry:
+  %temp = alloca i32, align 4
+  %cmp = icmp eq i32 %recurseCount, 0
+  br i1 %cmp, label %return, label %if.end
+
+if.end:   ; preds = %entry
+  %0 = bitcast i32* %temp to i8*
+  call void @llvm.lifetime.start.p0i8(i64 4, i8* nonnull %0) #6
+  store i32 10, i32* %temp, align 4
+  call void @_Z15globalIncrementPKi(i32* nonnull %temp)
+  %sub = add nsw i32 %recurseCount, -1
+  call void @_Z4testi(i32 %sub)
+  call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %0) #6
+  br label %return
+
+return:   ; preds = %entry, %if.end
+  ret void
+}
+
+; Function Attrs: argmemonly nounwind willreturn
+declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #2
+
+; Function Attrs: argmemonly nounwind willreturn
+declare void @llvm.lifetime.end.p0i8(i64 immarg, i8* nocapture) #2
+
+attributes #0 = { nofree noinline norecurse nounwind uwtable }
+attributes #1 = { nounwind uwtable }
+attributes #2 = { argmemonly nounwind willreturn }
Index: llvm/test/Transforms/TailCallElim/basic.ll
===
--- llvm/test/Transforms/TailCallElim/basic.ll
+++ llvm/test/Transforms/TailCallElim/basic.ll
@@ -19,8 +19,8 @@
 	%A = alloca i32		;  [#uses=2]
 	store i32 5, i32* %A
 	call void @use(i32* %A)
-; CHECK: tail call i32 @test1
-	%X = tail call i32 @test1()		;  [#uses=1]
+; CHECK: call i32 @test1
+	%X = call i32 @test1()		;  [#uses=1]
 	ret i32 %X
 }
 
Index: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
===
--- llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
+++ llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
@@ -185,11 +185,9 @@
 };
 }
 
-static bool markTails(Function , bool ,
-  OptimizationRemarkEmitter *ORE) {
+static bool markTails(Function , OptimizationRemarkEmitter *ORE) {
   if (F.callsFunctionThatReturnsTwice())
 return false;
-  AllCallsAreTailCalls = true;
 
   // The local stack holds all alloca instructions and all byval arguments.
   AllocaDerivedValueTracker Tracker;
@@ -272,11 +270,8 @@
 }
   }
 
-  if (!IsNoTail && Escaped == UNESCAPED && !Tracker.AllocaUsers.count(CI)) {
+  if (!IsNoTail && Escaped == UNESCAPED && !Tracker.AllocaUsers.count(CI))
 DeferredTails.push_back(CI);
-  } else {
-AllCallsAreTailCalls = false;
-  }
 }
 
 for (auto *SuccBB : make_range(succ_begin(BB), succ_end(BB))) {
@@ -313,8 +308,6 @@
   LLVM_DEBUG(dbgs() << "Marked as tail call candidate: " << *CI << "\n");
   CI->setTailCall();
   Modified = true;
-} else {
-  AllCallsAreTailCalls = false;
 }
   }
 
@@ -326,6 +319,14 @@
 /// instructions between the call and this instruction are movable.
 ///
 static bool canMoveAboveCall(Instruction *I, CallInst *CI, AliasAnalysis *AA) {
+  if (isa(I))
+return true;
+
+  if (const IntrinsicInst *II = dyn_cast(I))
+if (II->getIntrinsicID() == Intrinsic::lifetime_end ||
+II->getIntrinsicID() == Intrinsic::assume)
+  

[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-29 Thread Alexey Lapshin via Phabricator via cfe-commits
avl added a comment.

ping.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-25 Thread Alexey Lapshin via Phabricator via cfe-commits
avl updated this revision to Diff 273457.
avl added a comment.

removed early check for TRE candidates from canTRE().


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085

Files:
  llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
  llvm/test/Transforms/TailCallElim/basic.ll
  llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll

Index: llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
===
--- /dev/null
+++ llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
@@ -0,0 +1,69 @@
+; RUN: opt < %s -tailcallelim -verify-dom-info -S | FileCheck %s
+
+; IR for that test was generated from the following C++ source:
+;
+;int count;
+;__attribute__((noinline)) void globalIncrement(const int* param) { count += *param; }
+;
+;void test(int recurseCount)
+;{
+;if (recurseCount == 0) return;
+;int temp = 10;
+;globalIncrement();
+;test(recurseCount - 1);
+;}
+;
+
+@count = dso_local local_unnamed_addr global i32 0, align 4
+
+; Function Attrs: nofree noinline norecurse nounwind uwtable
+define dso_local void @_Z15globalIncrementPKi(i32* nocapture readonly %param) local_unnamed_addr #0 {
+entry:
+  %0 = load i32, i32* %param, align 4
+  %1 = load i32, i32* @count, align 4
+  %add = add nsw i32 %1, %0
+  store i32 %add, i32* @count, align 4
+  ret void
+}
+
+; Test that TRE could be done for recursive tail routine containing
+; call to function receiving a pointer to local stack. 
+
+; CHECK: void @_Z4testi
+; CHECK: br label %tailrecurse
+; CHECK: tailrecurse:
+; CHECK-NOT: call void @_Z4testi
+; CHECK: br label %tailrecurse
+; CHECK-NOT: call void @_Z4testi
+; CHECK: ret
+
+; Function Attrs: nounwind uwtable
+define dso_local void @_Z4testi(i32 %recurseCount) local_unnamed_addr #1 {
+entry:
+  %temp = alloca i32, align 4
+  %cmp = icmp eq i32 %recurseCount, 0
+  br i1 %cmp, label %return, label %if.end
+
+if.end:   ; preds = %entry
+  %0 = bitcast i32* %temp to i8*
+  call void @llvm.lifetime.start.p0i8(i64 4, i8* nonnull %0) #6
+  store i32 10, i32* %temp, align 4
+  call void @_Z15globalIncrementPKi(i32* nonnull %temp)
+  %sub = add nsw i32 %recurseCount, -1
+  call void @_Z4testi(i32 %sub)
+  call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %0) #6
+  br label %return
+
+return:   ; preds = %entry, %if.end
+  ret void
+}
+
+; Function Attrs: argmemonly nounwind willreturn
+declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #2
+
+; Function Attrs: argmemonly nounwind willreturn
+declare void @llvm.lifetime.end.p0i8(i64 immarg, i8* nocapture) #2
+
+attributes #0 = { nofree noinline norecurse nounwind uwtable }
+attributes #1 = { nounwind uwtable }
+attributes #2 = { argmemonly nounwind willreturn }
Index: llvm/test/Transforms/TailCallElim/basic.ll
===
--- llvm/test/Transforms/TailCallElim/basic.ll
+++ llvm/test/Transforms/TailCallElim/basic.ll
@@ -19,8 +19,8 @@
 	%A = alloca i32		;  [#uses=2]
 	store i32 5, i32* %A
 	call void @use(i32* %A)
-; CHECK: tail call i32 @test1
-	%X = tail call i32 @test1()		;  [#uses=1]
+; CHECK: call i32 @test1
+	%X = call i32 @test1()		;  [#uses=1]
 	ret i32 %X
 }
 
Index: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
===
--- llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
+++ llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
@@ -185,11 +185,9 @@
 };
 }
 
-static bool markTails(Function , bool ,
-  OptimizationRemarkEmitter *ORE) {
+static bool markTails(Function , OptimizationRemarkEmitter *ORE) {
   if (F.callsFunctionThatReturnsTwice())
 return false;
-  AllCallsAreTailCalls = true;
 
   // The local stack holds all alloca instructions and all byval arguments.
   AllocaDerivedValueTracker Tracker;
@@ -272,11 +270,8 @@
 }
   }
 
-  if (!IsNoTail && Escaped == UNESCAPED && !Tracker.AllocaUsers.count(CI)) {
+  if (!IsNoTail && Escaped == UNESCAPED && !Tracker.AllocaUsers.count(CI))
 DeferredTails.push_back(CI);
-  } else {
-AllCallsAreTailCalls = false;
-  }
 }
 
 for (auto *SuccBB : make_range(succ_begin(BB), succ_end(BB))) {
@@ -313,8 +308,6 @@
   LLVM_DEBUG(dbgs() << "Marked as tail call candidate: " << *CI << "\n");
   CI->setTailCall();
   Modified = true;
-} else {
-  AllCallsAreTailCalls = false;
 }
   }
 
@@ -326,6 +319,14 @@
 /// instructions between the call and this instruction are movable.
 ///
 static bool canMoveAboveCall(Instruction *I, CallInst *CI, AliasAnalysis *AA) {
+  if (isa(I))
+return true;
+
+  if (const IntrinsicInst *II = dyn_cast(I))
+if (II->getIntrinsicID() == Intrinsic::lifetime_end ||
+

[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-25 Thread Alexey Lapshin via Phabricator via cfe-commits
avl marked an inline comment as done.
avl added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:838
+if (isValidTRECandidate(CI))
+  HasValidCandidates = true;
+  }

laytonio wrote:
> avl wrote:
> > laytonio wrote:
> > > Is there any reason to find and validate candidates now only to have to 
> > > redo it when we actually perform the eliminations? If so, is there any 
> > > reason this needs to follow a different code path than findTRECandidate? 
> > > findTRECandidate is doing the same checks, except for canMoveAboveCall 
> > > and canTransformAccumulatorRecursion, which should probably be refactored 
> > > into findTRECandidate from eliminateCall anyway. If not then all of this 
> > > code goes away and we're left with the same canTRE as in trunk.
> > We are enumerating all instructions here, so we could understand if there 
> > are not TRE candidates and stop earlier. That is the reason for doing it 
> > here.
> > 
> > I agree that findTRECandidate should be refactored to have the same checks 
> > as here. 
> > 
> > What do you think is better to do:
> > 
> > - leave early check for TRE candidates in canTRE or remove it
> > - refactor findTRECandidate or leave it as is
> > 
> > ?
> Yes we are iterating all the instructions here but, unless I am missing 
> something, we would literally just be doing the checks twice for no reason. 
> Look at it this way, best case scenario we have to check all possible 
> candidates once, find none and we're done. Worst case, we check all possible 
> candidates once, find one and have to check all possible candidates a second 
> time. Where as if we remove the early checks we only ever have to check the 
> candidates once. So we wouldn't really be stopping any earlier.
> 
> As for refactoring findTRECandidate, I do think that should be done and we 
> should strive to move all the failure conditions out of eliminateCall in 
> order to avoid having to fold a return only to find out we didn't need to. 
> But, I think that is out of the scope of this change, and if we do decide to 
> keep the early checks here then we should say that findTRECandidate does a 
> good enough job to consider this function as having valid candidates.
>Yes we are iterating all the instructions here but, unless I am missing 
>something, we would literally just be doing the checks twice for no reason. 
>Look at it this way, best case scenario we have to check all possible 
>candidates once, find none and we're done. Worst case, we check all possible 
>candidates once, find one and have to check all possible candidates a second 
>time. Where as if we remove the early checks we only ever have to check the 
>candidates once. So we wouldn't really be stopping any earlier.

yes. we would do check twice if there are TRE candidates. my idea was that 
number of cases when TRE is applicable less then when TRE is not applicable. 
Thus we would do double instruction navigation more often than double check for 
candidates. But, I did not measure real impact. Thus, let`s return old logic 
here as you suggested.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-25 Thread Layton Kifer via Phabricator via cfe-commits
laytonio added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:838
+if (isValidTRECandidate(CI))
+  HasValidCandidates = true;
+  }

avl wrote:
> laytonio wrote:
> > Is there any reason to find and validate candidates now only to have to 
> > redo it when we actually perform the eliminations? If so, is there any 
> > reason this needs to follow a different code path than findTRECandidate? 
> > findTRECandidate is doing the same checks, except for canMoveAboveCall and 
> > canTransformAccumulatorRecursion, which should probably be refactored into 
> > findTRECandidate from eliminateCall anyway. If not then all of this code 
> > goes away and we're left with the same canTRE as in trunk.
> We are enumerating all instructions here, so we could understand if there are 
> not TRE candidates and stop earlier. That is the reason for doing it here.
> 
> I agree that findTRECandidate should be refactored to have the same checks as 
> here. 
> 
> What do you think is better to do:
> 
> - leave early check for TRE candidates in canTRE or remove it
> - refactor findTRECandidate or leave it as is
> 
> ?
Yes we are iterating all the instructions here but, unless I am missing 
something, we would literally just be doing the checks twice for no reason. 
Look at it this way, best case scenario we have to check all possible 
candidates once, find none and we're done. Worst case, we check all possible 
candidates once, find one and have to check all possible candidates a second 
time. Where as if we remove the early checks we only ever have to check the 
candidates once. So we wouldn't really be stopping any earlier.

As for refactoring findTRECandidate, I do think that should be done and we 
should strive to move all the failure conditions out of eliminateCall in order 
to avoid having to fold a return only to find out we didn't need to. But, I 
think that is out of the scope of this change, and if we do decide to keep the 
early checks here then we should say that findTRECandidate does a good enough 
job to consider this function as having valid candidates.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-25 Thread Alexey Lapshin via Phabricator via cfe-commits
avl marked an inline comment as done.
avl added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:838
+if (isValidTRECandidate(CI))
+  HasValidCandidates = true;
+  }

laytonio wrote:
> Is there any reason to find and validate candidates now only to have to redo 
> it when we actually perform the eliminations? If so, is there any reason this 
> needs to follow a different code path than findTRECandidate? findTRECandidate 
> is doing the same checks, except for canMoveAboveCall and 
> canTransformAccumulatorRecursion, which should probably be refactored into 
> findTRECandidate from eliminateCall anyway. If not then all of this code goes 
> away and we're left with the same canTRE as in trunk.
We are enumerating all instructions here, so we could understand if there are 
not TRE candidates and stop earlier. That is the reason for doing it here.

I agree that findTRECandidate should be refactored to have the same checks as 
here. 

What do you think is better to do:

- leave early check for TRE candidates in canTRE or remove it
- refactor findTRECandidate or leave it as is

?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-25 Thread Layton Kifer via Phabricator via cfe-commits
laytonio added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:838
+if (isValidTRECandidate(CI))
+  HasValidCandidates = true;
+  }

Is there any reason to find and validate candidates now only to have to redo it 
when we actually perform the eliminations? If so, is there any reason this 
needs to follow a different code path than findTRECandidate? findTRECandidate 
is doing the same checks, except for canMoveAboveCall and 
canTransformAccumulatorRecursion, which should probably be refactored into 
findTRECandidate from eliminateCall anyway. If not then all of this code goes 
away and we're left with the same canTRE as in trunk.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-24 Thread Alexey Lapshin via Phabricator via cfe-commits
avl updated this revision to Diff 273174.
avl added a comment.

check valid TRE candidate into findTRECandidate()().


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085

Files:
  llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
  llvm/test/Transforms/TailCallElim/basic.ll
  llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll

Index: llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
===
--- /dev/null
+++ llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
@@ -0,0 +1,69 @@
+; RUN: opt < %s -tailcallelim -verify-dom-info -S | FileCheck %s
+
+; IR for that test was generated from the following C++ source:
+;
+;int count;
+;__attribute__((noinline)) void globalIncrement(const int* param) { count += *param; }
+;
+;void test(int recurseCount)
+;{
+;if (recurseCount == 0) return;
+;int temp = 10;
+;globalIncrement();
+;test(recurseCount - 1);
+;}
+;
+
+@count = dso_local local_unnamed_addr global i32 0, align 4
+
+; Function Attrs: nofree noinline norecurse nounwind uwtable
+define dso_local void @_Z15globalIncrementPKi(i32* nocapture readonly %param) local_unnamed_addr #0 {
+entry:
+  %0 = load i32, i32* %param, align 4
+  %1 = load i32, i32* @count, align 4
+  %add = add nsw i32 %1, %0
+  store i32 %add, i32* @count, align 4
+  ret void
+}
+
+; Test that TRE could be done for recursive tail routine containing
+; call to function receiving a pointer to local stack. 
+
+; CHECK: void @_Z4testi
+; CHECK: br label %tailrecurse
+; CHECK: tailrecurse:
+; CHECK-NOT: call void @_Z4testi
+; CHECK: br label %tailrecurse
+; CHECK-NOT: call void @_Z4testi
+; CHECK: ret
+
+; Function Attrs: nounwind uwtable
+define dso_local void @_Z4testi(i32 %recurseCount) local_unnamed_addr #1 {
+entry:
+  %temp = alloca i32, align 4
+  %cmp = icmp eq i32 %recurseCount, 0
+  br i1 %cmp, label %return, label %if.end
+
+if.end:   ; preds = %entry
+  %0 = bitcast i32* %temp to i8*
+  call void @llvm.lifetime.start.p0i8(i64 4, i8* nonnull %0) #6
+  store i32 10, i32* %temp, align 4
+  call void @_Z15globalIncrementPKi(i32* nonnull %temp)
+  %sub = add nsw i32 %recurseCount, -1
+  call void @_Z4testi(i32 %sub)
+  call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %0) #6
+  br label %return
+
+return:   ; preds = %entry, %if.end
+  ret void
+}
+
+; Function Attrs: argmemonly nounwind willreturn
+declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #2
+
+; Function Attrs: argmemonly nounwind willreturn
+declare void @llvm.lifetime.end.p0i8(i64 immarg, i8* nocapture) #2
+
+attributes #0 = { nofree noinline norecurse nounwind uwtable }
+attributes #1 = { nounwind uwtable }
+attributes #2 = { argmemonly nounwind willreturn }
Index: llvm/test/Transforms/TailCallElim/basic.ll
===
--- llvm/test/Transforms/TailCallElim/basic.ll
+++ llvm/test/Transforms/TailCallElim/basic.ll
@@ -19,8 +19,8 @@
 	%A = alloca i32		;  [#uses=2]
 	store i32 5, i32* %A
 	call void @use(i32* %A)
-; CHECK: tail call i32 @test1
-	%X = tail call i32 @test1()		;  [#uses=1]
+; CHECK: call i32 @test1
+	%X = call i32 @test1()		;  [#uses=1]
 	ret i32 %X
 }
 
Index: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
===
--- llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
+++ llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
@@ -89,16 +89,6 @@
 STATISTIC(NumRetDuped,   "Number of return duplicated");
 STATISTIC(NumAccumAdded, "Number of accumulators introduced");
 
-/// Scan the specified function for alloca instructions.
-/// If it contains any dynamic allocas, returns false.
-static bool canTRE(Function ) {
-  // Because of PR962, we don't TRE dynamic allocas.
-  return llvm::all_of(instructions(F), [](Instruction ) {
-auto *AI = dyn_cast();
-return !AI || AI->isStaticAlloca();
-  });
-}
-
 namespace {
 struct AllocaDerivedValueTracker {
   // Start at a root value and walk its use-def chain to mark calls that use the
@@ -185,11 +175,9 @@
 };
 }
 
-static bool markTails(Function , bool ,
-  OptimizationRemarkEmitter *ORE) {
+static bool markTails(Function , OptimizationRemarkEmitter *ORE) {
   if (F.callsFunctionThatReturnsTwice())
 return false;
-  AllCallsAreTailCalls = true;
 
   // The local stack holds all alloca instructions and all byval arguments.
   AllocaDerivedValueTracker Tracker;
@@ -272,11 +260,8 @@
 }
   }
 
-  if (!IsNoTail && Escaped == UNESCAPED && !Tracker.AllocaUsers.count(CI)) {
+  if (!IsNoTail && Escaped == UNESCAPED && !Tracker.AllocaUsers.count(CI))
 DeferredTails.push_back(CI);
-  } else {
-AllCallsAreTailCalls = false;
-  }
 

[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-24 Thread Alexey Lapshin via Phabricator via cfe-commits
avl marked an inline comment as done.
avl added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:830
+  !CI->isTailCall())
+return false;
+}

laytonio wrote:
> Is this correct? I think we want to check these per TRE candidate in 
> findTRECandidate, not just disable TRE in general if one is found.
I tried to minimize changes and keep old logic here - but yes, it is better to 
move that check into findTRECandidate(). Will do.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-24 Thread Layton Kifer via Phabricator via cfe-commits
laytonio added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:816
+  for (auto  : F) {
+for (Instruction  : BB) {
+  if (AllocaInst *AI = dyn_cast()) {

You can use `for (Instruction  : instructions(F))` here.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:830
+  !CI->isTailCall())
+return false;
+}

Is this correct? I think we want to check these per TRE candidate in 
findTRECandidate, not just disable TRE in general if one is found.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-24 Thread Alexey Lapshin via Phabricator via cfe-commits
avl updated this revision to Diff 273029.
avl added a comment.

removed usages of AllocaDerivedValueTracker from canTRE().


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085

Files:
  llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
  llvm/test/Transforms/TailCallElim/basic.ll
  llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll

Index: llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
===
--- /dev/null
+++ llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
@@ -0,0 +1,69 @@
+; RUN: opt < %s -tailcallelim -verify-dom-info -S | FileCheck %s
+
+; IR for that test was generated from the following C++ source:
+;
+;int count;
+;__attribute__((noinline)) void globalIncrement(const int* param) { count += *param; }
+;
+;void test(int recurseCount)
+;{
+;if (recurseCount == 0) return;
+;int temp = 10;
+;globalIncrement();
+;test(recurseCount - 1);
+;}
+;
+
+@count = dso_local local_unnamed_addr global i32 0, align 4
+
+; Function Attrs: nofree noinline norecurse nounwind uwtable
+define dso_local void @_Z15globalIncrementPKi(i32* nocapture readonly %param) local_unnamed_addr #0 {
+entry:
+  %0 = load i32, i32* %param, align 4
+  %1 = load i32, i32* @count, align 4
+  %add = add nsw i32 %1, %0
+  store i32 %add, i32* @count, align 4
+  ret void
+}
+
+; Test that TRE could be done for recursive tail routine containing
+; call to function receiving a pointer to local stack. 
+
+; CHECK: void @_Z4testi
+; CHECK: br label %tailrecurse
+; CHECK: tailrecurse:
+; CHECK-NOT: call void @_Z4testi
+; CHECK: br label %tailrecurse
+; CHECK-NOT: call void @_Z4testi
+; CHECK: ret
+
+; Function Attrs: nounwind uwtable
+define dso_local void @_Z4testi(i32 %recurseCount) local_unnamed_addr #1 {
+entry:
+  %temp = alloca i32, align 4
+  %cmp = icmp eq i32 %recurseCount, 0
+  br i1 %cmp, label %return, label %if.end
+
+if.end:   ; preds = %entry
+  %0 = bitcast i32* %temp to i8*
+  call void @llvm.lifetime.start.p0i8(i64 4, i8* nonnull %0) #6
+  store i32 10, i32* %temp, align 4
+  call void @_Z15globalIncrementPKi(i32* nonnull %temp)
+  %sub = add nsw i32 %recurseCount, -1
+  call void @_Z4testi(i32 %sub)
+  call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %0) #6
+  br label %return
+
+return:   ; preds = %entry, %if.end
+  ret void
+}
+
+; Function Attrs: argmemonly nounwind willreturn
+declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #2
+
+; Function Attrs: argmemonly nounwind willreturn
+declare void @llvm.lifetime.end.p0i8(i64 immarg, i8* nocapture) #2
+
+attributes #0 = { nofree noinline norecurse nounwind uwtable }
+attributes #1 = { nounwind uwtable }
+attributes #2 = { argmemonly nounwind willreturn }
Index: llvm/test/Transforms/TailCallElim/basic.ll
===
--- llvm/test/Transforms/TailCallElim/basic.ll
+++ llvm/test/Transforms/TailCallElim/basic.ll
@@ -19,8 +19,8 @@
 	%A = alloca i32		;  [#uses=2]
 	store i32 5, i32* %A
 	call void @use(i32* %A)
-; CHECK: tail call i32 @test1
-	%X = tail call i32 @test1()		;  [#uses=1]
+; CHECK: call i32 @test1
+	%X = call i32 @test1()		;  [#uses=1]
 	ret i32 %X
 }
 
Index: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
===
--- llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
+++ llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
@@ -89,16 +89,6 @@
 STATISTIC(NumRetDuped,   "Number of return duplicated");
 STATISTIC(NumAccumAdded, "Number of accumulators introduced");
 
-/// Scan the specified function for alloca instructions.
-/// If it contains any dynamic allocas, returns false.
-static bool canTRE(Function ) {
-  // Because of PR962, we don't TRE dynamic allocas.
-  return llvm::all_of(instructions(F), [](Instruction ) {
-auto *AI = dyn_cast();
-return !AI || AI->isStaticAlloca();
-  });
-}
-
 namespace {
 struct AllocaDerivedValueTracker {
   // Start at a root value and walk its use-def chain to mark calls that use the
@@ -185,11 +175,9 @@
 };
 }
 
-static bool markTails(Function , bool ,
-  OptimizationRemarkEmitter *ORE) {
+static bool markTails(Function , OptimizationRemarkEmitter *ORE) {
   if (F.callsFunctionThatReturnsTwice())
 return false;
-  AllCallsAreTailCalls = true;
 
   // The local stack holds all alloca instructions and all byval arguments.
   AllocaDerivedValueTracker Tracker;
@@ -272,11 +260,8 @@
 }
   }
 
-  if (!IsNoTail && Escaped == UNESCAPED && !Tracker.AllocaUsers.count(CI)) {
+  if (!IsNoTail && Escaped == UNESCAPED && !Tracker.AllocaUsers.count(CI))
 DeferredTails.push_back(CI);
-  } else {
-AllCallsAreTailCalls = false;
-  

[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-24 Thread Alexey Lapshin via Phabricator via cfe-commits
avl marked an inline comment as done.
avl added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:825
+  // The local stack holds all alloca instructions and all byval arguments.
+  AllocaDerivedValueTracker Tracker;
+  for (Argument  : F.args()) {

efriedma wrote:
> avl wrote:
> > efriedma wrote:
> > > avl wrote:
> > > > efriedma wrote:
> > > > > avl wrote:
> > > > > > efriedma wrote:
> > > > > > > avl wrote:
> > > > > > > > efriedma wrote:
> > > > > > > > > avl wrote:
> > > > > > > > > > efriedma wrote:
> > > > > > > > > > > avl wrote:
> > > > > > > > > > > > efriedma wrote:
> > > > > > > > > > > > > Do you have to redo the AllocaDerivedValueTracker 
> > > > > > > > > > > > > analysis?  Is it not enough that the call you're 
> > > > > > > > > > > > > trying to TRE is marked "tail"?
> > > > > > > > > > > > >Do you have to redo the AllocaDerivedValueTracker 
> > > > > > > > > > > > >analysis?
> > > > > > > > > > > > 
> > > > > > > > > > > > AllocaDerivedValueTracker analysis(done in markTails) 
> > > > > > > > > > > > could be reused here. 
> > > > > > > > > > > > But marking, done in markTails(), looks like separate 
> > > > > > > > > > > > tasks. i.e. it is better 
> > > > > > > > > > > > to make TRE not depending on markTails(). There is a 
> > > > > > > > > > > > review for this - https://reviews.llvm.org/D60031
> > > > > > > > > > > > Thus such separation looks useful(To not reuse result 
> > > > > > > > > > > > of markTails but have it computed inplace).
> > > > > > > > > > > > 
> > > > > > > > > > > > > Is it not enough that the call you're trying to TRE 
> > > > > > > > > > > > > is marked "tail"?
> > > > > > > > > > > > 
> > > > > > > > > > > > It is not enough that call which is subject to TRE is 
> > > > > > > > > > > > marked "Tail".
> > > > > > > > > > > > It also should be checked that other calls does not 
> > > > > > > > > > > > capture pointer to local stack: 
> > > > > > > > > > > > 
> > > > > > > > > > > > ```
> > > > > > > > > > > > // do not do TRE if any pointer to local stack has 
> > > > > > > > > > > > escaped.
> > > > > > > > > > > > if (!Tracker.EscapePoints.empty())
> > > > > > > > > > > >return false;
> > > > > > > > > > > > 
> > > > > > > > > > > > ```
> > > > > > > > > > > > 
> > > > > > > > > > > > It is not enough that call which is subject to TRE is 
> > > > > > > > > > > > marked "Tail". It also should be checked that other 
> > > > > > > > > > > > calls does not capture pointer to local stack:
> > > > > > > > > > > 
> > > > > > > > > > > If there's an escaped pointer to the local stack, we 
> > > > > > > > > > > wouldn't infer "tail" in the first place, would we?
> > > > > > > > > > If function receives pointer to alloca then it would not be 
> > > > > > > > > > marked with "Tail". Then we do not have a possibility to 
> > > > > > > > > > understand whether this function receives pointer to alloca 
> > > > > > > > > > but does not capture it:
> > > > > > > > > > 
> > > > > > > > > > ```
> > > > > > > > > > void test(int recurseCount)
> > > > > > > > > > {
> > > > > > > > > > if (recurseCount == 0) return;
> > > > > > > > > > int temp = 10;
> > > > > > > > > > globalIncrement();
> > > > > > > > > > test(recurseCount - 1);
> > > > > > > > > > }
> > > > > > > > > > ```
> > > > > > > > > > 
> > > > > > > > > > test - marked with Tail.
> > > > > > > > > > globalIncrement - not marked with Tail. But TRE could be 
> > > > > > > > > > done since it does not capture pointer. But if it will 
> > > > > > > > > > capture the pointer then we could not do TRE. So we need to 
> > > > > > > > > > check !Tracker.EscapePoints.empty().
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > 
> > > > > > > > > > test - marked with Tail.
> > > > > > > > > 
> > > > > > > > > For the given code, TRE won't mark the recursive call "tail". 
> > > > > > > > >  That transform isn't legal: the recursive call could access 
> > > > > > > > > the caller's version of "temp".
> > > > > > > > >For the given code, TRE won't mark the recursive call "tail". 
> > > > > > > > >That transform isn't legal: the recursive call could access 
> > > > > > > > >the caller's version of "temp".
> > > > > > > > 
> > > > > > > > it looks like recursive call could NOT access the caller's 
> > > > > > > > version of "temp":
> > > > > > > > 
> > > > > > > > ```
> > > > > > > > test(recurseCount - 1);
> > > > > > > > ```
> > > > > > > > 
> > > > > > > > Caller`s version of temp is accessed by non-recursive call:
> > > > > > > > 
> > > > > > > > ```
> > > > > > > > globalIncrement();
> > > > > > > > ```
> > > > > > > > 
> > > > > > > > If globalIncrement does not capture the "" then TRE looks 
> > > > > > > > to be legal for that case. 
> > > > > > > > 
> > > > > > > > globalIncrement() would not be marked with "Tail". test() would 
> > > > > > > > be marked with Tail.
> > > > > > > > 
> > > > > > > > Thus the pre-requisite for TRE 

[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-23 Thread Eli Friedman via Phabricator via cfe-commits
efriedma added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:825
+  // The local stack holds all alloca instructions and all byval arguments.
+  AllocaDerivedValueTracker Tracker;
+  for (Argument  : F.args()) {

avl wrote:
> efriedma wrote:
> > avl wrote:
> > > efriedma wrote:
> > > > avl wrote:
> > > > > efriedma wrote:
> > > > > > avl wrote:
> > > > > > > efriedma wrote:
> > > > > > > > avl wrote:
> > > > > > > > > efriedma wrote:
> > > > > > > > > > avl wrote:
> > > > > > > > > > > efriedma wrote:
> > > > > > > > > > > > Do you have to redo the AllocaDerivedValueTracker 
> > > > > > > > > > > > analysis?  Is it not enough that the call you're trying 
> > > > > > > > > > > > to TRE is marked "tail"?
> > > > > > > > > > > >Do you have to redo the AllocaDerivedValueTracker 
> > > > > > > > > > > >analysis?
> > > > > > > > > > > 
> > > > > > > > > > > AllocaDerivedValueTracker analysis(done in markTails) 
> > > > > > > > > > > could be reused here. 
> > > > > > > > > > > But marking, done in markTails(), looks like separate 
> > > > > > > > > > > tasks. i.e. it is better 
> > > > > > > > > > > to make TRE not depending on markTails(). There is a 
> > > > > > > > > > > review for this - https://reviews.llvm.org/D60031
> > > > > > > > > > > Thus such separation looks useful(To not reuse result of 
> > > > > > > > > > > markTails but have it computed inplace).
> > > > > > > > > > > 
> > > > > > > > > > > > Is it not enough that the call you're trying to TRE is 
> > > > > > > > > > > > marked "tail"?
> > > > > > > > > > > 
> > > > > > > > > > > It is not enough that call which is subject to TRE is 
> > > > > > > > > > > marked "Tail".
> > > > > > > > > > > It also should be checked that other calls does not 
> > > > > > > > > > > capture pointer to local stack: 
> > > > > > > > > > > 
> > > > > > > > > > > ```
> > > > > > > > > > > // do not do TRE if any pointer to local stack has 
> > > > > > > > > > > escaped.
> > > > > > > > > > > if (!Tracker.EscapePoints.empty())
> > > > > > > > > > >return false;
> > > > > > > > > > > 
> > > > > > > > > > > ```
> > > > > > > > > > > 
> > > > > > > > > > > It is not enough that call which is subject to TRE is 
> > > > > > > > > > > marked "Tail". It also should be checked that other calls 
> > > > > > > > > > > does not capture pointer to local stack:
> > > > > > > > > > 
> > > > > > > > > > If there's an escaped pointer to the local stack, we 
> > > > > > > > > > wouldn't infer "tail" in the first place, would we?
> > > > > > > > > If function receives pointer to alloca then it would not be 
> > > > > > > > > marked with "Tail". Then we do not have a possibility to 
> > > > > > > > > understand whether this function receives pointer to alloca 
> > > > > > > > > but does not capture it:
> > > > > > > > > 
> > > > > > > > > ```
> > > > > > > > > void test(int recurseCount)
> > > > > > > > > {
> > > > > > > > > if (recurseCount == 0) return;
> > > > > > > > > int temp = 10;
> > > > > > > > > globalIncrement();
> > > > > > > > > test(recurseCount - 1);
> > > > > > > > > }
> > > > > > > > > ```
> > > > > > > > > 
> > > > > > > > > test - marked with Tail.
> > > > > > > > > globalIncrement - not marked with Tail. But TRE could be done 
> > > > > > > > > since it does not capture pointer. But if it will capture the 
> > > > > > > > > pointer then we could not do TRE. So we need to check 
> > > > > > > > > !Tracker.EscapePoints.empty().
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > 
> > > > > > > > > test - marked with Tail.
> > > > > > > > 
> > > > > > > > For the given code, TRE won't mark the recursive call "tail".  
> > > > > > > > That transform isn't legal: the recursive call could access the 
> > > > > > > > caller's version of "temp".
> > > > > > > >For the given code, TRE won't mark the recursive call "tail". 
> > > > > > > >That transform isn't legal: the recursive call could access the 
> > > > > > > >caller's version of "temp".
> > > > > > > 
> > > > > > > it looks like recursive call could NOT access the caller's 
> > > > > > > version of "temp":
> > > > > > > 
> > > > > > > ```
> > > > > > > test(recurseCount - 1);
> > > > > > > ```
> > > > > > > 
> > > > > > > Caller`s version of temp is accessed by non-recursive call:
> > > > > > > 
> > > > > > > ```
> > > > > > > globalIncrement();
> > > > > > > ```
> > > > > > > 
> > > > > > > If globalIncrement does not capture the "" then TRE looks to 
> > > > > > > be legal for that case. 
> > > > > > > 
> > > > > > > globalIncrement() would not be marked with "Tail". test() would 
> > > > > > > be marked with Tail.
> > > > > > > 
> > > > > > > Thus the pre-requisite for TRE would be: tail-recursive call must 
> > > > > > > not receive pointer to local stack(Tail) and non-recursive calls 
> > > > > > > must not capture the pointer to local stack.
> > > > > > Can you give a complete IR example where we infer "tail", but 

[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-23 Thread Alexey Lapshin via Phabricator via cfe-commits
avl marked an inline comment as done.
avl added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:825
+  // The local stack holds all alloca instructions and all byval arguments.
+  AllocaDerivedValueTracker Tracker;
+  for (Argument  : F.args()) {

efriedma wrote:
> avl wrote:
> > efriedma wrote:
> > > avl wrote:
> > > > efriedma wrote:
> > > > > avl wrote:
> > > > > > efriedma wrote:
> > > > > > > avl wrote:
> > > > > > > > efriedma wrote:
> > > > > > > > > avl wrote:
> > > > > > > > > > efriedma wrote:
> > > > > > > > > > > Do you have to redo the AllocaDerivedValueTracker 
> > > > > > > > > > > analysis?  Is it not enough that the call you're trying 
> > > > > > > > > > > to TRE is marked "tail"?
> > > > > > > > > > >Do you have to redo the AllocaDerivedValueTracker analysis?
> > > > > > > > > > 
> > > > > > > > > > AllocaDerivedValueTracker analysis(done in markTails) could 
> > > > > > > > > > be reused here. 
> > > > > > > > > > But marking, done in markTails(), looks like separate 
> > > > > > > > > > tasks. i.e. it is better 
> > > > > > > > > > to make TRE not depending on markTails(). There is a review 
> > > > > > > > > > for this - https://reviews.llvm.org/D60031
> > > > > > > > > > Thus such separation looks useful(To not reuse result of 
> > > > > > > > > > markTails but have it computed inplace).
> > > > > > > > > > 
> > > > > > > > > > > Is it not enough that the call you're trying to TRE is 
> > > > > > > > > > > marked "tail"?
> > > > > > > > > > 
> > > > > > > > > > It is not enough that call which is subject to TRE is 
> > > > > > > > > > marked "Tail".
> > > > > > > > > > It also should be checked that other calls does not capture 
> > > > > > > > > > pointer to local stack: 
> > > > > > > > > > 
> > > > > > > > > > ```
> > > > > > > > > > // do not do TRE if any pointer to local stack has escaped.
> > > > > > > > > > if (!Tracker.EscapePoints.empty())
> > > > > > > > > >return false;
> > > > > > > > > > 
> > > > > > > > > > ```
> > > > > > > > > > 
> > > > > > > > > > It is not enough that call which is subject to TRE is 
> > > > > > > > > > marked "Tail". It also should be checked that other calls 
> > > > > > > > > > does not capture pointer to local stack:
> > > > > > > > > 
> > > > > > > > > If there's an escaped pointer to the local stack, we wouldn't 
> > > > > > > > > infer "tail" in the first place, would we?
> > > > > > > > If function receives pointer to alloca then it would not be 
> > > > > > > > marked with "Tail". Then we do not have a possibility to 
> > > > > > > > understand whether this function receives pointer to alloca but 
> > > > > > > > does not capture it:
> > > > > > > > 
> > > > > > > > ```
> > > > > > > > void test(int recurseCount)
> > > > > > > > {
> > > > > > > > if (recurseCount == 0) return;
> > > > > > > > int temp = 10;
> > > > > > > > globalIncrement();
> > > > > > > > test(recurseCount - 1);
> > > > > > > > }
> > > > > > > > ```
> > > > > > > > 
> > > > > > > > test - marked with Tail.
> > > > > > > > globalIncrement - not marked with Tail. But TRE could be done 
> > > > > > > > since it does not capture pointer. But if it will capture the 
> > > > > > > > pointer then we could not do TRE. So we need to check 
> > > > > > > > !Tracker.EscapePoints.empty().
> > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > > test - marked with Tail.
> > > > > > > 
> > > > > > > For the given code, TRE won't mark the recursive call "tail".  
> > > > > > > That transform isn't legal: the recursive call could access the 
> > > > > > > caller's version of "temp".
> > > > > > >For the given code, TRE won't mark the recursive call "tail". That 
> > > > > > >transform isn't legal: the recursive call could access the 
> > > > > > >caller's version of "temp".
> > > > > > 
> > > > > > it looks like recursive call could NOT access the caller's version 
> > > > > > of "temp":
> > > > > > 
> > > > > > ```
> > > > > > test(recurseCount - 1);
> > > > > > ```
> > > > > > 
> > > > > > Caller`s version of temp is accessed by non-recursive call:
> > > > > > 
> > > > > > ```
> > > > > > globalIncrement();
> > > > > > ```
> > > > > > 
> > > > > > If globalIncrement does not capture the "" then TRE looks to 
> > > > > > be legal for that case. 
> > > > > > 
> > > > > > globalIncrement() would not be marked with "Tail". test() would be 
> > > > > > marked with Tail.
> > > > > > 
> > > > > > Thus the pre-requisite for TRE would be: tail-recursive call must 
> > > > > > not receive pointer to local stack(Tail) and non-recursive calls 
> > > > > > must not capture the pointer to local stack.
> > > > > Can you give a complete IR example where we infer "tail", but TRE is 
> > > > > illegal?
> > > > > 
> > > > > Can you give a complete IR example, we we don't infer "tail", but we 
> > > > > still do the TRE transform here?
> > > > >Can you give a complete IR example where we infer "tail", 

[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-23 Thread Eli Friedman via Phabricator via cfe-commits
efriedma added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:825
+  // The local stack holds all alloca instructions and all byval arguments.
+  AllocaDerivedValueTracker Tracker;
+  for (Argument  : F.args()) {

avl wrote:
> efriedma wrote:
> > avl wrote:
> > > efriedma wrote:
> > > > avl wrote:
> > > > > efriedma wrote:
> > > > > > avl wrote:
> > > > > > > efriedma wrote:
> > > > > > > > avl wrote:
> > > > > > > > > efriedma wrote:
> > > > > > > > > > Do you have to redo the AllocaDerivedValueTracker analysis? 
> > > > > > > > > >  Is it not enough that the call you're trying to TRE is 
> > > > > > > > > > marked "tail"?
> > > > > > > > > >Do you have to redo the AllocaDerivedValueTracker analysis?
> > > > > > > > > 
> > > > > > > > > AllocaDerivedValueTracker analysis(done in markTails) could 
> > > > > > > > > be reused here. 
> > > > > > > > > But marking, done in markTails(), looks like separate tasks. 
> > > > > > > > > i.e. it is better 
> > > > > > > > > to make TRE not depending on markTails(). There is a review 
> > > > > > > > > for this - https://reviews.llvm.org/D60031
> > > > > > > > > Thus such separation looks useful(To not reuse result of 
> > > > > > > > > markTails but have it computed inplace).
> > > > > > > > > 
> > > > > > > > > > Is it not enough that the call you're trying to TRE is 
> > > > > > > > > > marked "tail"?
> > > > > > > > > 
> > > > > > > > > It is not enough that call which is subject to TRE is marked 
> > > > > > > > > "Tail".
> > > > > > > > > It also should be checked that other calls does not capture 
> > > > > > > > > pointer to local stack: 
> > > > > > > > > 
> > > > > > > > > ```
> > > > > > > > > // do not do TRE if any pointer to local stack has escaped.
> > > > > > > > > if (!Tracker.EscapePoints.empty())
> > > > > > > > >return false;
> > > > > > > > > 
> > > > > > > > > ```
> > > > > > > > > 
> > > > > > > > > It is not enough that call which is subject to TRE is marked 
> > > > > > > > > "Tail". It also should be checked that other calls does not 
> > > > > > > > > capture pointer to local stack:
> > > > > > > > 
> > > > > > > > If there's an escaped pointer to the local stack, we wouldn't 
> > > > > > > > infer "tail" in the first place, would we?
> > > > > > > If function receives pointer to alloca then it would not be 
> > > > > > > marked with "Tail". Then we do not have a possibility to 
> > > > > > > understand whether this function receives pointer to alloca but 
> > > > > > > does not capture it:
> > > > > > > 
> > > > > > > ```
> > > > > > > void test(int recurseCount)
> > > > > > > {
> > > > > > > if (recurseCount == 0) return;
> > > > > > > int temp = 10;
> > > > > > > globalIncrement();
> > > > > > > test(recurseCount - 1);
> > > > > > > }
> > > > > > > ```
> > > > > > > 
> > > > > > > test - marked with Tail.
> > > > > > > globalIncrement - not marked with Tail. But TRE could be done 
> > > > > > > since it does not capture pointer. But if it will capture the 
> > > > > > > pointer then we could not do TRE. So we need to check 
> > > > > > > !Tracker.EscapePoints.empty().
> > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > > test - marked with Tail.
> > > > > > 
> > > > > > For the given code, TRE won't mark the recursive call "tail".  That 
> > > > > > transform isn't legal: the recursive call could access the caller's 
> > > > > > version of "temp".
> > > > > >For the given code, TRE won't mark the recursive call "tail". That 
> > > > > >transform isn't legal: the recursive call could access the caller's 
> > > > > >version of "temp".
> > > > > 
> > > > > it looks like recursive call could NOT access the caller's version of 
> > > > > "temp":
> > > > > 
> > > > > ```
> > > > > test(recurseCount - 1);
> > > > > ```
> > > > > 
> > > > > Caller`s version of temp is accessed by non-recursive call:
> > > > > 
> > > > > ```
> > > > > globalIncrement();
> > > > > ```
> > > > > 
> > > > > If globalIncrement does not capture the "" then TRE looks to be 
> > > > > legal for that case. 
> > > > > 
> > > > > globalIncrement() would not be marked with "Tail". test() would be 
> > > > > marked with Tail.
> > > > > 
> > > > > Thus the pre-requisite for TRE would be: tail-recursive call must not 
> > > > > receive pointer to local stack(Tail) and non-recursive calls must not 
> > > > > capture the pointer to local stack.
> > > > Can you give a complete IR example where we infer "tail", but TRE is 
> > > > illegal?
> > > > 
> > > > Can you give a complete IR example, we we don't infer "tail", but we 
> > > > still do the TRE transform here?
> > > >Can you give a complete IR example where we infer "tail", but TRE is 
> > > >illegal?
> > > 
> > > there is no such example. Currently all cases where we  infer "tail" 
> > > would be valid for TRE.
> > > 
> > > >Can you give a complete IR example, we we don't infer "tail", but we 
> > > >still do the TRE transform 

[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-23 Thread Alexey Lapshin via Phabricator via cfe-commits
avl updated this revision to Diff 272829.
avl edited the summary of this revision.
avl added a comment.

addressed comments:

1. removed PointerMayBeCaptured() used for CalledFunction.
2. rewrote CanTRE() to visiting instructions only once.
3. replaced areAllLastFuncCallsRecursive() with isInTREPosition().

I did not address request for not using AllocaDerivedValueTracker yet.
Since there is open question on it. I would address it as soon as the question 
would be resolved.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085

Files:
  llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
  llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll

Index: llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
===
--- /dev/null
+++ llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
@@ -0,0 +1,69 @@
+; RUN: opt < %s -tailcallelim -verify-dom-info -S | FileCheck %s
+
+; IR for that test was generated from the following C++ source:
+;
+;int count;
+;__attribute__((noinline)) void globalIncrement(const int* param) { count += *param; }
+;
+;void test(int recurseCount)
+;{
+;if (recurseCount == 0) return;
+;int temp = 10;
+;globalIncrement();
+;test(recurseCount - 1);
+;}
+;
+
+@count = dso_local local_unnamed_addr global i32 0, align 4
+
+; Function Attrs: nofree noinline norecurse nounwind uwtable
+define dso_local void @_Z15globalIncrementPKi(i32* nocapture readonly %param) local_unnamed_addr #0 {
+entry:
+  %0 = load i32, i32* %param, align 4
+  %1 = load i32, i32* @count, align 4
+  %add = add nsw i32 %1, %0
+  store i32 %add, i32* @count, align 4
+  ret void
+}
+
+; Test that TRE could be done for recursive tail routine containing
+; call to function receiving a pointer to local stack. 
+
+; CHECK: void @_Z4testi
+; CHECK: br label %tailrecurse
+; CHECK: tailrecurse:
+; CHECK-NOT: call void @_Z4testi
+; CHECK: br label %tailrecurse
+; CHECK-NOT: call void @_Z4testi
+; CHECK: ret
+
+; Function Attrs: nounwind uwtable
+define dso_local void @_Z4testi(i32 %recurseCount) local_unnamed_addr #1 {
+entry:
+  %temp = alloca i32, align 4
+  %cmp = icmp eq i32 %recurseCount, 0
+  br i1 %cmp, label %return, label %if.end
+
+if.end:   ; preds = %entry
+  %0 = bitcast i32* %temp to i8*
+  call void @llvm.lifetime.start.p0i8(i64 4, i8* nonnull %0) #6
+  store i32 10, i32* %temp, align 4
+  call void @_Z15globalIncrementPKi(i32* nonnull %temp)
+  %sub = add nsw i32 %recurseCount, -1
+  call void @_Z4testi(i32 %sub)
+  call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %0) #6
+  br label %return
+
+return:   ; preds = %entry, %if.end
+  ret void
+}
+
+; Function Attrs: argmemonly nounwind willreturn
+declare void @llvm.lifetime.start.p0i8(i64 immarg, i8* nocapture) #2
+
+; Function Attrs: argmemonly nounwind willreturn
+declare void @llvm.lifetime.end.p0i8(i64 immarg, i8* nocapture) #2
+
+attributes #0 = { nofree noinline norecurse nounwind uwtable }
+attributes #1 = { nounwind uwtable }
+attributes #2 = { argmemonly nounwind willreturn }
Index: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
===
--- llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
+++ llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
@@ -89,16 +89,6 @@
 STATISTIC(NumRetDuped,   "Number of return duplicated");
 STATISTIC(NumAccumAdded, "Number of accumulators introduced");
 
-/// Scan the specified function for alloca instructions.
-/// If it contains any dynamic allocas, returns false.
-static bool canTRE(Function ) {
-  // Because of PR962, we don't TRE dynamic allocas.
-  return llvm::all_of(instructions(F), [](Instruction ) {
-auto *AI = dyn_cast();
-return !AI || AI->isStaticAlloca();
-  });
-}
-
 namespace {
 struct AllocaDerivedValueTracker {
   // Start at a root value and walk its use-def chain to mark calls that use the
@@ -185,11 +175,9 @@
 };
 }
 
-static bool markTails(Function , bool ,
-  OptimizationRemarkEmitter *ORE) {
+static bool markTails(Function , OptimizationRemarkEmitter *ORE) {
   if (F.callsFunctionThatReturnsTwice())
 return false;
-  AllCallsAreTailCalls = true;
 
   // The local stack holds all alloca instructions and all byval arguments.
   AllocaDerivedValueTracker Tracker;
@@ -272,11 +260,8 @@
 }
   }
 
-  if (!IsNoTail && Escaped == UNESCAPED && !Tracker.AllocaUsers.count(CI)) {
+  if (!IsNoTail && Escaped == UNESCAPED && !Tracker.AllocaUsers.count(CI))
 DeferredTails.push_back(CI);
-  } else {
-AllCallsAreTailCalls = false;
-  }
 }
 
 for (auto *SuccBB : make_range(succ_begin(BB), succ_end(BB))) {
@@ -313,8 +298,6 @@
   LLVM_DEBUG(dbgs() << "Marked as tail call 

[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-23 Thread Alexey Lapshin via Phabricator via cfe-commits
avl marked an inline comment as done.
avl added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:825
+  // The local stack holds all alloca instructions and all byval arguments.
+  AllocaDerivedValueTracker Tracker;
+  for (Argument  : F.args()) {

efriedma wrote:
> avl wrote:
> > efriedma wrote:
> > > avl wrote:
> > > > efriedma wrote:
> > > > > avl wrote:
> > > > > > efriedma wrote:
> > > > > > > avl wrote:
> > > > > > > > efriedma wrote:
> > > > > > > > > Do you have to redo the AllocaDerivedValueTracker analysis?  
> > > > > > > > > Is it not enough that the call you're trying to TRE is marked 
> > > > > > > > > "tail"?
> > > > > > > > >Do you have to redo the AllocaDerivedValueTracker analysis?
> > > > > > > > 
> > > > > > > > AllocaDerivedValueTracker analysis(done in markTails) could be 
> > > > > > > > reused here. 
> > > > > > > > But marking, done in markTails(), looks like separate tasks. 
> > > > > > > > i.e. it is better 
> > > > > > > > to make TRE not depending on markTails(). There is a review for 
> > > > > > > > this - https://reviews.llvm.org/D60031
> > > > > > > > Thus such separation looks useful(To not reuse result of 
> > > > > > > > markTails but have it computed inplace).
> > > > > > > > 
> > > > > > > > > Is it not enough that the call you're trying to TRE is marked 
> > > > > > > > > "tail"?
> > > > > > > > 
> > > > > > > > It is not enough that call which is subject to TRE is marked 
> > > > > > > > "Tail".
> > > > > > > > It also should be checked that other calls does not capture 
> > > > > > > > pointer to local stack: 
> > > > > > > > 
> > > > > > > > ```
> > > > > > > > // do not do TRE if any pointer to local stack has escaped.
> > > > > > > > if (!Tracker.EscapePoints.empty())
> > > > > > > >return false;
> > > > > > > > 
> > > > > > > > ```
> > > > > > > > 
> > > > > > > > It is not enough that call which is subject to TRE is marked 
> > > > > > > > "Tail". It also should be checked that other calls does not 
> > > > > > > > capture pointer to local stack:
> > > > > > > 
> > > > > > > If there's an escaped pointer to the local stack, we wouldn't 
> > > > > > > infer "tail" in the first place, would we?
> > > > > > If function receives pointer to alloca then it would not be marked 
> > > > > > with "Tail". Then we do not have a possibility to understand 
> > > > > > whether this function receives pointer to alloca but does not 
> > > > > > capture it:
> > > > > > 
> > > > > > ```
> > > > > > void test(int recurseCount)
> > > > > > {
> > > > > > if (recurseCount == 0) return;
> > > > > > int temp = 10;
> > > > > > globalIncrement();
> > > > > > test(recurseCount - 1);
> > > > > > }
> > > > > > ```
> > > > > > 
> > > > > > test - marked with Tail.
> > > > > > globalIncrement - not marked with Tail. But TRE could be done since 
> > > > > > it does not capture pointer. But if it will capture the pointer 
> > > > > > then we could not do TRE. So we need to check 
> > > > > > !Tracker.EscapePoints.empty().
> > > > > > 
> > > > > > 
> > > > > > 
> > > > > > test - marked with Tail.
> > > > > 
> > > > > For the given code, TRE won't mark the recursive call "tail".  That 
> > > > > transform isn't legal: the recursive call could access the caller's 
> > > > > version of "temp".
> > > > >For the given code, TRE won't mark the recursive call "tail". That 
> > > > >transform isn't legal: the recursive call could access the caller's 
> > > > >version of "temp".
> > > > 
> > > > it looks like recursive call could NOT access the caller's version of 
> > > > "temp":
> > > > 
> > > > ```
> > > > test(recurseCount - 1);
> > > > ```
> > > > 
> > > > Caller`s version of temp is accessed by non-recursive call:
> > > > 
> > > > ```
> > > > globalIncrement();
> > > > ```
> > > > 
> > > > If globalIncrement does not capture the "" then TRE looks to be 
> > > > legal for that case. 
> > > > 
> > > > globalIncrement() would not be marked with "Tail". test() would be 
> > > > marked with Tail.
> > > > 
> > > > Thus the pre-requisite for TRE would be: tail-recursive call must not 
> > > > receive pointer to local stack(Tail) and non-recursive calls must not 
> > > > capture the pointer to local stack.
> > > Can you give a complete IR example where we infer "tail", but TRE is 
> > > illegal?
> > > 
> > > Can you give a complete IR example, we we don't infer "tail", but we 
> > > still do the TRE transform here?
> > >Can you give a complete IR example where we infer "tail", but TRE is 
> > >illegal?
> > 
> > there is no such example. Currently all cases where we  infer "tail" would 
> > be valid for TRE.
> > 
> > >Can you give a complete IR example, we we don't infer "tail", but we still 
> > >do the TRE transform here?
> > 
> > For the following example current code base would not infer "tail" for 
> > _Z15globalIncrementPKi and as the result would not do TRE for _Z4testi.
> > This patch changes this behavior: 

[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-23 Thread Eli Friedman via Phabricator via cfe-commits
efriedma added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:825
+  // The local stack holds all alloca instructions and all byval arguments.
+  AllocaDerivedValueTracker Tracker;
+  for (Argument  : F.args()) {

avl wrote:
> efriedma wrote:
> > avl wrote:
> > > efriedma wrote:
> > > > avl wrote:
> > > > > efriedma wrote:
> > > > > > avl wrote:
> > > > > > > efriedma wrote:
> > > > > > > > Do you have to redo the AllocaDerivedValueTracker analysis?  Is 
> > > > > > > > it not enough that the call you're trying to TRE is marked 
> > > > > > > > "tail"?
> > > > > > > >Do you have to redo the AllocaDerivedValueTracker analysis?
> > > > > > > 
> > > > > > > AllocaDerivedValueTracker analysis(done in markTails) could be 
> > > > > > > reused here. 
> > > > > > > But marking, done in markTails(), looks like separate tasks. i.e. 
> > > > > > > it is better 
> > > > > > > to make TRE not depending on markTails(). There is a review for 
> > > > > > > this - https://reviews.llvm.org/D60031
> > > > > > > Thus such separation looks useful(To not reuse result of 
> > > > > > > markTails but have it computed inplace).
> > > > > > > 
> > > > > > > > Is it not enough that the call you're trying to TRE is marked 
> > > > > > > > "tail"?
> > > > > > > 
> > > > > > > It is not enough that call which is subject to TRE is marked 
> > > > > > > "Tail".
> > > > > > > It also should be checked that other calls does not capture 
> > > > > > > pointer to local stack: 
> > > > > > > 
> > > > > > > ```
> > > > > > > // do not do TRE if any pointer to local stack has escaped.
> > > > > > > if (!Tracker.EscapePoints.empty())
> > > > > > >return false;
> > > > > > > 
> > > > > > > ```
> > > > > > > 
> > > > > > > It is not enough that call which is subject to TRE is marked 
> > > > > > > "Tail". It also should be checked that other calls does not 
> > > > > > > capture pointer to local stack:
> > > > > > 
> > > > > > If there's an escaped pointer to the local stack, we wouldn't infer 
> > > > > > "tail" in the first place, would we?
> > > > > If function receives pointer to alloca then it would not be marked 
> > > > > with "Tail". Then we do not have a possibility to understand whether 
> > > > > this function receives pointer to alloca but does not capture it:
> > > > > 
> > > > > ```
> > > > > void test(int recurseCount)
> > > > > {
> > > > > if (recurseCount == 0) return;
> > > > > int temp = 10;
> > > > > globalIncrement();
> > > > > test(recurseCount - 1);
> > > > > }
> > > > > ```
> > > > > 
> > > > > test - marked with Tail.
> > > > > globalIncrement - not marked with Tail. But TRE could be done since 
> > > > > it does not capture pointer. But if it will capture the pointer then 
> > > > > we could not do TRE. So we need to check 
> > > > > !Tracker.EscapePoints.empty().
> > > > > 
> > > > > 
> > > > > 
> > > > > test - marked with Tail.
> > > > 
> > > > For the given code, TRE won't mark the recursive call "tail".  That 
> > > > transform isn't legal: the recursive call could access the caller's 
> > > > version of "temp".
> > > >For the given code, TRE won't mark the recursive call "tail". That 
> > > >transform isn't legal: the recursive call could access the caller's 
> > > >version of "temp".
> > > 
> > > it looks like recursive call could NOT access the caller's version of 
> > > "temp":
> > > 
> > > ```
> > > test(recurseCount - 1);
> > > ```
> > > 
> > > Caller`s version of temp is accessed by non-recursive call:
> > > 
> > > ```
> > > globalIncrement();
> > > ```
> > > 
> > > If globalIncrement does not capture the "" then TRE looks to be 
> > > legal for that case. 
> > > 
> > > globalIncrement() would not be marked with "Tail". test() would be marked 
> > > with Tail.
> > > 
> > > Thus the pre-requisite for TRE would be: tail-recursive call must not 
> > > receive pointer to local stack(Tail) and non-recursive calls must not 
> > > capture the pointer to local stack.
> > Can you give a complete IR example where we infer "tail", but TRE is 
> > illegal?
> > 
> > Can you give a complete IR example, we we don't infer "tail", but we still 
> > do the TRE transform here?
> >Can you give a complete IR example where we infer "tail", but TRE is illegal?
> 
> there is no such example. Currently all cases where we  infer "tail" would be 
> valid for TRE.
> 
> >Can you give a complete IR example, we we don't infer "tail", but we still 
> >do the TRE transform here?
> 
> For the following example current code base would not infer "tail" for 
> _Z15globalIncrementPKi and as the result would not do TRE for _Z4testi.
> This patch changes this behavior: so that if _Z15globalIncrementPKi is not 
> marked with "tail" and does not capture its pointer argument - TRE would be 
> allowed for _Z4testi.
> 
> 
> ```
> @count = dso_local local_unnamed_addr global i32 0, align 4
> 
> ; Function Attrs: nofree noinline norecurse nounwind uwtable
> 

[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-23 Thread Alexey Lapshin via Phabricator via cfe-commits
avl marked an inline comment as done.
avl added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:825
+  // The local stack holds all alloca instructions and all byval arguments.
+  AllocaDerivedValueTracker Tracker;
+  for (Argument  : F.args()) {

efriedma wrote:
> avl wrote:
> > efriedma wrote:
> > > avl wrote:
> > > > efriedma wrote:
> > > > > avl wrote:
> > > > > > efriedma wrote:
> > > > > > > Do you have to redo the AllocaDerivedValueTracker analysis?  Is 
> > > > > > > it not enough that the call you're trying to TRE is marked "tail"?
> > > > > > >Do you have to redo the AllocaDerivedValueTracker analysis?
> > > > > > 
> > > > > > AllocaDerivedValueTracker analysis(done in markTails) could be 
> > > > > > reused here. 
> > > > > > But marking, done in markTails(), looks like separate tasks. i.e. 
> > > > > > it is better 
> > > > > > to make TRE not depending on markTails(). There is a review for 
> > > > > > this - https://reviews.llvm.org/D60031
> > > > > > Thus such separation looks useful(To not reuse result of markTails 
> > > > > > but have it computed inplace).
> > > > > > 
> > > > > > > Is it not enough that the call you're trying to TRE is marked 
> > > > > > > "tail"?
> > > > > > 
> > > > > > It is not enough that call which is subject to TRE is marked "Tail".
> > > > > > It also should be checked that other calls does not capture pointer 
> > > > > > to local stack: 
> > > > > > 
> > > > > > ```
> > > > > > // do not do TRE if any pointer to local stack has escaped.
> > > > > > if (!Tracker.EscapePoints.empty())
> > > > > >return false;
> > > > > > 
> > > > > > ```
> > > > > > 
> > > > > > It is not enough that call which is subject to TRE is marked 
> > > > > > "Tail". It also should be checked that other calls does not capture 
> > > > > > pointer to local stack:
> > > > > 
> > > > > If there's an escaped pointer to the local stack, we wouldn't infer 
> > > > > "tail" in the first place, would we?
> > > > If function receives pointer to alloca then it would not be marked with 
> > > > "Tail". Then we do not have a possibility to understand whether this 
> > > > function receives pointer to alloca but does not capture it:
> > > > 
> > > > ```
> > > > void test(int recurseCount)
> > > > {
> > > > if (recurseCount == 0) return;
> > > > int temp = 10;
> > > > globalIncrement();
> > > > test(recurseCount - 1);
> > > > }
> > > > ```
> > > > 
> > > > test - marked with Tail.
> > > > globalIncrement - not marked with Tail. But TRE could be done since it 
> > > > does not capture pointer. But if it will capture the pointer then we 
> > > > could not do TRE. So we need to check !Tracker.EscapePoints.empty().
> > > > 
> > > > 
> > > > 
> > > > test - marked with Tail.
> > > 
> > > For the given code, TRE won't mark the recursive call "tail".  That 
> > > transform isn't legal: the recursive call could access the caller's 
> > > version of "temp".
> > >For the given code, TRE won't mark the recursive call "tail". That 
> > >transform isn't legal: the recursive call could access the caller's 
> > >version of "temp".
> > 
> > it looks like recursive call could NOT access the caller's version of 
> > "temp":
> > 
> > ```
> > test(recurseCount - 1);
> > ```
> > 
> > Caller`s version of temp is accessed by non-recursive call:
> > 
> > ```
> > globalIncrement();
> > ```
> > 
> > If globalIncrement does not capture the "" then TRE looks to be legal 
> > for that case. 
> > 
> > globalIncrement() would not be marked with "Tail". test() would be marked 
> > with Tail.
> > 
> > Thus the pre-requisite for TRE would be: tail-recursive call must not 
> > receive pointer to local stack(Tail) and non-recursive calls must not 
> > capture the pointer to local stack.
> Can you give a complete IR example where we infer "tail", but TRE is illegal?
> 
> Can you give a complete IR example, we we don't infer "tail", but we still do 
> the TRE transform here?
>Can you give a complete IR example where we infer "tail", but TRE is illegal?

there is no such example. Currently all cases where we  infer "tail" would be 
valid for TRE.

>Can you give a complete IR example, we we don't infer "tail", but we still do 
>the TRE transform here?

For the following example current code base would not infer "tail" for 
_Z15globalIncrementPKi and as the result would not do TRE for _Z4testi.
This patch changes this behavior: so that if _Z15globalIncrementPKi is not 
marked with "tail" and does not capture its pointer argument - TRE would be 
allowed for _Z4testi.


```
@count = dso_local local_unnamed_addr global i32 0, align 4

; Function Attrs: nofree noinline norecurse nounwind uwtable
define dso_local void @_Z15globalIncrementPKi(i32* nocapture readonly %param) 
local_unnamed_addr #0 {
entry:
  %0 = load i32, i32* %param, align 4
  %1 = load i32, i32* @count, align 4
  %add = add nsw i32 %1, %0
  store i32 %add, i32* @count, align 4
  ret 

[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-23 Thread Eli Friedman via Phabricator via cfe-commits
efriedma added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:825
+  // The local stack holds all alloca instructions and all byval arguments.
+  AllocaDerivedValueTracker Tracker;
+  for (Argument  : F.args()) {

avl wrote:
> efriedma wrote:
> > avl wrote:
> > > efriedma wrote:
> > > > avl wrote:
> > > > > efriedma wrote:
> > > > > > Do you have to redo the AllocaDerivedValueTracker analysis?  Is it 
> > > > > > not enough that the call you're trying to TRE is marked "tail"?
> > > > > >Do you have to redo the AllocaDerivedValueTracker analysis?
> > > > > 
> > > > > AllocaDerivedValueTracker analysis(done in markTails) could be reused 
> > > > > here. 
> > > > > But marking, done in markTails(), looks like separate tasks. i.e. it 
> > > > > is better 
> > > > > to make TRE not depending on markTails(). There is a review for this 
> > > > > - https://reviews.llvm.org/D60031
> > > > > Thus such separation looks useful(To not reuse result of markTails 
> > > > > but have it computed inplace).
> > > > > 
> > > > > > Is it not enough that the call you're trying to TRE is marked 
> > > > > > "tail"?
> > > > > 
> > > > > It is not enough that call which is subject to TRE is marked "Tail".
> > > > > It also should be checked that other calls does not capture pointer 
> > > > > to local stack: 
> > > > > 
> > > > > ```
> > > > > // do not do TRE if any pointer to local stack has escaped.
> > > > > if (!Tracker.EscapePoints.empty())
> > > > >return false;
> > > > > 
> > > > > ```
> > > > > 
> > > > > It is not enough that call which is subject to TRE is marked "Tail". 
> > > > > It also should be checked that other calls does not capture pointer 
> > > > > to local stack:
> > > > 
> > > > If there's an escaped pointer to the local stack, we wouldn't infer 
> > > > "tail" in the first place, would we?
> > > If function receives pointer to alloca then it would not be marked with 
> > > "Tail". Then we do not have a possibility to understand whether this 
> > > function receives pointer to alloca but does not capture it:
> > > 
> > > ```
> > > void test(int recurseCount)
> > > {
> > > if (recurseCount == 0) return;
> > > int temp = 10;
> > > globalIncrement();
> > > test(recurseCount - 1);
> > > }
> > > ```
> > > 
> > > test - marked with Tail.
> > > globalIncrement - not marked with Tail. But TRE could be done since it 
> > > does not capture pointer. But if it will capture the pointer then we 
> > > could not do TRE. So we need to check !Tracker.EscapePoints.empty().
> > > 
> > > 
> > > 
> > > test - marked with Tail.
> > 
> > For the given code, TRE won't mark the recursive call "tail".  That 
> > transform isn't legal: the recursive call could access the caller's version 
> > of "temp".
> >For the given code, TRE won't mark the recursive call "tail". That transform 
> >isn't legal: the recursive call could access the caller's version of "temp".
> 
> it looks like recursive call could NOT access the caller's version of "temp":
> 
> ```
> test(recurseCount - 1);
> ```
> 
> Caller`s version of temp is accessed by non-recursive call:
> 
> ```
> globalIncrement();
> ```
> 
> If globalIncrement does not capture the "" then TRE looks to be legal 
> for that case. 
> 
> globalIncrement() would not be marked with "Tail". test() would be marked 
> with Tail.
> 
> Thus the pre-requisite for TRE would be: tail-recursive call must not receive 
> pointer to local stack(Tail) and non-recursive calls must not capture the 
> pointer to local stack.
Can you give a complete IR example where we infer "tail", but TRE is illegal?

Can you give a complete IR example, we we don't infer "tail", but we still do 
the TRE transform here?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-23 Thread Alexey Lapshin via Phabricator via cfe-commits
avl marked an inline comment as done.
avl added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:825
+  // The local stack holds all alloca instructions and all byval arguments.
+  AllocaDerivedValueTracker Tracker;
+  for (Argument  : F.args()) {

efriedma wrote:
> avl wrote:
> > efriedma wrote:
> > > avl wrote:
> > > > efriedma wrote:
> > > > > Do you have to redo the AllocaDerivedValueTracker analysis?  Is it 
> > > > > not enough that the call you're trying to TRE is marked "tail"?
> > > > >Do you have to redo the AllocaDerivedValueTracker analysis?
> > > > 
> > > > AllocaDerivedValueTracker analysis(done in markTails) could be reused 
> > > > here. 
> > > > But marking, done in markTails(), looks like separate tasks. i.e. it is 
> > > > better 
> > > > to make TRE not depending on markTails(). There is a review for this - 
> > > > https://reviews.llvm.org/D60031
> > > > Thus such separation looks useful(To not reuse result of markTails but 
> > > > have it computed inplace).
> > > > 
> > > > > Is it not enough that the call you're trying to TRE is marked "tail"?
> > > > 
> > > > It is not enough that call which is subject to TRE is marked "Tail".
> > > > It also should be checked that other calls does not capture pointer to 
> > > > local stack: 
> > > > 
> > > > ```
> > > > // do not do TRE if any pointer to local stack has escaped.
> > > > if (!Tracker.EscapePoints.empty())
> > > >return false;
> > > > 
> > > > ```
> > > > 
> > > > It is not enough that call which is subject to TRE is marked "Tail". It 
> > > > also should be checked that other calls does not capture pointer to 
> > > > local stack:
> > > 
> > > If there's an escaped pointer to the local stack, we wouldn't infer 
> > > "tail" in the first place, would we?
> > If function receives pointer to alloca then it would not be marked with 
> > "Tail". Then we do not have a possibility to understand whether this 
> > function receives pointer to alloca but does not capture it:
> > 
> > ```
> > void test(int recurseCount)
> > {
> > if (recurseCount == 0) return;
> > int temp = 10;
> > globalIncrement();
> > test(recurseCount - 1);
> > }
> > ```
> > 
> > test - marked with Tail.
> > globalIncrement - not marked with Tail. But TRE could be done since it does 
> > not capture pointer. But if it will capture the pointer then we could not 
> > do TRE. So we need to check !Tracker.EscapePoints.empty().
> > 
> > 
> > 
> > test - marked with Tail.
> 
> For the given code, TRE won't mark the recursive call "tail".  That transform 
> isn't legal: the recursive call could access the caller's version of "temp".
>For the given code, TRE won't mark the recursive call "tail". That transform 
>isn't legal: the recursive call could access the caller's version of "temp".

it looks like recursive call could NOT access the caller's version of "temp":

```
test(recurseCount - 1);
```

Caller`s version of temp is accessed by non-recursive call:

```
globalIncrement();
```

If globalIncrement does not capture the "" then TRE looks to be legal for 
that case. 

globalIncrement() would not be marked with "Tail". test() would be marked with 
Tail.

Thus the pre-requisite for TRE would be: tail-recursive call must not receive 
pointer to local stack(Tail) and non-recursive calls must not capture the 
pointer to local stack.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-22 Thread Eli Friedman via Phabricator via cfe-commits
efriedma added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:825
+  // The local stack holds all alloca instructions and all byval arguments.
+  AllocaDerivedValueTracker Tracker;
+  for (Argument  : F.args()) {

avl wrote:
> efriedma wrote:
> > avl wrote:
> > > efriedma wrote:
> > > > Do you have to redo the AllocaDerivedValueTracker analysis?  Is it not 
> > > > enough that the call you're trying to TRE is marked "tail"?
> > > >Do you have to redo the AllocaDerivedValueTracker analysis?
> > > 
> > > AllocaDerivedValueTracker analysis(done in markTails) could be reused 
> > > here. 
> > > But marking, done in markTails(), looks like separate tasks. i.e. it is 
> > > better 
> > > to make TRE not depending on markTails(). There is a review for this - 
> > > https://reviews.llvm.org/D60031
> > > Thus such separation looks useful(To not reuse result of markTails but 
> > > have it computed inplace).
> > > 
> > > > Is it not enough that the call you're trying to TRE is marked "tail"?
> > > 
> > > It is not enough that call which is subject to TRE is marked "Tail".
> > > It also should be checked that other calls does not capture pointer to 
> > > local stack: 
> > > 
> > > ```
> > > // do not do TRE if any pointer to local stack has escaped.
> > > if (!Tracker.EscapePoints.empty())
> > >return false;
> > > 
> > > ```
> > > 
> > > It is not enough that call which is subject to TRE is marked "Tail". It 
> > > also should be checked that other calls does not capture pointer to local 
> > > stack:
> > 
> > If there's an escaped pointer to the local stack, we wouldn't infer "tail" 
> > in the first place, would we?
> If function receives pointer to alloca then it would not be marked with 
> "Tail". Then we do not have a possibility to understand whether this function 
> receives pointer to alloca but does not capture it:
> 
> ```
> void test(int recurseCount)
> {
> if (recurseCount == 0) return;
> int temp = 10;
> globalIncrement();
> test(recurseCount - 1);
> }
> ```
> 
> test - marked with Tail.
> globalIncrement - not marked with Tail. But TRE could be done since it does 
> not capture pointer. But if it will capture the pointer then we could not do 
> TRE. So we need to check !Tracker.EscapePoints.empty().
> 
> 
> 
> test - marked with Tail.

For the given code, TRE won't mark the recursive call "tail".  That transform 
isn't legal: the recursive call could access the caller's version of "temp".


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-22 Thread Alexey Lapshin via Phabricator via cfe-commits
avl marked 3 inline comments as done.
avl added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:823
+
+bool TailRecursionEliminator::canTRE(Function ) {
+  // The local stack holds all alloca instructions and all byval arguments.

laytonio wrote:
> There is no need to pass the function here since its a member variable.
Ok.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:825
+  // The local stack holds all alloca instructions and all byval arguments.
+  AllocaDerivedValueTracker Tracker;
+  for (Argument  : F.args()) {

efriedma wrote:
> avl wrote:
> > efriedma wrote:
> > > Do you have to redo the AllocaDerivedValueTracker analysis?  Is it not 
> > > enough that the call you're trying to TRE is marked "tail"?
> > >Do you have to redo the AllocaDerivedValueTracker analysis?
> > 
> > AllocaDerivedValueTracker analysis(done in markTails) could be reused here. 
> > But marking, done in markTails(), looks like separate tasks. i.e. it is 
> > better 
> > to make TRE not depending on markTails(). There is a review for this - 
> > https://reviews.llvm.org/D60031
> > Thus such separation looks useful(To not reuse result of markTails but have 
> > it computed inplace).
> > 
> > > Is it not enough that the call you're trying to TRE is marked "tail"?
> > 
> > It is not enough that call which is subject to TRE is marked "Tail".
> > It also should be checked that other calls does not capture pointer to 
> > local stack: 
> > 
> > ```
> > // do not do TRE if any pointer to local stack has escaped.
> > if (!Tracker.EscapePoints.empty())
> >return false;
> > 
> > ```
> > 
> > It is not enough that call which is subject to TRE is marked "Tail". It 
> > also should be checked that other calls does not capture pointer to local 
> > stack:
> 
> If there's an escaped pointer to the local stack, we wouldn't infer "tail" in 
> the first place, would we?
If function receives pointer to alloca then it would not be marked with "Tail". 
Then we do not have a possibility to understand whether this function receives 
pointer to alloca but does not capture it:

```
void test(int recurseCount)
{
if (recurseCount == 0) return;
int temp = 10;
globalIncrement();
test(recurseCount - 1);
}
```

test - marked with Tail.
globalIncrement - not marked with Tail. But TRE could be done since it does not 
capture pointer. But if it will capture the pointer then we could not do TRE. 
So we need to check !Tracker.EscapePoints.empty().






Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:852
+
+// Do not do TRE if exists recursive calls which are not last calls.
+if (!isTailBlock(CI->getParent()) ||

efriedma wrote:
> avl wrote:
> > efriedma wrote:
> > > I thought we had some tests where we TRE in the presence of recursive 
> > > calls, like a simple recursive fibonacci.  Am I misunderstanding this?
> > right, there is a testcase for fibonacchi:
> > 
> > llvm/test/Transforms/TailCallElim/accum_recursion.ll:@test3_fib
> > 
> > areAllLastFuncCallsRecursive() checking works well for fibonacci testcase:
> > 
> > 
> > ```
> > return fib(x-1)+fib(x-2);
> > 
> > ```
> > 
> > Since, Last funcs call chain is : fib()->fib()->ret. 
> > That check should prevent from such cases:
> > 
> > 
> > ```
> > return fib(x-1)+another_call()+fib(x-2);
> > ```
> > 
> > 
> > That check should prevent from such cases: return 
> > fib(x-1)+another_call()+fib(x-2);
> 
> Why do we need to prevent this?
We do not. I misunderstood the canTransformAccumulatorRecursion(). That check 
could be removed.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-22 Thread Aditya Kumar via Phabricator via cfe-commits
hiraditya added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:801
+if (Branch->isUnconditional())
+  if (ReturnInst *Ret = dyn_cast(
+  Branch->getSuccessor(0)->getFirstNonPHIOrDbg()))

can we use isa<> here?



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:816
+if (CallInst *CI = dyn_cast(&*BBI))
+  if (!canMoveAboveCall(CI, Inst, AA) && CI->getCalledFunction() != )
+return false;

`CI->getCalledFunction() != ` seems cheaper than `canMoveAboveCall`



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:840
+
+  return !llvm::any_of(instructions(F), [&](Instruction ) {
+// Because of PR962, we don't TRE dynamic allocas.

Do we need to visit all the instructions twice?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-22 Thread Layton Kifer via Phabricator via cfe-commits
laytonio added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:823
+
+bool TailRecursionEliminator::canTRE(Function ) {
+  // The local stack holds all alloca instructions and all byval arguments.

There is no need to pass the function here since its a member variable.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-22 Thread Eli Friedman via Phabricator via cfe-commits
efriedma added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:825
+  // The local stack holds all alloca instructions and all byval arguments.
+  AllocaDerivedValueTracker Tracker;
+  for (Argument  : F.args()) {

avl wrote:
> efriedma wrote:
> > Do you have to redo the AllocaDerivedValueTracker analysis?  Is it not 
> > enough that the call you're trying to TRE is marked "tail"?
> >Do you have to redo the AllocaDerivedValueTracker analysis?
> 
> AllocaDerivedValueTracker analysis(done in markTails) could be reused here. 
> But marking, done in markTails(), looks like separate tasks. i.e. it is 
> better 
> to make TRE not depending on markTails(). There is a review for this - 
> https://reviews.llvm.org/D60031
> Thus such separation looks useful(To not reuse result of markTails but have 
> it computed inplace).
> 
> > Is it not enough that the call you're trying to TRE is marked "tail"?
> 
> It is not enough that call which is subject to TRE is marked "Tail".
> It also should be checked that other calls does not capture pointer to local 
> stack: 
> 
> ```
> // do not do TRE if any pointer to local stack has escaped.
> if (!Tracker.EscapePoints.empty())
>return false;
> 
> ```
> 
> It is not enough that call which is subject to TRE is marked "Tail". It also 
> should be checked that other calls does not capture pointer to local stack:

If there's an escaped pointer to the local stack, we wouldn't infer "tail" in 
the first place, would we?



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:852
+
+// Do not do TRE if exists recursive calls which are not last calls.
+if (!isTailBlock(CI->getParent()) ||

avl wrote:
> efriedma wrote:
> > I thought we had some tests where we TRE in the presence of recursive 
> > calls, like a simple recursive fibonacci.  Am I misunderstanding this?
> right, there is a testcase for fibonacchi:
> 
> llvm/test/Transforms/TailCallElim/accum_recursion.ll:@test3_fib
> 
> areAllLastFuncCallsRecursive() checking works well for fibonacci testcase:
> 
> 
> ```
> return fib(x-1)+fib(x-2);
> 
> ```
> 
> Since, Last funcs call chain is : fib()->fib()->ret. 
> That check should prevent from such cases:
> 
> 
> ```
> return fib(x-1)+another_call()+fib(x-2);
> ```
> 
> 
> That check should prevent from such cases: return 
> fib(x-1)+another_call()+fib(x-2);

Why do we need to prevent this?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-22 Thread Alexey Lapshin via Phabricator via cfe-commits
avl marked 4 inline comments as done.
avl added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:130
+IsNocapture = true;
+  else if (Function *CalledFunction = CB.getCalledFunction()) {
+if (CalledFunction->getBasicBlockList().size() > 0 &&

efriedma wrote:
> Please don't add code to examine the callee; if we're not deducing nocapture 
> appropriately, we should fix that elsewhere.
Ok. 



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:335
+II->getIntrinsicID() == Intrinsic::assume)
+  return true;
+

efriedma wrote:
> What is the new handling for lifetime.end/assume doing?
They are just skipped. In following test case:


```
  call void @_Z5test5i(i32 %sub)
  call void @llvm.lifetime.end.p0i8(i64 24, i8* nonnull %1) #5
  call void @llvm.lifetime.end.p0i8(i64 4, i8* nonnull %0) #5
  br label %return

```

they are generated in between call and ret. It is safe to ignore them while 
checking whether transformation is possible.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:825
+  // The local stack holds all alloca instructions and all byval arguments.
+  AllocaDerivedValueTracker Tracker;
+  for (Argument  : F.args()) {

efriedma wrote:
> Do you have to redo the AllocaDerivedValueTracker analysis?  Is it not enough 
> that the call you're trying to TRE is marked "tail"?
>Do you have to redo the AllocaDerivedValueTracker analysis?

AllocaDerivedValueTracker analysis(done in markTails) could be reused here. 
But marking, done in markTails(), looks like separate tasks. i.e. it is better 
to make TRE not depending on markTails(). There is a review for this - 
https://reviews.llvm.org/D60031
Thus such separation looks useful(To not reuse result of markTails but have it 
computed inplace).

> Is it not enough that the call you're trying to TRE is marked "tail"?

It is not enough that call which is subject to TRE is marked "Tail".
It also should be checked that other calls does not capture pointer to local 
stack: 

```
// do not do TRE if any pointer to local stack has escaped.
if (!Tracker.EscapePoints.empty())
   return false;

```




Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:852
+
+// Do not do TRE if exists recursive calls which are not last calls.
+if (!isTailBlock(CI->getParent()) ||

efriedma wrote:
> I thought we had some tests where we TRE in the presence of recursive calls, 
> like a simple recursive fibonacci.  Am I misunderstanding this?
right, there is a testcase for fibonacchi:

llvm/test/Transforms/TailCallElim/accum_recursion.ll:@test3_fib

areAllLastFuncCallsRecursive() checking works well for fibonacci testcase:


```
return fib(x-1)+fib(x-2);

```

Since, Last funcs call chain is : fib()->fib()->ret. 
That check should prevent from such cases:


```
return fib(x-1)+another_call()+fib(x-2);
```




Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-22 Thread Eli Friedman via Phabricator via cfe-commits
efriedma added inline comments.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:130
+IsNocapture = true;
+  else if (Function *CalledFunction = CB.getCalledFunction()) {
+if (CalledFunction->getBasicBlockList().size() > 0 &&

Please don't add code to examine the callee; if we're not deducing nocapture 
appropriately, we should fix that elsewhere.



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:335
+II->getIntrinsicID() == Intrinsic::assume)
+  return true;
+

What is the new handling for lifetime.end/assume doing?



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:825
+  // The local stack holds all alloca instructions and all byval arguments.
+  AllocaDerivedValueTracker Tracker;
+  for (Argument  : F.args()) {

Do you have to redo the AllocaDerivedValueTracker analysis?  Is it not enough 
that the call you're trying to TRE is marked "tail"?



Comment at: llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp:852
+
+// Do not do TRE if exists recursive calls which are not last calls.
+if (!isTailBlock(CI->getParent()) ||

I thought we had some tests where we TRE in the presence of recursive calls, 
like a simple recursive fibonacci.  Am I misunderstanding this?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82085: [TRE] allow TRE for non-capturing calls.

2020-06-22 Thread Alexey Lapshin via Phabricator via cfe-commits
avl updated this revision to Diff 272378.
avl added a comment.

1. deleted code doing more strict tailcall marking.
2. left removal of "AllCallsAreTailCalls".
3. added check for non-capturing calls while tracking alloca.
4. re-titled the patch.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82085/new/

https://reviews.llvm.org/D82085

Files:
  llvm/lib/Transforms/Scalar/TailRecursionElimination.cpp
  llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll

Index: llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
===
--- /dev/null
+++ llvm/test/Transforms/TailCallElim/tre-noncapturing-alloca-calls.ll
@@ -0,0 +1,356 @@
+; RUN: opt < %s -tailcallelim -verify-dom-info -S | FileCheck %s
+
+; IR for that test was generated from the following C++ source:
+;
+;int count;
+;__attribute__((noinline)) void globalIncrement(const int* param) { count += *param; }
+;
+;void test(int recurseCount)
+;{
+;if (recurseCount == 0) return;
+;int temp = 10;
+;globalIncrement();
+;test(recurseCount - 1);
+;}
+;
+;struct Counter
+;{
+;int count = 0;
+;
+;__attribute__((noinline)) void increment(const int param) { count += param;
+;}
+;virtual __attribute__((noinline)) void increment(const int* param) { count += *param; }
+;
+;__attribute__((noinline)) int getCount () const { return count; }
+;};
+;
+;void test2(int recurseCount)
+;{
+;if (recurseCount == 0) return;
+;Counter counter;
+;counter.increment(0);
+;test2(recurseCount - 1);
+;}
+;
+;void test3(int recurseCount)
+;{
+;if (recurseCount == 0) return;
+;Counter counter;
+;counter.increment(counter.getCount());
+;test3(recurseCount - 1);
+;}
+;
+;void test4(int recurseCount)
+;{
+;if (recurseCount == 0) return;
+;int temp = 10;
+;Counter counter;
+;counter.increment();
+;test4(recurseCount - 1);
+;}
+;
+;struct Counter2 : public Counter
+;{
+;virtual __attribute__((noinline)) void increment(const int* param) {
+;ptr = param;
+;count += *param;
+;}
+;
+;const int* ptr;
+;};
+;
+;void test5(int recurseCount)
+;{
+;if (recurseCount == 0) return;
+;int temp = 10;
+;Counter2 counter;
+;counter.increment();
+;test5(recurseCount - 1);
+;}
+;
+
+%struct.Counter = type <{ i32 (...)**, i32, [4 x i8] }>
+%struct.Counter2 = type { %struct.Counter.base, i32* }
+%struct.Counter.base = type <{ i32 (...)**, i32 }>
+
+$_ZN7Counter9incrementEi = comdat any
+
+$_ZNK7Counter8getCountEv = comdat any
+
+$_ZN7Counter9incrementEPKi = comdat any
+
+$_ZN8Counter29incrementEPKi = comdat any
+
+$_ZTV7Counter = comdat any
+
+$_ZTS7Counter = comdat any
+
+$_ZTI7Counter = comdat any
+
+$_ZTV8Counter2 = comdat any
+
+$_ZTS8Counter2 = comdat any
+
+$_ZTI8Counter2 = comdat any
+
+@count = dso_local local_unnamed_addr global i32 0, align 4
+
+@_ZTV7Counter = linkonce_odr dso_local unnamed_addr constant { [3 x i8*] } { [3 x i8*] [i8* null, i8* bitcast ({ i8*, i8* }* @_ZTI7Counter to i8*), i8* bitcast (void (%struct.Counter*, i32*)* @_ZN7Counter9incrementEPKi to i8*)] }, comdat, align 8
+@_ZTVN10__cxxabiv117__class_type_infoE = external dso_local global i8*
+@_ZTS7Counter = linkonce_odr dso_local constant [9 x i8] c"7Counter\00", comdat, align 1
+@_ZTI7Counter = linkonce_odr dso_local constant { i8*, i8* } { i8* bitcast (i8** getelementptr inbounds (i8*, i8** @_ZTVN10__cxxabiv117__class_type_infoE, i64 2) to i8*), i8* getelementptr inbounds ([9 x i8], [9 x i8]* @_ZTS7Counter, i32 0, i32 0) }, comdat, align 8
+@_ZTV8Counter2 = linkonce_odr dso_local unnamed_addr constant { [3 x i8*] } { [3 x i8*] [i8* null, i8* bitcast ({ i8*, i8*, i8* }* @_ZTI8Counter2 to i8*), i8* bitcast (void (%struct.Counter2*, i32*)* @_ZN8Counter29incrementEPKi to i8*)] }, comdat, align 8
+@_ZTVN10__cxxabiv120__si_class_type_infoE = external dso_local global i8*
+@_ZTS8Counter2 = linkonce_odr dso_local constant [10 x i8] c"8Counter2\00", comdat, align 1
+@_ZTI8Counter2 = linkonce_odr dso_local constant { i8*, i8*, i8* } { i8* bitcast (i8** getelementptr inbounds (i8*, i8** @_ZTVN10__cxxabiv120__si_class_type_infoE, i64 2) to i8*), i8* getelementptr inbounds ([10 x i8], [10 x i8]* @_ZTS8Counter2, i32 0, i32 0), i8* bitcast ({ i8*, i8* }* @_ZTI7Counter to i8*) }, comdat, align 8
+
+
+; Function Attrs: nofree noinline norecurse nounwind uwtable
+define dso_local void @_Z15globalIncrementPKi(i32* nocapture readonly %param) local_unnamed_addr #0 {
+entry:
+  %0 = load i32, i32* %param, align 4
+  %1 = load i32, i32* @count, align 4
+  %add = add nsw i32 %1, %0
+  store i32 %add, i32* @count, align 4
+  ret void
+}
+
+; Test that TRE could be done for recursive tail routine containing
+; call to function receiving a pointer to local stack. 
+
+; CHECK: void @_Z4testi
+; CHECK: br label %tailrecurse
+; CHECK: tailrecurse:
+; CHECK-NOT: call void @_Z4testi
+; CHECK: br