echristo created this revision.
echristo added reviewers: chandlerc, hfinkel.
Herald added subscribers: cfe-commits, jfb, dexonsmith, steven_wu, hiraditya, 
javed.absar, mcrosier, mehdi_amini.
Herald added projects: clang, LLVM.

As a follow-up to my initial mail to llvm-dev here's a first pass at the O1 
<https://reviews.llvm.org/owners/package/1/> described there.

Some rough internal testing using a bootstrap and test of clang has shown a 
combined build and test time for clang with nearly equivalent performance to O3 
<https://reviews.llvm.org/owners/package/3/> and quite a speedup over O0 - it's 
currently a little slower than the existing O1 
<https://reviews.llvm.org/owners/package/1/>, likely due to the clang+llvm 
testsuite use of the same binaries many times rather than a few for individual 
tests. Build time is a bit better. For a larger build and smaller test time 
(think a couple of unittests), this is a bit better than either O3 
<https://reviews.llvm.org/owners/package/3/>, O0, or O1 
<https://reviews.llvm.org/owners/package/1/>. Overall binary size drops 
significantly compared to O0.

This change doesn't include any change to move from selection dag to fast isel 
and that will come with other numbers that should help inform that decision. I 
also haven't done any real debuggability studies with this pipeline yet, I 
wanted to get the initial start done so that people could see it and we could 
start tweaking after.

Test updates: Outside of the newpm tests most of the updates are coming from 
either optimization passes not run anymore (and without a compelling argument 
at the moment) that were largely used for canonicalization in clang.

Original post:

http://lists.llvm.org/pipermail/llvm-dev/2019-April/131494.html


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D65410

Files:
  clang/test/CodeGen/2008-07-30-implicit-initialization.c
  clang/test/CodeGen/arm-fp16-arguments.c
  clang/test/CodeGen/arm-vfp16-arguments2.cpp
  clang/test/CodeGenCXX/auto-var-init.cpp
  clang/test/CodeGenCXX/stack-reuse.cpp
  llvm/include/llvm/Passes/PassBuilder.h
  llvm/lib/Passes/PassBuilder.cpp
  llvm/lib/Transforms/IPO/PassManagerBuilder.cpp
  llvm/test/Feature/optnone-opt.ll
  llvm/test/Other/new-pm-defaults.ll
  llvm/test/Other/new-pm-thinlto-defaults.ll
  llvm/test/Transforms/MemCpyOpt/lifetime.ll
  llvm/test/Transforms/PhaseOrdering/simplifycfg-options.ll

Index: llvm/test/Transforms/PhaseOrdering/simplifycfg-options.ll
===================================================================
--- llvm/test/Transforms/PhaseOrdering/simplifycfg-options.ll
+++ llvm/test/Transforms/PhaseOrdering/simplifycfg-options.ll
@@ -7,7 +7,7 @@
 
 define i1 @PR33605(i32 %a, i32 %b, i32* %c) {
 ; ALL-LABEL: @PR33605(
-; ALL-NEXT:  for.body:
+; ALL-NEXT:  entry:
 ; ALL-NEXT:    [[OR:%.*]] = or i32 [[B:%.*]], [[A:%.*]]
 ; ALL-NEXT:    [[ARRAYIDX:%.*]] = getelementptr inbounds i32, i32* [[C:%.*]], i64 1
 ; ALL-NEXT:    [[TMP0:%.*]] = load i32, i32* [[ARRAYIDX]], align 4
@@ -18,7 +18,7 @@
 ; ALL-NEXT:    tail call void @foo()
 ; ALL-NEXT:    br label [[IF_END]]
 ; ALL:       if.end:
-; ALL-NEXT:    [[CHANGED_1_OFF0:%.*]] = phi i1 [ true, [[IF_THEN]] ], [ false, [[FOR_BODY:%.*]] ]
+; ALL-NEXT:    [[CHANGED_1_OFF0:%.*]] = phi i1 [ true, [[IF_THEN]] ], [ false, [[ENTRY:%.*]] ]
 ; ALL-NEXT:    [[TMP1:%.*]] = load i32, i32* [[C]], align 4
 ; ALL-NEXT:    [[CMP_1:%.*]] = icmp eq i32 [[OR]], [[TMP1]]
 ; ALL-NEXT:    br i1 [[CMP_1]], label [[IF_END_1:%.*]], label [[IF_THEN_1:%.*]]
Index: llvm/test/Transforms/MemCpyOpt/lifetime.ll
===================================================================
--- llvm/test/Transforms/MemCpyOpt/lifetime.ll
+++ llvm/test/Transforms/MemCpyOpt/lifetime.ll
@@ -1,4 +1,4 @@
-; RUN: opt < %s -O1 -S | FileCheck %s
+; RUN: opt < %s -O2 -S | FileCheck %s
 
 ; performCallSlotOptzn in MemCpy should not exchange the calls to
 ; @llvm.lifetime.start and @llvm.memcpy.
Index: llvm/test/Other/new-pm-thinlto-defaults.ll
===================================================================
--- llvm/test/Other/new-pm-thinlto-defaults.ll
+++ llvm/test/Other/new-pm-thinlto-defaults.ll
@@ -108,13 +108,28 @@
 ; CHECK-O3-NEXT: Running pass: ArgumentPromotionPass
 ; CHECK-O-NEXT: Running pass: CGSCCToFunctionPassAdaptor<{{.*}}PassManager{{.*}}>
 ; CHECK-O-NEXT: Starting llvm::Function pass manager run.
-; CHECK-O-NEXT: Running pass: SROA
+; CHECK-O2-NEXT: Running pass: SROA
+; CHECK-O3-NEXT: Running pass: SROA
+; CHECK-Os-NEXT: Running pass: SROA
+; CHECK-Oz-NEXT: Running pass: SROA
 ; CHECK-O-NEXT: Running pass: EarlyCSEPass
 ; CHECK-O-NEXT: Running analysis: MemorySSAAnalysis
-; CHECK-O-NEXT: Running pass: SpeculativeExecutionPass
-; CHECK-O-NEXT: Running pass: JumpThreadingPass
-; CHECK-O-NEXT: Running analysis: LazyValueAnalysis
-; CHECK-O-NEXT: Running pass: CorrelatedValuePropagationPass
+; CHECK-O2-NEXT: Running pass: SpeculativeExecutionPass
+; CHECK-O3-NEXT: Running pass: SpeculativeExecutionPass
+; CHECK-Os-NEXT: Running pass: SpeculativeExecutionPass
+; CHECK-Oz-NEXT: Running pass: SpeculativeExecutionPass
+; CHECK-O2-NEXT: Running pass: JumpThreadingPass
+; CHECK-O3-NEXT: Running pass: JumpThreadingPass
+; CHECK-Os-NEXT: Running pass: JumpThreadingPass
+; CHECK-Oz-NEXT: Running pass: JumpThreadingPass
+; CHECK-O2-NEXT: Running analysis: LazyValueAnalysis
+; CHECK-O3-NEXT: Running analysis: LazyValueAnalysis
+; CHECK-Os-NEXT: Running analysis: LazyValueAnalysis
+; CHECK-Oz-NEXT: Running analysis: LazyValueAnalysis
+; CHECK-O2-NEXT: Running pass: CorrelatedValuePropagationPass
+; CHECK-O3-NEXT: Running pass: CorrelatedValuePropagationPass
+; CHECK-Os-NEXT: Running pass: CorrelatedValuePropagationPass
+; CHECK-Oz-NEXT: Running pass: CorrelatedValuePropagationPass
 ; CHECK-O-NEXT: Running pass: SimplifyCFGPass
 ; CHECK-O3-NEXT: Running pass: AggressiveInstCombinePass
 ; CHECK-O-NEXT: Running pass: InstCombinePass
@@ -178,14 +193,38 @@
 ; CHECK-O-NEXT: Running pass: BDCEPass
 ; CHECK-O-NEXT: Running analysis: DemandedBitsAnalysis
 ; CHECK-O-NEXT: Running pass: InstCombinePass
-; CHECK-O-NEXT: Running pass: JumpThreadingPass
-; CHECK-O-NEXT: Running pass: CorrelatedValuePropagationPass
-; CHECK-O-NEXT: Running pass: DSEPass
-; CHECK-O-NEXT: Running pass: FunctionToLoopPassAdaptor<{{.*}}LICMPass{{.*}}>
-; CHECK-O-NEXT: Starting llvm::Function pass manager run
-; CHECK-O-NEXT: Running pass: LoopSimplifyPass
-; CHECK-O-NEXT: Running pass: LCSSAPass
-; CHECK-O-NEXT: Finished llvm::Function pass manager run
+; CHECK-O2-NEXT: Running pass: JumpThreadingPass
+; CHECK-O3-NEXT: Running pass: JumpThreadingPass
+; CHECK-Os-NEXT: Running pass: JumpThreadingPass
+; CHECK-Oz-NEXT: Running pass: JumpThreadingPass
+; CHECK-O2-NEXT: Running pass: CorrelatedValuePropagationPass
+; CHECK-O3-NEXT: Running pass: CorrelatedValuePropagationPass
+; CHECK-Os-NEXT: Running pass: CorrelatedValuePropagationPass
+; CHECK-Oz-NEXT: Running pass: CorrelatedValuePropagationPass
+; CHECK-O2-NEXT: Running pass: DSEPass
+; CHECK-O3-NEXT: Running pass: DSEPass
+; CHECK-Os-NEXT: Running pass: DSEPass
+; CHECK-Oz-NEXT: Running pass: DSEPass
+; CHECK-O2-NEXT: Running pass: FunctionToLoopPassAdaptor<{{.*}}LICMPass{{.*}}>
+; CHECK-O3-NEXT: Running pass: FunctionToLoopPassAdaptor<{{.*}}LICMPass{{.*}}>
+; CHECK-Os-NEXT: Running pass: FunctionToLoopPassAdaptor<{{.*}}LICMPass{{.*}}>
+; CHECK-Oz-NEXT: Running pass: FunctionToLoopPassAdaptor<{{.*}}LICMPass{{.*}}>
+; CHECK-O2-NEXT: Starting llvm::Function pass manager run
+; CHECK-O3-NEXT: Starting llvm::Function pass manager run
+; CHECK-Os-NEXT: Starting llvm::Function pass manager run
+; CHECK-Oz-NEXT: Starting llvm::Function pass manager run
+; CHECK-O2-NEXT: Running pass: LoopSimplifyPass
+; CHECK-O3-NEXT: Running pass: LoopSimplifyPass
+; CHECK-Os-NEXT: Running pass: LoopSimplifyPass
+; CHECK-Oz-NEXT: Running pass: LoopSimplifyPass
+; CHECK-O2-NEXT: Running pass: LCSSAPass
+; CHECK-O3-NEXT: Running pass: LCSSAPass
+; CHECK-Os-NEXT: Running pass: LCSSAPass
+; CHECK-Oz-NEXT: Running pass: LCSSAPass
+; CHECK-O2-NEXT: Finished llvm::Function pass manager run
+; CHECK-O3-NEXT: Finished llvm::Function pass manager run
+; CHECK-Os-NEXT: Finished llvm::Function pass manager run
+; CHECK-Oz-NEXT: Finished llvm::Function pass manager run
 ; CHECK-O-NEXT: Running pass: ADCEPass
 ; CHECK-O-NEXT: Running analysis: PostDominatorTreeAnalysis
 ; CHECK-O-NEXT: Running pass: SimplifyCFGPass
Index: llvm/test/Other/new-pm-defaults.ll
===================================================================
--- llvm/test/Other/new-pm-defaults.ll
+++ llvm/test/Other/new-pm-defaults.ll
@@ -128,13 +128,28 @@
 ; CHECK-O3-NEXT: Running pass: ArgumentPromotionPass
 ; CHECK-O-NEXT: Running pass: CGSCCToFunctionPassAdaptor<{{.*}}PassManager{{.*}}>
 ; CHECK-O-NEXT: Starting llvm::Function pass manager run.
-; CHECK-O-NEXT: Running pass: SROA
+; CHECK-O2-NEXT: Running pass: SROA
+; CHECK-O3-NEXT: Running pass: SROA
+; CHECK-Os-NEXT: Running pass: SROA
+; CHECK-Oz-NEXT: Running pass: SROA
 ; CHECK-O-NEXT: Running pass: EarlyCSEPass
 ; CHECK-O-NEXT: Running analysis: MemorySSAAnalysis
-; CHECK-O-NEXT: Running pass: SpeculativeExecutionPass
-; CHECK-O-NEXT: Running pass: JumpThreadingPass
-; CHECK-O-NEXT: Running analysis: LazyValueAnalysis
-; CHECK-O-NEXT: Running pass: CorrelatedValuePropagationPass
+; CHECK-O2-NEXT: Running pass: SpeculativeExecutionPass
+; CHECK-O3-NEXT: Running pass: SpeculativeExecutionPass
+; CHECK-Os-NEXT: Running pass: SpeculativeExecutionPass
+; CHECK-Oz-NEXT: Running pass: SpeculativeExecutionPass
+; CHECK-O2-NEXT: Running pass: JumpThreadingPass
+; CHECK-O3-NEXT: Running pass: JumpThreadingPass
+; CHECK-Os-NEXT: Running pass: JumpThreadingPass
+; CHECK-Oz-NEXT: Running pass: JumpThreadingPass
+; CHECK-O2-NEXT: Running analysis: LazyValueAnalysis
+; CHECK-O3-NEXT: Running analysis: LazyValueAnalysis
+; CHECK-Os-NEXT: Running analysis: LazyValueAnalysis
+; CHECK-Oz-NEXT: Running analysis: LazyValueAnalysis
+; CHECK-O2-NEXT: Running pass: CorrelatedValuePropagationPass
+; CHECK-O3-NEXT: Running pass: CorrelatedValuePropagationPass
+; CHECK-Os-NEXT: Running pass: CorrelatedValuePropagationPass
+; CHECK-Oz-NEXT: Running pass: CorrelatedValuePropagationPass
 ; CHECK-O-NEXT: Running pass: SimplifyCFGPass
 ; CHECK-O3-NEXT: AggressiveInstCombinePass
 ; CHECK-O-NEXT: Running pass: InstCombinePass
@@ -202,14 +217,38 @@
 ; CHECK-O-NEXT: Running analysis: DemandedBitsAnalysis
 ; CHECK-O-NEXT: Running pass: InstCombinePass
 ; CHECK-EP-PEEPHOLE-NEXT: Running pass: NoOpFunctionPass
-; CHECK-O-NEXT: Running pass: JumpThreadingPass
-; CHECK-O-NEXT: Running pass: CorrelatedValuePropagationPass
-; CHECK-O-NEXT: Running pass: DSEPass
-; CHECK-O-NEXT: Running pass: FunctionToLoopPassAdaptor<{{.*}}LICMPass{{.*}}>
-; CHECK-O-NEXT: Starting llvm::Function pass manager run.
-; CHECK-O-NEXT: Running pass: LoopSimplifyPass
-; CHECK-O-NEXT: Running pass: LCSSAPass
-; CHECK-O-NEXT: Finished llvm::Function pass manager run.
+; CHECK-O2-NEXT: Running pass: JumpThreadingPass
+; CHECK-O3-NEXT: Running pass: JumpThreadingPass
+; CHECK-Os-NEXT: Running pass: JumpThreadingPass
+; CHECK-Oz-NEXT: Running pass: JumpThreadingPass
+; CHECK-O2-NEXT: Running pass: CorrelatedValuePropagationPass
+; CHECK-O3-NEXT: Running pass: CorrelatedValuePropagationPass
+; CHECK-Os-NEXT: Running pass: CorrelatedValuePropagationPass
+; CHECK-Oz-NEXT: Running pass: CorrelatedValuePropagationPass
+; CHECK-O2-NEXT: Running pass: DSEPass
+; CHECK-O3-NEXT: Running pass: DSEPass
+; CHECK-Os-NEXT: Running pass: DSEPass
+; CHECK-Oz-NEXT: Running pass: DSEPass
+; CHECK-O2-NEXT: Running pass: FunctionToLoopPassAdaptor<{{.*}}LICMPass{{.*}}>
+; CHECK-O3-NEXT: Running pass: FunctionToLoopPassAdaptor<{{.*}}LICMPass{{.*}}>
+; CHECK-Os-NEXT: Running pass: FunctionToLoopPassAdaptor<{{.*}}LICMPass{{.*}}>
+; CHECK-Oz-NEXT: Running pass: FunctionToLoopPassAdaptor<{{.*}}LICMPass{{.*}}>
+; CHECK-O2-NEXT: Starting llvm::Function pass manager run.
+; CHECK-O3-NEXT: Starting llvm::Function pass manager run.
+; CHECK-Os-NEXT: Starting llvm::Function pass manager run.
+; CHECK-Oz-NEXT: Starting llvm::Function pass manager run.
+; CHECK-O2-NEXT: Running pass: LoopSimplifyPass
+; CHECK-O3-NEXT: Running pass: LoopSimplifyPass
+; CHECK-Os-NEXT: Running pass: LoopSimplifyPass
+; CHECK-Oz-NEXT: Running pass: LoopSimplifyPass
+; CHECK-O2-NEXT: Running pass: LCSSAPass
+; CHECK-O3-NEXT: Running pass: LCSSAPass
+; CHECK-Os-NEXT: Running pass: LCSSAPass
+; CHECK-Oz-NEXT: Running pass: LCSSAPass
+; CHECK-O2-NEXT: Finished llvm::Function pass manager run.
+; CHECK-O3-NEXT: Finished llvm::Function pass manager run.
+; CHECK-Os-NEXT: Finished llvm::Function pass manager run.
+; CHECK-Oz-NEXT: Finished llvm::Function pass manager run.
 ; CHECK-EP-SCALAR-LATE-NEXT: Running pass: NoOpFunctionPass
 ; CHECK-O-NEXT: Running pass: ADCEPass
 ; CHECK-O-NEXT: Running analysis: PostDominatorTreeAnalysis
Index: llvm/test/Feature/optnone-opt.ll
===================================================================
--- llvm/test/Feature/optnone-opt.ll
+++ llvm/test/Feature/optnone-opt.ll
@@ -39,16 +39,11 @@
 ; IR passes run at -O1 and higher.
 ; OPT-O1-DAG: Skipping pass 'Aggressive Dead Code Elimination'
 ; OPT-O1-DAG: Skipping pass 'Combine redundant instructions'
-; OPT-O1-DAG: Skipping pass 'Dead Store Elimination'
 ; OPT-O1-DAG: Skipping pass 'Early CSE'
-; OPT-O1-DAG: Skipping pass 'Jump Threading'
-; OPT-O1-DAG: Skipping pass 'MemCpy Optimization'
 ; OPT-O1-DAG: Skipping pass 'Reassociate expressions'
 ; OPT-O1-DAG: Skipping pass 'Simplify the CFG'
 ; OPT-O1-DAG: Skipping pass 'Sparse Conditional Constant Propagation'
-; OPT-O1-DAG: Skipping pass 'SROA'
 ; OPT-O1-DAG: Skipping pass 'Tail Call Elimination'
-; OPT-O1-DAG: Skipping pass 'Value Propagation'
 
 ; Additional IR passes run at -O2 and higher.
 ; OPT-O2O3-DAG: Skipping pass 'Global Value Numbering'
Index: llvm/lib/Transforms/IPO/PassManagerBuilder.cpp
===================================================================
--- llvm/lib/Transforms/IPO/PassManagerBuilder.cpp
+++ llvm/lib/Transforms/IPO/PassManagerBuilder.cpp
@@ -260,7 +260,9 @@
   addInitialAliasAnalysisPasses(FPM);
 
   FPM.add(createCFGSimplificationPass());
-  FPM.add(createSROAPass());
+  // TODO: Investigate for O1. We'd like mem2reg, but the full SROA is a bit of a cost.
+  if (OptLevel > 1)
+    FPM.add(createSROAPass());
   FPM.add(createEarlyCSEPass());
   FPM.add(createLowerExpectIntrinsicPass());
 }
@@ -290,7 +292,9 @@
     IP.HintThreshold = 325;
 
     MPM.add(createFunctionInliningPass(IP));
-    MPM.add(createSROAPass());
+    // TODO: Investigate for O1. We'd like mem2reg, but the full SROA is a bit of a cost.
+    if (OptLevel > 1)
+      MPM.add(createSROAPass());
     MPM.add(createEarlyCSEPass());             // Catch trivial redundancies
     MPM.add(createCFGSimplificationPass());    // Merge & remove BBs
     MPM.add(createInstructionCombiningPass()); // Combine silly seq's
@@ -320,19 +324,28 @@
     legacy::PassManagerBase &MPM) {
   // Start of function pass.
   // Break up aggregate allocas, using SSAUpdater.
-  MPM.add(createSROAPass());
+  // TODO: Investigate for O1. We'd like mem2reg, but the full SROA is a bit of a cost.
+  if (OptLevel > 1)
+    MPM.add(createSROAPass());
   MPM.add(createEarlyCSEPass(true /* Enable mem-ssa. */)); // Catch trivial redundancies
-  if (EnableGVNHoist)
-    MPM.add(createGVNHoistPass());
-  if (EnableGVNSink) {
-    MPM.add(createGVNSinkPass());
-    MPM.add(createCFGSimplificationPass());
+
+  if (OptLevel > 1) {
+    if (EnableGVNHoist)
+      MPM.add(createGVNHoistPass());
+    if (EnableGVNSink) {
+      MPM.add(createGVNSinkPass());
+      MPM.add(createCFGSimplificationPass());
+    }
   }
 
   // Speculative execution if the target has divergent branches; otherwise nop.
-  MPM.add(createSpeculativeExecutionIfHasBranchDivergencePass());
-  MPM.add(createJumpThreadingPass());         // Thread jumps.
-  MPM.add(createCorrelatedValuePropagationPass()); // Propagate conditionals
+  if (OptLevel > 1)
+    MPM.add(createSpeculativeExecutionIfHasBranchDivergencePass());
+
+  if (OptLevel > 1) {
+    MPM.add(createJumpThreadingPass());         // Thread jumps.
+    MPM.add(createCorrelatedValuePropagationPass()); // Propagate conditionals
+  }
   MPM.add(createCFGSimplificationPass());     // Merge & remove BBs
   // Combine silly seq's
   if (OptLevel > 2)
@@ -346,6 +359,7 @@
   if (SizeLevel == 0)
     MPM.add(createPGOMemOPSizeOptLegacyPass());
 
+  // TODO: Investigate the cost/benefit of tail call elimination on debugging.
   MPM.add(createTailCallEliminationPass()); // Eliminate tail calls
   MPM.add(createCFGSimplificationPass());     // Merge & remove BBs
   MPM.add(createReassociatePass());           // Reassociate expressions
@@ -360,6 +374,7 @@
   }
   // Rotate Loop - disable header duplication at -Oz
   MPM.add(createLoopRotatePass(SizeLevel == 2 ? 0 : -1));
+  // TODO: Investigate promotion cap for O1.
   MPM.add(createLICMPass(LicmMssaOptCap, LicmMssaNoAccForPromotionCap));
   if (EnableSimpleLoopUnswitch)
     MPM.add(createSimpleLoopUnswitchLegacyPass());
@@ -402,16 +417,19 @@
   // opened up by them.
   addInstructionCombiningPass(MPM);
   addExtensionsToPM(EP_Peephole, MPM);
-  MPM.add(createJumpThreadingPass());         // Thread jumps
-  MPM.add(createCorrelatedValuePropagationPass());
-  MPM.add(createDeadStoreEliminationPass());  // Delete dead stores
-  MPM.add(createLICMPass(LicmMssaOptCap, LicmMssaNoAccForPromotionCap));
+  if (OptLevel > 1) {
+    MPM.add(createJumpThreadingPass());         // Thread jumps
+    MPM.add(createCorrelatedValuePropagationPass());
+    MPM.add(createDeadStoreEliminationPass());  // Delete dead stores
+    MPM.add(createLICMPass(LicmMssaOptCap, LicmMssaNoAccForPromotionCap));
+  }
 
   addExtensionsToPM(EP_ScalarOptimizerLate, MPM);
 
   if (RerollLoops)
     MPM.add(createLoopRerollPass());
 
+  // TODO: Investigate if this is too expensive at O1.
   MPM.add(createAggressiveDCEPass());         // Delete dead instructions
   MPM.add(createCFGSimplificationPass()); // Merge & remove BBs
   // Clean up after everything.
Index: llvm/lib/Passes/PassBuilder.cpp
===================================================================
--- llvm/lib/Passes/PassBuilder.cpp
+++ llvm/lib/Passes/PassBuilder.cpp
@@ -387,27 +387,34 @@
 
   // Form SSA out of local memory accesses after breaking apart aggregates into
   // scalars.
-  FPM.addPass(SROA());
+  // TODO: Investigate for O1. We'd like mem2reg, but the full SROA is a bit of a cost.
+  if (Level > O1)
+    FPM.addPass(SROA());
 
   // Catch trivial redundancies
   FPM.addPass(EarlyCSEPass(true /* Enable mem-ssa. */));
 
   // Hoisting of scalars and load expressions.
-  if (EnableGVNHoist)
-    FPM.addPass(GVNHoistPass());
-
-  // Global value numbering based sinking.
-  if (EnableGVNSink) {
-    FPM.addPass(GVNSinkPass());
-    FPM.addPass(SimplifyCFGPass());
+  if (Level > O1) {
+    if (EnableGVNHoist)
+      FPM.addPass(GVNHoistPass());
+
+    // Global value numbering based sinking.
+    if (EnableGVNSink) {
+      FPM.addPass(GVNSinkPass());
+      FPM.addPass(SimplifyCFGPass());
+    }
   }
 
   // Speculative execution if the target has divergent branches; otherwise nop.
-  FPM.addPass(SpeculativeExecutionPass());
+  if (Level > O1)
+    FPM.addPass(SpeculativeExecutionPass());
 
   // Optimize based on known information about branches, and cleanup afterward.
-  FPM.addPass(JumpThreadingPass());
-  FPM.addPass(CorrelatedValuePropagationPass());
+  if (Level > O1) {
+    FPM.addPass(JumpThreadingPass());
+    FPM.addPass(CorrelatedValuePropagationPass());
+  }
   FPM.addPass(SimplifyCFGPass());
   if (Level == O3)
     FPM.addPass(AggressiveInstCombinePass());
@@ -421,9 +428,10 @@
   // For PGO use pipeline, try to optimize memory intrinsics such as memcpy
   // using the size value profile. Don't perform this when optimizing for size.
   if (PGOOpt && PGOOpt->Action == PGOOptions::IRUse &&
-      !isOptimizingForSize(Level))
+      !isOptimizingForSize(Level) && Level > O1)
     FPM.addPass(PGOMemOPSizeOpt());
 
+  // TODO: Investigate the cost/benefit of tail call elimination on debugging.
   FPM.addPass(TailCallElimPass());
   FPM.addPass(SimplifyCFGPass());
 
@@ -451,6 +459,7 @@
 
   // Rotate Loop - disable header duplication at -Oz
   LPM1.addPass(LoopRotatePass(Level != Oz));
+  // TODO: Investigate promotion cap for O1.
   LPM1.addPass(LICMPass(PTO.LicmMssaOptCap, PTO.LicmMssaNoAccForPromotionCap));
   LPM1.addPass(SimpleLoopUnswitchPass());
   LPM2.addPass(IndVarSimplifyPass());
@@ -510,18 +519,21 @@
 
   // Re-consider control flow based optimizations after redundancy elimination,
   // redo DCE, etc.
-  FPM.addPass(JumpThreadingPass());
-  FPM.addPass(CorrelatedValuePropagationPass());
-  FPM.addPass(DSEPass());
-  FPM.addPass(createFunctionToLoopPassAdaptor(
-      LICMPass(PTO.LicmMssaOptCap, PTO.LicmMssaNoAccForPromotionCap),
-      DebugLogging));
+  if (Level > O1) {
+    FPM.addPass(JumpThreadingPass());
+    FPM.addPass(CorrelatedValuePropagationPass());
+    FPM.addPass(DSEPass());
+    FPM.addPass(createFunctionToLoopPassAdaptor(
+        LICMPass(PTO.LicmMssaOptCap, PTO.LicmMssaNoAccForPromotionCap),
+        DebugLogging));
+  }
 
   for (auto &C : ScalarOptimizerLateEPCallbacks)
     C(FPM, Level);
 
   // Finally, do an expensive DCE pass to catch all the dead code exposed by
   // the simplifications and basic cleanup after all the simplifications.
+  // TODO: Investigate if this is too expensive.
   FPM.addPass(ADCEPass());
   FPM.addPass(SimplifyCFGPass());
   FPM.addPass(InstCombinePass());
Index: llvm/include/llvm/Passes/PassBuilder.h
===================================================================
--- llvm/include/llvm/Passes/PassBuilder.h
+++ llvm/include/llvm/Passes/PassBuilder.h
@@ -151,10 +151,6 @@
 
     /// Optimize quickly without destroying debuggability.
     ///
-    /// FIXME: The current and historical behavior of this level does *not*
-    /// agree with this goal, but we would like to move toward this goal in the
-    /// future.
-    ///
     /// This level is tuned to produce a result from the optimizer as quickly
     /// as possible and to avoid destroying debuggability. This tends to result
     /// in a very good development mode where the compiled code will be
@@ -164,9 +160,9 @@
     /// debugging of the resulting binary.
     ///
     /// As an example, complex loop transformations such as versioning,
-    /// vectorization, or fusion might not make sense here due to the degree to
-    /// which the executed code would differ from the source code, and the
-    /// potential compile time cost.
+    /// vectorization, or fusion don't make sense here due to the degree to
+    /// which the executed code differs from the source code, and the compile time
+    /// cost.
     O1,
 
     /// Optimize for fast execution as much as possible without triggering
Index: clang/test/CodeGenCXX/stack-reuse.cpp
===================================================================
--- clang/test/CodeGenCXX/stack-reuse.cpp
+++ clang/test/CodeGenCXX/stack-reuse.cpp
@@ -1,4 +1,4 @@
-// RUN: %clang_cc1 -triple armv7-unknown-linux-gnueabihf %s -o - -emit-llvm -O1 | FileCheck %s
+// RUN: %clang_cc1 -triple armv7-unknown-linux-gnueabihf %s -o - -emit-llvm -O2 | FileCheck %s
 
 // Stack should be reused when possible, no need to allocate two separate slots
 // if they have disjoint lifetime.
Index: clang/test/CodeGenCXX/auto-var-init.cpp
===================================================================
--- clang/test/CodeGenCXX/auto-var-init.cpp
+++ clang/test/CodeGenCXX/auto-var-init.cpp
@@ -645,7 +645,7 @@
 // ZERO-LABEL: @test_smallpartinit_uninit()
 // ZERO-O0: call void @llvm.memset{{.*}}, i8 0,
 // ZERO-O1-LEGACY: store i16 0, i16* %uninit, align 2
-// ZERO-O1-NEWPM: store i16 42, i16* %uninit, align 2
+// ZERO-O1-NEWPM: store i16 0, i16* %uninit, align 2
 
 TEST_BRACES(smallpartinit, smallpartinit);
 // CHECK-LABEL: @test_smallpartinit_braces()
@@ -718,7 +718,7 @@
 // PATTERN-LABEL: @test_paddednullinit_uninit()
 // PATTERN-O0: call void @llvm.memcpy{{.*}} @__const.test_paddednullinit_uninit.uninit
 // PATTERN-O1-LEGACY: store i64 [[I64]], i64* %uninit, align 8
-// PATTERN-O1-NEWPM: store i64 2863311360, i64* %uninit, align 8
+// PATTERN-O1-NEWPM: store i64 [[I64]], i64* %uninit, align 8
 // ZERO-LABEL: @test_paddednullinit_uninit()
 // ZERO-O0: call void @llvm.memset{{.*}}, i8 0,
 // ZERO-O1: store i64 0, i64* %uninit, align 8
@@ -1345,10 +1345,7 @@
 // ZERO-LABEL: @test_virtualderived_uninit()
 // ZERO-O0: call void @llvm.memset{{.*}}, i8 0,
 // ZERO-O1-LEGACY: call void @llvm.memset{{.*}}, i8 0,
-// ZERO-O1-NEWPM: [[FIELD1:%.*]] = getelementptr inbounds %struct.virtualderived, %struct.virtualderived* %uninit, i64 0, i32 1, i32 0, i32 0
-// ZERO-O1-NEWPM: [[FIELD0:%.*]] = getelementptr inbounds %struct.virtualderived, %struct.virtualderived* %uninit, i64 0, i32 0, i32 0
-// ZERO-O1-NEWPM: store i32 (...)** bitcast (i8** getelementptr inbounds ({ [7 x i8*], [5 x i8*] }, { [7 x i8*], [5 x i8*] }* @_ZTV14virtualderived, i64 0, inrange i32 0, i64 5) to i32 (...)**), i32 (...)*** [[FIELD0]], align 8
-// ZERO-O1-NEWPM: store i32 (...)** bitcast (i8** getelementptr inbounds ({ [7 x i8*], [5 x i8*] }, { [7 x i8*], [5 x i8*] }* @_ZTV14virtualderived, i64 0, inrange i32 1, i64 3) to i32 (...)**), i32 (...)*** [[FIELD1]], align 8
+// ZERO-O1-NEWPM: call void @llvm.memset{{.*}}, i8 0,
 
 TEST_BRACES(virtualderived, virtualderived);
 // CHECK-LABEL: @test_virtualderived_braces()
Index: clang/test/CodeGen/arm-vfp16-arguments2.cpp
===================================================================
--- clang/test/CodeGen/arm-vfp16-arguments2.cpp
+++ clang/test/CodeGen/arm-vfp16-arguments2.cpp
@@ -1,12 +1,12 @@
 // RUN: %clang_cc1 -triple armv7a--none-eabi -target-abi aapcs \
-// RUN:   -mfloat-abi soft -target-feature +neon -emit-llvm -o - -O1 %s \
+// RUN:   -mfloat-abi soft -target-feature +neon -emit-llvm -o - -O2 %s \
 // RUN:   | FileCheck %s --check-prefix=CHECK-SOFT
 // RUN: %clang_cc1 -triple armv7a--none-eabi -target-abi aapcs \
-// RUN:   -mfloat-abi hard -target-feature +neon -emit-llvm -o - -O1 %s \
+// RUN:   -mfloat-abi hard -target-feature +neon -emit-llvm -o - -O2 %s \
 // RUN:   | FileCheck %s --check-prefix=CHECK-HARD
 // RUN: %clang_cc1 -triple armv7a--none-eabi -target-abi aapcs \
 // RUN:   -mfloat-abi hard -target-feature +neon -target-feature +fullfp16 \
-// RUN:   -emit-llvm -o - -O1 %s \
+// RUN:   -emit-llvm -o - -O2 %s \
 // RUN:   | FileCheck %s --check-prefix=CHECK-FULL
 
 typedef float float32_t;
Index: clang/test/CodeGen/arm-fp16-arguments.c
===================================================================
--- clang/test/CodeGen/arm-fp16-arguments.c
+++ clang/test/CodeGen/arm-fp16-arguments.c
@@ -1,6 +1,6 @@
-// RUN: %clang_cc1 -triple armv7a--none-eabi -target-abi aapcs -mfloat-abi soft -fallow-half-arguments-and-returns -emit-llvm -o - -O1 %s | FileCheck %s --check-prefix=CHECK --check-prefix=SOFT
-// RUN: %clang_cc1 -triple armv7a--none-eabi -target-abi aapcs -mfloat-abi hard -fallow-half-arguments-and-returns -emit-llvm -o - -O1 %s | FileCheck %s --check-prefix=CHECK --check-prefix=HARD
-// RUN: %clang_cc1 -triple armv7a--none-eabi -target-abi aapcs -mfloat-abi soft -fnative-half-arguments-and-returns -emit-llvm -o - -O1 %s | FileCheck %s --check-prefix=NATIVE
+// RUN: %clang_cc1 -triple armv7a--none-eabi -target-abi aapcs -mfloat-abi soft -fallow-half-arguments-and-returns -emit-llvm -o - -O2 %s | FileCheck %s --check-prefix=CHECK --check-prefix=SOFT
+// RUN: %clang_cc1 -triple armv7a--none-eabi -target-abi aapcs -mfloat-abi hard -fallow-half-arguments-and-returns -emit-llvm -o - -O2 %s | FileCheck %s --check-prefix=CHECK --check-prefix=HARD
+// RUN: %clang_cc1 -triple armv7a--none-eabi -target-abi aapcs -mfloat-abi soft -fnative-half-arguments-and-returns -emit-llvm -o - -O2 %s | FileCheck %s --check-prefix=NATIVE
 
 __fp16 g;
 
Index: clang/test/CodeGen/2008-07-30-implicit-initialization.c
===================================================================
--- clang/test/CodeGen/2008-07-30-implicit-initialization.c
+++ clang/test/CodeGen/2008-07-30-implicit-initialization.c
@@ -1,4 +1,4 @@
-// RUN: %clang_cc1 -triple i386-unknown-unknown -O1 -emit-llvm -o - %s | FileCheck %s
+// RUN: %clang_cc1 -triple i386-unknown-unknown -O2 -emit-llvm -o - %s | FileCheck %s
 // CHECK-LABEL: define i32 @f0()
 // CHECK:   ret i32 0
 // CHECK-LABEL: define i32 @f1()
_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to