chandlerc added a comment. In D63156#1543937 <https://reviews.llvm.org/D63156#1543937>, @leonardchan wrote:
> I did some more digging and found the following differences between PMs for > each test, and they seem to all differ and can be fixed for different reasons. Awesome, and sorry this is such a big effort, but I think these are all going to end up being pretty interesting root causes. I guess yay, Clang's testing is doing its job? > **CodeGen/aggregate-assign-call.c**: The new PM on -O1 codegen produces the > do/while loop differently but still emits the lifetime start/end intrinsics > around temporary variables, which is what the test checks for. Here we can > just check for the instructions while ignoring the branches generated. Weird, but makes sense. > **CodeGen/arm_acle.c**: A bunch of instructions seem to have been combined > into a call to `llvm.fshl.i32`, so we just check for those under the new PM. Same. > **CodeGen/available-externally-suppress.c**: `available_externally` was not > emitted during `-O2 -flto` runs when it should still be retained for link > time inlining purposes. This can be fixed by checking that we aren't > LTOPrelinking when adding the `EliminateAvailableExternallyPass`. Yikes, this is a good bug to fix! > **CodeGen/pgo-sample.c [No new fix]**: The test checks that `PruneEH` runs, > but there doesn’t seem to be a corresponding new PM pass for it. Should there > be? If there is one then we can just check for that in the debug PM dump. The analog would be `function-attrs` and `function(simplify-cfg)` in some combination. Maybe this should just check that some specific thing is optimized away at `-O2` rather than checking the specific details of pass pipeline construction? > **CodeGen/x86_64-instrument-functions.c [No new fix]**: This one's a bit > complicated. The initial problem seems to be that `EntryExitInstrumenterPass` > was not added into the pipeline to emit `__cyg_profile_func_enter/exit()` > calls. Adding that to the pipeline seems to instrument correctly now and add > these calls, but we run into a problem with inlining. `root()` expects 2 > calls to `__cyg_profile_func_enter` and 2 calls to `__cyg_profile_func_exit` > in its body, > > ; Function Attrs: nounwind > define i32 @root(i32 returned %x) #0 { > entry: > %0 = tail call i8* @llvm.returnaddress(i32 0) > tail call void @__cyg_profile_func_enter(i8* bitcast (i32 (i32)* @root to > i8*), i8* %0) #2 > tail call void @__cyg_profile_func_enter(i8* bitcast (i32 (i32)* @leaf to > i8*), i8* %0) #2 > tail call void @__cyg_profile_func_exit(i8* bitcast (i32 (i32)* @leaf to > i8*), i8* %0) #2 > tail call void @__cyg_profile_func_exit(i8* bitcast (i32 (i32)* @root to > i8*), i8* %0) #2 > ret i32 %x > } > > > but we get only 1 call > > ; Function Attrs: norecurse nounwind readnone > define i32 @root(i32 returned %x) local_unnamed_addr #0 { > entry: > %0 = call i8* @llvm.returnaddress(i32 0) > call void @__cyg_profile_func_enter(i8* bitcast (i32 (i32)* @root to > i8*), i8* %0) > %1 = call i8* @llvm.returnaddress(i32 0) > call void @__cyg_profile_func_exit(i8* bitcast (i32 (i32)* @root to i8*), > i8* %1) > ret i32 %x > } > > > I suspect this is due to the `leaf()` being inlined into `root()` even though > inlining should happen after it. The legacy `EntryExitInstrumenter` pass > seems to be added to the legacy FunctionPassManager first before all others > passes. Under the legacy PM, when this pass runs, `root()` should look like: > > ; Function Attrs: nounwind > define i32 @root(i32 %x) #1 { > entry: > %x.addr = alloca i32, align 4 > store i32 %x, i32* %x.addr, align 4, !tbaa !2 > %0 = load i32, i32* %x.addr, align 4, !tbaa !2 > %call = call i32 @leaf(i32 %0) ; this call is not yet inlined > ret i32 %call > } > > > but `root()` looks like this in the new PM pass: > > ; Function Attrs: norecurse nounwind readnone > define i32 @root(i32 returned %x) local_unnamed_addr #1 { > entry: > ret i32 %x > } Yeah, I think the entry/exit pass really wants to come *before* any other pass. That should be possible even in the new PM by using an extension point? > **CodeGenCXX/auto-var-init.cpp**: Auto var initialization is performed > differently under new PM than legacy. With initialization for structs with > default values, the legacy PM will store the stream of 0xAAs, then initialize > the members with default values in a constructor, whereas the new PM will > just store the correct value for the whole struct right away without > initialization. In one case, it seems that ctor function was also inlined. > Fixed by check accounting for these checks separately. Cool, makes sense. > **CodeGenCXX/conditional-temporaries.cpp**: The new pm seems to be unable to > statically determine the return value of some functions. For now I just add > separate checks for the new PM IR. Interesting. Definitely the right thing for this patch. Maybe file a PR to track understanding why the new PM struggles here. > **CodeGenCXX/member-function-pointer-calls.cpp**: Same as > `CodeGenCXX/conditional-temporaries.cpp`. In this case, the new PM codegen > also contains calls to lifetime start/end intrinsics. Perhaps this time it > could be due to the intrinsics not being optimized out? Weird, but that's also my best guess. Again, maybe we need a bug to track why this is different. > **CodeGenObjC/os_log.m**: Legacy test expects argument of `ptrtoint` and some > functions to be a specific type and argument, though the new PM codegen uses > a different type and argument which are also valid since > `@llvm.objc.retainAutoreleasedReturnValue(val)` always returns `val`. > > **CodeGenObjCXX/os_log.mm**: A function seems to have been inlined under the > new PM. Here we just prevent inlining for the new PM run since the test > expects this specific function call. You could also use a `noinline` attribute in the code to express the *specific* call that needs to be retained? > **Misc/pr32207.c [No new fix]**: The banner is something that appears only in > the legacy PM. I think the test only checks that the banner is printed and > not so much that the particular pass is run. Yeah, this seems far enough down in the weeds that just making it PM-specific is about the only option. ================ Comment at: clang/lib/CodeGen/BackendUtil.cpp:1135-1139 + PB.registerOptimizerLastEPCallback( + [](FunctionPassManager &FPM, PassBuilder::OptimizationLevel Level) { + FPM.addPass(EntryExitInstrumenterPass(/*PostInlining=*/false)); + }); + ---------------- FYI, feel free to split out the entry/exit change (and its test) to a separate patch if you want. Either way really. ================ Comment at: llvm/lib/Passes/PassBuilder.cpp:815-818 + // running remaining passes on the eliminated functions. These should be + // preserved during prelinking for link-time inlining decisions. + if (!LTOPreLink) + MPM.addPass(EliminateAvailableExternallyPass()); ---------------- I'd suggest splitting this into a separate patch to LLVM and adding an LLVM-side test to cover it as well. Repository: rG LLVM Github Monorepo CHANGES SINCE LAST ACTION https://reviews.llvm.org/D63156/new/ https://reviews.llvm.org/D63156 _______________________________________________ cfe-commits mailing list cfe-commits@lists.llvm.org https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits