chandlerc added a comment.
In D63156#1543937 <https://reviews.llvm.org/D63156#1543937>, @leonardchan wrote:
> I did some more digging and found the following differences between PMs for
> each test, and they seem to all differ and can be fixed for different reasons.
Awesome, and sorry this is such a big effort, but I think these are all going
to end up being pretty interesting root causes. I guess yay, Clang's testing is
doing its job?
> **CodeGen/aggregate-assign-call.c**: The new PM on -O1 codegen produces the
> do/while loop differently but still emits the lifetime start/end intrinsics
> around temporary variables, which is what the test checks for. Here we can
> just check for the instructions while ignoring the branches generated.
Weird, but makes sense.
> **CodeGen/arm_acle.c**: A bunch of instructions seem to have been combined
> into a call to `llvm.fshl.i32`, so we just check for those under the new PM.
Same.
> **CodeGen/available-externally-suppress.c**: `available_externally` was not
> emitted during `-O2 -flto` runs when it should still be retained for link
> time inlining purposes. This can be fixed by checking that we aren't
> LTOPrelinking when adding the `EliminateAvailableExternallyPass`.
Yikes, this is a good bug to fix!
> **CodeGen/pgo-sample.c [No new fix]**: The test checks that `PruneEH` runs,
> but there doesn’t seem to be a corresponding new PM pass for it. Should there
> be? If there is one then we can just check for that in the debug PM dump.
The analog would be `function-attrs` and `function(simplify-cfg)` in some
combination. Maybe this should just check that some specific thing is optimized
away at `-O2` rather than checking the specific details of pass pipeline
construction?
> **CodeGen/x86_64-instrument-functions.c [No new fix]**: This one's a bit
> complicated. The initial problem seems to be that `EntryExitInstrumenterPass`
> was not added into the pipeline to emit `__cyg_profile_func_enter/exit()`
> calls. Adding that to the pipeline seems to instrument correctly now and add
> these calls, but we run into a problem with inlining. `root()` expects 2
> calls to `__cyg_profile_func_enter` and 2 calls to `__cyg_profile_func_exit`
> in its body,
>
> ; Function Attrs: nounwind
> define i32 @root(i32 returned %x) #0 {
> entry:
> %0 = tail call i8* @llvm.returnaddress(i32 0)
> tail call void @__cyg_profile_func_enter(i8* bitcast (i32 (i32)* @root to
> i8*), i8* %0) #2
> tail call void @__cyg_profile_func_enter(i8* bitcast (i32 (i32)* @leaf to
> i8*), i8* %0) #2
> tail call void @__cyg_profile_func_exit(i8* bitcast (i32 (i32)* @leaf to
> i8*), i8* %0) #2
> tail call void @__cyg_profile_func_exit(i8* bitcast (i32 (i32)* @root to
> i8*), i8* %0) #2
> ret i32 %x
> }
>
>
> but we get only 1 call
>
> ; Function Attrs: norecurse nounwind readnone
> define i32 @root(i32 returned %x) local_unnamed_addr #0 {
> entry:
> %0 = call i8* @llvm.returnaddress(i32 0)
> call void @__cyg_profile_func_enter(i8* bitcast (i32 (i32)* @root to
> i8*), i8* %0)
> %1 = call i8* @llvm.returnaddress(i32 0)
> call void @__cyg_profile_func_exit(i8* bitcast (i32 (i32)* @root to i8*),
> i8* %1)
> ret i32 %x
> }
>
>
> I suspect this is due to the `leaf()` being inlined into `root()` even though
> inlining should happen after it. The legacy `EntryExitInstrumenter` pass
> seems to be added to the legacy FunctionPassManager first before all others
> passes. Under the legacy PM, when this pass runs, `root()` should look like:
>
> ; Function Attrs: nounwind
> define i32 @root(i32 %x) #1 {
> entry:
> %x.addr = alloca i32, align 4
> store i32 %x, i32* %x.addr, align 4, !tbaa !2
> %0 = load i32, i32* %x.addr, align 4, !tbaa !2
> %call = call i32 @leaf(i32 %0) ; this call is not yet inlined
> ret i32 %call
> }
>
>
> but `root()` looks like this in the new PM pass:
>
> ; Function Attrs: norecurse nounwind readnone
> define i32 @root(i32 returned %x) local_unnamed_addr #1 {
> entry:
> ret i32 %x
> }
Yeah, I think the entry/exit pass really wants to come *before* any other pass.
That should be possible even in the new PM by using an extension point?
> **CodeGenCXX/auto-var-init.cpp**: Auto var initialization is performed
> differently under new PM than legacy. With initialization for structs with
> default values, the legacy PM will store the stream of 0xAAs, then initialize
> the members with default values in a constructor, whereas the new PM will
> just store the correct value for the whole struct right away without
> initialization. In one case, it seems that ctor function was also inlined.
> Fixed by check accounting for these checks separately.
Cool, makes sense.
> **CodeGenCXX/conditional-temporaries.cpp**: The new pm seems to be unable to
> statically determine the return value of some functions. For now I just add
> separate checks for the new PM IR.
Interesting. Definitely the right thing for this patch. Maybe file a PR to
track understanding why the new PM struggles here.
> **CodeGenCXX/member-function-pointer-calls.cpp**: Same as
> `CodeGenCXX/conditional-temporaries.cpp`. In this case, the new PM codegen
> also contains calls to lifetime start/end intrinsics. Perhaps this time it
> could be due to the intrinsics not being optimized out?
Weird, but that's also my best guess. Again, maybe we need a bug to track why
this is different.
> **CodeGenObjC/os_log.m**: Legacy test expects argument of `ptrtoint` and some
> functions to be a specific type and argument, though the new PM codegen uses
> a different type and argument which are also valid since
> `@llvm.objc.retainAutoreleasedReturnValue(val)` always returns `val`.
>
> **CodeGenObjCXX/os_log.mm**: A function seems to have been inlined under the
> new PM. Here we just prevent inlining for the new PM run since the test
> expects this specific function call.
You could also use a `noinline` attribute in the code to express the *specific*
call that needs to be retained?
> **Misc/pr32207.c [No new fix]**: The banner is something that appears only in
> the legacy PM. I think the test only checks that the banner is printed and
> not so much that the particular pass is run.
Yeah, this seems far enough down in the weeds that just making it PM-specific
is about the only option.
================
Comment at: clang/lib/CodeGen/BackendUtil.cpp:1135-1139
+ PB.registerOptimizerLastEPCallback(
+ [](FunctionPassManager &FPM, PassBuilder::OptimizationLevel Level) {
+ FPM.addPass(EntryExitInstrumenterPass(/*PostInlining=*/false));
+ });
+
----------------
FYI, feel free to split out the entry/exit change (and its test) to a separate
patch if you want. Either way really.
================
Comment at: llvm/lib/Passes/PassBuilder.cpp:815-818
+ // running remaining passes on the eliminated functions. These should be
+ // preserved during prelinking for link-time inlining decisions.
+ if (!LTOPreLink)
+ MPM.addPass(EliminateAvailableExternallyPass());
----------------
I'd suggest splitting this into a separate patch to LLVM and adding an
LLVM-side test to cover it as well.
Repository:
rG LLVM Github Monorepo
CHANGES SINCE LAST ACTION
https://reviews.llvm.org/D63156/new/
https://reviews.llvm.org/D63156
_______________________________________________
cfe-commits mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits