chandlerc added a comment.

In D63156#1543937 <https://reviews.llvm.org/D63156#1543937>, @leonardchan wrote:

> I did some more digging and found the following differences between PMs for 
> each test, and they seem to all differ and can be fixed for different reasons.


Awesome, and sorry this is such a big effort, but I think these are all going 
to end up being pretty interesting root causes. I guess yay, Clang's testing is 
doing its job?

> **CodeGen/aggregate-assign-call.c**: The new PM on -O1 codegen produces the 
> do/while loop differently but still emits the lifetime start/end intrinsics 
> around temporary variables, which is what the test checks for. Here we can 
> just check for the instructions while ignoring the branches generated.

Weird, but makes sense.

> **CodeGen/arm_acle.c**: A bunch of instructions seem to have been combined 
> into a call to `llvm.fshl.i32`, so we just check for those under the new PM.

Same.

> **CodeGen/available-externally-suppress.c**: `available_externally` was not 
> emitted during `-O2 -flto` runs when it should still be retained for link 
> time inlining purposes. This can be fixed by checking that we aren't 
> LTOPrelinking when adding the `EliminateAvailableExternallyPass`.

Yikes, this is a good bug to fix!

> **CodeGen/pgo-sample.c [No new fix]**: The test checks that `PruneEH` runs, 
> but there doesn’t seem to be a corresponding new PM pass for it. Should there 
> be? If there is one then we can just check for that in the debug PM dump.

The analog would be `function-attrs` and `function(simplify-cfg)` in some 
combination. Maybe this should just check that some specific thing is optimized 
away at `-O2` rather than checking the specific details of pass pipeline 
construction?

> **CodeGen/x86_64-instrument-functions.c [No new fix]**: This one's a bit 
> complicated. The initial problem seems to be that `EntryExitInstrumenterPass` 
> was not added into the pipeline to emit `__cyg_profile_func_enter/exit()` 
> calls. Adding that to the pipeline seems to instrument correctly now and add 
> these calls, but we run into a problem with inlining. `root()` expects 2 
> calls to `__cyg_profile_func_enter` and 2 calls to `__cyg_profile_func_exit` 
> in its body,
> 
>   ; Function Attrs: nounwind
>   define i32 @root(i32 returned %x) #0 {
>   entry:
>     %0 = tail call i8* @llvm.returnaddress(i32 0)
>     tail call void @__cyg_profile_func_enter(i8* bitcast (i32 (i32)* @root to 
> i8*), i8* %0) #2
>     tail call void @__cyg_profile_func_enter(i8* bitcast (i32 (i32)* @leaf to 
> i8*), i8* %0) #2
>     tail call void @__cyg_profile_func_exit(i8* bitcast (i32 (i32)* @leaf to 
> i8*), i8* %0) #2
>     tail call void @__cyg_profile_func_exit(i8* bitcast (i32 (i32)* @root to 
> i8*), i8* %0) #2
>     ret i32 %x
>   }
> 
> 
> but we get only 1 call
> 
>   ; Function Attrs: norecurse nounwind readnone
>   define i32 @root(i32 returned %x) local_unnamed_addr #0 {
>   entry:
>     %0 = call i8* @llvm.returnaddress(i32 0)
>     call void @__cyg_profile_func_enter(i8* bitcast (i32 (i32)* @root to 
> i8*), i8* %0)
>     %1 = call i8* @llvm.returnaddress(i32 0)
>     call void @__cyg_profile_func_exit(i8* bitcast (i32 (i32)* @root to i8*), 
> i8* %1)
>     ret i32 %x
>   }
> 
> 
> I suspect this is due to the `leaf()` being inlined into `root()` even though 
> inlining should happen after it. The legacy `EntryExitInstrumenter` pass 
> seems to be added to the legacy FunctionPassManager first before all others 
> passes. Under the legacy PM, when this pass runs, `root()` should look like:
> 
>   ; Function Attrs: nounwind
>   define i32 @root(i32 %x) #1 {
>   entry:
>     %x.addr = alloca i32, align 4
>     store i32 %x, i32* %x.addr, align 4, !tbaa !2
>     %0 = load i32, i32* %x.addr, align 4, !tbaa !2
>     %call = call i32 @leaf(i32 %0)  ; this call is not yet inlined
>     ret i32 %call
>   }
> 
> 
> but `root()` looks like this in the new PM pass:
> 
>   ; Function Attrs: norecurse nounwind readnone
>   define i32 @root(i32 returned %x) local_unnamed_addr #1 {
>   entry:
>     ret i32 %x
>   }

Yeah, I think the entry/exit pass really wants to come *before* any other pass. 
That should be possible even in the new PM by using an extension point?

> **CodeGenCXX/auto-var-init.cpp**: Auto var initialization is performed 
> differently under new PM than legacy. With initialization for structs with 
> default values, the legacy PM will store the stream of 0xAAs, then initialize 
> the members with default values in a constructor, whereas the new PM will 
> just store the correct value for the whole struct right away without 
> initialization. In one case, it seems that ctor function was also inlined. 
> Fixed by check accounting for these checks separately.

Cool, makes sense.

> **CodeGenCXX/conditional-temporaries.cpp**: The new pm seems to be unable to 
> statically determine the return value of some functions. For now I just add 
> separate checks for the new PM IR.

Interesting. Definitely the right thing for this patch. Maybe file a PR to 
track understanding why the new PM struggles here.

> **CodeGenCXX/member-function-pointer-calls.cpp**: Same as 
> `CodeGenCXX/conditional-temporaries.cpp`. In this case, the new PM codegen 
> also contains calls to lifetime start/end intrinsics. Perhaps this time it 
> could be due to the intrinsics not being optimized out?

Weird, but that's also my best guess. Again, maybe we need a bug to track why 
this is different.

> **CodeGenObjC/os_log.m**: Legacy test expects argument of `ptrtoint` and some 
> functions to be a specific type and argument, though the new PM codegen uses 
> a different type and argument which are also valid since 
> `@llvm.objc.retainAutoreleasedReturnValue(val)` always returns `val`.
> 
> **CodeGenObjCXX/os_log.mm**: A function seems to have been inlined under the 
> new PM. Here we just prevent inlining for the new PM run since the test 
> expects this specific function call.

You could also use a `noinline` attribute in the code to express the *specific* 
call that needs to be retained?

> **Misc/pr32207.c [No new fix]**: The banner is something that appears only in 
> the legacy PM. I think the test only checks that the banner is printed and 
> not so much that the particular pass is run.

Yeah, this seems far enough down in the weeds that just making it PM-specific 
is about the only option.



================
Comment at: clang/lib/CodeGen/BackendUtil.cpp:1135-1139
+      PB.registerOptimizerLastEPCallback(
+          [](FunctionPassManager &FPM, PassBuilder::OptimizationLevel Level) {
+            FPM.addPass(EntryExitInstrumenterPass(/*PostInlining=*/false));
+          });
+
----------------
FYI, feel free to split out the entry/exit change (and its test) to a separate 
patch if you want. Either way really.


================
Comment at: llvm/lib/Passes/PassBuilder.cpp:815-818
+  // running remaining passes on the eliminated functions. These should be
+  // preserved during prelinking for link-time inlining decisions.
+  if (!LTOPreLink)
+    MPM.addPass(EliminateAvailableExternallyPass());
----------------
I'd suggest splitting this into a separate patch to LLVM and adding an 
LLVM-side test to cover it as well.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D63156/new/

https://reviews.llvm.org/D63156



_______________________________________________
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits

Reply via email to