[PATCH] D152741: [WPD] implement -funknown-vtable-visibility-filepaths

2023-06-26 Thread Di Mo via Phabricator via cfe-commits
modimo added a comment.

In D152741#4445807 , @wenlei wrote:

>> For concrete data are you talking about between the different solutions e.g. 
>> --lto-whole-program-visibility, -funknown-vtable-visibility-filepaths, RTTI 
>> based, FatLTO based etc or something else?
>
> Right, between the different solutions. RTTI based solution doesn't exist 
> yet, so maybe just compare using `-fwhole-program-vtables` on a known safe 
> set of files vs using `-funknown-vtable-visibility-filepaths` on a known 
> unsafe set of files first.

On a large Meta service with manually opting in an internal source folder with 
`-fwhole-program-vtables` there's 2,933 single implementation methods that get 
devirtualized. Using `-funknown-vtable-visibility-filepath` on the same service 
for the `third-party` directory there's 32,800 single implementation method 
devirts.

>> The ordering for conflicts is embedded in the logic for 
>> CodeGenModule::GetVCallVisibilityLevel which has priority order of
>
> I was thinking about different source of visibility instead of absolute order 
> of visibility itself - i.e. what is the rule if 
> `__attribute__((visibility("...")))` conflicts with 
> `-funknown-vtable-visibility-filepaths` setting for a specific type? This may 
> not be an immediately important question, but just as example of the knock on 
> effect of added complexity, which may or may not be justified depending on 
> the benefit, which goes back to data from experiments.

That complexity already exists with `-fvisibility=hidden` interacting with 
`__attribute__((visibility("...")))` where the most conservative annotation 
wins out. Having a type annotated with `unknown` visibility is just adding a 
more conservative option than `public`.

> We have `-wholeprogramdevirt-skip`; with this patch, we will have 
> `-funknown-vtable-visibility-filepaths`; later on, we will have another RTTI 
> based solution, then there's FatObj solution. It feels like a lot of stuff 
> trying to solve one problem, so wondering if this addition here is going to 
> provide enough value in the end state.

My current prototype RTTI implementation doesn't really need an `unknown` 
visibility because it's generating and passing a blocklist at symbol resolution 
time. For FatObj, the input into WPD is identical to when everything is built 
with ThinLTO so `unknown` isn't that valuable either. The original intent was 
to use this to roll out WPD to select services but performance-wise opting in 
folders with `-fwhole-program-vtables` proves just as effective without having 
to modify LLVM. With that use case gone, there's no longer a need on my side 
for this change. Others may find value for this in the interim to 
on-board/evaluate WPD but that's not very concrete value.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152741/new/

https://reviews.llvm.org/D152741

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152741: [WPD] implement -funknown-vtable-visibility-filepaths

2023-06-23 Thread Di Mo via Phabricator via cfe-commits
modimo updated this revision to Diff 534134.
modimo added a comment.

Feedback, add documentation for flag and unknown visibility.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152741/new/

https://reviews.llvm.org/D152741

Files:
  clang/docs/LTOVisibility.rst
  clang/include/clang/Basic/CodeGenOptions.h
  clang/include/clang/Basic/DiagnosticDriverKinds.td
  clang/include/clang/Driver/Options.td
  clang/lib/CodeGen/CGVTables.cpp
  clang/lib/CodeGen/CodeGenAction.cpp
  clang/lib/CodeGen/CodeGenModule.h
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/test/CodeGenCXX/type-metadata-unknown-vtable-visibility-filepath.cpp
  clang/test/Driver/funknown-vtable-visibility-filepath.c
  llvm/include/llvm/IR/GlobalObject.h
  llvm/lib/IR/Metadata.cpp
  llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
  llvm/test/ThinLTO/X86/devirt.ll

Index: llvm/test/ThinLTO/X86/devirt.ll
===
--- llvm/test/ThinLTO/X86/devirt.ll
+++ llvm/test/ThinLTO/X86/devirt.ll
@@ -42,9 +42,11 @@
 ; RUN:   -r=%t2.o,_ZN1B1fEi,p \
 ; RUN:   -r=%t2.o,_ZN1C1fEi,p \
 ; RUN:   -r=%t2.o,_ZN1D1mEi,p \
+; RUN:   -r=%t2.o,_ZN1E1nEi,p \
 ; RUN:   -r=%t2.o,_ZTV1B,px \
 ; RUN:   -r=%t2.o,_ZTV1C,px \
-; RUN:   -r=%t2.o,_ZTV1D,px 2>&1 | FileCheck %s --check-prefix=REMARK
+; RUN:   -r=%t2.o,_ZTV1D,px \
+; RUN:   -r=%t2.o,_ZTV1F,px 2>&1 | FileCheck %s --check-prefix=REMARK
 ; RUN: llvm-dis %t3.1.4.opt.bc -o - | FileCheck %s --check-prefix=CHECK-IR
 
 ; Check that we're able to prevent specific function from being
@@ -58,9 +60,11 @@
 ; RUN:   -r=%t2.o,_ZN1B1fEi,p \
 ; RUN:   -r=%t2.o,_ZN1C1fEi,p \
 ; RUN:   -r=%t2.o,_ZN1D1mEi,p \
+; RUN:   -r=%t2.o,_ZN1E1nEi,p \
 ; RUN:   -r=%t2.o,_ZTV1B,px \
 ; RUN:   -r=%t2.o,_ZTV1C,px \
-; RUN:   -r=%t2.o,_ZTV1D,px 2>&1 | FileCheck %s --check-prefix=SKIP
+; RUN:   -r=%t2.o,_ZTV1D,px \
+; RUN:   -r=%t2.o,_ZTV1F,px 2>&1 | FileCheck %s --check-prefix=SKIP
 
 ; RUN: llvm-lto2 run %t.o -save-temps -pass-remarks=. \
 ; RUN:   -whole-program-visibility \
@@ -70,16 +74,20 @@
 ; RUN:   -r=%t.o,_ZN1B1fEi,p \
 ; RUN:   -r=%t.o,_ZN1C1fEi,p \
 ; RUN:   -r=%t.o,_ZN1D1mEi,p \
+; RUN:   -r=%t.o,_ZN1E1nEi,p \
 ; RUN:   -r=%t.o,_ZTV1B, \
 ; RUN:   -r=%t.o,_ZTV1C, \
 ; RUN:   -r=%t.o,_ZTV1D, \
+; RUN:   -r=%t.o,_ZTV1F, \
 ; RUN:   -r=%t.o,_ZN1A1nEi, \
 ; RUN:   -r=%t.o,_ZN1B1fEi, \
 ; RUN:   -r=%t.o,_ZN1C1fEi, \
 ; RUN:   -r=%t.o,_ZN1D1mEi, \
+; RUN:   -r=%t.o,_ZN1E1nEi, \
 ; RUN:   -r=%t.o,_ZTV1B,px \
 ; RUN:   -r=%t.o,_ZTV1C,px \
-; RUN:   -r=%t.o,_ZTV1D,px 2>&1 | FileCheck %s --check-prefix=REMARK --dump-input=fail
+; RUN:   -r=%t.o,_ZTV1D,px \
+; RUN:   -r=%t.o,_ZTV1F,px 2>&1 | FileCheck %s --check-prefix=REMARK --dump-input=fail
 ; RUN: llvm-dis %t3.1.4.opt.bc -o - | FileCheck %s --check-prefix=CHECK-IR
 
 ; REMARK-DAG: single-impl: devirtualized a call to _ZN1A1nEi
@@ -95,13 +103,18 @@
 %struct.C = type { %struct.A }
 %struct.D = type { ptr }
 
+%struct.E = type { ptr }
+%struct.F = type { %struct.E }
+
 @_ZTV1B = constant { [4 x ptr] } { [4 x ptr] [ptr null, ptr undef, ptr @_ZN1B1fEi, ptr @_ZN1A1nEi] }, !type !0, !type !1
 @_ZTV1C = constant { [4 x ptr] } { [4 x ptr] [ptr null, ptr undef, ptr @_ZN1C1fEi, ptr @_ZN1A1nEi] }, !type !0, !type !2
 @_ZTV1D = constant { [3 x ptr] } { [3 x ptr] [ptr null, ptr undef, ptr @_ZN1D1mEi] }, !type !3
 
+@_ZTV1F = constant { [3 x ptr] } { [3 x ptr] [ptr null, ptr undef, ptr @_ZN1E1nEi] }, !type !5, !vcall_visibility !6
+
 
 ; CHECK-IR-LABEL: define i32 @test
-define i32 @test(ptr %obj, ptr %obj2, i32 %a) {
+define i32 @test(ptr %obj, ptr %obj2, ptr %obj3, i32 %a) {
 entry:
   %vtable = load ptr, ptr %obj
   %p = call i1 @llvm.type.test(ptr %vtable, metadata !"_ZTS1A")
@@ -114,7 +127,7 @@
   ; Ensure !prof and !callees metadata for indirect call promotion removed.
   ; CHECK-IR-NOT: prof
   ; CHECK-IR-NOT: callees
-  %call = tail call i32 %fptr1(ptr nonnull %obj, i32 %a), !prof !5, !callees !6
+  %call = tail call i32 %fptr1(ptr nonnull %obj, i32 %a), !prof !7, !callees !8
 
   %fptr22 = load ptr, ptr %vtable, align 8
 
@@ -131,7 +144,18 @@
   ; Check that the call was devirtualized.
   ; CHECK-IR: %call4 = tail call i32 @_ZN1D1mEi
   %call4 = tail call i32 %fptr33(ptr nonnull %obj2, i32 %call3)
-  ret i32 %call4
+
+  %vtable1 = load ptr, ptr %obj3
+  %p3 = call i1 @llvm.type.test(ptr %vtable1, metadata !"_ZTS1E")
+  call void @llvm.assume(i1 %p3)
+  %fptrptr1 = getelementptr ptr, ptr %vtable1, i32 0
+  %fptr44 = load ptr, ptr %fptrptr1, align 8
+
+  ; Check that the call was not devirtualized because of "unknown"
+  ; vcall_visibility.
+  ; CHECK-IR: %call5 = tail call i32 %fptr44
+  %call5 = tail call i32 %fptr44(ptr nonnull %obj, i32 %call4)
+  ret i32 %call5
 }
 ; CHECK-IR-LABEL: ret i32
 ; CHECK-IR-LABEL: }
@@ -155,6 +179,10 @@
ret i32 0;
 }
 
+define i32 @_ZN1E1nEi(ptr %this, i32 %a) #0 {
+   ret i32 0;
+}
+
 ; Make sure we don't inline or 

[PATCH] D152741: [WPD] implement -funknown-vtable-visibility-filepaths

2023-06-23 Thread Di Mo via Phabricator via cfe-commits
modimo marked 4 inline comments as done.
modimo added a comment.

In D152741#4445112 , @wenlei wrote:

>> The big advantage of doing this in the FE is that we know which types are 
>> actually coming from the native headers. Blocking all types in the TU is 
>> overly conservative and also less stable as header changes can effectively 
>> turn on/off unrelated large chunks of WPD.
>
> This is clearly going to be selective in punting unsafe devirt, however do 
> you have data comparing the effectiveness of the two (module granularity vs 
> header granularity)?

Some data would help quantify the difference, I'll hack up a module granularity 
implementation and compare to the current one.

> I also think introducing unknown visibility is a good idea for this to work, 
> but this is going to be exposed to users (not hidden implementing only), so 
> we would probably need to have spec/rule to handle conflicting visibility 
> from different source and make those explicit here: 
> https://clang.llvm.org/docs/LTOVisibility.html.

The ordering for conflicts is embedded in the logic for 
`CodeGenModule::GetVCallVisibilityLevel` which has priority order of:

1. Unknown
2. Public
3. LinkageUnit
4. TranslationUnit

I'll update the documentation to reflect this.

> There's a spectrum of solutions we could use to make WPD safer, but we need 
> to be careful not to make this whole thing too convoluted with multiple 
> solutions implemented, but little differentiation in their incremental value 
> (extra perf, extra safety). So having concrete data backing the incremental 
> value of this solution would be helpful.

For concrete data are you talking about between the different solutions e.g. 
--lto-whole-program-visibility, -funknown-vtable-visibility-filepaths, RTTI 
based, FatLTO based etc or something else?




Comment at: clang/include/clang/Basic/CodeGenOptions.h:191
+  struct RegexWithPattern {
+std::string Pattern;
+std::shared_ptr Regex;

wenlei wrote:
> Pattern string doesn't seem to be used anywhere, can we simplify this using 
> `llvm::Regex` instead of `RegexWithPattern`?
It's used here in CompilerInvocation.cpp:
```
  if (Opts.SkipVtableFilepaths.hasValidPattern())
GenerateArg(Args, OPT_funknown_vtable_visibility_filepaths_EQ,
Opts.SkipVtableFilepaths.Pattern, SA);
```


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152741/new/

https://reviews.llvm.org/D152741

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152741: [WPD] implement -fskip-vtable-filepaths

2023-06-21 Thread Di Mo via Phabricator via cfe-commits
modimo updated this revision to Diff 533326.
modimo added a comment.

Address feedback. Allow empty string to unset the flag


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152741/new/

https://reviews.llvm.org/D152741

Files:
  clang/include/clang/Basic/CodeGenOptions.h
  clang/include/clang/Driver/Options.td
  clang/lib/CodeGen/CGVTables.cpp
  clang/lib/CodeGen/CodeGenModule.h
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/test/CodeGenCXX/type-metadata-unknown-vtable-visibility-filepaths.cpp
  clang/test/Driver/funknown-vtable-visibility-filepaths.c
  llvm/include/llvm/IR/GlobalObject.h
  llvm/lib/IR/Metadata.cpp
  llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
  llvm/test/ThinLTO/X86/devirt.ll

Index: llvm/test/ThinLTO/X86/devirt.ll
===
--- llvm/test/ThinLTO/X86/devirt.ll
+++ llvm/test/ThinLTO/X86/devirt.ll
@@ -42,9 +42,11 @@
 ; RUN:   -r=%t2.o,_ZN1B1fEi,p \
 ; RUN:   -r=%t2.o,_ZN1C1fEi,p \
 ; RUN:   -r=%t2.o,_ZN1D1mEi,p \
+; RUN:   -r=%t2.o,_ZN1E1nEi,p \
 ; RUN:   -r=%t2.o,_ZTV1B,px \
 ; RUN:   -r=%t2.o,_ZTV1C,px \
-; RUN:   -r=%t2.o,_ZTV1D,px 2>&1 | FileCheck %s --check-prefix=REMARK
+; RUN:   -r=%t2.o,_ZTV1D,px \
+; RUN:   -r=%t2.o,_ZTV1F,px 2>&1 | FileCheck %s --check-prefix=REMARK
 ; RUN: llvm-dis %t3.1.4.opt.bc -o - | FileCheck %s --check-prefix=CHECK-IR
 
 ; Check that we're able to prevent specific function from being
@@ -58,9 +60,11 @@
 ; RUN:   -r=%t2.o,_ZN1B1fEi,p \
 ; RUN:   -r=%t2.o,_ZN1C1fEi,p \
 ; RUN:   -r=%t2.o,_ZN1D1mEi,p \
+; RUN:   -r=%t2.o,_ZN1E1nEi,p \
 ; RUN:   -r=%t2.o,_ZTV1B,px \
 ; RUN:   -r=%t2.o,_ZTV1C,px \
-; RUN:   -r=%t2.o,_ZTV1D,px 2>&1 | FileCheck %s --check-prefix=SKIP
+; RUN:   -r=%t2.o,_ZTV1D,px \
+; RUN:   -r=%t2.o,_ZTV1F,px 2>&1 | FileCheck %s --check-prefix=SKIP
 
 ; RUN: llvm-lto2 run %t.o -save-temps -pass-remarks=. \
 ; RUN:   -whole-program-visibility \
@@ -70,16 +74,20 @@
 ; RUN:   -r=%t.o,_ZN1B1fEi,p \
 ; RUN:   -r=%t.o,_ZN1C1fEi,p \
 ; RUN:   -r=%t.o,_ZN1D1mEi,p \
+; RUN:   -r=%t.o,_ZN1E1nEi,p \
 ; RUN:   -r=%t.o,_ZTV1B, \
 ; RUN:   -r=%t.o,_ZTV1C, \
 ; RUN:   -r=%t.o,_ZTV1D, \
+; RUN:   -r=%t.o,_ZTV1F, \
 ; RUN:   -r=%t.o,_ZN1A1nEi, \
 ; RUN:   -r=%t.o,_ZN1B1fEi, \
 ; RUN:   -r=%t.o,_ZN1C1fEi, \
 ; RUN:   -r=%t.o,_ZN1D1mEi, \
+; RUN:   -r=%t.o,_ZN1E1nEi, \
 ; RUN:   -r=%t.o,_ZTV1B,px \
 ; RUN:   -r=%t.o,_ZTV1C,px \
-; RUN:   -r=%t.o,_ZTV1D,px 2>&1 | FileCheck %s --check-prefix=REMARK --dump-input=fail
+; RUN:   -r=%t.o,_ZTV1D,px \
+; RUN:   -r=%t.o,_ZTV1F,px 2>&1 | FileCheck %s --check-prefix=REMARK --dump-input=fail
 ; RUN: llvm-dis %t3.1.4.opt.bc -o - | FileCheck %s --check-prefix=CHECK-IR
 
 ; REMARK-DAG: single-impl: devirtualized a call to _ZN1A1nEi
@@ -95,13 +103,18 @@
 %struct.C = type { %struct.A }
 %struct.D = type { ptr }
 
+%struct.E = type { ptr }
+%struct.F = type { %struct.E }
+
 @_ZTV1B = constant { [4 x ptr] } { [4 x ptr] [ptr null, ptr undef, ptr @_ZN1B1fEi, ptr @_ZN1A1nEi] }, !type !0, !type !1
 @_ZTV1C = constant { [4 x ptr] } { [4 x ptr] [ptr null, ptr undef, ptr @_ZN1C1fEi, ptr @_ZN1A1nEi] }, !type !0, !type !2
 @_ZTV1D = constant { [3 x ptr] } { [3 x ptr] [ptr null, ptr undef, ptr @_ZN1D1mEi] }, !type !3
 
+@_ZTV1F = constant { [3 x ptr] } { [3 x ptr] [ptr null, ptr undef, ptr @_ZN1E1nEi] }, !type !5, !vcall_visibility !6
+
 
 ; CHECK-IR-LABEL: define i32 @test
-define i32 @test(ptr %obj, ptr %obj2, i32 %a) {
+define i32 @test(ptr %obj, ptr %obj2, ptr %obj3, i32 %a) {
 entry:
   %vtable = load ptr, ptr %obj
   %p = call i1 @llvm.type.test(ptr %vtable, metadata !"_ZTS1A")
@@ -114,7 +127,7 @@
   ; Ensure !prof and !callees metadata for indirect call promotion removed.
   ; CHECK-IR-NOT: prof
   ; CHECK-IR-NOT: callees
-  %call = tail call i32 %fptr1(ptr nonnull %obj, i32 %a), !prof !5, !callees !6
+  %call = tail call i32 %fptr1(ptr nonnull %obj, i32 %a), !prof !7, !callees !8
 
   %fptr22 = load ptr, ptr %vtable, align 8
 
@@ -131,7 +144,18 @@
   ; Check that the call was devirtualized.
   ; CHECK-IR: %call4 = tail call i32 @_ZN1D1mEi
   %call4 = tail call i32 %fptr33(ptr nonnull %obj2, i32 %call3)
-  ret i32 %call4
+
+  %vtable1 = load ptr, ptr %obj3
+  %p3 = call i1 @llvm.type.test(ptr %vtable1, metadata !"_ZTS1E")
+  call void @llvm.assume(i1 %p3)
+  %fptrptr1 = getelementptr ptr, ptr %vtable1, i32 0
+  %fptr44 = load ptr, ptr %fptrptr1, align 8
+
+  ; Check that the call was not devirtualized because of "unknown"
+  ; vcall_visibility.
+  ; CHECK-IR: %call5 = tail call i32 %fptr44
+  %call5 = tail call i32 %fptr44(ptr nonnull %obj, i32 %call4)
+  ret i32 %call5
 }
 ; CHECK-IR-LABEL: ret i32
 ; CHECK-IR-LABEL: }
@@ -155,6 +179,10 @@
ret i32 0;
 }
 
+define i32 @_ZN1E1nEi(ptr %this, i32 %a) #0 {
+   ret i32 0;
+}
+
 ; Make sure we don't inline or otherwise optimize out the direct calls.
 attributes #0 = { noinline optnone }
 
@@ -163,5 +191,7 @@
 !2 = !{i64 16, !"_ZTS1C"}
 !3 

[PATCH] D152741: [WPD] implement -fskip-vtable-filepaths

2023-06-21 Thread Di Mo via Phabricator via cfe-commits
modimo added a comment.

In D152741#4438007 , @tejohnson wrote:

> Ok what I missed is that you don't want to apply this to entire TUs, but 
> rather just some paths that are header files, which may be included in many 
> source files. So in your above example, you really only need to apply to the 
> path of third-party/include/boost.h - is that correct?

Yep!

> That would mark class A, and therefore anything derived from it won't get 
> devirtualized. I guess in your example above, you are trying to prevent the 
> devirtualization in a.cpp since there are hidden overrides (class C) in 
> boost.a native objects.

Exactly, we saw this scenario causing issues when enabling WPD.

> The example included with the patch applies the option to the source file of 
> the test case, and therefore its entire TU. It would be helpful to have a 
> test case structured like your example above, where the file path is just 
> that of the header.

Makes sense and yeah the test case is confusing. Changed it to apply to just 
the header file.

> Maybe a better option name is something like 
> -funknown-vtable-visibility-filepaths= ? It seems a bit more descriptive.

Sure, changed.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152741/new/

https://reviews.llvm.org/D152741

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152741: [WPD] implement -fskip-vtable-filepaths

2023-06-20 Thread Di Mo via Phabricator via cfe-commits
modimo updated this revision to Diff 533048.
modimo added a comment.

Remove leftover code from original implementation


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152741/new/

https://reviews.llvm.org/D152741

Files:
  clang/include/clang/Basic/CodeGenOptions.h
  clang/include/clang/Driver/Options.td
  clang/lib/CodeGen/CGVTables.cpp
  clang/lib/CodeGen/CodeGenModule.h
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/test/CodeGenCXX/type-metadata-skip-vtable-filepaths.cpp
  clang/test/Driver/fskip-vtable-filepaths.c
  llvm/include/llvm/IR/GlobalObject.h
  llvm/lib/IR/Metadata.cpp
  llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
  llvm/test/ThinLTO/X86/devirt.ll

Index: llvm/test/ThinLTO/X86/devirt.ll
===
--- llvm/test/ThinLTO/X86/devirt.ll
+++ llvm/test/ThinLTO/X86/devirt.ll
@@ -42,9 +42,11 @@
 ; RUN:   -r=%t2.o,_ZN1B1fEi,p \
 ; RUN:   -r=%t2.o,_ZN1C1fEi,p \
 ; RUN:   -r=%t2.o,_ZN1D1mEi,p \
+; RUN:   -r=%t2.o,_ZN1E1nEi,p \
 ; RUN:   -r=%t2.o,_ZTV1B,px \
 ; RUN:   -r=%t2.o,_ZTV1C,px \
-; RUN:   -r=%t2.o,_ZTV1D,px 2>&1 | FileCheck %s --check-prefix=REMARK
+; RUN:   -r=%t2.o,_ZTV1D,px \
+; RUN:   -r=%t2.o,_ZTV1F,px 2>&1 | FileCheck %s --check-prefix=REMARK
 ; RUN: llvm-dis %t3.1.4.opt.bc -o - | FileCheck %s --check-prefix=CHECK-IR
 
 ; Check that we're able to prevent specific function from being
@@ -58,9 +60,11 @@
 ; RUN:   -r=%t2.o,_ZN1B1fEi,p \
 ; RUN:   -r=%t2.o,_ZN1C1fEi,p \
 ; RUN:   -r=%t2.o,_ZN1D1mEi,p \
+; RUN:   -r=%t2.o,_ZN1E1nEi,p \
 ; RUN:   -r=%t2.o,_ZTV1B,px \
 ; RUN:   -r=%t2.o,_ZTV1C,px \
-; RUN:   -r=%t2.o,_ZTV1D,px 2>&1 | FileCheck %s --check-prefix=SKIP
+; RUN:   -r=%t2.o,_ZTV1D,px \
+; RUN:   -r=%t2.o,_ZTV1F,px 2>&1 | FileCheck %s --check-prefix=SKIP
 
 ; RUN: llvm-lto2 run %t.o -save-temps -pass-remarks=. \
 ; RUN:   -whole-program-visibility \
@@ -70,16 +74,20 @@
 ; RUN:   -r=%t.o,_ZN1B1fEi,p \
 ; RUN:   -r=%t.o,_ZN1C1fEi,p \
 ; RUN:   -r=%t.o,_ZN1D1mEi,p \
+; RUN:   -r=%t.o,_ZN1E1nEi,p \
 ; RUN:   -r=%t.o,_ZTV1B, \
 ; RUN:   -r=%t.o,_ZTV1C, \
 ; RUN:   -r=%t.o,_ZTV1D, \
+; RUN:   -r=%t.o,_ZTV1F, \
 ; RUN:   -r=%t.o,_ZN1A1nEi, \
 ; RUN:   -r=%t.o,_ZN1B1fEi, \
 ; RUN:   -r=%t.o,_ZN1C1fEi, \
 ; RUN:   -r=%t.o,_ZN1D1mEi, \
+; RUN:   -r=%t.o,_ZN1E1nEi, \
 ; RUN:   -r=%t.o,_ZTV1B,px \
 ; RUN:   -r=%t.o,_ZTV1C,px \
-; RUN:   -r=%t.o,_ZTV1D,px 2>&1 | FileCheck %s --check-prefix=REMARK --dump-input=fail
+; RUN:   -r=%t.o,_ZTV1D,px \
+; RUN:   -r=%t.o,_ZTV1F,px 2>&1 | FileCheck %s --check-prefix=REMARK --dump-input=fail
 ; RUN: llvm-dis %t3.1.4.opt.bc -o - | FileCheck %s --check-prefix=CHECK-IR
 
 ; REMARK-DAG: single-impl: devirtualized a call to _ZN1A1nEi
@@ -95,13 +103,18 @@
 %struct.C = type { %struct.A }
 %struct.D = type { ptr }
 
+%struct.E = type { ptr }
+%struct.F = type { %struct.E }
+
 @_ZTV1B = constant { [4 x ptr] } { [4 x ptr] [ptr null, ptr undef, ptr @_ZN1B1fEi, ptr @_ZN1A1nEi] }, !type !0, !type !1
 @_ZTV1C = constant { [4 x ptr] } { [4 x ptr] [ptr null, ptr undef, ptr @_ZN1C1fEi, ptr @_ZN1A1nEi] }, !type !0, !type !2
 @_ZTV1D = constant { [3 x ptr] } { [3 x ptr] [ptr null, ptr undef, ptr @_ZN1D1mEi] }, !type !3
 
+@_ZTV1F = constant { [3 x ptr] } { [3 x ptr] [ptr null, ptr undef, ptr @_ZN1E1nEi] }, !type !5, !vcall_visibility !6
+
 
 ; CHECK-IR-LABEL: define i32 @test
-define i32 @test(ptr %obj, ptr %obj2, i32 %a) {
+define i32 @test(ptr %obj, ptr %obj2, ptr %obj3, i32 %a) {
 entry:
   %vtable = load ptr, ptr %obj
   %p = call i1 @llvm.type.test(ptr %vtable, metadata !"_ZTS1A")
@@ -114,7 +127,7 @@
   ; Ensure !prof and !callees metadata for indirect call promotion removed.
   ; CHECK-IR-NOT: prof
   ; CHECK-IR-NOT: callees
-  %call = tail call i32 %fptr1(ptr nonnull %obj, i32 %a), !prof !5, !callees !6
+  %call = tail call i32 %fptr1(ptr nonnull %obj, i32 %a), !prof !7, !callees !8
 
   %fptr22 = load ptr, ptr %vtable, align 8
 
@@ -131,7 +144,18 @@
   ; Check that the call was devirtualized.
   ; CHECK-IR: %call4 = tail call i32 @_ZN1D1mEi
   %call4 = tail call i32 %fptr33(ptr nonnull %obj2, i32 %call3)
-  ret i32 %call4
+
+  %vtable1 = load ptr, ptr %obj3
+  %p3 = call i1 @llvm.type.test(ptr %vtable1, metadata !"_ZTS1E")
+  call void @llvm.assume(i1 %p3)
+  %fptrptr1 = getelementptr ptr, ptr %vtable1, i32 0
+  %fptr44 = load ptr, ptr %fptrptr1, align 8
+
+  ; Check that the call was not devirtualized because of "unknown"
+  ; vcall_visibility.
+  ; CHECK-IR: %call5 = tail call i32 %fptr44
+  %call5 = tail call i32 %fptr44(ptr nonnull %obj, i32 %call4)
+  ret i32 %call5
 }
 ; CHECK-IR-LABEL: ret i32
 ; CHECK-IR-LABEL: }
@@ -155,6 +179,10 @@
ret i32 0;
 }
 
+define i32 @_ZN1E1nEi(ptr %this, i32 %a) #0 {
+   ret i32 0;
+}
+
 ; Make sure we don't inline or otherwise optimize out the direct calls.
 attributes #0 = { noinline optnone }
 
@@ -163,5 +191,7 @@
 !2 = !{i64 16, !"_ZTS1C"}
 !3 = !{i64 16, !4}
 !4 = distinct 

[PATCH] D152741: [WPD] implement -fskip-vtable-filepaths

2023-06-14 Thread Di Mo via Phabricator via cfe-commits
modimo added a comment.

In D152741#4421067 , @tejohnson wrote:

> In D152741#4419366 , @modimo wrote:
>
>> In D152741#4419324 , @tejohnson 
>> wrote:
>>
>>> In D152741#4419265 , @modimo 
>>> wrote:
>>>
 In D152741#4418831 , @tejohnson 
 wrote:

> I think I understand the motivation, but not sure I agree this is the 
> right approach - can you simply not pass -flto-unit and 
> -fwhole-program-vtables for these files?

 For our third-party libraries, they're pre-built into native files by GCC 
 so that's unfortunately not an option.
>>>
>>> I'm confused - how would you pass this new option then? I was assuming you 
>>> were passing this option to some LLVM built files at the interface of those 
>>> libraries. In which case not passing -flto-unit and 
>>> -fwhole-program-visibility should have a similar effect (suppress the type 
>>> metadata).
>>
>> Oh I see, I misunderstood. Yes this is being passed to LLVM built files. We 
>> want to avoid manual allowlists/blocklists because code changes make it less 
>> flexible and scalable than an automatic option.
>
> It seems like you need allowlists or blocklists in either case - either it is 
> passed as a regex via the option proposed here, or the build system modifies 
> the options for that set of files.
>
>> This can also be pretty tricky to do correctly since we can get type 
>> metadata from multiple TUs and all of them would need to be opted out for 
>> WPD to not kick in.
>
> But clang is presumably compiling a single TU at a time, so your regex needs 
> to cover them all anyway? I'm not sure I understand the distinction between 
> doing something like -fskip-vtable-filepaths=third-party/.* vs something like 
> applying -funknown-vtable-visibility to each third-party/*.cc compile.

The blocklists need to be enforced on internal files that interact with native 
libraries and those live in many different areas:

  ; /third-party/include/boost.h
  
  class A {}
  
  ; /internal-source/a.cpp
  #include "boost.h"
  
  class B : public A
  
  ; /third-party/lib/boost.a
  #include "boost.h"
  
  class C : public A

That being said, this is something the build system can detect and mark.

> I really think the logic for which files to apply this option to belongs in 
> the build system, not in the clang driver - just like any other clang option. 
> It isn't clear to me why this particular option should be applied based on a 
> file regex.

The big advantage of doing this in the FE is that we know which types are 
actually coming from the native headers. Blocking all types in the TU is overly 
conservative and also less stable as header changes can effectively turn on/off 
unrelated large chunks of WPD.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152741/new/

https://reviews.llvm.org/D152741

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152741: [WPD] implement -fskip-vtable-filepaths

2023-06-13 Thread Di Mo via Phabricator via cfe-commits
modimo updated this revision to Diff 531151.
modimo added a comment.

Remove unrelated change, fix typo


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152741/new/

https://reviews.llvm.org/D152741

Files:
  clang/include/clang/Basic/CodeGenOptions.h
  clang/include/clang/Driver/Options.td
  clang/lib/CodeGen/CGVTables.cpp
  clang/lib/CodeGen/CodeGenModule.cpp
  clang/lib/CodeGen/CodeGenModule.h
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/test/CodeGenCXX/type-metadata-skip-vtable-filepaths.cpp
  clang/test/Driver/fskip-vtable-filepaths.c
  llvm/include/llvm/IR/GlobalObject.h
  llvm/lib/IR/Metadata.cpp
  llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
  llvm/test/ThinLTO/X86/devirt.ll

Index: llvm/test/ThinLTO/X86/devirt.ll
===
--- llvm/test/ThinLTO/X86/devirt.ll
+++ llvm/test/ThinLTO/X86/devirt.ll
@@ -42,9 +42,11 @@
 ; RUN:   -r=%t2.o,_ZN1B1fEi,p \
 ; RUN:   -r=%t2.o,_ZN1C1fEi,p \
 ; RUN:   -r=%t2.o,_ZN1D1mEi,p \
+; RUN:   -r=%t2.o,_ZN1E1nEi,p \
 ; RUN:   -r=%t2.o,_ZTV1B,px \
 ; RUN:   -r=%t2.o,_ZTV1C,px \
-; RUN:   -r=%t2.o,_ZTV1D,px 2>&1 | FileCheck %s --check-prefix=REMARK
+; RUN:   -r=%t2.o,_ZTV1D,px \
+; RUN:   -r=%t2.o,_ZTV1F,px 2>&1 | FileCheck %s --check-prefix=REMARK
 ; RUN: llvm-dis %t3.1.4.opt.bc -o - | FileCheck %s --check-prefix=CHECK-IR
 
 ; Check that we're able to prevent specific function from being
@@ -58,9 +60,11 @@
 ; RUN:   -r=%t2.o,_ZN1B1fEi,p \
 ; RUN:   -r=%t2.o,_ZN1C1fEi,p \
 ; RUN:   -r=%t2.o,_ZN1D1mEi,p \
+; RUN:   -r=%t2.o,_ZN1E1nEi,p \
 ; RUN:   -r=%t2.o,_ZTV1B,px \
 ; RUN:   -r=%t2.o,_ZTV1C,px \
-; RUN:   -r=%t2.o,_ZTV1D,px 2>&1 | FileCheck %s --check-prefix=SKIP
+; RUN:   -r=%t2.o,_ZTV1D,px \
+; RUN:   -r=%t2.o,_ZTV1F,px 2>&1 | FileCheck %s --check-prefix=SKIP
 
 ; RUN: llvm-lto2 run %t.o -save-temps -pass-remarks=. \
 ; RUN:   -whole-program-visibility \
@@ -70,16 +74,20 @@
 ; RUN:   -r=%t.o,_ZN1B1fEi,p \
 ; RUN:   -r=%t.o,_ZN1C1fEi,p \
 ; RUN:   -r=%t.o,_ZN1D1mEi,p \
+; RUN:   -r=%t.o,_ZN1E1nEi,p \
 ; RUN:   -r=%t.o,_ZTV1B, \
 ; RUN:   -r=%t.o,_ZTV1C, \
 ; RUN:   -r=%t.o,_ZTV1D, \
+; RUN:   -r=%t.o,_ZTV1F, \
 ; RUN:   -r=%t.o,_ZN1A1nEi, \
 ; RUN:   -r=%t.o,_ZN1B1fEi, \
 ; RUN:   -r=%t.o,_ZN1C1fEi, \
 ; RUN:   -r=%t.o,_ZN1D1mEi, \
+; RUN:   -r=%t.o,_ZN1E1nEi, \
 ; RUN:   -r=%t.o,_ZTV1B,px \
 ; RUN:   -r=%t.o,_ZTV1C,px \
-; RUN:   -r=%t.o,_ZTV1D,px 2>&1 | FileCheck %s --check-prefix=REMARK --dump-input=fail
+; RUN:   -r=%t.o,_ZTV1D,px \
+; RUN:   -r=%t.o,_ZTV1F,px 2>&1 | FileCheck %s --check-prefix=REMARK --dump-input=fail
 ; RUN: llvm-dis %t3.1.4.opt.bc -o - | FileCheck %s --check-prefix=CHECK-IR
 
 ; REMARK-DAG: single-impl: devirtualized a call to _ZN1A1nEi
@@ -95,13 +103,18 @@
 %struct.C = type { %struct.A }
 %struct.D = type { ptr }
 
+%struct.E = type { ptr }
+%struct.F = type { %struct.E }
+
 @_ZTV1B = constant { [4 x ptr] } { [4 x ptr] [ptr null, ptr undef, ptr @_ZN1B1fEi, ptr @_ZN1A1nEi] }, !type !0, !type !1
 @_ZTV1C = constant { [4 x ptr] } { [4 x ptr] [ptr null, ptr undef, ptr @_ZN1C1fEi, ptr @_ZN1A1nEi] }, !type !0, !type !2
 @_ZTV1D = constant { [3 x ptr] } { [3 x ptr] [ptr null, ptr undef, ptr @_ZN1D1mEi] }, !type !3
 
+@_ZTV1F = constant { [3 x ptr] } { [3 x ptr] [ptr null, ptr undef, ptr @_ZN1E1nEi] }, !type !5, !vcall_visibility !6
+
 
 ; CHECK-IR-LABEL: define i32 @test
-define i32 @test(ptr %obj, ptr %obj2, i32 %a) {
+define i32 @test(ptr %obj, ptr %obj2, ptr %obj3, i32 %a) {
 entry:
   %vtable = load ptr, ptr %obj
   %p = call i1 @llvm.type.test(ptr %vtable, metadata !"_ZTS1A")
@@ -114,7 +127,7 @@
   ; Ensure !prof and !callees metadata for indirect call promotion removed.
   ; CHECK-IR-NOT: prof
   ; CHECK-IR-NOT: callees
-  %call = tail call i32 %fptr1(ptr nonnull %obj, i32 %a), !prof !5, !callees !6
+  %call = tail call i32 %fptr1(ptr nonnull %obj, i32 %a), !prof !7, !callees !8
 
   %fptr22 = load ptr, ptr %vtable, align 8
 
@@ -131,7 +144,18 @@
   ; Check that the call was devirtualized.
   ; CHECK-IR: %call4 = tail call i32 @_ZN1D1mEi
   %call4 = tail call i32 %fptr33(ptr nonnull %obj2, i32 %call3)
-  ret i32 %call4
+
+  %vtable1 = load ptr, ptr %obj3
+  %p3 = call i1 @llvm.type.test(ptr %vtable1, metadata !"_ZTS1E")
+  call void @llvm.assume(i1 %p3)
+  %fptrptr1 = getelementptr ptr, ptr %vtable1, i32 0
+  %fptr44 = load ptr, ptr %fptrptr1, align 8
+
+  ; Check that the call was not devirtualized because of "unknown"
+  ; vcall_visibility.
+  ; CHECK-IR: %call5 = tail call i32 %fptr44
+  %call5 = tail call i32 %fptr44(ptr nonnull %obj, i32 %call4)
+  ret i32 %call5
 }
 ; CHECK-IR-LABEL: ret i32
 ; CHECK-IR-LABEL: }
@@ -155,6 +179,10 @@
ret i32 0;
 }
 
+define i32 @_ZN1E1nEi(ptr %this, i32 %a) #0 {
+   ret i32 0;
+}
+
 ; Make sure we don't inline or otherwise optimize out the direct calls.
 attributes #0 = { noinline optnone }
 
@@ -163,5 +191,7 @@
 !2 = !{i64 16, !"_ZTS1C"}
 !3 = !{i64 16, 

[PATCH] D152741: [WPD] implement -fskip-vtable-filepaths

2023-06-13 Thread Di Mo via Phabricator via cfe-commits
modimo updated this revision to Diff 531149.
modimo added a comment.
Herald added subscribers: llvm-commits, ormris, steven_wu, hiraditya.
Herald added a project: LLVM.

Implement using VCallVisibilityUnknown


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152741/new/

https://reviews.llvm.org/D152741

Files:
  clang/include/clang/Basic/CodeGenOptions.def
  clang/include/clang/Basic/CodeGenOptions.h
  clang/include/clang/Driver/Options.td
  clang/lib/CodeGen/CGVTables.cpp
  clang/lib/CodeGen/CodeGenModule.cpp
  clang/lib/CodeGen/CodeGenModule.h
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/test/CodeGenCXX/type-metadata-skip-vtable-filepaths.cpp
  clang/test/Driver/fskip-vtable-filepaths.c
  llvm/include/llvm/IR/GlobalObject.h
  llvm/lib/IR/Metadata.cpp
  llvm/lib/Transforms/IPO/WholeProgramDevirt.cpp
  llvm/test/ThinLTO/X86/devirt.ll

Index: llvm/test/ThinLTO/X86/devirt.ll
===
--- llvm/test/ThinLTO/X86/devirt.ll
+++ llvm/test/ThinLTO/X86/devirt.ll
@@ -42,9 +42,11 @@
 ; RUN:   -r=%t2.o,_ZN1B1fEi,p \
 ; RUN:   -r=%t2.o,_ZN1C1fEi,p \
 ; RUN:   -r=%t2.o,_ZN1D1mEi,p \
+; RUN:   -r=%t2.o,_ZN1E1nEi,p \
 ; RUN:   -r=%t2.o,_ZTV1B,px \
 ; RUN:   -r=%t2.o,_ZTV1C,px \
-; RUN:   -r=%t2.o,_ZTV1D,px 2>&1 | FileCheck %s --check-prefix=REMARK
+; RUN:   -r=%t2.o,_ZTV1D,px \
+; RUN:   -r=%t2.o,_ZTV1F,px 2>&1 | FileCheck %s --check-prefix=REMARK
 ; RUN: llvm-dis %t3.1.4.opt.bc -o - | FileCheck %s --check-prefix=CHECK-IR
 
 ; Check that we're able to prevent specific function from being
@@ -58,9 +60,11 @@
 ; RUN:   -r=%t2.o,_ZN1B1fEi,p \
 ; RUN:   -r=%t2.o,_ZN1C1fEi,p \
 ; RUN:   -r=%t2.o,_ZN1D1mEi,p \
+; RUN:   -r=%t2.o,_ZN1E1nEi,p \
 ; RUN:   -r=%t2.o,_ZTV1B,px \
 ; RUN:   -r=%t2.o,_ZTV1C,px \
-; RUN:   -r=%t2.o,_ZTV1D,px 2>&1 | FileCheck %s --check-prefix=SKIP
+; RUN:   -r=%t2.o,_ZTV1D,px \
+; RUN:   -r=%t2.o,_ZTV1F,px 2>&1 | FileCheck %s --check-prefix=SKIP
 
 ; RUN: llvm-lto2 run %t.o -save-temps -pass-remarks=. \
 ; RUN:   -whole-program-visibility \
@@ -70,16 +74,20 @@
 ; RUN:   -r=%t.o,_ZN1B1fEi,p \
 ; RUN:   -r=%t.o,_ZN1C1fEi,p \
 ; RUN:   -r=%t.o,_ZN1D1mEi,p \
+; RUN:   -r=%t.o,_ZN1E1nEi,p \
 ; RUN:   -r=%t.o,_ZTV1B, \
 ; RUN:   -r=%t.o,_ZTV1C, \
 ; RUN:   -r=%t.o,_ZTV1D, \
+; RUN:   -r=%t.o,_ZTV1F, \
 ; RUN:   -r=%t.o,_ZN1A1nEi, \
 ; RUN:   -r=%t.o,_ZN1B1fEi, \
 ; RUN:   -r=%t.o,_ZN1C1fEi, \
 ; RUN:   -r=%t.o,_ZN1D1mEi, \
+; RUN:   -r=%t.o,_ZN1E1nEi, \
 ; RUN:   -r=%t.o,_ZTV1B,px \
 ; RUN:   -r=%t.o,_ZTV1C,px \
-; RUN:   -r=%t.o,_ZTV1D,px 2>&1 | FileCheck %s --check-prefix=REMARK --dump-input=fail
+; RUN:   -r=%t.o,_ZTV1D,px \
+; RUN:   -r=%t.o,_ZTV1F,px 2>&1 | FileCheck %s --check-prefix=REMARK --dump-input=fail
 ; RUN: llvm-dis %t3.1.4.opt.bc -o - | FileCheck %s --check-prefix=CHECK-IR
 
 ; REMARK-DAG: single-impl: devirtualized a call to _ZN1A1nEi
@@ -95,13 +103,18 @@
 %struct.C = type { %struct.A }
 %struct.D = type { ptr }
 
+%struct.E = type { ptr }
+%struct.F = type { %struct.E }
+
 @_ZTV1B = constant { [4 x ptr] } { [4 x ptr] [ptr null, ptr undef, ptr @_ZN1B1fEi, ptr @_ZN1A1nEi] }, !type !0, !type !1
 @_ZTV1C = constant { [4 x ptr] } { [4 x ptr] [ptr null, ptr undef, ptr @_ZN1C1fEi, ptr @_ZN1A1nEi] }, !type !0, !type !2
 @_ZTV1D = constant { [3 x ptr] } { [3 x ptr] [ptr null, ptr undef, ptr @_ZN1D1mEi] }, !type !3
 
+@_ZTV1F = constant { [3 x ptr] } { [3 x ptr] [ptr null, ptr undef, ptr @_ZN1E1nEi] }, !type !5, !vcall_visibility !6
+
 
 ; CHECK-IR-LABEL: define i32 @test
-define i32 @test(ptr %obj, ptr %obj2, i32 %a) {
+define i32 @test(ptr %obj, ptr %obj2, ptr %obj3, i32 %a) {
 entry:
   %vtable = load ptr, ptr %obj
   %p = call i1 @llvm.type.test(ptr %vtable, metadata !"_ZTS1A")
@@ -114,7 +127,7 @@
   ; Ensure !prof and !callees metadata for indirect call promotion removed.
   ; CHECK-IR-NOT: prof
   ; CHECK-IR-NOT: callees
-  %call = tail call i32 %fptr1(ptr nonnull %obj, i32 %a), !prof !5, !callees !6
+  %call = tail call i32 %fptr1(ptr nonnull %obj, i32 %a), !prof !7, !callees !8
 
   %fptr22 = load ptr, ptr %vtable, align 8
 
@@ -131,7 +144,18 @@
   ; Check that the call was devirtualized.
   ; CHECK-IR: %call4 = tail call i32 @_ZN1D1mEi
   %call4 = tail call i32 %fptr33(ptr nonnull %obj2, i32 %call3)
-  ret i32 %call4
+
+  %vtable1 = load ptr, ptr %obj3
+  %p3 = call i1 @llvm.type.test(ptr %vtable1, metadata !"_ZTS1E")
+  call void @llvm.assume(i1 %p3)
+  %fptrptr1 = getelementptr ptr, ptr %vtable1, i32 0
+  %fptr44 = load ptr, ptr %fptrptr1, align 8
+
+  ; Check that the call was not devirtualized because of "unknown"
+  ; vcall_visibility.
+  ; CHECK-IR: %call5 = tail call i32 %fptr44
+  %call5 = tail call i32 %fptr44(ptr nonnull %obj, i32 %call4)
+  ret i32 %call5
 }
 ; CHECK-IR-LABEL: ret i32
 ; CHECK-IR-LABEL: }
@@ -155,6 +179,10 @@
ret i32 0;
 }
 
+define i32 @_ZN1E1nEi(ptr %this, i32 %a) #0 {
+   ret i32 0;
+}
+
 ; Make sure we don't 

[PATCH] D152741: [WPD] implement -fskip-vtable-filepaths

2023-06-13 Thread Di Mo via Phabricator via cfe-commits
modimo added a comment.

In D152741#4419324 , @tejohnson wrote:

> In D152741#4419265 , @modimo wrote:
>
>> In D152741#4418831 , @tejohnson 
>> wrote:
>>
>>> I think I understand the motivation, but not sure I agree this is the right 
>>> approach - can you simply not pass -flto-unit and -fwhole-program-vtables 
>>> for these files?
>>
>> For our third-party libraries, they're pre-built into native files by GCC so 
>> that's unfortunately not an option.
>
> I'm confused - how would you pass this new option then? I was assuming you 
> were passing this option to some LLVM built files at the interface of those 
> libraries. In which case not passing -flto-unit and 
> -fwhole-program-visibility should have a similar effect (suppress the type 
> metadata).

Oh I see, I misunderstood. Yes this is being passed to LLVM built files. We 
want to avoid manual allowlists/blocklists because code changes make it less 
flexible and scalable than an automatic option. This can also be pretty tricky 
to do correctly since we can get type metadata from multiple TUs and all of 
them would need to be opted out for WPD to not kick in.

>>> Also, isn't this hiding possibly necessary info from WPD that might be 
>>> needed for correct class hierarchy analysis affecting other IR modules? 
>>> I.e. in the type-metadata-skip-vtable-filepaths.cpp test, what if A was 
>>> derived from a struct B, which was also defined/used in another module 
>>> without this skipping option. We would lose information about the override 
>>> of f in A, and possibly do an incorrect devirtualization elsewhere. It 
>>> seems like a dangerous option to provide.
>>>
>>> It might be better to provide an option that can somehow mark vtables in a 
>>> given module as unsafe for devirt, and propagate that info to WPD.
>>
>> That would nicely side-step mismatched flags. `Public` `vcall_visibility` 
>> describes this case but with `--lto-whole-program-visibility` there's no a 
>> distinction between `Public` because of deferred vs. `Public` because the 
>> type is known unsafe. Thoughts on an `unsafe` `vcall_visibility` to capture 
>> the latter notion?
>
> That would be better I think. Maybe "unknown".

Sounds good, I'll rework the patch.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152741/new/

https://reviews.llvm.org/D152741

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152741: [WPD] implement -fskip-vtable-filepaths

2023-06-13 Thread Di Mo via Phabricator via cfe-commits
modimo added a comment.

In D152741#4418831 , @tejohnson wrote:

> I think I understand the motivation, but not sure I agree this is the right 
> approach - can you simply not pass -flto-unit and -fwhole-program-vtables for 
> these files?

For our third-party libraries, they're pre-built into native files by GCC so 
that's unfortunately not an option.

> Also, isn't this hiding possibly necessary info from WPD that might be needed 
> for correct class hierarchy analysis affecting other IR modules? I.e. in the 
> type-metadata-skip-vtable-filepaths.cpp test, what if A was derived from a 
> struct B, which was also defined/used in another module without this skipping 
> option. We would lose information about the override of f in A, and possibly 
> do an incorrect devirtualization elsewhere. It seems like a dangerous option 
> to provide.
>
> It might be better to provide an option that can somehow mark vtables in a 
> given module as unsafe for devirt, and propagate that info to WPD.

That would nicely side-step mismatched flags. `Public` `vcall_visibility` 
describes this case but with `--lto-whole-program-visibility` there's no a 
distinction between `Public` because of deferred vs. `Public` because the type 
is known unsafe. Thoughts on an `unsafe` `vcall_visibility` to capture the 
latter notion?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152741/new/

https://reviews.llvm.org/D152741

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152741: -fskip-vtable-filepaths

2023-06-12 Thread Di Mo via Phabricator via cfe-commits
modimo created this revision.
Herald added subscribers: hoy, wenlei.
Herald added a project: All.
modimo requested review of this revision.
Herald added subscribers: cfe-commits, MaskRay.
Herald added a project: clang.

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D152741

Files:
  clang/include/clang/Basic/CodeGenOptions.def
  clang/include/clang/Basic/CodeGenOptions.h
  clang/include/clang/Driver/Options.td
  clang/lib/CodeGen/CodeGenModule.cpp
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Frontend/CompilerInvocation.cpp
  clang/test/CodeGenCXX/type-metadata-skip-vtable-filepaths.cpp
  clang/test/Driver/fskip-vtable-filepaths.c

Index: clang/test/Driver/fskip-vtable-filepaths.c
===
--- /dev/null
+++ clang/test/Driver/fskip-vtable-filepaths.c
@@ -0,0 +1,7 @@
+// RUN: %clang -target x86_64-unknown-linux -### %s -flto=thin 2>&1 | FileCheck --check-prefix=NOSKIP %s
+// RUN: %clang -target x86_64-unknown-linux -### %s -flto=thin -fwhole-program-vtables -fskip-vtable-filepaths=abc 2>&1 | FileCheck --check-prefix=SKIP %s
+// RUN: %clang -target x86_64-unknown-linux -### %s -flto=thin -fskip-vtable-filepaths=abc 2>&1 | FileCheck --check-prefix=ERROR1 %s
+
+// SKIP: "-fskip-vtable-filepaths=abc"
+// NOSKIP-NOT: "-fskip-vtable-filepaths=abc"
+// ERROR1: error: invalid argument '-fskip-vtable-filepaths' only allowed with '-fwhole-program-vtables'
Index: clang/test/CodeGenCXX/type-metadata-skip-vtable-filepaths.cpp
===
--- /dev/null
+++ clang/test/CodeGenCXX/type-metadata-skip-vtable-filepaths.cpp
@@ -0,0 +1,20 @@
+// RUN: rm -rf %t
+// RUN: split-file %s %t
+// RUN: cd %t
+
+// RUN: %clang_cc1 -flto=thin -flto-unit -fwhole-program-vtables -triple x86_64-unknown-linux -fvisibility=hidden -emit-llvm -o - a.cpp | FileCheck %s
+// RUN: %clang_cc1 -flto=thin -flto-unit -fwhole-program-vtables -triple x86_64-unknown-linux -fskip-vtable-filepaths=[^p]$ -fvisibility=hidden -emit-llvm -o - a.cpp | FileCheck %s
+// RUN: %clang_cc1 -flto=thin -flto-unit -fwhole-program-vtables -triple x86_64-unknown-linux -fskip-vtable-filepaths=p$ -fvisibility=hidden -emit-llvm -o - a.cpp | FileCheck -check-prefix=SKIP-PATH %s
+
+// CHECK: !{i64 16, !"_ZTS1A"}
+// SKIP-PATH-NOT: !{i64 16, !"_ZTS1A"}
+//--- a.cpp
+
+struct A {
+  virtual int f() { return 1; }
+};
+
+int f() {
+  auto a = new A();
+  return a->f();
+}
Index: clang/lib/Frontend/CompilerInvocation.cpp
===
--- clang/lib/Frontend/CompilerInvocation.cpp
+++ clang/lib/Frontend/CompilerInvocation.cpp
@@ -1578,6 +1578,10 @@
   GenerateOptimizationRemark(Args, SA, OPT_Rpass_analysis_EQ, "pass-analysis",
  Opts.OptimizationRemarkAnalysis);
 
+  if (Opts.SkipVtableFilepaths.hasValidPattern())
+GenerateArg(Args, OPT_fskip_vtable_file_paths_EQ,
+Opts.SkipVtableFilepaths.Pattern, SA);
+
   GenerateArg(Args, OPT_fdiagnostics_hotness_threshold_EQ,
   Opts.DiagnosticsHotnessThreshold
   ? Twine(*Opts.DiagnosticsHotnessThreshold)
@@ -1989,6 +1993,19 @@
  Opts.OptimizationRemarkMissed.hasValidPattern() ||
  Opts.OptimizationRemarkAnalysis.hasValidPattern();
 
+  if (Arg *A = Args.getLastArg(OPT_fskip_vtable_file_paths_EQ)) {
+StringRef Val = A->getValue();
+std::string RegexError;
+std::shared_ptr Pattern = std::make_shared(Val);
+if (!Pattern->isValid(RegexError)) {
+  Diags.Report(diag::err_drv_optimization_remark_pattern)
+  << RegexError << A->getAsString(Args);
+  Pattern.reset();
+}
+Opts.SkipVtableFilepaths =
+CodeGenOptions::RegexWithPattern(std::string(Val), Pattern);
+  }
+
   bool UsingSampleProfile = !Opts.SampleProfileFile.empty();
   bool UsingProfile =
   UsingSampleProfile || !Opts.ProfileInstrumentUsePath.empty();
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -7245,6 +7245,16 @@
   if (SplitLTOUnit)
 CmdArgs.push_back("-fsplit-lto-unit");
 
+  for (const Arg *A : Args.filtered(options::OPT_fskip_vtable_file_paths_EQ)) {
+if (!WholeProgramVTables)
+  D.Diag(diag::err_drv_argument_only_allowed_with)
+  << "-fskip-vtable-filepaths"
+  << "-fwhole-program-vtables";
+StringRef Path = A->getValue();
+CmdArgs.push_back(Args.MakeArgString("-fskip-vtable-filepaths=" + Path));
+A->claim();
+  }
+
   if (Arg *A = Args.getLastArg(options::OPT_fglobal_isel,
options::OPT_fno_global_isel)) {
 CmdArgs.push_back("-mllvm");
Index: clang/lib/CodeGen/CodeGenModule.cpp
===
--- clang/lib/CodeGen/CodeGenModule.cpp
+++ 

[PATCH] D132186: Clang: Add a new flag Wmisnoinline for printing hot noinline functions

2022-09-20 Thread Di Mo via Phabricator via cfe-commits
modimo added a comment.

In D132186#3802989 , @paulkirth wrote:

> @iamarchit123 I think the standard advice is to start w/ the llvm-test-suite 
> and then explore other benchmarks as needed. Also, Clang itself is often a 
> very good starting point.
>
> As for profiles, it probably won't be representative, but you could collect 
> the profile using your benchmark and then assess how often the mismatch w/ 
> inlining happens. if you want to do it w/ Clang itself, then a common 
> approach I've heard is to record have Clang build your project and then use 
> ninja trace or equivalent to find the 5-10 TUs w/ the longest compile time. 
> Then stick them in the 
> https://github.com/llvm/llvm-project/tree/main/clang/utils/perf-training 
> directory, which will use them for PGO automatically. If you go that route, 
> you may need to preprocess the source files.

+1 Clang is the best starting point. I've been busy recently so haven't had a 
chance to run the HHVM experiments, starting a run today. Paul left some good 
review comments that you can address without requiring performance runs--I 
would recommend getting the patch updated so when the results come back 
everything will be ready to commit.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D132186/new/

https://reviews.llvm.org/D132186

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D132186: Clang: Add a new flag Wmisnoinline for printing hot noinline functions

2022-08-26 Thread Di Mo via Phabricator via cfe-commits
modimo added a comment.

Thanks for taking a look!

In D132186#3752150 , @paulkirth wrote:

> In D132186#3751985 , @tejohnson 
> wrote:
>
>> I have seen a few cases where noinline was used for performance, in addition 
>> to other cases like avoiding too much stack growth.
>
> Well, I stand corrected. I'm curious about what these cases are, but in any 
> case if there are cases where its done, then I agree that a diagnostic would 
> be helpful.

Same. The instances I've seen is an older codebase where compiler optimizations 
were not as powerful and/or purposefully written by engineers that didn't trust 
the compiler to do the right thing.

>> It is a little different than misexpect though in that the expect hints are 
>> pretty much only for performance, so it is more useful to be able to issue a 
>> strong warning that can be turned into an error if they are wrong. And also 
>> there was no way to report the misuse of expects earlier, unlike inlining 
>> where we already had the remarks plumbing.
>>
>> I haven't looked through the patch in detail, but is it possible to use your 
>> changes to emit a better missed opt remark from the inliner for these cases 
>> (I assume we will already emit a -Rpass-missed=inline for the noinline 
>> attribute case, just not highlighting that it is hot and would have been 
>> inlined for performance reasons otherwise)? I suppose one main reason for 
>> adding a warning is that the missed inline remarks can be really noisy and 
>> not really useful to the user vs a compiler optimization engineer doing 
>> inliner/compiler tuning, and therefore a warning would make it easier to 
>> turn on more widely as user feedback that can/should be addressed in user 
>> code.
>
> Yeah, I was thinking we could emit a new remark type for this to 
> differentiate, but it seems simpler more user friendly to emit some clar 
> diagnostic directly.
>
> I think we’re starting to accumulate a few of these diagnostics now that are 
> trying to diagnose potential performance deficiencies based on profiling 
> information. Originally we had prototyped a tool for misexpect based on 
> libtooling that ran over the build based on the compile commands DB and 
> reported everything it found.  I wonder if reviving that would be useful in 
> these cases when you want to look for performance issues like this, 
> misexpect, and other cases? Making ORE diagnostic output queryable through a 
> tool may also be a good option, but I'm not too familiar with what already 
> exists in that area.

Currently a new ORE (`-pass-remarks=misnoinline`) is getting generated, which 
misnoexcept also does. Agreed a warning is more familiar and friendlier for 
users so I lean towards that approach. For additional tooling, I think the 
first step will be to trial this on more real programs to see what cases are 
interesting. @iamarchit123 just finished his internship with us so I'll be 
evaluating these changes on HHVM to see if they can swing the performance 
needle.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D132186/new/

https://reviews.llvm.org/D132186

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124563: Drop '* text=auto' from .gitattributes and normalize

2022-04-27 Thread Di Mo via Phabricator via cfe-commits
modimo added a comment.

In D124563#3478978 , @aaronpuchert 
wrote:

> In D124563#3478968 , @modimo wrote:
>
>> I used `arc patch` and also saw the same thing.
>
> The patch does actually change the files to LF endings. So just applying the 
> patch with non-Git tools will make LF endings, but Git will apply the LF -> 
> CRLF transformation when it checks out itself. Git doesn't show the file as 
> modified because after cleaning the file (i.e. applying CRLF -> LF) it's the 
> same as in the index.

To confirm in main:

  ~/llvm-project2# git ls-files --eol 
clang-tools-extra/test/clang-apply-replacements/Inputs/crlf/crlf.cpp
  i/lfw/crlf  attr/text eol=crlf  
clang-tools-extra/test/clang-apply-replacements/Inputs/crlf/crlf.cpp

`i/lf` indicates in the index it's stored as LF but transformed to `w/crlf` 
CRLF in the working directory.

After running `arc patch` though:

  ~/llvm-project# llvm-arc patch D124563
  ~/llvm-project# git ls-files --eol 
clang-tools-extra/test/clang-apply-replacements/Inputs/crlf/crlf.cpp
  i/lfw/lfattr/text eol=crlf  
clang-tools-extra/test/clang-apply-replacements/Inputs/crlf/crlf.cpp

So confirmed it was an `arc patch` diff application. EOL is... tricky.

> Sorry for all the noise, I was just annoyed about this empty `test` directory 
> and thought we just need to move that file... well, it was a bit of an 
> adventure. Thanks for helping out here.

It happens, properly fixing the original diff to make it actually do something 
was definitely the right choice. If anything Git should warn loudly that the 
index needs to be refreshed if `.gitattributes` is modified or added.

Happy to help, I learned quite a lot about git internals digging into this :)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124563/new/

https://reviews.llvm.org/D124563

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124563: Drop '* text=auto' from .gitattributes and normalize

2022-04-27 Thread Di Mo via Phabricator via cfe-commits
modimo added a comment.

In D124563#3478951 , @aaronpuchert 
wrote:

> In D124563#3478937 , @modimo wrote:
>
>> Checking locally I'm seeing LF as the line ending in crlf.cpp in the working 
>> directory. Can you double check that everything matches up?
>
> I had this too, but checking out again seems to have fixed it. Maybe you used 
> `arc patch` like I did? I presume that applies just textually and doesn't 
> know about `.gitattributes`.
>
> And sorry for landing already, there seems to have been a race...

Ah that would explain it, I used `arc patch` and also saw the same thing. It 
was fixed after I switched off `arcpatch-D124563` and back. Anyways it's all 
good on main which is what matters.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124563/new/

https://reviews.llvm.org/D124563

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124563: Drop '* text=auto' from .gitattributes and normalize

2022-04-27 Thread Di Mo via Phabricator via cfe-commits
modimo added a comment.

In D124563#3478937 , @modimo wrote:

> Checking locally I'm seeing LF as the line ending in crlf.cpp in the working 
> directory. Can you double check that everything matches up?

Ah it was some strange setup on my end. Confirmed patch and commit in main are 
good with CRLF. Apologies for the noise.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124563/new/

https://reviews.llvm.org/D124563

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124563: Drop '* text=auto' from .gitattributes and normalize

2022-04-27 Thread Di Mo via Phabricator via cfe-commits
modimo requested changes to this revision.
modimo added a comment.
This revision now requires changes to proceed.

Checking locally I'm seeing LF as the line ending in crlf.cpp in the working 
directory. Can you double check that everything matches up?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124563/new/

https://reviews.llvm.org/D124563

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124563: Drop '* text=auto' from .gitattributes and normalize

2022-04-27 Thread Di Mo via Phabricator via cfe-commits
modimo added a comment.

In D124563#3478915 , @aaronpuchert 
wrote:

> Drop `* text=auto`, so that we renormalize only the files that need it.

Makes sense to me, thanks for putting it up. If you want to commandeer I can 
accept the change or you can accept your own revision?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124563/new/

https://reviews.llvm.org/D124563

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124563: Drop '* text=auto' from .gitattributes and normalize

2022-04-27 Thread Di Mo via Phabricator via cfe-commits
modimo added a comment.

In D124563#3478781 , @aaronpuchert 
wrote:

> In D124563#3478653 , @modimo wrote:
>
>> I think the way to go is to revert ac5f7be6a868 
>>  then 
>> land everything as a single stack to prevent this issue.
>
> Doesn't change anything about existing commits though. There is no way to fix 
> that now I'm afraid. We can only fix it for future commits.

Yeah unfortunately. On Linux since it has the timestamp check landing this in 
one piece will be clean but on Windows the existing commits will continue to 
have this issue.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124563/new/

https://reviews.llvm.org/D124563

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124563: Renormalize line endings after ac5f7be6a8688955a282becf00eebc542238a86b

2022-04-27 Thread Di Mo via Phabricator via cfe-commits
modimo added a comment.

In D124563#3478627 , @smeenai wrote:

> If I check out this commit and then check out the previous commit, 
> `clang-tools-extra/test/clang-apply-replacements/Inputs/crlf/crlf.cpp` and 
> `clang-tools-extra/test/clang-apply-replacements/Inputs/crlf/crlf.cpp.expected`
>  become modified in my working directory; their line endings are changed from 
> CRLF to LF. That seems undesirable.

Good catch, I see it as well. The `.gitattributes` change needs to be atomic 
with the renormalization change. If I checkout from main after the 
`.gitattributes` change (`git checkout ac5f7be6a868`) I'll see the files as 
dirty. However, if I checkout the revision before (`git checkout 
ac5f7be6a868~1`) this goes away. I think the way to go is to revert 
ac5f7be6a868 
 then land 
everything as a single stack to prevent this issue.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124563/new/

https://reviews.llvm.org/D124563

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124563: Renormalize line endings after ac5f7be6a8688955a282becf00eebc542238a86b

2022-04-27 Thread Di Mo via Phabricator via cfe-commits
modimo added a comment.

In D124563#3478561 , @smeenai wrote:

> The following files have their line endings (when checked out on disk) 
> changed from CRLF to LF by this patch. Seems harmless, but I just wanted to 
> confirm that it was expected:
>
>   
> clang-tools-extra/test/clang-tidy/checkers/Inputs/readability-duplicate-include/readability-duplicate-include.h
>   
> clang-tools-extra/test/clang-tidy/checkers/Inputs/readability-duplicate-include/readability-duplicate-include2.h
>   
> clang-tools-extra/test/clang-tidy/checkers/Inputs/readability-duplicate-include/system/sys/types.h
>   
> clang-tools-extra/test/clang-tidy/checkers/readability-else-after-return-if-constexpr.cpp
>   clang-tools-extra/test/modularize/Inputs/CompileError/module.modulemap
>   clang-tools-extra/test/modularize/Inputs/MissingHeader/module.modulemap
>   clang-tools-extra/test/pp-trace/Inputs/module.map
>
> (The files which should be CRLF according to `.gitattributes` remained CRLF.)

Good catch. Looking at git documentation 
(https://git-scm.com/docs/gitattributes#_text) by virtue of applying `* 
text=auto` the line endings will be stored internally as LF and then use the 
system settings on checkout. Looking on my linux box I see LF endings but 
checking this out on Windows I see CRLF endings so this behavior is correct. 
This diff is displaying index changes which does move from CRLF->LF. I didn't 
see any test failures on Linux and these files will not have changed on Windows 
so this should be good.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D124563/new/

https://reviews.llvm.org/D124563

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D124563: renormalize

2022-04-27 Thread Di Mo via Phabricator via cfe-commits
modimo created this revision.
Herald added subscribers: hoy, wenlei.
Herald added a project: All.
modimo requested review of this revision.
Herald added a project: clang-tools-extra.
Herald added a subscriber: cfe-commits.

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D124563

Files:
  clang-tools-extra/test/clang-apply-replacements/Inputs/crlf/crlf.cpp
  clang-tools-extra/test/clang-apply-replacements/Inputs/crlf/crlf.cpp.expected
  
clang-tools-extra/test/clang-tidy/checkers/Inputs/readability-duplicate-include/readability-duplicate-include.h
  
clang-tools-extra/test/clang-tidy/checkers/Inputs/readability-duplicate-include/readability-duplicate-include2.h
  
clang-tools-extra/test/clang-tidy/checkers/Inputs/readability-duplicate-include/system/sys/types.h
  
clang-tools-extra/test/clang-tidy/checkers/readability-else-after-return-if-constexpr.cpp
  clang-tools-extra/test/modularize/Inputs/CompileError/module.modulemap
  clang-tools-extra/test/modularize/Inputs/MissingHeader/module.modulemap
  clang-tools-extra/test/pp-trace/Inputs/module.map

Index: clang-tools-extra/test/pp-trace/Inputs/module.map
===
--- clang-tools-extra/test/pp-trace/Inputs/module.map
+++ clang-tools-extra/test/pp-trace/Inputs/module.map
@@ -1,18 +1,18 @@
-// module.map
-
-module Level1A {
-  header "Level1A.h"
-  export *
-}
-module Level1B {
-  header "Level1B.h"
-  export *
-  module Level2B {
-header "Level2B.h"
-export *
-  }
-}
-module Level2A {
-  header "Level2A.h"
-  export *
-}
+// module.map
+
+module Level1A {
+  header "Level1A.h"
+  export *
+}
+module Level1B {
+  header "Level1B.h"
+  export *
+  module Level2B {
+header "Level2B.h"
+export *
+  }
+}
+module Level2A {
+  header "Level2A.h"
+  export *
+}
Index: clang-tools-extra/test/modularize/Inputs/MissingHeader/module.modulemap
===
--- clang-tools-extra/test/modularize/Inputs/MissingHeader/module.modulemap
+++ clang-tools-extra/test/modularize/Inputs/MissingHeader/module.modulemap
@@ -1,10 +1,10 @@
-// module.map
-
-module Level1A {
-  header "Level1A.h"
-  export *
-}
-module Missing {
-  header "Missing.h"
-  export *
-}
+// module.map
+
+module Level1A {
+  header "Level1A.h"
+  export *
+}
+module Missing {
+  header "Missing.h"
+  export *
+}
Index: clang-tools-extra/test/modularize/Inputs/CompileError/module.modulemap
===
--- clang-tools-extra/test/modularize/Inputs/CompileError/module.modulemap
+++ clang-tools-extra/test/modularize/Inputs/CompileError/module.modulemap
@@ -1,10 +1,10 @@
-// module.map
-
-module Level1A {
-  header "Level1A.h"
-  export *
-}
-module HasError {
-  header "HasError.h"
-  export *
-}
+// module.map
+
+module Level1A {
+  header "Level1A.h"
+  export *
+}
+module HasError {
+  header "HasError.h"
+  export *
+}
Index: clang-tools-extra/test/clang-tidy/checkers/readability-else-after-return-if-constexpr.cpp
===
--- clang-tools-extra/test/clang-tidy/checkers/readability-else-after-return-if-constexpr.cpp
+++ clang-tools-extra/test/clang-tidy/checkers/readability-else-after-return-if-constexpr.cpp
@@ -1,22 +1,22 @@
-// RUN: %check_clang_tidy %s readability-else-after-return %t -- -- -std=c++17
-
-// Constexpr if is an exception to the rule, we cannot remove the else.
-void f() {
-  if (sizeof(int) > 4)
-return;
-  else
-return;
-  // CHECK-MESSAGES: [[@LINE-2]]:3: warning: do not use 'else' after 'return'
-
-  if constexpr (sizeof(int) > 4)
-return;
-  else
-return;
-
-  if constexpr (sizeof(int) > 4)
-return;
-  else if constexpr (sizeof(long) > 4)
-return;
-  else
-return;
-}
+// RUN: %check_clang_tidy %s readability-else-after-return %t -- -- -std=c++17
+
+// Constexpr if is an exception to the rule, we cannot remove the else.
+void f() {
+  if (sizeof(int) > 4)
+return;
+  else
+return;
+  // CHECK-MESSAGES: [[@LINE-2]]:3: warning: do not use 'else' after 'return'
+
+  if constexpr (sizeof(int) > 4)
+return;
+  else
+return;
+
+  if constexpr (sizeof(int) > 4)
+return;
+  else if constexpr (sizeof(long) > 4)
+return;
+  else
+return;
+}
Index: clang-tools-extra/test/clang-tidy/checkers/Inputs/readability-duplicate-include/system/sys/types.h
===
--- clang-tools-extra/test/clang-tidy/checkers/Inputs/readability-duplicate-include/system/sys/types.h
+++ clang-tools-extra/test/clang-tidy/checkers/Inputs/readability-duplicate-include/system/sys/types.h
@@ -1 +1 @@
-// This file is intentionally empty.
+// This file is intentionally empty.
Index: clang-tools-extra/test/clang-tidy/checkers/Inputs/readability-duplicate-include/readability-duplicate-include2.h
===
--- 

[PATCH] D113523: Add toggling for -fnew-infallible/-fno-new-infallible

2021-11-30 Thread Di Mo via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG47f230ba2c8f: Add toggling for 
-fnew-infallible/-fno-new-infallible (authored by modimo).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113523/new/

https://reviews.llvm.org/D113523

Files:
  clang/docs/ClangCommandLineReference.rst
  clang/include/clang/Driver/Options.td
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/CodeGenCXX/new-infallible.cpp
  clang/test/Driver/new-infallible.cpp


Index: clang/test/Driver/new-infallible.cpp
===
--- /dev/null
+++ clang/test/Driver/new-infallible.cpp
@@ -0,0 +1,5 @@
+// RUN: %clang -### -S -fno-new-infallible -fnew-infallible %s 2>&1 | 
FileCheck --check-prefix=NEW-INFALLIBLE %s
+// NEW-INFALLIBLE: "-fnew-infallible"
+
+// RUN: %clang -### -S -fnew-infallible -fno-new-infallible %s 2>&1 | 
FileCheck --check-prefix=NO-NEW-INFALLIBLE %s
+// NO-NEW-INFALLIBLE-NOT: "-fnew-infallible"
\ No newline at end of file
Index: clang/test/CodeGenCXX/new-infallible.cpp
===
--- clang/test/CodeGenCXX/new-infallible.cpp
+++ clang/test/CodeGenCXX/new-infallible.cpp
@@ -1,7 +1,16 @@
 // RUN: %clang_cc1 -emit-llvm -triple x86_64-linux-gnu -fnew-infallible -o - 
%s | FileCheck %s
+// RUN: %clang_cc1 -emit-llvm -triple x86_64-linux-gnu -fno-new-infallible 
-fnew-infallible -o - %s | FileCheck %s
+// RUN: %clang_cc1 -emit-llvm -triple x86_64-linux-gnu -fno-new-infallible -o 
- %s | FileCheck %s --check-prefix=NO-NEW-INFALLIBLE
+// RUN: %clang_cc1 -emit-llvm -triple x86_64-linux-gnu -fnew-infallible 
-fno-new-infallible -o - %s | FileCheck %s --check-prefix=NO-NEW-INFALLIBLE
 
 // CHECK: call noalias nonnull i8* @_Znwm(i64 4)
 
 // CHECK: ; Function Attrs: nobuiltin nounwind allocsize(0)
 // CHECK-NEXT: declare nonnull i8* @_Znwm(i64)
+
+// NO-NEW-INFALLIBLE: call noalias nonnull i8* @_Znwm(i64 4)
+
+// NO-NEW-INFALLIBLE: ; Function Attrs: nobuiltin allocsize(0)
+// NO-NEW-INFALLIBLE-NEXT: declare nonnull i8* @_Znwm(i64)
+
 int *new_infallible = new int;
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -5821,9 +5821,12 @@
   Args.AddLastArg(CmdArgs, 
options::OPT_fvisibility_inlines_hidden_static_local_var,

options::OPT_fno_visibility_inlines_hidden_static_local_var);
   Args.AddLastArg(CmdArgs, options::OPT_fvisibility_global_new_delete_hidden);
-  Args.AddLastArg(CmdArgs, options::OPT_fnew_infallible);
   Args.AddLastArg(CmdArgs, options::OPT_ftlsmodel_EQ);
 
+  if (Args.hasFlag(options::OPT_fnew_infallible,
+   options::OPT_fno_new_infallible, false))
+CmdArgs.push_back("-fnew-infallible");
+
   if (Args.hasFlag(options::OPT_fno_operator_names,
options::OPT_foperator_names, false))
 CmdArgs.push_back("-fno-operator-names");
Index: clang/include/clang/Driver/Options.td
===
--- clang/include/clang/Driver/Options.td
+++ clang/include/clang/Driver/Options.td
@@ -2789,10 +2789,11 @@
 def fvisibility_global_new_delete_hidden : Flag<["-"], 
"fvisibility-global-new-delete-hidden">, Group,
   HelpText<"Give global C++ operator new and delete declarations hidden 
visibility">, Flags<[CC1Option]>,
   MarshallingInfoFlag>;
-def fnew_infallible : Flag<["-"], "fnew-infallible">, Group,
-  HelpText<"Treats throwing global C++ operator new as always returning valid 
memory "
-  "(annotates with __attribute__((returns_nonnull)) and throw()). This is 
detectable in source.">,
-  Flags<[CC1Option]>, MarshallingInfoFlag>;
+defm new_infallible : BoolFOption<"new-infallible",
+  LangOpts<"NewInfallible">, DefaultFalse,
+  PosFlag, NegFlag,
+  BothFlags<[CC1Option], " treating throwing global C++ operator new as always 
returning valid memory "
+  "(annotates with __attribute__((returns_nonnull)) and throw()). This is 
detectable in source.">>;
 defm whole_program_vtables : BoolFOption<"whole-program-vtables",
   CodeGenOpts<"WholeProgramVTables">, DefaultFalse,
   PosFlag,
Index: clang/docs/ClangCommandLineReference.rst
===
--- clang/docs/ClangCommandLineReference.rst
+++ clang/docs/ClangCommandLineReference.rst
@@ -1941,9 +1941,9 @@
 
 Specifies the largest alignment guaranteed by '::operator new(size\_t)'
 
-.. option:: -fnew-infallible
+.. option:: -fnew-infallible, -fno-new-infallible
 
-Treats throwing global C++ operator new as always returning valid memory 
(annotates with \_\_attribute\_\_((returns\_nonnull)) and throw()). This is 
detectable in source.
+Enable treating throwing global C++ operator new as always returning valid 
memory (annotates with 

[PATCH] D114130: [Clang] Add option to disable -mconstructor-aliases with -mno-constructor-aliases

2021-11-30 Thread Di Mo via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG9b704d31b54a: [Clang] Add option to disable 
-mconstructor-aliases with -mno-constructor… (authored by modimo).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D114130/new/

https://reviews.llvm.org/D114130

Files:
  clang/include/clang/Driver/Options.td
  clang/test/CodeGenCXX/constructor-alias.cpp


Index: clang/test/CodeGenCXX/constructor-alias.cpp
===
--- clang/test/CodeGenCXX/constructor-alias.cpp
+++ clang/test/CodeGenCXX/constructor-alias.cpp
@@ -1,4 +1,5 @@
-// RUN: %clang_cc1 -emit-llvm -triple mipsel--linux-gnu -mconstructor-aliases 
-o - %s | FileCheck %s
+// RUN: %clang_cc1 -emit-llvm -triple mipsel--linux-gnu 
-mno-constructor-aliases -mconstructor-aliases -o - %s | FileCheck %s
+// RUN: %clang_cc1 -emit-llvm -triple mipsel--linux-gnu -mconstructor-aliases 
-mno-constructor-aliases -o - %s | FileCheck %s --check-prefix=NO-ALIAS
 
 // The target attribute code used to get confused with aliases. Make sure
 // we don't crash when an alias is used.
@@ -10,3 +11,4 @@
 }
 
 // CHECK: @_ZN1BC1Ev ={{.*}} unnamed_addr alias void (%struct.B*), void 
(%struct.B*)* @_ZN1BC2Ev
+// NO-ALIAS-NOT: @_ZN1BC1Ev ={{.*}} unnamed_addr alias void (%struct.B*), void 
(%struct.B*)* @_ZN1BC2Ev
Index: clang/include/clang/Driver/Options.td
===
--- clang/include/clang/Driver/Options.td
+++ clang/include/clang/Driver/Options.td
@@ -5065,9 +5065,10 @@
 def funwind_tables_EQ : Joined<["-"], "funwind-tables=">,
   HelpText<"Generate unwinding tables for all functions">,
   MarshallingInfoInt>;
-def mconstructor_aliases : Flag<["-"], "mconstructor-aliases">,
-  HelpText<"Emit complete constructors and destructors as aliases when 
possible">,
-  MarshallingInfoFlag>;
+defm constructor_aliases : BoolOption<"m", "constructor-aliases",
+  CodeGenOpts<"CXXCtorDtorAliases">, DefaultFalse,
+  PosFlag, NegFlag,
+  BothFlags<[CC1Option], " emitting complete constructors and destructors as 
aliases when possible">>;
 def mlink_bitcode_file : Separate<["-"], "mlink-bitcode-file">,
   HelpText<"Link the given bitcode file before performing optimizations.">;
 def mlink_builtin_bitcode : Separate<["-"], "mlink-builtin-bitcode">,


Index: clang/test/CodeGenCXX/constructor-alias.cpp
===
--- clang/test/CodeGenCXX/constructor-alias.cpp
+++ clang/test/CodeGenCXX/constructor-alias.cpp
@@ -1,4 +1,5 @@
-// RUN: %clang_cc1 -emit-llvm -triple mipsel--linux-gnu -mconstructor-aliases -o - %s | FileCheck %s
+// RUN: %clang_cc1 -emit-llvm -triple mipsel--linux-gnu -mno-constructor-aliases -mconstructor-aliases -o - %s | FileCheck %s
+// RUN: %clang_cc1 -emit-llvm -triple mipsel--linux-gnu -mconstructor-aliases -mno-constructor-aliases -o - %s | FileCheck %s --check-prefix=NO-ALIAS
 
 // The target attribute code used to get confused with aliases. Make sure
 // we don't crash when an alias is used.
@@ -10,3 +11,4 @@
 }
 
 // CHECK: @_ZN1BC1Ev ={{.*}} unnamed_addr alias void (%struct.B*), void (%struct.B*)* @_ZN1BC2Ev
+// NO-ALIAS-NOT: @_ZN1BC1Ev ={{.*}} unnamed_addr alias void (%struct.B*), void (%struct.B*)* @_ZN1BC2Ev
Index: clang/include/clang/Driver/Options.td
===
--- clang/include/clang/Driver/Options.td
+++ clang/include/clang/Driver/Options.td
@@ -5065,9 +5065,10 @@
 def funwind_tables_EQ : Joined<["-"], "funwind-tables=">,
   HelpText<"Generate unwinding tables for all functions">,
   MarshallingInfoInt>;
-def mconstructor_aliases : Flag<["-"], "mconstructor-aliases">,
-  HelpText<"Emit complete constructors and destructors as aliases when possible">,
-  MarshallingInfoFlag>;
+defm constructor_aliases : BoolOption<"m", "constructor-aliases",
+  CodeGenOpts<"CXXCtorDtorAliases">, DefaultFalse,
+  PosFlag, NegFlag,
+  BothFlags<[CC1Option], " emitting complete constructors and destructors as aliases when possible">>;
 def mlink_bitcode_file : Separate<["-"], "mlink-bitcode-file">,
   HelpText<"Link the given bitcode file before performing optimizations.">;
 def mlink_builtin_bitcode : Separate<["-"], "mlink-builtin-bitcode">,
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D114130: [Clang] Add option to disable -mconstructor-aliases with -mno-constructor-aliases

2021-11-18 Thread Di Mo via Phabricator via cfe-commits
modimo updated this revision to Diff 388307.
modimo added a comment.

Condense lines


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D114130/new/

https://reviews.llvm.org/D114130

Files:
  clang/include/clang/Driver/Options.td
  clang/test/CodeGenCXX/constructor-alias.cpp


Index: clang/test/CodeGenCXX/constructor-alias.cpp
===
--- clang/test/CodeGenCXX/constructor-alias.cpp
+++ clang/test/CodeGenCXX/constructor-alias.cpp
@@ -1,4 +1,5 @@
-// RUN: %clang_cc1 -emit-llvm -triple mipsel--linux-gnu -mconstructor-aliases 
-o - %s | FileCheck %s
+// RUN: %clang_cc1 -emit-llvm -triple mipsel--linux-gnu 
-mno-constructor-aliases -mconstructor-aliases -o - %s | FileCheck %s
+// RUN: %clang_cc1 -emit-llvm -triple mipsel--linux-gnu -mconstructor-aliases 
-mno-constructor-aliases -o - %s | FileCheck %s --check-prefix=NO-ALIAS
 
 // The target attribute code used to get confused with aliases. Make sure
 // we don't crash when an alias is used.
@@ -10,3 +11,4 @@
 }
 
 // CHECK: @_ZN1BC1Ev ={{.*}} unnamed_addr alias void (%struct.B*), void 
(%struct.B*)* @_ZN1BC2Ev
+// NO-ALIAS-NOT: @_ZN1BC1Ev ={{.*}} unnamed_addr alias void (%struct.B*), void 
(%struct.B*)* @_ZN1BC2Ev
Index: clang/include/clang/Driver/Options.td
===
--- clang/include/clang/Driver/Options.td
+++ clang/include/clang/Driver/Options.td
@@ -5054,9 +5054,10 @@
 def funwind_tables_EQ : Joined<["-"], "funwind-tables=">,
   HelpText<"Generate unwinding tables for all functions">,
   MarshallingInfoInt>;
-def mconstructor_aliases : Flag<["-"], "mconstructor-aliases">,
-  HelpText<"Emit complete constructors and destructors as aliases when 
possible">,
-  MarshallingInfoFlag>;
+defm constructor_aliases : BoolOption<"m", "constructor-aliases",
+  CodeGenOpts<"CXXCtorDtorAliases">, DefaultFalse,
+  PosFlag, NegFlag,
+  BothFlags<[CC1Option], " emitting complete constructors and destructors as 
aliases when possible">>;
 def mlink_bitcode_file : Separate<["-"], "mlink-bitcode-file">,
   HelpText<"Link the given bitcode file before performing optimizations.">;
 def mlink_builtin_bitcode : Separate<["-"], "mlink-builtin-bitcode">,


Index: clang/test/CodeGenCXX/constructor-alias.cpp
===
--- clang/test/CodeGenCXX/constructor-alias.cpp
+++ clang/test/CodeGenCXX/constructor-alias.cpp
@@ -1,4 +1,5 @@
-// RUN: %clang_cc1 -emit-llvm -triple mipsel--linux-gnu -mconstructor-aliases -o - %s | FileCheck %s
+// RUN: %clang_cc1 -emit-llvm -triple mipsel--linux-gnu -mno-constructor-aliases -mconstructor-aliases -o - %s | FileCheck %s
+// RUN: %clang_cc1 -emit-llvm -triple mipsel--linux-gnu -mconstructor-aliases -mno-constructor-aliases -o - %s | FileCheck %s --check-prefix=NO-ALIAS
 
 // The target attribute code used to get confused with aliases. Make sure
 // we don't crash when an alias is used.
@@ -10,3 +11,4 @@
 }
 
 // CHECK: @_ZN1BC1Ev ={{.*}} unnamed_addr alias void (%struct.B*), void (%struct.B*)* @_ZN1BC2Ev
+// NO-ALIAS-NOT: @_ZN1BC1Ev ={{.*}} unnamed_addr alias void (%struct.B*), void (%struct.B*)* @_ZN1BC2Ev
Index: clang/include/clang/Driver/Options.td
===
--- clang/include/clang/Driver/Options.td
+++ clang/include/clang/Driver/Options.td
@@ -5054,9 +5054,10 @@
 def funwind_tables_EQ : Joined<["-"], "funwind-tables=">,
   HelpText<"Generate unwinding tables for all functions">,
   MarshallingInfoInt>;
-def mconstructor_aliases : Flag<["-"], "mconstructor-aliases">,
-  HelpText<"Emit complete constructors and destructors as aliases when possible">,
-  MarshallingInfoFlag>;
+defm constructor_aliases : BoolOption<"m", "constructor-aliases",
+  CodeGenOpts<"CXXCtorDtorAliases">, DefaultFalse,
+  PosFlag, NegFlag,
+  BothFlags<[CC1Option], " emitting complete constructors and destructors as aliases when possible">>;
 def mlink_bitcode_file : Separate<["-"], "mlink-bitcode-file">,
   HelpText<"Link the given bitcode file before performing optimizations.">;
 def mlink_builtin_bitcode : Separate<["-"], "mlink-builtin-bitcode">,
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D114130: [Clang] Add option to disable -mconstructor-aliases with -mno-constructor-aliases

2021-11-17 Thread Di Mo via Phabricator via cfe-commits
modimo created this revision.
Herald added subscribers: jeroen.dobbelaere, hoy, wenlei, lxfind, dang.
modimo requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D114130

Files:
  clang/include/clang/Driver/Options.td
  clang/test/CodeGenCXX/constructor-alias.cpp


Index: clang/test/CodeGenCXX/constructor-alias.cpp
===
--- clang/test/CodeGenCXX/constructor-alias.cpp
+++ clang/test/CodeGenCXX/constructor-alias.cpp
@@ -1,4 +1,5 @@
-// RUN: %clang_cc1 -emit-llvm -triple mipsel--linux-gnu -mconstructor-aliases 
-o - %s | FileCheck %s
+// RUN: %clang_cc1 -emit-llvm -triple mipsel--linux-gnu 
-mno-constructor-aliases -mconstructor-aliases -o - %s | FileCheck %s
+// RUN: %clang_cc1 -emit-llvm -triple mipsel--linux-gnu -mconstructor-aliases 
-mno-constructor-aliases -o - %s | FileCheck %s --check-prefix=NO-ALIAS
 
 // The target attribute code used to get confused with aliases. Make sure
 // we don't crash when an alias is used.
@@ -10,3 +11,4 @@
 }
 
 // CHECK: @_ZN1BC1Ev ={{.*}} unnamed_addr alias void (%struct.B*), void 
(%struct.B*)* @_ZN1BC2Ev
+// NO-ALIAS-NOT: @_ZN1BC1Ev ={{.*}} unnamed_addr alias void (%struct.B*), void 
(%struct.B*)* @_ZN1BC2Ev
Index: clang/include/clang/Driver/Options.td
===
--- clang/include/clang/Driver/Options.td
+++ clang/include/clang/Driver/Options.td
@@ -5054,9 +5054,12 @@
 def funwind_tables_EQ : Joined<["-"], "funwind-tables=">,
   HelpText<"Generate unwinding tables for all functions">,
   MarshallingInfoInt>;
-def mconstructor_aliases : Flag<["-"], "mconstructor-aliases">,
-  HelpText<"Emit complete constructors and destructors as aliases when 
possible">,
-  MarshallingInfoFlag>;
+defm constructor_aliases : BoolOption<"m", "constructor-aliases",
+  CodeGenOpts<"CXXCtorDtorAliases">,
+  DefaultFalse,
+  PosFlag,
+  NegFlag,
+  BothFlags<[CC1Option], " emitting complete constructors and destructors as 
aliases when possible">>;
 def mlink_bitcode_file : Separate<["-"], "mlink-bitcode-file">,
   HelpText<"Link the given bitcode file before performing optimizations.">;
 def mlink_builtin_bitcode : Separate<["-"], "mlink-builtin-bitcode">,


Index: clang/test/CodeGenCXX/constructor-alias.cpp
===
--- clang/test/CodeGenCXX/constructor-alias.cpp
+++ clang/test/CodeGenCXX/constructor-alias.cpp
@@ -1,4 +1,5 @@
-// RUN: %clang_cc1 -emit-llvm -triple mipsel--linux-gnu -mconstructor-aliases -o - %s | FileCheck %s
+// RUN: %clang_cc1 -emit-llvm -triple mipsel--linux-gnu -mno-constructor-aliases -mconstructor-aliases -o - %s | FileCheck %s
+// RUN: %clang_cc1 -emit-llvm -triple mipsel--linux-gnu -mconstructor-aliases -mno-constructor-aliases -o - %s | FileCheck %s --check-prefix=NO-ALIAS
 
 // The target attribute code used to get confused with aliases. Make sure
 // we don't crash when an alias is used.
@@ -10,3 +11,4 @@
 }
 
 // CHECK: @_ZN1BC1Ev ={{.*}} unnamed_addr alias void (%struct.B*), void (%struct.B*)* @_ZN1BC2Ev
+// NO-ALIAS-NOT: @_ZN1BC1Ev ={{.*}} unnamed_addr alias void (%struct.B*), void (%struct.B*)* @_ZN1BC2Ev
Index: clang/include/clang/Driver/Options.td
===
--- clang/include/clang/Driver/Options.td
+++ clang/include/clang/Driver/Options.td
@@ -5054,9 +5054,12 @@
 def funwind_tables_EQ : Joined<["-"], "funwind-tables=">,
   HelpText<"Generate unwinding tables for all functions">,
   MarshallingInfoInt>;
-def mconstructor_aliases : Flag<["-"], "mconstructor-aliases">,
-  HelpText<"Emit complete constructors and destructors as aliases when possible">,
-  MarshallingInfoFlag>;
+defm constructor_aliases : BoolOption<"m", "constructor-aliases",
+  CodeGenOpts<"CXXCtorDtorAliases">,
+  DefaultFalse,
+  PosFlag,
+  NegFlag,
+  BothFlags<[CC1Option], " emitting complete constructors and destructors as aliases when possible">>;
 def mlink_bitcode_file : Separate<["-"], "mlink-bitcode-file">,
   HelpText<"Link the given bitcode file before performing optimizations.">;
 def mlink_builtin_bitcode : Separate<["-"], "mlink-builtin-bitcode">,
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D113523: Add toggling for -fnew-infallible/-fno-new-infallible

2021-11-12 Thread Di Mo via Phabricator via cfe-commits
modimo added a comment.

Gentle ping @bruno


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113523/new/

https://reviews.llvm.org/D113523

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D113523: Add toggling for -fnew-infallible/-fno-new-infallible

2021-11-09 Thread Di Mo via Phabricator via cfe-commits
modimo updated this revision to Diff 386059.
modimo added a comment.

Add driver test


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113523/new/

https://reviews.llvm.org/D113523

Files:
  clang/docs/ClangCommandLineReference.rst
  clang/include/clang/Driver/Options.td
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/CodeGenCXX/new-infallible.cpp
  clang/test/Driver/new-infallible.cpp


Index: clang/test/Driver/new-infallible.cpp
===
--- /dev/null
+++ clang/test/Driver/new-infallible.cpp
@@ -0,0 +1,5 @@
+// RUN: %clang -### -S -fno-new-infallible -fnew-infallible %s 2>&1 | 
FileCheck --check-prefix=NEW-INFALLIBLE %s
+// NEW-INFALLIBLE: "-fnew-infallible"
+
+// RUN: %clang -### -S -fnew-infallible -fno-new-infallible %s 2>&1 | 
FileCheck --check-prefix=NO-NEW-INFALLIBLE %s
+// NO-NEW-INFALLIBLE-NOT: "-fnew-infallible"
\ No newline at end of file
Index: clang/test/CodeGenCXX/new-infallible.cpp
===
--- clang/test/CodeGenCXX/new-infallible.cpp
+++ clang/test/CodeGenCXX/new-infallible.cpp
@@ -1,7 +1,16 @@
 // RUN: %clang_cc1 -emit-llvm -triple x86_64-linux-gnu -fnew-infallible -o - 
%s | FileCheck %s
+// RUN: %clang_cc1 -emit-llvm -triple x86_64-linux-gnu -fno-new-infallible 
-fnew-infallible -o - %s | FileCheck %s
+// RUN: %clang_cc1 -emit-llvm -triple x86_64-linux-gnu -fno-new-infallible -o 
- %s | FileCheck %s --check-prefix=NO-NEW-INFALLIBLE
+// RUN: %clang_cc1 -emit-llvm -triple x86_64-linux-gnu -fnew-infallible 
-fno-new-infallible -o - %s | FileCheck %s --check-prefix=NO-NEW-INFALLIBLE
 
 // CHECK: call noalias nonnull i8* @_Znwm(i64 4)
 
 // CHECK: ; Function Attrs: nobuiltin nounwind allocsize(0)
 // CHECK-NEXT: declare nonnull i8* @_Znwm(i64)
+
+// NO-NEW-INFALLIBLE: call noalias nonnull i8* @_Znwm(i64 4)
+
+// NO-NEW-INFALLIBLE: ; Function Attrs: nobuiltin allocsize(0)
+// NO-NEW-INFALLIBLE-NEXT: declare nonnull i8* @_Znwm(i64)
+
 int *new_infallible = new int;
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -5810,9 +5810,12 @@
   Args.AddLastArg(CmdArgs, 
options::OPT_fvisibility_inlines_hidden_static_local_var,

options::OPT_fno_visibility_inlines_hidden_static_local_var);
   Args.AddLastArg(CmdArgs, options::OPT_fvisibility_global_new_delete_hidden);
-  Args.AddLastArg(CmdArgs, options::OPT_fnew_infallible);
   Args.AddLastArg(CmdArgs, options::OPT_ftlsmodel_EQ);
 
+  if (Args.hasFlag(options::OPT_fnew_infallible,
+   options::OPT_fno_new_infallible, false))
+CmdArgs.push_back("-fnew-infallible");
+
   if (Args.hasFlag(options::OPT_fno_operator_names,
options::OPT_foperator_names, false))
 CmdArgs.push_back("-fno-operator-names");
Index: clang/include/clang/Driver/Options.td
===
--- clang/include/clang/Driver/Options.td
+++ clang/include/clang/Driver/Options.td
@@ -2786,10 +2786,11 @@
 def fvisibility_global_new_delete_hidden : Flag<["-"], 
"fvisibility-global-new-delete-hidden">, Group,
   HelpText<"Give global C++ operator new and delete declarations hidden 
visibility">, Flags<[CC1Option]>,
   MarshallingInfoFlag>;
-def fnew_infallible : Flag<["-"], "fnew-infallible">, Group,
-  HelpText<"Treats throwing global C++ operator new as always returning valid 
memory "
-  "(annotates with __attribute__((returns_nonnull)) and throw()). This is 
detectable in source.">,
-  Flags<[CC1Option]>, MarshallingInfoFlag>;
+defm new_infallible : BoolFOption<"new-infallible",
+  LangOpts<"NewInfallible">, DefaultFalse,
+  PosFlag, NegFlag,
+  BothFlags<[CC1Option], " treating throwing global C++ operator new as always 
returning valid memory "
+  "(annotates with __attribute__((returns_nonnull)) and throw()). This is 
detectable in source.">>;
 defm whole_program_vtables : BoolFOption<"whole-program-vtables",
   CodeGenOpts<"WholeProgramVTables">, DefaultFalse,
   PosFlag,
Index: clang/docs/ClangCommandLineReference.rst
===
--- clang/docs/ClangCommandLineReference.rst
+++ clang/docs/ClangCommandLineReference.rst
@@ -1941,9 +1941,9 @@
 
 Specifies the largest alignment guaranteed by '::operator new(size\_t)'
 
-.. option:: -fnew-infallible
+.. option:: -fnew-infallible, -fno-new-infallible
 
-Treats throwing global C++ operator new as always returning valid memory 
(annotates with \_\_attribute\_\_((returns\_nonnull)) and throw()). This is 
detectable in source.
+Enable treating throwing global C++ operator new as always returning valid 
memory (annotates with \_\_attribute\_\_((returns\_nonnull)) and throw()). This 
is detectable in source.
 
 .. option:: -fnext-runtime
 


Index: 

[PATCH] D113523: Add toggling for -fnew-infallible/-fno-new-infallible

2021-11-09 Thread Di Mo via Phabricator via cfe-commits
modimo updated this revision to Diff 385996.
modimo added a comment.

Remove whitespace change


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D113523/new/

https://reviews.llvm.org/D113523

Files:
  clang/docs/ClangCommandLineReference.rst
  clang/include/clang/Driver/Options.td
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/CodeGenCXX/new-infallible.cpp


Index: clang/test/CodeGenCXX/new-infallible.cpp
===
--- clang/test/CodeGenCXX/new-infallible.cpp
+++ clang/test/CodeGenCXX/new-infallible.cpp
@@ -1,7 +1,16 @@
 // RUN: %clang_cc1 -emit-llvm -triple x86_64-linux-gnu -fnew-infallible -o - 
%s | FileCheck %s
+// RUN: %clang_cc1 -emit-llvm -triple x86_64-linux-gnu -fno-new-infallible 
-fnew-infallible -o - %s | FileCheck %s
+// RUN: %clang_cc1 -emit-llvm -triple x86_64-linux-gnu -fno-new-infallible -o 
- %s | FileCheck %s --check-prefix=NO-NEW-INFALLIBLE
+// RUN: %clang_cc1 -emit-llvm -triple x86_64-linux-gnu -fnew-infallible 
-fno-new-infallible -o - %s | FileCheck %s --check-prefix=NO-NEW-INFALLIBLE
 
 // CHECK: call noalias nonnull i8* @_Znwm(i64 4)
 
 // CHECK: ; Function Attrs: nobuiltin nounwind allocsize(0)
 // CHECK-NEXT: declare nonnull i8* @_Znwm(i64)
+
+// NO-NEW-INFALLIBLE: call noalias nonnull i8* @_Znwm(i64 4)
+
+// NO-NEW-INFALLIBLE: ; Function Attrs: nobuiltin allocsize(0)
+// NO-NEW-INFALLIBLE-NEXT: declare nonnull i8* @_Znwm(i64)
+
 int *new_infallible = new int;
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -5810,9 +5810,12 @@
   Args.AddLastArg(CmdArgs, 
options::OPT_fvisibility_inlines_hidden_static_local_var,

options::OPT_fno_visibility_inlines_hidden_static_local_var);
   Args.AddLastArg(CmdArgs, options::OPT_fvisibility_global_new_delete_hidden);
-  Args.AddLastArg(CmdArgs, options::OPT_fnew_infallible);
   Args.AddLastArg(CmdArgs, options::OPT_ftlsmodel_EQ);
 
+  if (Args.hasFlag(options::OPT_fnew_infallible,
+   options::OPT_fno_new_infallible, false))
+CmdArgs.push_back("-fnew-infallible");
+
   if (Args.hasFlag(options::OPT_fno_operator_names,
options::OPT_foperator_names, false))
 CmdArgs.push_back("-fno-operator-names");
Index: clang/include/clang/Driver/Options.td
===
--- clang/include/clang/Driver/Options.td
+++ clang/include/clang/Driver/Options.td
@@ -2786,10 +2786,11 @@
 def fvisibility_global_new_delete_hidden : Flag<["-"], 
"fvisibility-global-new-delete-hidden">, Group,
   HelpText<"Give global C++ operator new and delete declarations hidden 
visibility">, Flags<[CC1Option]>,
   MarshallingInfoFlag>;
-def fnew_infallible : Flag<["-"], "fnew-infallible">, Group,
-  HelpText<"Treats throwing global C++ operator new as always returning valid 
memory "
-  "(annotates with __attribute__((returns_nonnull)) and throw()). This is 
detectable in source.">,
-  Flags<[CC1Option]>, MarshallingInfoFlag>;
+defm new_infallible : BoolFOption<"new-infallible",
+  LangOpts<"NewInfallible">, DefaultFalse,
+  PosFlag, NegFlag,
+  BothFlags<[CC1Option], " treating throwing global C++ operator new as always 
returning valid memory "
+  "(annotates with __attribute__((returns_nonnull)) and throw()). This is 
detectable in source.">>;
 defm whole_program_vtables : BoolFOption<"whole-program-vtables",
   CodeGenOpts<"WholeProgramVTables">, DefaultFalse,
   PosFlag,
Index: clang/docs/ClangCommandLineReference.rst
===
--- clang/docs/ClangCommandLineReference.rst
+++ clang/docs/ClangCommandLineReference.rst
@@ -1941,9 +1941,9 @@
 
 Specifies the largest alignment guaranteed by '::operator new(size\_t)'
 
-.. option:: -fnew-infallible
+.. option:: -fnew-infallible, -fno-new-infallible
 
-Treats throwing global C++ operator new as always returning valid memory 
(annotates with \_\_attribute\_\_((returns\_nonnull)) and throw()). This is 
detectable in source.
+Enable treating throwing global C++ operator new as always returning valid 
memory (annotates with \_\_attribute\_\_((returns\_nonnull)) and throw()). This 
is detectable in source.
 
 .. option:: -fnext-runtime
 


Index: clang/test/CodeGenCXX/new-infallible.cpp
===
--- clang/test/CodeGenCXX/new-infallible.cpp
+++ clang/test/CodeGenCXX/new-infallible.cpp
@@ -1,7 +1,16 @@
 // RUN: %clang_cc1 -emit-llvm -triple x86_64-linux-gnu -fnew-infallible -o - %s | FileCheck %s
+// RUN: %clang_cc1 -emit-llvm -triple x86_64-linux-gnu -fno-new-infallible -fnew-infallible -o - %s | FileCheck %s
+// RUN: %clang_cc1 -emit-llvm -triple x86_64-linux-gnu -fno-new-infallible -o - %s | FileCheck %s --check-prefix=NO-NEW-INFALLIBLE
+// RUN: 

[PATCH] D113523: Add toggling for -fnew-infallible/-fno-new-infallible

2021-11-09 Thread Di Mo via Phabricator via cfe-commits
modimo created this revision.
Herald added subscribers: hoy, wenlei, lxfind, dang.
modimo requested review of this revision.
Herald added a reviewer: jdoerfert.
Herald added subscribers: cfe-commits, sstefan1.
Herald added a project: clang.

Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D113523

Files:
  clang/docs/ClangCommandLineReference.rst
  clang/include/clang/Driver/Options.td
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/test/CodeGenCXX/new-infallible.cpp


Index: clang/test/CodeGenCXX/new-infallible.cpp
===
--- clang/test/CodeGenCXX/new-infallible.cpp
+++ clang/test/CodeGenCXX/new-infallible.cpp
@@ -1,7 +1,16 @@
 // RUN: %clang_cc1 -emit-llvm -triple x86_64-linux-gnu -fnew-infallible -o - 
%s | FileCheck %s
+// RUN: %clang_cc1 -emit-llvm -triple x86_64-linux-gnu -fno-new-infallible 
-fnew-infallible -o - %s | FileCheck %s
+// RUN: %clang_cc1 -emit-llvm -triple x86_64-linux-gnu -fno-new-infallible -o 
- %s | FileCheck %s --check-prefix=NO-NEW-INFALLIBLE
+// RUN: %clang_cc1 -emit-llvm -triple x86_64-linux-gnu -fnew-infallible 
-fno-new-infallible -o - %s | FileCheck %s --check-prefix=NO-NEW-INFALLIBLE
 
 // CHECK: call noalias nonnull i8* @_Znwm(i64 4)
 
 // CHECK: ; Function Attrs: nobuiltin nounwind allocsize(0)
 // CHECK-NEXT: declare nonnull i8* @_Znwm(i64)
+
+// NO-NEW-INFALLIBLE: call noalias nonnull i8* @_Znwm(i64 4)
+
+// NO-NEW-INFALLIBLE: ; Function Attrs: nobuiltin allocsize(0)
+// NO-NEW-INFALLIBLE-NEXT: declare nonnull i8* @_Znwm(i64)
+
 int *new_infallible = new int;
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -5810,9 +5810,12 @@
   Args.AddLastArg(CmdArgs, 
options::OPT_fvisibility_inlines_hidden_static_local_var,

options::OPT_fno_visibility_inlines_hidden_static_local_var);
   Args.AddLastArg(CmdArgs, options::OPT_fvisibility_global_new_delete_hidden);
-  Args.AddLastArg(CmdArgs, options::OPT_fnew_infallible);
   Args.AddLastArg(CmdArgs, options::OPT_ftlsmodel_EQ);
 
+  if (Args.hasFlag(options::OPT_fnew_infallible,
+   options::OPT_fno_new_infallible, false))
+CmdArgs.push_back("-fnew-infallible");
+
   if (Args.hasFlag(options::OPT_fno_operator_names,
options::OPT_foperator_names, false))
 CmdArgs.push_back("-fno-operator-names");
Index: clang/include/clang/Driver/Options.td
===
--- clang/include/clang/Driver/Options.td
+++ clang/include/clang/Driver/Options.td
@@ -2433,13 +2433,13 @@
   HelpText<"Enable debugging in the OpenMP offloading device RTL">;
 def fno_openmp_target_debug : Flag<["-"], "fno-openmp-target-debug">, 
Group, Flags<[NoArgumentUnused]>;
 def fopenmp_target_debug_EQ : Joined<["-"], "fopenmp-target-debug=">, 
Group, Flags<[CC1Option, NoArgumentUnused, HelpHidden]>;
-def fopenmp_assume_teams_oversubscription : Flag<["-"], 
"fopenmp-assume-teams-oversubscription">, 
+def fopenmp_assume_teams_oversubscription : Flag<["-"], 
"fopenmp-assume-teams-oversubscription">,
   Group, Flags<[CC1Option, NoArgumentUnused, HelpHidden]>;
-def fopenmp_assume_threads_oversubscription : Flag<["-"], 
"fopenmp-assume-threads-oversubscription">, 
+def fopenmp_assume_threads_oversubscription : Flag<["-"], 
"fopenmp-assume-threads-oversubscription">,
   Group, Flags<[CC1Option, NoArgumentUnused, HelpHidden]>;
-def fno_openmp_assume_teams_oversubscription : Flag<["-"], 
"fno-openmp-assume-teams-oversubscription">, 
+def fno_openmp_assume_teams_oversubscription : Flag<["-"], 
"fno-openmp-assume-teams-oversubscription">,
   Group, Flags<[CC1Option, NoArgumentUnused, HelpHidden]>;
-def fno_openmp_assume_threads_oversubscription : Flag<["-"], 
"fno-openmp-assume-threads-oversubscription">, 
+def fno_openmp_assume_threads_oversubscription : Flag<["-"], 
"fno-openmp-assume-threads-oversubscription">,
   Group, Flags<[CC1Option, NoArgumentUnused, HelpHidden]>;
 defm openmp_target_new_runtime: BoolFOption<"openmp-target-new-runtime",
   LangOpts<"OpenMPTargetNewRuntime">, DefaultFalse,
@@ -2786,10 +2786,11 @@
 def fvisibility_global_new_delete_hidden : Flag<["-"], 
"fvisibility-global-new-delete-hidden">, Group,
   HelpText<"Give global C++ operator new and delete declarations hidden 
visibility">, Flags<[CC1Option]>,
   MarshallingInfoFlag>;
-def fnew_infallible : Flag<["-"], "fnew-infallible">, Group,
-  HelpText<"Treats throwing global C++ operator new as always returning valid 
memory "
-  "(annotates with __attribute__((returns_nonnull)) and throw()). This is 
detectable in source.">,
-  Flags<[CC1Option]>, MarshallingInfoFlag>;
+defm new_infallible : BoolFOption<"new-infallible",
+  LangOpts<"NewInfallible">, DefaultFalse,
+  PosFlag, NegFlag,
+  BothFlags<[CC1Option], " treating throwing global C++ operator new as 

[PATCH] D36850: [ThinLTO] Add noRecurse and noUnwind thinlink function attribute propagation

2021-09-27 Thread Di Mo via Phabricator via cfe-commits
modimo added a comment.

Thanks for the thorough review @tejohnson! I'll do additional validation on FB 
code that uses exceptions and if that all looks good I'll send up a change to 
turn this default on with the findings.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D36850/new/

https://reviews.llvm.org/D36850

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D36850: [ThinLTO] Add noRecurse and noUnwind thinlink function attribute propagation

2021-09-27 Thread Di Mo via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rG20faf789199d: [ThinLTO] Add noRecurse and noUnwind thinlink 
function attribute propagation (authored by modimo).

Changed prior to commit:
  https://reviews.llvm.org/D36850?vs=374934=375367#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D36850/new/

https://reviews.llvm.org/D36850

Files:
  clang/test/CodeGen/thinlto-distributed-cfi-devirt.ll
  clang/test/CodeGen/thinlto-distributed-cfi.ll
  clang/test/CodeGen/thinlto-funcattr-prop.ll
  llvm/include/llvm/AsmParser/LLToken.h
  llvm/include/llvm/IR/GlobalValue.h
  llvm/include/llvm/IR/ModuleSummaryIndex.h
  llvm/include/llvm/LTO/LTO.h
  llvm/include/llvm/Transforms/IPO/FunctionAttrs.h
  llvm/include/llvm/Transforms/IPO/FunctionImport.h
  llvm/lib/Analysis/ModuleSummaryAnalysis.cpp
  llvm/lib/AsmParser/LLLexer.cpp
  llvm/lib/AsmParser/LLParser.cpp
  llvm/lib/Bitcode/Reader/BitcodeReader.cpp
  llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
  llvm/lib/IR/AsmWriter.cpp
  llvm/lib/IR/ModuleSummaryIndex.cpp
  llvm/lib/LTO/LTO.cpp
  llvm/lib/LTO/LTOBackend.cpp
  llvm/lib/LTO/ThinLTOCodeGenerator.cpp
  llvm/lib/Transforms/IPO/FunctionAttrs.cpp
  llvm/lib/Transforms/IPO/FunctionImport.cpp
  llvm/test/Assembler/thinlto-summary.ll
  llvm/test/Bitcode/thinlto-function-summary-refgraph.ll
  llvm/test/Bitcode/thinlto-type-vcalls.ll
  llvm/test/ThinLTO/X86/deadstrip.ll
  llvm/test/ThinLTO/X86/dot-dumper.ll
  llvm/test/ThinLTO/X86/dot-dumper2.ll
  llvm/test/ThinLTO/X86/funcattrs-prop-exported-internal.ll
  llvm/test/ThinLTO/X86/funcattrs-prop-maythrow.ll
  llvm/test/ThinLTO/X86/funcattrs-prop-undefined.ll
  llvm/test/ThinLTO/X86/funcattrs-prop-unknown.ll
  llvm/test/ThinLTO/X86/funcattrs-prop-weak.ll
  llvm/test/ThinLTO/X86/funcattrs-prop.ll
  llvm/test/ThinLTO/X86/funcimport_alwaysinline.ll
  llvm/test/ThinLTO/X86/function_entry_count.ll
  llvm/test/ThinLTO/X86/linkonce_resolution_comdat.ll

Index: llvm/test/ThinLTO/X86/linkonce_resolution_comdat.ll
===
--- llvm/test/ThinLTO/X86/linkonce_resolution_comdat.ll
+++ llvm/test/ThinLTO/X86/linkonce_resolution_comdat.ll
@@ -3,15 +3,17 @@
 ; verification error.
 ; RUN: opt -module-summary %s -o %t1.bc
 ; RUN: opt -module-summary %p/Inputs/linkonce_resolution_comdat.ll -o %t2.bc
-; RUN: llvm-lto -thinlto-action=run %t1.bc %t2.bc -exported-symbol=f -exported-symbol=g -thinlto-save-temps=%t3.
+; RUN: llvm-lto -thinlto-action=run -disable-thinlto-funcattrs=0 %t1.bc %t2.bc -exported-symbol=f -exported-symbol=g -thinlto-save-temps=%t3.
 
 ; RUN: llvm-dis %t3.0.3.imported.bc -o - | FileCheck %s --check-prefix=IMPORT1
 ; RUN: llvm-dis %t3.1.3.imported.bc -o - | FileCheck %s --check-prefix=IMPORT2
 ; Copy from first module is prevailing and converted to weak_odr, copy
 ; from second module is preempted and converted to available_externally and
 ; removed from comdat.
-; IMPORT1: define weak_odr i32 @f(i8* %0) unnamed_addr comdat($c1) {
-; IMPORT2: define available_externally i32 @f(i8* %0) unnamed_addr {
+; IMPORT1: define weak_odr i32 @f(i8* %0) unnamed_addr [[ATTR:#[0-9]+]] comdat($c1) {
+; IMPORT2: define available_externally i32 @f(i8* %0) unnamed_addr [[ATTR:#[0-9]+]] {
+
+; CHECK-DAG: attributes [[ATTR]] = { norecurse nounwind }
 
 ; RUN: llvm-nm -o - < %t1.bc.thinlto.o | FileCheck %s --check-prefix=NM1
 ; NM1: W f
Index: llvm/test/ThinLTO/X86/function_entry_count.ll
===
--- llvm/test/ThinLTO/X86/function_entry_count.ll
+++ llvm/test/ThinLTO/X86/function_entry_count.ll
@@ -2,7 +2,7 @@
 ; RUN: opt -thinlto-bc %p/Inputs/function_entry_count.ll -write-relbf-to-summary -thin-link-bitcode-file=%t2.thinlink.bc -o %t2.bc
 
 ; First perform the thin link on the normal bitcode file.
-; RUN: llvm-lto2 run %t1.bc %t2.bc -o %t.o -save-temps -thinlto-synthesize-entry-counts \
+; RUN: llvm-lto2 run %t1.bc %t2.bc -o %t.o -save-temps -disable-thinlto-funcattrs=0 -thinlto-synthesize-entry-counts \
 ; RUN: -r=%t1.bc,g, \
 ; RUN: -r=%t1.bc,f,px \
 ; RUN: -r=%t1.bc,h,px \
@@ -10,15 +10,16 @@
 ; RUN: -r=%t2.bc,g,px
 ; RUN: llvm-dis -o - %t.o.1.3.import.bc | FileCheck %s
 
-; RUN: llvm-lto -thinlto-action=run -thinlto-synthesize-entry-counts -exported-symbol=f \
+; RUN: llvm-lto -thinlto-action=run -disable-thinlto-funcattrs=0 -thinlto-synthesize-entry-counts -exported-symbol=f \
 ; RUN: -exported-symbol=g -exported-symbol=h -thinlto-save-temps=%t3. %t1.bc %t2.bc
 ; RUN: llvm-dis %t3.0.3.imported.bc -o - | FileCheck %s
 
-; CHECK: define void @h() !prof ![[PROF2:[0-9]+]]
-; CHECK: define void @f(i32{{.*}}) !prof ![[PROF1:[0-9]+]]
+; CHECK: define void @h() [[ATTR:#[0-9]+]] !prof ![[PROF2:[0-9]+]]
+; CHECK: define void @f(i32{{.*}}) [[ATTR:#[0-9]+]] !prof ![[PROF1:[0-9]+]]
 ; CHECK: define available_externally void 

[PATCH] D36850: [ThinLTO] Add noRecurse and noUnwind thinlink function attribute propagation

2021-09-24 Thread Di Mo via Phabricator via cfe-commits
modimo added inline comments.



Comment at: clang/test/CodeGen/thinlto-funcattr-prop.ll:17
 
-; CHECK: ^2 = gv: (guid: 13959900437860518209, summaries: (function: (module: 
^0, flags: (linkage: external, visibility: default, notEligibleToImport: 0, 
live: 1, dsoLocal: 1, canAutoHide: 0), insts: 2, calls: ((callee: ^3)
-; CHECK: ^3 = gv: (guid: 14959766916849974397, summaries: (function: (module: 
^1, flags: (linkage: external, visibility: default, notEligibleToImport: 0, 
live: 1, dsoLocal: 0, canAutoHide: 0), insts: 1, funcFlags: (readNone: 0, 
readOnly: 0, noRecurse: 1, returnDoesNotAlias: 0, noInline: 0, alwaysInline: 0, 
noUnwind: 1, mayThrow: 0, hasUnknownCall: 0
+;; Summary for call_extern. Note that llvm-lto2 writes out the index before 
+; CHECK-INDEX: ^2 = gv: (guid: 13959900437860518209, summaries: (function: 
(module: ^0, flags: (linkage: external, visibility: default, 
notEligibleToImport: 0, live: 1, dsoLocal: 1, canAutoHide: 0), insts: 2, calls: 
((callee: ^3)

tejohnson wrote:
> Incomplete sentence, seems to be missing the rest of the explanation about 
> when it is written.
Nice catch, sentence is now complete.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D36850/new/

https://reviews.llvm.org/D36850

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D36850: [ThinLTO] Add noRecurse and noUnwind thinlink function attribute propagation

2021-09-24 Thread Di Mo via Phabricator via cfe-commits
modimo updated this revision to Diff 374934.
modimo marked an inline comment as done.
modimo added a comment.

Complete explanation in thinlto-funcattr-prop.ll, also fix up diff to contain 
all changes.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D36850/new/

https://reviews.llvm.org/D36850

Files:
  clang/test/CodeGen/thinlto-distributed-cfi-devirt.ll
  clang/test/CodeGen/thinlto-distributed-cfi.ll
  clang/test/CodeGen/thinlto-funcattr-prop.ll
  llvm/include/llvm/AsmParser/LLToken.h
  llvm/include/llvm/IR/GlobalValue.h
  llvm/include/llvm/IR/ModuleSummaryIndex.h
  llvm/include/llvm/LTO/LTO.h
  llvm/include/llvm/Transforms/IPO/FunctionAttrs.h
  llvm/include/llvm/Transforms/IPO/FunctionImport.h
  llvm/lib/Analysis/ModuleSummaryAnalysis.cpp
  llvm/lib/AsmParser/LLLexer.cpp
  llvm/lib/AsmParser/LLParser.cpp
  llvm/lib/Bitcode/Reader/BitcodeReader.cpp
  llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
  llvm/lib/IR/AsmWriter.cpp
  llvm/lib/IR/ModuleSummaryIndex.cpp
  llvm/lib/LTO/LTO.cpp
  llvm/lib/LTO/LTOBackend.cpp
  llvm/lib/LTO/ThinLTOCodeGenerator.cpp
  llvm/lib/Transforms/IPO/FunctionAttrs.cpp
  llvm/lib/Transforms/IPO/FunctionImport.cpp
  llvm/test/Assembler/thinlto-summary.ll
  llvm/test/Bitcode/thinlto-function-summary-refgraph.ll
  llvm/test/Bitcode/thinlto-type-vcalls.ll
  llvm/test/ThinLTO/X86/deadstrip.ll
  llvm/test/ThinLTO/X86/dot-dumper.ll
  llvm/test/ThinLTO/X86/dot-dumper2.ll
  llvm/test/ThinLTO/X86/funcattrs-prop-exported-internal.ll
  llvm/test/ThinLTO/X86/funcattrs-prop-maythrow.ll
  llvm/test/ThinLTO/X86/funcattrs-prop-undefined.ll
  llvm/test/ThinLTO/X86/funcattrs-prop-unknown.ll
  llvm/test/ThinLTO/X86/funcattrs-prop-weak.ll
  llvm/test/ThinLTO/X86/funcattrs-prop.ll
  llvm/test/ThinLTO/X86/funcimport_alwaysinline.ll
  llvm/test/ThinLTO/X86/function_entry_count.ll
  llvm/test/ThinLTO/X86/linkonce_resolution_comdat.ll

Index: llvm/test/ThinLTO/X86/linkonce_resolution_comdat.ll
===
--- llvm/test/ThinLTO/X86/linkonce_resolution_comdat.ll
+++ llvm/test/ThinLTO/X86/linkonce_resolution_comdat.ll
@@ -3,15 +3,17 @@
 ; verification error.
 ; RUN: opt -module-summary %s -o %t1.bc
 ; RUN: opt -module-summary %p/Inputs/linkonce_resolution_comdat.ll -o %t2.bc
-; RUN: llvm-lto -thinlto-action=run %t1.bc %t2.bc -exported-symbol=f -exported-symbol=g -thinlto-save-temps=%t3.
+; RUN: llvm-lto -thinlto-action=run -disable-thinlto-funcattrs=0 %t1.bc %t2.bc -exported-symbol=f -exported-symbol=g -thinlto-save-temps=%t3.
 
 ; RUN: llvm-dis %t3.0.3.imported.bc -o - | FileCheck %s --check-prefix=IMPORT1
 ; RUN: llvm-dis %t3.1.3.imported.bc -o - | FileCheck %s --check-prefix=IMPORT2
 ; Copy from first module is prevailing and converted to weak_odr, copy
 ; from second module is preempted and converted to available_externally and
 ; removed from comdat.
-; IMPORT1: define weak_odr i32 @f(i8* %0) unnamed_addr comdat($c1) {
-; IMPORT2: define available_externally i32 @f(i8* %0) unnamed_addr {
+; IMPORT1: define weak_odr i32 @f(i8* %0) unnamed_addr [[ATTR:#[0-9]+]] comdat($c1) {
+; IMPORT2: define available_externally i32 @f(i8* %0) unnamed_addr [[ATTR:#[0-9]+]] {
+
+; CHECK-DAG: attributes [[ATTR]] = { norecurse nounwind }
 
 ; RUN: llvm-nm -o - < %t1.bc.thinlto.o | FileCheck %s --check-prefix=NM1
 ; NM1: W f
Index: llvm/test/ThinLTO/X86/function_entry_count.ll
===
--- llvm/test/ThinLTO/X86/function_entry_count.ll
+++ llvm/test/ThinLTO/X86/function_entry_count.ll
@@ -2,7 +2,7 @@
 ; RUN: opt -thinlto-bc %p/Inputs/function_entry_count.ll -write-relbf-to-summary -thin-link-bitcode-file=%t2.thinlink.bc -o %t2.bc
 
 ; First perform the thin link on the normal bitcode file.
-; RUN: llvm-lto2 run %t1.bc %t2.bc -o %t.o -save-temps -thinlto-synthesize-entry-counts \
+; RUN: llvm-lto2 run %t1.bc %t2.bc -o %t.o -save-temps -disable-thinlto-funcattrs=0 -thinlto-synthesize-entry-counts \
 ; RUN: -r=%t1.bc,g, \
 ; RUN: -r=%t1.bc,f,px \
 ; RUN: -r=%t1.bc,h,px \
@@ -10,15 +10,16 @@
 ; RUN: -r=%t2.bc,g,px
 ; RUN: llvm-dis -o - %t.o.1.3.import.bc | FileCheck %s
 
-; RUN: llvm-lto -thinlto-action=run -thinlto-synthesize-entry-counts -exported-symbol=f \
+; RUN: llvm-lto -thinlto-action=run -disable-thinlto-funcattrs=0 -thinlto-synthesize-entry-counts -exported-symbol=f \
 ; RUN: -exported-symbol=g -exported-symbol=h -thinlto-save-temps=%t3. %t1.bc %t2.bc
 ; RUN: llvm-dis %t3.0.3.imported.bc -o - | FileCheck %s
 
-; CHECK: define void @h() !prof ![[PROF2:[0-9]+]]
-; CHECK: define void @f(i32{{.*}}) !prof ![[PROF1:[0-9]+]]
+; CHECK: define void @h() [[ATTR:#[0-9]+]] !prof ![[PROF2:[0-9]+]]
+; CHECK: define void @f(i32{{.*}}) [[ATTR:#[0-9]+]] !prof ![[PROF1:[0-9]+]]
 ; CHECK: define available_externally void @g() !prof ![[PROF2]]
 ; CHECK-DAG: ![[PROF1]] = !{!"synthetic_function_entry_count", i64 10}
 ; CHECK-DAG: ![[PROF2]] = 

[PATCH] D36850: [ThinLTO] Add noRecurse and noUnwind thinlink function attribute propagation

2021-09-23 Thread Di Mo via Phabricator via cfe-commits
modimo added inline comments.



Comment at: clang/test/CodeGen/thinlto-funcattr-prop.ll:16
+
+; CHECK: ^2 = gv: (guid: 13959900437860518209, summaries: (function: (module: 
^0, flags: (linkage: external, visibility: default, notEligibleToImport: 0, 
live: 1, dsoLocal: 1, canAutoHide: 0), insts: 2, calls: ((callee: ^3)
+; CHECK: ^3 = gv: (guid: 14959766916849974397, summaries: (function: (module: 
^1, flags: (linkage: external, visibility: default, notEligibleToImport: 0, 
live: 1, dsoLocal: 0, canAutoHide: 0), insts: 1, funcFlags: (readNone: 0, 
readOnly: 0, noRecurse: 1, returnDoesNotAlias: 0, noInline: 0, alwaysInline: 0, 
noUnwind: 1, mayThrow: 0, hasUnknownCall: 0

tejohnson wrote:
> I believe this corresponds to call_extern - why aren't we getting noRecurse 
> and noUnwind propagated here?
> 
> (also, suggest adding a comment above each of these summaries as to what 
> function name they correspond to)
Tracing through llvm-lto2 the index is written out by `CombinedIndexHook` 
before the rest of thinlink including attribute propagation takes place. The 
attributes do end up successfully getting propagated, I'll add a check for that 
in the `*1.promote.bc` which shows the outcome of the attributes being 
propagated.

Good idea, added the function name that correspond to each summary. 



Comment at: llvm/lib/Transforms/IPO/FunctionAttrs.cpp:480
+if (!CalleeSummary->fflags().NoUnwind ||
+CalleeSummary->fflags().MayThrow)
+  InferredFlags.NoUnwind = false;

tejohnson wrote:
> Please make sure one of the may throw propagation tests would fail without 
> this fix (i.e. when it was checking the caller's maythrow setting).
Thinking more on why this didn't manifest strange behavior: because of the BU 
order of call-graph traversal any callee that has mayThrow will have its 
inferred noUnwind set to false above. Checking again in the caller is redundant 
because the noUnwind property of the callee will be determined by its value of 
noUnwind only. I think removing this check completely makes sense.

I can think of a scenario where there are mayThrow instructions but the 
function is still marked noUnwind (noexcept function with a throw in it) but in 
that case it is safe to propagate upwards because any exception will fail to 
escape this callee and so checking mayThrow would actually be a pessimization. 
I added a case in funcattrs-prop-maythrow.ll to illustrate this.



Comment at: llvm/lib/Transforms/IPO/FunctionImport.cpp:1110
+const auto  = DefinedGlobals.find(F.getGUID());
+if (GV == DefinedGlobals.end())
+  return;

tejohnson wrote:
> Can this be merged with updateLinkage so we only do the DefinedGlobals lookup 
> once per symbol?
Sure, merged.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D36850/new/

https://reviews.llvm.org/D36850

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D36850: [ThinLTO] Add noRecurse and noUnwind thinlink function attribute propagation

2021-09-23 Thread Di Mo via Phabricator via cfe-commits
modimo updated this revision to Diff 374723.
modimo marked 3 inline comments as done.
modimo added a comment.

Address follow-ups


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D36850/new/

https://reviews.llvm.org/D36850

Files:
  clang/test/CodeGen/thinlto-funcattr-prop.ll
  llvm/include/llvm/Transforms/IPO/FunctionImport.h
  llvm/lib/Transforms/IPO/FunctionAttrs.cpp
  llvm/lib/Transforms/IPO/FunctionImport.cpp
  llvm/test/ThinLTO/X86/funcattrs-prop-maythrow.ll
  llvm/test/ThinLTO/X86/funcattrs-prop.ll

Index: llvm/test/ThinLTO/X86/funcattrs-prop.ll
===
--- llvm/test/ThinLTO/X86/funcattrs-prop.ll
+++ llvm/test/ThinLTO/X86/funcattrs-prop.ll
@@ -3,8 +3,8 @@
 ; RUN: opt -module-summary %t/b.ll -o %t/b.bc
 ; RUN: opt -module-summary %t/c.ll -o %t/c.bc
 
-;; ThinLTO Function attribute propagation uses the prevailing symbol to propagate attributes to its callers. Interposable (linkonce and weak) linkages are fair game given we know the prevailing copy 
-;; will be used in the final binary.
+;; ThinLTO Function attribute propagation uses the prevailing symbol to propagate attributes to its callers. 
+;; Interposable (linkonce and weak) linkages are fair game given we know the prevailing copy will be used in the final binary.
 ; RUN: llvm-lto2 run -disable-thinlto-funcattrs=0 %t/a.bc %t/b.bc %t/c.bc -o %t1 -save-temps \
 ; RUN:   -r=%t/a.bc,call_extern,plx -r=%t/a.bc,call_linkonceodr,plx -r=%t/a.bc,call_weakodr,plx -r=%t/a.bc,call_linkonce,plx -r=%t/a.bc,call_weak,plx -r=%t/a.bc,call_linkonce_may_unwind,plx -r=%t/a.bc,call_weak_may_unwind,plx \
 ; RUN:   -r=%t/a.bc,extern, -r=%t/a.bc,linkonceodr, -r=%t/a.bc,weakodr, -r=%t/a.bc,linkonce, -r=%t/a.bc,weak, -r=%t/a.bc,linkonce_may_unwind, -r=%t/a.bc,weak_may_unwind, \
Index: llvm/test/ThinLTO/X86/funcattrs-prop-maythrow.ll
===
--- llvm/test/ThinLTO/X86/funcattrs-prop-maythrow.ll
+++ llvm/test/ThinLTO/X86/funcattrs-prop-maythrow.ll
@@ -2,9 +2,9 @@
 ; RUN: split-file %s %t
 ; RUN: opt -thinlto-bc %t/main.ll -thin-link-bitcode-file=%t1.thinlink.bc -o %t1.bc
 ; RUN: opt -thinlto-bc %t/callees.ll -thin-link-bitcode-file=%t2.thinlink.bc -o %t2.bc
-; RUN: llvm-lto2 run -disable-thinlto-funcattrs=0 %t1.bc %t2.bc -o %t.o -r %t1.bc,caller,px -r %t1.bc,caller1,px -r %t1.bc,caller2,px \
-; RUN:   -r %t1.bc,cleanupret,l -r %t1.bc,catchret,l -r %t1.bc,resume,l \
-; RUN:   -r %t2.bc,cleanupret,px -r %t2.bc,catchret,px -r %t2.bc,resume,px -r %t2.bc,nonThrowing,px -r %t2.bc,__gxx_personality_v0,px -save-temps
+; RUN: llvm-lto2 run -disable-thinlto-funcattrs=0 %t1.bc %t2.bc -o %t.o -r %t1.bc,caller,px -r %t1.bc,caller1,px -r %t1.bc,caller2,px -r %t1.bc,caller_nounwind,px  \
+; RUN:   -r %t1.bc,cleanupret,l -r %t1.bc,catchret,l -r %t1.bc,resume,l -r %t1.bc,cleanupret_nounwind,l \
+; RUN:   -r %t2.bc,cleanupret,px -r %t2.bc,catchret,px -r %t2.bc,resume,px -r %t2.bc,cleanupret_nounwind,px -r %t2.bc,nonThrowing,px -r %t2.bc,__gxx_personality_v0,px -save-temps
 ; RUN: llvm-dis -o - %t2.bc | FileCheck %s --check-prefix=SUMMARY
 ; RUN: llvm-dis -o - %t.o.1.3.import.bc | FileCheck %s
 
@@ -16,26 +16,37 @@
 declare void @catchret()
 declare void @resume()
 
+; Functions can have mayThrow instructions but also be marked noUnwind
+; if they have terminate semantics (e.g. noexcept). In such cases
+; propagation trusts the original noUnwind value in the function summary
+declare void @cleanupret_nounwind()
 
-; CHECK: define void @caller() [[ATTR:#[0-9]+]]
+; CHECK: define void @caller() [[ATTR_MAYTHROW:#[0-9]+]]
 define void @caller() {
   call void @cleanupret()
   ret void
 }
 
-; CHECK: define void @caller1() [[ATTR:#[0-9]+]]
+; CHECK: define void @caller1() [[ATTR_MAYTHROW:#[0-9]+]]
 define void @caller1() {
   call void @catchret()
   ret void
 }
 
-; CHECK: define void @caller2() [[ATTR:#[0-9]+]]
+; CHECK: define void @caller2() [[ATTR_MAYTHROW:#[0-9]+]]
 define void @caller2() {
   call void @resume()
   ret void
 }
 
-; CHECK-DAG: attributes [[ATTR]] = { norecurse }
+; CHECK: define void @caller_nounwind() [[ATTR_NOUNWIND:#[0-9]+]]
+define void @caller_nounwind() {
+call void @cleanupret_nounwind()
+ret void
+}
+
+; CHECK-DAG: attributes [[ATTR_NOUNWIND]] = { norecurse nounwind }
+; CHECK-DAG: attributes [[ATTR_MAYTHROW]] = { norecurse }
 
 ; SUMMARY-DAG: = gv: (name: "cleanupret", summaries: (function: (module: ^0, flags: (linkage: external, visibility: default, notEligibleToImport: 0, live: 0, dsoLocal: 0, canAutoHide: 0), insts: 4, funcFlags: (readNone: 0, readOnly: 0, noRecurse: 0, returnDoesNotAlias: 0, noInline: 0, alwaysInline: 0, noUnwind: 0, mayThrow: 1, hasUnknownCall: 0), calls: ((callee: ^{{.*}})), refs: (^{{.*}}
 ; SUMMARY-DAG: = gv: (name: "resume", summaries: (function: (module: ^0, flags: (linkage: external, visibility: default, 

[PATCH] D36850: [ThinLTO] Add norecurse function attribute propagation

2021-09-22 Thread Di Mo via Phabricator via cfe-commits
modimo added inline comments.



Comment at: clang/test/CodeGen/thinlto-funcattr-prop.ll:14
+
+; RUN: llvm-dis %t/b.bc -o - | FileCheck %s
+

tejohnson wrote:
> This is checking the summary generated by opt, not the result of the 
> llvm-lto2 run.
Fixed.



Comment at: llvm/include/llvm/IR/ModuleSummaryIndex.h:580
+// If there are calls to unknown targets (e.g. indirect)
+unsigned hasUnknownCall : 1;
+

tejohnson wrote:
> tejohnson wrote:
> > modimo wrote:
> > > tejohnson wrote:
> > > > Now that we have MayThrow, can we avoid a separate hasUnknownCall bool 
> > > > and just conservatively set MayThrow true in that case?
> > > hasUnknownCall is used for norecurse and other future flags as well to 
> > > stop propagation.
> > Ah that makes sense.
> nit, maybe change this to hasIndirectCall which I think is more specific?
My thinking is that the flag is a catch-all for blocking propagation and could 
conceivably be set for other reasons. It also matches the existing usage in 
FunctionAttrs.cpp for local propagation which also sets this for functions that 
are `OptNone`.



Comment at: llvm/include/llvm/LTO/LTO.h:26
 #include "llvm/Support/thread.h"
+#include "llvm/Transforms/IPO/FunctionAttrs.h"
 #include "llvm/Transforms/IPO/FunctionImport.h"

tejohnson wrote:
> Is this needed?
Yeah, `thinLTOPropagateFunctionAttrs` resides in FunctionAttrs.h and 
`runThinLTO` calls it to propagate.



Comment at: llvm/lib/Analysis/ModuleSummaryAnalysis.cpp:379
   } else {
+HasUnknownCall = true;
 // Skip inline assembly calls.

tejohnson wrote:
> Should this be moved below the following checks for inline asm and direct 
> calls? (Not sure what the direct calls case is given that we handle direct 
> calls to "known functions" above though).
> 
> If it should stay where it is and treat the below cases as unknown, probably 
> should add tests for them.
Any call that isn't emitted to the summary CallGraphEdges is a hole in 
propagation knowledge.

Direct calls case is from https://reviews.llvm.org/D40056 which is handling:
```
; Test calls that aren't handled either as direct or indirect.
call void select (i1 icmp eq (i32* @global, i32* null), void ()* @f, void 
()* @g)()
```
Neat that select can be consolidated into a call, though I wonder if it should 
be allowed given it could be canonicalized to be another IR instruction above 
it and maybe eliminate this edge case. 

Tangent aside, since in all these cases the call isn't part of the static 
callgraph `HasUnknownCall` needs to be set for correctness. Tests added in 
funcattrs-prop-unknown.ll (replacing funcattrs-prop-indirect.ll since we're 
handling more than just indirect here).



Comment at: llvm/lib/Transforms/IPO/FunctionAttrs.cpp:484
+if (!CalleeSummary->fflags().NoUnwind ||
+CallerSummary->fflags().MayThrow)
+  InferredFlags.NoUnwind = false;

tejohnson wrote:
> You've already set InferredFlags.NoUnwind to false above this loop in the 
> case where MayThrow was set on the CallerSummary.
Good catch, this case should be querying CalleeSummary MayThrow.



Comment at: llvm/lib/Transforms/IPO/FunctionAttrs.cpp:499
+  ++NumThinLinkNoRecurse;
+  CachedAttributes[V]->setNoRecurse();
+}

tejohnson wrote:
> I think you can remove this and the below setNoUnwind() call on 
> CachedAttributes[V] since presumably this points to one of the function 
> summaries we update in the below loop.
Makes sense, removed. I like keeping the stats/debug tracking around though.



Comment at: llvm/lib/Transforms/IPO/FunctionAttrs.cpp:537
+
+  for (Function  : TheModule) {
+const auto  = DefinedGlobals.find(F.getGUID());

tejohnson wrote:
> Consider consolidating this function with thinLTOResolvePrevailingInModule, 
> to reduce the number of walks of the module and lookups into the 
> DefinedGlobals map.
Good idea, merged and renamed `thinLTOResolvePrevailingInModule` to 
`thinLTOFinalizeInModule`



Comment at: llvm/test/ThinLTO/X86/funcattrs-prop-indirect.ll:3
+; RUN: opt -thinlto-bc %s -thin-link-bitcode-file=%t1.thinlink.bc -o %t1.bc
+; RUN: llvm-lto2 run -disable-thinlto-funcattrs=0 %t1.bc -o %t.o -r 
%t1.bc,indirect,px -save-temps
+; RUN: llvm-dis -o - %t.o.1.3.import.bc | FileCheck %s

tejohnson wrote:
> Have a second version that tests with -thinlto-funcattrs-optimistic-indirect? 
> I don't see a test for that option anywhere. Or maybe just remove that option 
> - is it really needed?
Good point, option removed.



Comment at: llvm/test/ThinLTO/X86/funcattrs-prop.ll:8
+;; 1. If external, linkonce_odr or weak_odr. We capture the attributes of the 
prevailing symbol for propagation.
+;; 2. If linkonce or weak, individual callers 

[PATCH] D36850: [ThinLTO] Add norecurse function attribute propagation

2021-09-22 Thread Di Mo via Phabricator via cfe-commits
modimo updated this revision to Diff 374431.
modimo marked 23 inline comments as done.
modimo added a comment.

Address feedback, rename funcattrs-prop-indirect.ll to funcattrs-prop-unknown.ll


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D36850/new/

https://reviews.llvm.org/D36850

Files:
  clang/test/CodeGen/thinlto-distributed-cfi-devirt.ll
  clang/test/CodeGen/thinlto-distributed-cfi.ll
  clang/test/CodeGen/thinlto-funcattr-prop.ll
  llvm/include/llvm/AsmParser/LLToken.h
  llvm/include/llvm/IR/GlobalValue.h
  llvm/include/llvm/IR/ModuleSummaryIndex.h
  llvm/include/llvm/LTO/LTO.h
  llvm/include/llvm/Transforms/IPO/FunctionAttrs.h
  llvm/include/llvm/Transforms/IPO/FunctionImport.h
  llvm/lib/Analysis/ModuleSummaryAnalysis.cpp
  llvm/lib/AsmParser/LLLexer.cpp
  llvm/lib/AsmParser/LLParser.cpp
  llvm/lib/Bitcode/Reader/BitcodeReader.cpp
  llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
  llvm/lib/IR/AsmWriter.cpp
  llvm/lib/IR/ModuleSummaryIndex.cpp
  llvm/lib/LTO/LTO.cpp
  llvm/lib/LTO/LTOBackend.cpp
  llvm/lib/LTO/ThinLTOCodeGenerator.cpp
  llvm/lib/Transforms/IPO/FunctionAttrs.cpp
  llvm/lib/Transforms/IPO/FunctionImport.cpp
  llvm/test/Assembler/thinlto-summary.ll
  llvm/test/Bitcode/thinlto-function-summary-refgraph.ll
  llvm/test/Bitcode/thinlto-type-vcalls.ll
  llvm/test/ThinLTO/X86/deadstrip.ll
  llvm/test/ThinLTO/X86/dot-dumper.ll
  llvm/test/ThinLTO/X86/dot-dumper2.ll
  llvm/test/ThinLTO/X86/funcattrs-prop-exported-internal.ll
  llvm/test/ThinLTO/X86/funcattrs-prop-maythrow.ll
  llvm/test/ThinLTO/X86/funcattrs-prop-undefined.ll
  llvm/test/ThinLTO/X86/funcattrs-prop-unknown.ll
  llvm/test/ThinLTO/X86/funcattrs-prop-weak.ll
  llvm/test/ThinLTO/X86/funcattrs-prop.ll
  llvm/test/ThinLTO/X86/funcimport_alwaysinline.ll
  llvm/test/ThinLTO/X86/function_entry_count.ll
  llvm/test/ThinLTO/X86/linkonce_resolution_comdat.ll

Index: llvm/test/ThinLTO/X86/linkonce_resolution_comdat.ll
===
--- llvm/test/ThinLTO/X86/linkonce_resolution_comdat.ll
+++ llvm/test/ThinLTO/X86/linkonce_resolution_comdat.ll
@@ -3,15 +3,17 @@
 ; verification error.
 ; RUN: opt -module-summary %s -o %t1.bc
 ; RUN: opt -module-summary %p/Inputs/linkonce_resolution_comdat.ll -o %t2.bc
-; RUN: llvm-lto -thinlto-action=run %t1.bc %t2.bc -exported-symbol=f -exported-symbol=g -thinlto-save-temps=%t3.
+; RUN: llvm-lto -thinlto-action=run -disable-thinlto-funcattrs=0 %t1.bc %t2.bc -exported-symbol=f -exported-symbol=g -thinlto-save-temps=%t3.
 
 ; RUN: llvm-dis %t3.0.3.imported.bc -o - | FileCheck %s --check-prefix=IMPORT1
 ; RUN: llvm-dis %t3.1.3.imported.bc -o - | FileCheck %s --check-prefix=IMPORT2
 ; Copy from first module is prevailing and converted to weak_odr, copy
 ; from second module is preempted and converted to available_externally and
 ; removed from comdat.
-; IMPORT1: define weak_odr i32 @f(i8* %0) unnamed_addr comdat($c1) {
-; IMPORT2: define available_externally i32 @f(i8* %0) unnamed_addr {
+; IMPORT1: define weak_odr i32 @f(i8* %0) unnamed_addr [[ATTR:#[0-9]+]] comdat($c1) {
+; IMPORT2: define available_externally i32 @f(i8* %0) unnamed_addr [[ATTR:#[0-9]+]] {
+
+; CHECK-DAG: attributes [[ATTR]] = { norecurse nounwind }
 
 ; RUN: llvm-nm -o - < %t1.bc.thinlto.o | FileCheck %s --check-prefix=NM1
 ; NM1: W f
Index: llvm/test/ThinLTO/X86/function_entry_count.ll
===
--- llvm/test/ThinLTO/X86/function_entry_count.ll
+++ llvm/test/ThinLTO/X86/function_entry_count.ll
@@ -2,7 +2,7 @@
 ; RUN: opt -thinlto-bc %p/Inputs/function_entry_count.ll -write-relbf-to-summary -thin-link-bitcode-file=%t2.thinlink.bc -o %t2.bc
 
 ; First perform the thin link on the normal bitcode file.
-; RUN: llvm-lto2 run %t1.bc %t2.bc -o %t.o -save-temps -thinlto-synthesize-entry-counts \
+; RUN: llvm-lto2 run %t1.bc %t2.bc -o %t.o -save-temps -disable-thinlto-funcattrs=0 -thinlto-synthesize-entry-counts \
 ; RUN: -r=%t1.bc,g, \
 ; RUN: -r=%t1.bc,f,px \
 ; RUN: -r=%t1.bc,h,px \
@@ -10,15 +10,16 @@
 ; RUN: -r=%t2.bc,g,px
 ; RUN: llvm-dis -o - %t.o.1.3.import.bc | FileCheck %s
 
-; RUN: llvm-lto -thinlto-action=run -thinlto-synthesize-entry-counts -exported-symbol=f \
+; RUN: llvm-lto -thinlto-action=run -disable-thinlto-funcattrs=0 -thinlto-synthesize-entry-counts -exported-symbol=f \
 ; RUN: -exported-symbol=g -exported-symbol=h -thinlto-save-temps=%t3. %t1.bc %t2.bc
 ; RUN: llvm-dis %t3.0.3.imported.bc -o - | FileCheck %s
 
-; CHECK: define void @h() !prof ![[PROF2:[0-9]+]]
-; CHECK: define void @f(i32{{.*}}) !prof ![[PROF1:[0-9]+]]
+; CHECK: define void @h() [[ATTR:#[0-9]+]] !prof ![[PROF2:[0-9]+]]
+; CHECK: define void @f(i32{{.*}}) [[ATTR:#[0-9]+]] !prof ![[PROF1:[0-9]+]]
 ; CHECK: define available_externally void @g() !prof ![[PROF2]]
 ; CHECK-DAG: ![[PROF1]] = !{!"synthetic_function_entry_count", i64 10}
 ; CHECK-DAG: ![[PROF2]] = 

[PATCH] D36850: [ThinLTO] Add norecurse function attribute propagation

2021-09-16 Thread Di Mo via Phabricator via cfe-commits
modimo added a comment.

In D36850#3005254 , @tejohnson wrote:

> Ok thanks. I need to go through the propagation code and tests again more 
> closely now, but one question/suggestion below in the meantime.

Thanks!




Comment at: llvm/include/llvm/IR/ModuleSummaryIndex.h:580
+// If there are calls to unknown targets (e.g. indirect)
+unsigned hasUnknownCall : 1;
+

tejohnson wrote:
> Now that we have MayThrow, can we avoid a separate hasUnknownCall bool and 
> just conservatively set MayThrow true in that case?
hasUnknownCall is used for norecurse and other future flags as well to stop 
propagation.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D36850/new/

https://reviews.llvm.org/D36850

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D36850: [ThinLTO] Add norecurse function attribute propagation

2021-09-13 Thread Di Mo via Phabricator via cfe-commits
modimo updated this revision to Diff 372398.
modimo added a comment.

Add mayThrow flag


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D36850/new/

https://reviews.llvm.org/D36850

Files:
  clang/test/CodeGen/thinlto-distributed-cfi-devirt.ll
  clang/test/CodeGen/thinlto-distributed-cfi.ll
  clang/test/CodeGen/thinlto-funcattr-prop.ll
  llvm/include/llvm/AsmParser/LLToken.h
  llvm/include/llvm/IR/GlobalValue.h
  llvm/include/llvm/IR/ModuleSummaryIndex.h
  llvm/include/llvm/LTO/LTO.h
  llvm/include/llvm/Transforms/IPO/FunctionAttrs.h
  llvm/lib/Analysis/ModuleSummaryAnalysis.cpp
  llvm/lib/AsmParser/LLLexer.cpp
  llvm/lib/AsmParser/LLParser.cpp
  llvm/lib/Bitcode/Reader/BitcodeReader.cpp
  llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
  llvm/lib/IR/AsmWriter.cpp
  llvm/lib/IR/ModuleSummaryIndex.cpp
  llvm/lib/LTO/LTO.cpp
  llvm/lib/LTO/LTOBackend.cpp
  llvm/lib/LTO/ThinLTOCodeGenerator.cpp
  llvm/lib/Transforms/IPO/FunctionAttrs.cpp
  llvm/test/Assembler/thinlto-summary.ll
  llvm/test/Bitcode/thinlto-function-summary-refgraph.ll
  llvm/test/Bitcode/thinlto-type-vcalls.ll
  llvm/test/ThinLTO/X86/deadstrip.ll
  llvm/test/ThinLTO/X86/dot-dumper.ll
  llvm/test/ThinLTO/X86/dot-dumper2.ll
  llvm/test/ThinLTO/X86/funcattrs-prop-exported-internal.ll
  llvm/test/ThinLTO/X86/funcattrs-prop-indirect.ll
  llvm/test/ThinLTO/X86/funcattrs-prop-maythrow.ll
  llvm/test/ThinLTO/X86/funcattrs-prop-undefined.ll
  llvm/test/ThinLTO/X86/funcattrs-prop-weak.ll
  llvm/test/ThinLTO/X86/funcattrs-prop.ll
  llvm/test/ThinLTO/X86/funcimport_alwaysinline.ll
  llvm/test/ThinLTO/X86/function_entry_count.ll
  llvm/test/ThinLTO/X86/linkonce_resolution_comdat.ll

Index: llvm/test/ThinLTO/X86/linkonce_resolution_comdat.ll
===
--- llvm/test/ThinLTO/X86/linkonce_resolution_comdat.ll
+++ llvm/test/ThinLTO/X86/linkonce_resolution_comdat.ll
@@ -3,15 +3,17 @@
 ; verification error.
 ; RUN: opt -module-summary %s -o %t1.bc
 ; RUN: opt -module-summary %p/Inputs/linkonce_resolution_comdat.ll -o %t2.bc
-; RUN: llvm-lto -thinlto-action=run %t1.bc %t2.bc -exported-symbol=f -exported-symbol=g -thinlto-save-temps=%t3.
+; RUN: llvm-lto -thinlto-action=run -disable-thinlto-funcattrs=0 %t1.bc %t2.bc -exported-symbol=f -exported-symbol=g -thinlto-save-temps=%t3.
 
 ; RUN: llvm-dis %t3.0.3.imported.bc -o - | FileCheck %s --check-prefix=IMPORT1
 ; RUN: llvm-dis %t3.1.3.imported.bc -o - | FileCheck %s --check-prefix=IMPORT2
 ; Copy from first module is prevailing and converted to weak_odr, copy
 ; from second module is preempted and converted to available_externally and
 ; removed from comdat.
-; IMPORT1: define weak_odr i32 @f(i8* %0) unnamed_addr comdat($c1) {
-; IMPORT2: define available_externally i32 @f(i8* %0) unnamed_addr {
+; IMPORT1: define weak_odr i32 @f(i8* %0) unnamed_addr [[ATTR:#[0-9]+]] comdat($c1) {
+; IMPORT2: define available_externally i32 @f(i8* %0) unnamed_addr [[ATTR:#[0-9]+]] {
+
+; CHECK-DAG: attributes [[ATTR]] = { norecurse nounwind }
 
 ; RUN: llvm-nm -o - < %t1.bc.thinlto.o | FileCheck %s --check-prefix=NM1
 ; NM1: W f
Index: llvm/test/ThinLTO/X86/function_entry_count.ll
===
--- llvm/test/ThinLTO/X86/function_entry_count.ll
+++ llvm/test/ThinLTO/X86/function_entry_count.ll
@@ -2,7 +2,7 @@
 ; RUN: opt -thinlto-bc %p/Inputs/function_entry_count.ll -write-relbf-to-summary -thin-link-bitcode-file=%t2.thinlink.bc -o %t2.bc
 
 ; First perform the thin link on the normal bitcode file.
-; RUN: llvm-lto2 run %t1.bc %t2.bc -o %t.o -save-temps -thinlto-synthesize-entry-counts \
+; RUN: llvm-lto2 run %t1.bc %t2.bc -o %t.o -save-temps -disable-thinlto-funcattrs=0 -thinlto-synthesize-entry-counts \
 ; RUN: -r=%t1.bc,g, \
 ; RUN: -r=%t1.bc,f,px \
 ; RUN: -r=%t1.bc,h,px \
@@ -10,15 +10,16 @@
 ; RUN: -r=%t2.bc,g,px
 ; RUN: llvm-dis -o - %t.o.1.3.import.bc | FileCheck %s
 
-; RUN: llvm-lto -thinlto-action=run -thinlto-synthesize-entry-counts -exported-symbol=f \
+; RUN: llvm-lto -thinlto-action=run -disable-thinlto-funcattrs=0 -thinlto-synthesize-entry-counts -exported-symbol=f \
 ; RUN: -exported-symbol=g -exported-symbol=h -thinlto-save-temps=%t3. %t1.bc %t2.bc
 ; RUN: llvm-dis %t3.0.3.imported.bc -o - | FileCheck %s
 
-; CHECK: define void @h() !prof ![[PROF2:[0-9]+]]
-; CHECK: define void @f(i32{{.*}}) !prof ![[PROF1:[0-9]+]]
+; CHECK: define void @h() [[ATTR:#[0-9]+]] !prof ![[PROF2:[0-9]+]]
+; CHECK: define void @f(i32{{.*}}) [[ATTR:#[0-9]+]] !prof ![[PROF1:[0-9]+]]
 ; CHECK: define available_externally void @g() !prof ![[PROF2]]
 ; CHECK-DAG: ![[PROF1]] = !{!"synthetic_function_entry_count", i64 10}
 ; CHECK-DAG: ![[PROF2]] = !{!"synthetic_function_entry_count", i64 198}
+; CHECK-DAG: attributes [[ATTR]] = { norecurse nounwind }
 
 target triple = "x86_64-unknown-linux-gnu"
 target datalayout = 

[PATCH] D36850: [ThinLTO] Add norecurse function attribute propagation

2021-09-09 Thread Di Mo via Phabricator via cfe-commits
modimo added a comment.

In D36850#2990907 , @tejohnson wrote:

> In D36850#2990847 , @modimo wrote:
>
>> In D36850#2990771 , @tejohnson 
>> wrote:
>>
>>> In D36850#2968536 , @modimo wrote:
>>>
 In D36850#2964293 , @tejohnson 
 wrote:

> Good point on indirect calls. Rather than add a bit to the summary, can 
> the flags just be set conservatively in any function containing an 
> indirect call when we build the summaries initially? I think that would 
> get the same effect.

 That could have an issue where A calls {indirect, B} and A gets propagated 
 upon from B without knowledge that the indirect call exists. Right now 
 I've got a FunFlags `hasUnknownCall` which marks these as non-propagatable.
>>>
>>> Ah, because there isn't a conservative setting of the flag...which raises a 
>>> larger issue (but maybe I am completely missing something) - how do we 
>>> distinguish between the NoUnwind summary flag not being set because we 
>>> don't know yet (in which case we want the propagation from callees), vs 
>>> because it cannot be NoUnwind (because there is a throw in the function)? 
>>> Do we need to have a second flag indicating that a function contains a 
>>> mayThrow instruction (other than calls, which are being handled by the 
>>> propagation)?
>>
>> Only call instructions can throw (what ultimately performs the throw 
>> operation is an opaque call to __cxa_throw()) which simplifies the problem. 
>> If all calls are known, we only need to examine the callees for accurate 
>> propagation.
>
> What about the other instruction checks done in Instruction::mayThrow i.e. 
> http://llvm-cs.pcc.me.uk/lib/IR/Instruction.cpp#592?

Good point! CleanupReturnInst and CatchSwitchInst are windows SEH specific 
representations for asynchronous exceptions but definitely should be covered 
for correctness. For ResumeInst it's the "return" of a landing pad and in order 
for a landing pad to be reachable AFAIK an invoke must exist so is captured by 
the call graph. I'll add a scan for `Instruction::mayThrow` in summary 
building. Having a mayThrow flag or making NoUnwind a tri-state flag in the 
summary makes sense to me to capture this case.

As a side note to why there's a check for ResumeInst at all: an invoke 
instructions actually never has "mayThrow" set. I haven't delved too deep but 
this could be changed since a dead landing pad at attribute inference time can 
lead to pessimization of NoUnwind in cases I've seen (alternatively, making 
sure CFG opts run before this to make sure this is pruned away).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D36850/new/

https://reviews.llvm.org/D36850

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D36850: [ThinLTO] Add norecurse function attribute propagation

2021-09-08 Thread Di Mo via Phabricator via cfe-commits
modimo added a comment.

In D36850#2990771 , @tejohnson wrote:

> In D36850#2968536 , @modimo wrote:
>
>> In D36850#2964293 , @tejohnson 
>> wrote:
>>
>>> Good point on indirect calls. Rather than add a bit to the summary, can the 
>>> flags just be set conservatively in any function containing an indirect 
>>> call when we build the summaries initially? I think that would get the same 
>>> effect.
>>
>> That could have an issue where A calls {indirect, B} and A gets propagated 
>> upon from B without knowledge that the indirect call exists. Right now I've 
>> got a FunFlags `hasUnknownCall` which marks these as non-propagatable.
>
> Ah, because there isn't a conservative setting of the flag...which raises a 
> larger issue (but maybe I am completely missing something) - how do we 
> distinguish between the NoUnwind summary flag not being set because we don't 
> know yet (in which case we want the propagation from callees), vs because it 
> cannot be NoUnwind (because there is a throw in the function)? Do we need to 
> have a second flag indicating that a function contains a mayThrow instruction 
> (other than calls, which are being handled by the propagation)?

Only call instructions can throw (what ultimately performs the throw operation 
is an opaque call to __cxa_throw()) which simplifies the problem. If all calls 
are known, we only need to examine the callees for accurate propagation.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D36850/new/

https://reviews.llvm.org/D36850

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D36850: [ThinLTO] Add norecurse function attribute propagation

2021-09-02 Thread Di Mo via Phabricator via cfe-commits
modimo added a comment.

gentle ping @tejohnson


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D36850/new/

https://reviews.llvm.org/D36850

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108905: [ItaniumCXXABI] Make __cxa_end_catch calls unconditionally nounwind

2021-08-30 Thread Di Mo via Phabricator via cfe-commits
modimo added a comment.

In D108905#2973099 , @rjmccall wrote:

> Yeah, I think this is the most natural interpretation of the current 
> standard.  But that would be a very unfortunate rule, because people who 
> write `catch (...) {}` do reasonably expect that that code will never throw.  
>  In fact, it would be quite difficult — perhaps impossible — to write code 
> that actually swallowed all exceptions: even `try { try { foo() } catch(...) 
> {} } catch (...) {}` wouldn't work, because you could throw an exception 
> whose destructor throws an exception whose destructor throws an exception ad 
> infinitum.

Yeah it's not great and also something that practically will never happen. I 
think terminate guards are the only thing that really swallows all exceptions 
except here you can't guard the catch variable destructor unless you want to 
change up and depend on library implementation. My immediate thought is 
something like `catch(...) noexcept {}` to express this but it's a solution to 
a problem that really shouldn't exist.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108905/new/

https://reviews.llvm.org/D108905

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108905: [ItaniumCXXABI] Make __cxa_end_catch calls unconditionally nounwind

2021-08-30 Thread Di Mo via Phabricator via cfe-commits
modimo added a comment.

In D108905#2971861 , @rjmccall wrote:

> I'm not really sure what the standard expects to happen if an exception 
> destructor throws.  The standard tells us when the destruction happens — the 
> latest point it can — but that clause doesn't mention exceptions from the 
> destructor.  If there's a specific clause on this, I can't find it.  
> [except.throw]p7 says that "if the exception handling mechanism handling an 
> uncaught exception directly invokes a function that exits via an exception, 
> the function std::terminate is called", which is meant to cover exceptions 
> thrown when initializing a catch variable.  Once the catch clause is entered, 
> though, the exception is considered caught unless it's rethrown, so this 
> clause doesn't apply when destroying the exception at the end of the clause.

Scanning through the standard this to me also looks like an overlooked corner 
case in the standard (TBF this is very corner case).

> If the catch variable's destructor throws, that seems to be specified to 
> unwind normally (including destroying the exception, and if the destructor 
> throws at that point then std::terminate gets called, as normal for 
> exceptions during unwinding).

In this case, the destructor is throwing during normal scope exit so I don't 
think terminate behavior from [except.throw]p7 is enforced since we're not 
currently handling an uncaught exception. The handler is already active since 
we're past the initialization of the catch. Given that, I think this is akin to 
the example in [except.ctor]p2 and if the destructor is `noexcept(false)` 
should trigger a proper unwind like how an exception in the destructor of a 
simple automatic variable inside the handler scope would also do.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108905/new/

https://reviews.llvm.org/D108905

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D36850: [ThinLTO] Add norecurse function attribute propagation

2021-08-27 Thread Di Mo via Phabricator via cfe-commits
modimo updated this revision to Diff 369158.
modimo added a comment.

Check for CachedAttributes count in map rather than value so conservative 
results are not re-calculated


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D36850/new/

https://reviews.llvm.org/D36850

Files:
  clang/test/CodeGen/thinlto-distributed-cfi-devirt.ll
  clang/test/CodeGen/thinlto-distributed-cfi.ll
  clang/test/CodeGen/thinlto-funcattr-prop.ll
  llvm/include/llvm/AsmParser/LLToken.h
  llvm/include/llvm/IR/GlobalValue.h
  llvm/include/llvm/IR/ModuleSummaryIndex.h
  llvm/include/llvm/LTO/LTO.h
  llvm/include/llvm/Transforms/IPO/FunctionAttrs.h
  llvm/lib/Analysis/ModuleSummaryAnalysis.cpp
  llvm/lib/AsmParser/LLLexer.cpp
  llvm/lib/AsmParser/LLParser.cpp
  llvm/lib/Bitcode/Reader/BitcodeReader.cpp
  llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
  llvm/lib/IR/AsmWriter.cpp
  llvm/lib/IR/ModuleSummaryIndex.cpp
  llvm/lib/LTO/LTO.cpp
  llvm/lib/LTO/LTOBackend.cpp
  llvm/lib/LTO/ThinLTOCodeGenerator.cpp
  llvm/lib/Transforms/IPO/FunctionAttrs.cpp
  llvm/test/Assembler/thinlto-summary.ll
  llvm/test/Bitcode/thinlto-function-summary-refgraph.ll
  llvm/test/Bitcode/thinlto-type-vcalls.ll
  llvm/test/ThinLTO/X86/deadstrip.ll
  llvm/test/ThinLTO/X86/dot-dumper.ll
  llvm/test/ThinLTO/X86/dot-dumper2.ll
  llvm/test/ThinLTO/X86/funcattrs-prop-exported-internal.ll
  llvm/test/ThinLTO/X86/funcattrs-prop-indirect.ll
  llvm/test/ThinLTO/X86/funcattrs-prop-undefined.ll
  llvm/test/ThinLTO/X86/funcattrs-prop-weak.ll
  llvm/test/ThinLTO/X86/funcattrs-prop.ll
  llvm/test/ThinLTO/X86/funcimport_alwaysinline.ll
  llvm/test/ThinLTO/X86/function_entry_count.ll
  llvm/test/ThinLTO/X86/linkonce_resolution_comdat.ll

Index: llvm/test/ThinLTO/X86/linkonce_resolution_comdat.ll
===
--- llvm/test/ThinLTO/X86/linkonce_resolution_comdat.ll
+++ llvm/test/ThinLTO/X86/linkonce_resolution_comdat.ll
@@ -3,15 +3,17 @@
 ; verification error.
 ; RUN: opt -module-summary %s -o %t1.bc
 ; RUN: opt -module-summary %p/Inputs/linkonce_resolution_comdat.ll -o %t2.bc
-; RUN: llvm-lto -thinlto-action=run %t1.bc %t2.bc -exported-symbol=f -exported-symbol=g -thinlto-save-temps=%t3.
+; RUN: llvm-lto -thinlto-action=run -disable-thinlto-funcattrs=0 %t1.bc %t2.bc -exported-symbol=f -exported-symbol=g -thinlto-save-temps=%t3.
 
 ; RUN: llvm-dis %t3.0.3.imported.bc -o - | FileCheck %s --check-prefix=IMPORT1
 ; RUN: llvm-dis %t3.1.3.imported.bc -o - | FileCheck %s --check-prefix=IMPORT2
 ; Copy from first module is prevailing and converted to weak_odr, copy
 ; from second module is preempted and converted to available_externally and
 ; removed from comdat.
-; IMPORT1: define weak_odr i32 @f(i8* %0) unnamed_addr comdat($c1) {
-; IMPORT2: define available_externally i32 @f(i8* %0) unnamed_addr {
+; IMPORT1: define weak_odr i32 @f(i8* %0) unnamed_addr [[ATTR:#[0-9]+]] comdat($c1) {
+; IMPORT2: define available_externally i32 @f(i8* %0) unnamed_addr [[ATTR:#[0-9]+]] {
+
+; CHECK-DAG: attributes [[ATTR]] = { norecurse nounwind }
 
 ; RUN: llvm-nm -o - < %t1.bc.thinlto.o | FileCheck %s --check-prefix=NM1
 ; NM1: W f
Index: llvm/test/ThinLTO/X86/function_entry_count.ll
===
--- llvm/test/ThinLTO/X86/function_entry_count.ll
+++ llvm/test/ThinLTO/X86/function_entry_count.ll
@@ -2,7 +2,7 @@
 ; RUN: opt -thinlto-bc %p/Inputs/function_entry_count.ll -write-relbf-to-summary -thin-link-bitcode-file=%t2.thinlink.bc -o %t2.bc
 
 ; First perform the thin link on the normal bitcode file.
-; RUN: llvm-lto2 run %t1.bc %t2.bc -o %t.o -save-temps -thinlto-synthesize-entry-counts \
+; RUN: llvm-lto2 run %t1.bc %t2.bc -o %t.o -save-temps -disable-thinlto-funcattrs=0 -thinlto-synthesize-entry-counts \
 ; RUN: -r=%t1.bc,g, \
 ; RUN: -r=%t1.bc,f,px \
 ; RUN: -r=%t1.bc,h,px \
@@ -10,15 +10,16 @@
 ; RUN: -r=%t2.bc,g,px
 ; RUN: llvm-dis -o - %t.o.1.3.import.bc | FileCheck %s
 
-; RUN: llvm-lto -thinlto-action=run -thinlto-synthesize-entry-counts -exported-symbol=f \
+; RUN: llvm-lto -thinlto-action=run -disable-thinlto-funcattrs=0 -thinlto-synthesize-entry-counts -exported-symbol=f \
 ; RUN: -exported-symbol=g -exported-symbol=h -thinlto-save-temps=%t3. %t1.bc %t2.bc
 ; RUN: llvm-dis %t3.0.3.imported.bc -o - | FileCheck %s
 
-; CHECK: define void @h() !prof ![[PROF2:[0-9]+]]
-; CHECK: define void @f(i32{{.*}}) !prof ![[PROF1:[0-9]+]]
+; CHECK: define void @h() [[ATTR:#[0-9]+]] !prof ![[PROF2:[0-9]+]]
+; CHECK: define void @f(i32{{.*}}) [[ATTR:#[0-9]+]] !prof ![[PROF1:[0-9]+]]
 ; CHECK: define available_externally void @g() !prof ![[PROF2]]
 ; CHECK-DAG: ![[PROF1]] = !{!"synthetic_function_entry_count", i64 10}
 ; CHECK-DAG: ![[PROF2]] = !{!"synthetic_function_entry_count", i64 198}
+; CHECK-DAG: attributes [[ATTR]] = { norecurse nounwind }
 
 target triple = "x86_64-unknown-linux-gnu"
 target datalayout = 

[PATCH] D36850: [ThinLTO] Add norecurse function attribute propagation

2021-08-26 Thread Di Mo via Phabricator via cfe-commits
modimo updated this revision to Diff 369024.
modimo marked 2 inline comments as done.
modimo added a comment.

Use prevailing for linkonce/weak. Add hasUnknownCall to model virtual and 
indirect calls


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D36850/new/

https://reviews.llvm.org/D36850

Files:
  clang/test/CodeGen/thinlto-distributed-cfi-devirt.ll
  clang/test/CodeGen/thinlto-distributed-cfi.ll
  clang/test/CodeGen/thinlto-funcattr-prop.ll
  llvm/include/llvm/AsmParser/LLToken.h
  llvm/include/llvm/IR/GlobalValue.h
  llvm/include/llvm/IR/ModuleSummaryIndex.h
  llvm/include/llvm/LTO/LTO.h
  llvm/include/llvm/Transforms/IPO/FunctionAttrs.h
  llvm/lib/Analysis/ModuleSummaryAnalysis.cpp
  llvm/lib/AsmParser/LLLexer.cpp
  llvm/lib/AsmParser/LLParser.cpp
  llvm/lib/Bitcode/Reader/BitcodeReader.cpp
  llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
  llvm/lib/IR/AsmWriter.cpp
  llvm/lib/IR/ModuleSummaryIndex.cpp
  llvm/lib/LTO/LTO.cpp
  llvm/lib/LTO/LTOBackend.cpp
  llvm/lib/LTO/ThinLTOCodeGenerator.cpp
  llvm/lib/Transforms/IPO/FunctionAttrs.cpp
  llvm/test/Assembler/thinlto-summary.ll
  llvm/test/Bitcode/thinlto-function-summary-refgraph.ll
  llvm/test/Bitcode/thinlto-type-vcalls.ll
  llvm/test/ThinLTO/X86/deadstrip.ll
  llvm/test/ThinLTO/X86/dot-dumper.ll
  llvm/test/ThinLTO/X86/dot-dumper2.ll
  llvm/test/ThinLTO/X86/funcattrs-prop-exported-internal.ll
  llvm/test/ThinLTO/X86/funcattrs-prop-indirect.ll
  llvm/test/ThinLTO/X86/funcattrs-prop-undefined.ll
  llvm/test/ThinLTO/X86/funcattrs-prop-weak.ll
  llvm/test/ThinLTO/X86/funcattrs-prop.ll
  llvm/test/ThinLTO/X86/funcimport_alwaysinline.ll
  llvm/test/ThinLTO/X86/function_entry_count.ll
  llvm/test/ThinLTO/X86/linkonce_resolution_comdat.ll

Index: llvm/test/ThinLTO/X86/linkonce_resolution_comdat.ll
===
--- llvm/test/ThinLTO/X86/linkonce_resolution_comdat.ll
+++ llvm/test/ThinLTO/X86/linkonce_resolution_comdat.ll
@@ -3,15 +3,17 @@
 ; verification error.
 ; RUN: opt -module-summary %s -o %t1.bc
 ; RUN: opt -module-summary %p/Inputs/linkonce_resolution_comdat.ll -o %t2.bc
-; RUN: llvm-lto -thinlto-action=run %t1.bc %t2.bc -exported-symbol=f -exported-symbol=g -thinlto-save-temps=%t3.
+; RUN: llvm-lto -thinlto-action=run -disable-thinlto-funcattrs=0 %t1.bc %t2.bc -exported-symbol=f -exported-symbol=g -thinlto-save-temps=%t3.
 
 ; RUN: llvm-dis %t3.0.3.imported.bc -o - | FileCheck %s --check-prefix=IMPORT1
 ; RUN: llvm-dis %t3.1.3.imported.bc -o - | FileCheck %s --check-prefix=IMPORT2
 ; Copy from first module is prevailing and converted to weak_odr, copy
 ; from second module is preempted and converted to available_externally and
 ; removed from comdat.
-; IMPORT1: define weak_odr i32 @f(i8* %0) unnamed_addr comdat($c1) {
-; IMPORT2: define available_externally i32 @f(i8* %0) unnamed_addr {
+; IMPORT1: define weak_odr i32 @f(i8* %0) unnamed_addr [[ATTR:#[0-9]+]] comdat($c1) {
+; IMPORT2: define available_externally i32 @f(i8* %0) unnamed_addr [[ATTR:#[0-9]+]] {
+
+; CHECK-DAG: attributes [[ATTR]] = { norecurse nounwind }
 
 ; RUN: llvm-nm -o - < %t1.bc.thinlto.o | FileCheck %s --check-prefix=NM1
 ; NM1: W f
Index: llvm/test/ThinLTO/X86/function_entry_count.ll
===
--- llvm/test/ThinLTO/X86/function_entry_count.ll
+++ llvm/test/ThinLTO/X86/function_entry_count.ll
@@ -2,7 +2,7 @@
 ; RUN: opt -thinlto-bc %p/Inputs/function_entry_count.ll -write-relbf-to-summary -thin-link-bitcode-file=%t2.thinlink.bc -o %t2.bc
 
 ; First perform the thin link on the normal bitcode file.
-; RUN: llvm-lto2 run %t1.bc %t2.bc -o %t.o -save-temps -thinlto-synthesize-entry-counts \
+; RUN: llvm-lto2 run %t1.bc %t2.bc -o %t.o -save-temps -disable-thinlto-funcattrs=0 -thinlto-synthesize-entry-counts \
 ; RUN: -r=%t1.bc,g, \
 ; RUN: -r=%t1.bc,f,px \
 ; RUN: -r=%t1.bc,h,px \
@@ -10,15 +10,16 @@
 ; RUN: -r=%t2.bc,g,px
 ; RUN: llvm-dis -o - %t.o.1.3.import.bc | FileCheck %s
 
-; RUN: llvm-lto -thinlto-action=run -thinlto-synthesize-entry-counts -exported-symbol=f \
+; RUN: llvm-lto -thinlto-action=run -disable-thinlto-funcattrs=0 -thinlto-synthesize-entry-counts -exported-symbol=f \
 ; RUN: -exported-symbol=g -exported-symbol=h -thinlto-save-temps=%t3. %t1.bc %t2.bc
 ; RUN: llvm-dis %t3.0.3.imported.bc -o - | FileCheck %s
 
-; CHECK: define void @h() !prof ![[PROF2:[0-9]+]]
-; CHECK: define void @f(i32{{.*}}) !prof ![[PROF1:[0-9]+]]
+; CHECK: define void @h() [[ATTR:#[0-9]+]] !prof ![[PROF2:[0-9]+]]
+; CHECK: define void @f(i32{{.*}}) [[ATTR:#[0-9]+]] !prof ![[PROF1:[0-9]+]]
 ; CHECK: define available_externally void @g() !prof ![[PROF2]]
 ; CHECK-DAG: ![[PROF1]] = !{!"synthetic_function_entry_count", i64 10}
 ; CHECK-DAG: ![[PROF2]] = !{!"synthetic_function_entry_count", i64 198}
+; CHECK-DAG: attributes [[ATTR]] = { norecurse nounwind }
 
 target triple = "x86_64-unknown-linux-gnu"
 target datalayout = 

[PATCH] D36850: [ThinLTO] Add norecurse function attribute propagation

2021-08-26 Thread Di Mo via Phabricator via cfe-commits
modimo marked 3 inline comments as done.
modimo added a comment.

In D36850#2964293 , @tejohnson wrote:

> Good point on indirect calls. Rather than add a bit to the summary, can the 
> flags just be set conservatively in any function containing an indirect call 
> when we build the summaries initially? I think that would get the same effect.

That could have an issue where A calls {indirect, B} and A gets propagated upon 
from B without knowledge that the indirect call exists. Right now I've got a 
FunFlags `hasUnknownCall` which marks these as non-propagatable.

> For speculative devirtualization aka ICP, we will still be left with a 
> fallback indirect call, so it would still need to be treated conservatively. 
> The extra edges added for ICP promotion candidates shouldn't be a problem or 
> affect this.

Ah good point. I was thinking it may pessimize the propagation because of 
having to process all of these edges this is a no-go because of the fallback.

> Note that with class hierarchy analysis we can do better for virtual calls, 
> e.g. when -fwhole-program-vtables is enabled for whole program 
> devirtualization and we have the necessary whole program visibility on 
> vtables. We could potentially integrate WPD decision here. Even if we can't 
> find a single devirtualization target, we can compute the set of all possible 
> targets of virtual calls during the thin link and could use that information 
> to basically merge the attributes from all possible targets. But probably 
> best to leave that as a future refinement as it will take some additional 
> work to get that hooked up. We'd still need to be conservative for virtual 
> calls when we don't have the necessary type info (when 
> -fwhole-program-vtables not enabled or we don't have whole program visibility 
> on the vtable defs), or for non-virtual indirect calls.

Agreed, it's an engineering problem more than anything. I ran an optimistic 
build (same revisions as before, release + noinline) where indirect and virtual 
calls were assumed to always propagate (thinlto_prop_optimistic) and the effect 
in Clang self-build at least is not too large:

thinlto_base/

  "dwarfehprepare.NumCleanupLandingPadsRemaining": 217515,
  "dwarfehprepare.NumNoUnwind": 299126,
  "dwarfehprepare.NumUnwind": 332785,

thinlto_prop/

  "dwarfehprepare.NumCleanupLandingPadsRemaining": 158372,
  "dwarfehprepare.NumNoUnwind": 420918,
  "dwarfehprepare.NumUnwind": 210870,

thinlto_prop_optimistic/

  "dwarfehprepare.NumCleanupLandingPadsRemaining": 154958,
  "dwarfehprepare.NumNoUnwind": 425893,
  "dwarfehprepare.NumUnwind": 205889,

(425893-420918)/(420918-299126) = 4% boost over being conservative and correct. 
This may change in real workloads though so I added a 
`thinlto-funcattrs-optimistic-indirect` flag for easy measurement.




Comment at: llvm/lib/Transforms/IPO/FunctionAttrs.cpp:383
+  //   occur in a couple of circumstances:
+  //  a. An internal function gets imported due to its caller getting
+  //  imported, it becomes AvailableExternally

tejohnson wrote:
> I'm not sure how this case could be happening as we haven't actually done the 
> importing that would create these new available externally copies yet - that 
> happens in the LTO backends, during the thin link we just add them to import 
> lists.
I added the test funcattrs-prop-exported-internal.ll that demonstrates this. 
The summary has its internal linkage converted to external in 
[thinLTOResolvePrevailingInIndex](https://github.com/llvm/llvm-project/blob/92ce6db/llvm/lib/LTO/LTO.cpp#L436)
 then converted to AvailableExternally in 
[thinLTOResolvePrevailingGUID](https://github.com/llvm/llvm-project/blob/92ce6db/llvm/lib/LTO/LTO.cpp#L370).
 Currently being handled conservatively since a prevailing copy doesn't exist.



Comment at: llvm/lib/Transforms/IPO/FunctionAttrs.cpp:388
+  // propagating on its caller.
+  //  b. C++11 [temp.explicit]p10 can generate AvailableExternally
+  //  definitions of explicitly instanced template declarations

tejohnson wrote:
> There is no prevailing copy presumably because the prevailing copy is in a 
> native library being linked? I think these cases can be handled 
> conservatively.
Yeah the prevailing copy is in the native binary.

This is a [C++ specific 
feature](https://github.com/llvm/llvm-project/blob/92ce6db/clang/lib/AST/ASTContext.cpp#L10755)
 which has ODR and these are already being propagated/inlined from in pre-link. 
The current thinlink propagation implementation is conservative because a 
prevailing copy doesn't exist. Currently being handled conservatively since a 
prevailing copy doesn't exist.



Comment at: llvm/lib/Transforms/IPO/FunctionAttrs.cpp:419
+// Virtual calls are unknown so go conservative
+if (!FS || FS->getTypeIdInfo())
+  return CachedAttributes[VI];


[PATCH] D36850: [ThinLTO] Add norecurse function attribute propagation

2021-08-24 Thread Di Mo via Phabricator via cfe-commits
modimo added a comment.

Checking build timing in release mode Clang self-build looking at purely 
thinlink timing:

| Mode   | Time (s) |
| base   | 2.254|
| base + propagation | 2.556|
| noinline   | 8.870|
| noinline + propagation | 10.215   |
|

So 13% in base and 15% with noinline which seems reasonable for what it's doing.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D36850/new/

https://reviews.llvm.org/D36850

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D36850: [ThinLTO] Add norecurse function attribute propagation

2021-08-23 Thread Di Mo via Phabricator via cfe-commits
modimo updated this revision to Diff 368178.
modimo added a comment.

Remove llvm/test/ThinLTO/X86/weak_externals.ll from diffs


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D36850/new/

https://reviews.llvm.org/D36850

Files:
  clang/test/CodeGen/thinlto-distributed-cfi-devirt.ll
  clang/test/CodeGen/thinlto-distributed-cfi.ll
  clang/test/CodeGen/thinlto-funcattr-prop.ll
  llvm/include/llvm/AsmParser/LLToken.h
  llvm/include/llvm/IR/GlobalValue.h
  llvm/include/llvm/IR/ModuleSummaryIndex.h
  llvm/include/llvm/LTO/LTO.h
  llvm/include/llvm/Transforms/IPO/FunctionAttrs.h
  llvm/lib/Analysis/ModuleSummaryAnalysis.cpp
  llvm/lib/AsmParser/LLLexer.cpp
  llvm/lib/AsmParser/LLParser.cpp
  llvm/lib/Bitcode/Reader/BitcodeReader.cpp
  llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
  llvm/lib/IR/AsmWriter.cpp
  llvm/lib/IR/ModuleSummaryIndex.cpp
  llvm/lib/LTO/LTO.cpp
  llvm/lib/LTO/LTOBackend.cpp
  llvm/lib/LTO/ThinLTOCodeGenerator.cpp
  llvm/lib/Transforms/IPO/FunctionAttrs.cpp
  llvm/test/Assembler/thinlto-summary.ll
  llvm/test/ThinLTO/X86/deadstrip.ll
  llvm/test/ThinLTO/X86/dot-dumper.ll
  llvm/test/ThinLTO/X86/dot-dumper2.ll
  llvm/test/ThinLTO/X86/funcattrs-prop-exported-internal.ll
  llvm/test/ThinLTO/X86/funcattrs-prop-indirect.ll
  llvm/test/ThinLTO/X86/funcattrs-prop-undefined.ll
  llvm/test/ThinLTO/X86/funcattrs-prop.ll
  llvm/test/ThinLTO/X86/funcimport_alwaysinline.ll
  llvm/test/ThinLTO/X86/function_entry_count.ll
  llvm/test/ThinLTO/X86/linkonce_resolution_comdat.ll

Index: llvm/test/ThinLTO/X86/linkonce_resolution_comdat.ll
===
--- llvm/test/ThinLTO/X86/linkonce_resolution_comdat.ll
+++ llvm/test/ThinLTO/X86/linkonce_resolution_comdat.ll
@@ -3,15 +3,17 @@
 ; verification error.
 ; RUN: opt -module-summary %s -o %t1.bc
 ; RUN: opt -module-summary %p/Inputs/linkonce_resolution_comdat.ll -o %t2.bc
-; RUN: llvm-lto -thinlto-action=run %t1.bc %t2.bc -exported-symbol=f -exported-symbol=g -thinlto-save-temps=%t3.
+; RUN: llvm-lto -thinlto-action=run -disable-thinlto-funcattrs=0 %t1.bc %t2.bc -exported-symbol=f -exported-symbol=g -thinlto-save-temps=%t3.
 
 ; RUN: llvm-dis %t3.0.3.imported.bc -o - | FileCheck %s --check-prefix=IMPORT1
 ; RUN: llvm-dis %t3.1.3.imported.bc -o - | FileCheck %s --check-prefix=IMPORT2
 ; Copy from first module is prevailing and converted to weak_odr, copy
 ; from second module is preempted and converted to available_externally and
 ; removed from comdat.
-; IMPORT1: define weak_odr i32 @f(i8* %0) unnamed_addr comdat($c1) {
-; IMPORT2: define available_externally i32 @f(i8* %0) unnamed_addr {
+; IMPORT1: define weak_odr i32 @f(i8* %0) unnamed_addr [[ATTR:#[0-9]+]] comdat($c1) {
+; IMPORT2: define available_externally i32 @f(i8* %0) unnamed_addr [[ATTR:#[0-9]+]] {
+
+; CHECK-DAG: attributes [[ATTR]] = { norecurse nounwind }
 
 ; RUN: llvm-nm -o - < %t1.bc.thinlto.o | FileCheck %s --check-prefix=NM1
 ; NM1: W f
Index: llvm/test/ThinLTO/X86/function_entry_count.ll
===
--- llvm/test/ThinLTO/X86/function_entry_count.ll
+++ llvm/test/ThinLTO/X86/function_entry_count.ll
@@ -2,7 +2,7 @@
 ; RUN: opt -thinlto-bc %p/Inputs/function_entry_count.ll -write-relbf-to-summary -thin-link-bitcode-file=%t2.thinlink.bc -o %t2.bc
 
 ; First perform the thin link on the normal bitcode file.
-; RUN: llvm-lto2 run %t1.bc %t2.bc -o %t.o -save-temps -thinlto-synthesize-entry-counts \
+; RUN: llvm-lto2 run %t1.bc %t2.bc -o %t.o -save-temps -disable-thinlto-funcattrs=0 -thinlto-synthesize-entry-counts \
 ; RUN: -r=%t1.bc,g, \
 ; RUN: -r=%t1.bc,f,px \
 ; RUN: -r=%t1.bc,h,px \
@@ -10,15 +10,16 @@
 ; RUN: -r=%t2.bc,g,px
 ; RUN: llvm-dis -o - %t.o.1.3.import.bc | FileCheck %s
 
-; RUN: llvm-lto -thinlto-action=run -thinlto-synthesize-entry-counts -exported-symbol=f \
+; RUN: llvm-lto -thinlto-action=run -disable-thinlto-funcattrs=0 -thinlto-synthesize-entry-counts -exported-symbol=f \
 ; RUN: -exported-symbol=g -exported-symbol=h -thinlto-save-temps=%t3. %t1.bc %t2.bc
 ; RUN: llvm-dis %t3.0.3.imported.bc -o - | FileCheck %s
 
-; CHECK: define void @h() !prof ![[PROF2:[0-9]+]]
-; CHECK: define void @f(i32{{.*}}) !prof ![[PROF1:[0-9]+]]
+; CHECK: define void @h() [[ATTR:#[0-9]+]] !prof ![[PROF2:[0-9]+]]
+; CHECK: define void @f(i32{{.*}}) [[ATTR:#[0-9]+]] !prof ![[PROF1:[0-9]+]]
 ; CHECK: define available_externally void @g() !prof ![[PROF2]]
 ; CHECK-DAG: ![[PROF1]] = !{!"synthetic_function_entry_count", i64 10}
 ; CHECK-DAG: ![[PROF2]] = !{!"synthetic_function_entry_count", i64 198}
+; CHECK-DAG: attributes [[ATTR]] = { norecurse nounwind }
 
 target triple = "x86_64-unknown-linux-gnu"
 target datalayout = "e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
Index: llvm/test/ThinLTO/X86/funcimport_alwaysinline.ll
===
--- 

[PATCH] D36850: [ThinLTO] Add norecurse function attribute propagation

2021-08-23 Thread Di Mo via Phabricator via cfe-commits
modimo updated this revision to Diff 368174.
modimo added a comment.

Minor test fixups


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D36850/new/

https://reviews.llvm.org/D36850

Files:
  clang/test/CodeGen/thinlto-distributed-cfi-devirt.ll
  clang/test/CodeGen/thinlto-distributed-cfi.ll
  clang/test/CodeGen/thinlto-funcattr-prop.ll
  llvm/include/llvm/AsmParser/LLToken.h
  llvm/include/llvm/IR/GlobalValue.h
  llvm/include/llvm/IR/ModuleSummaryIndex.h
  llvm/include/llvm/LTO/LTO.h
  llvm/include/llvm/Transforms/IPO/FunctionAttrs.h
  llvm/lib/Analysis/ModuleSummaryAnalysis.cpp
  llvm/lib/AsmParser/LLLexer.cpp
  llvm/lib/AsmParser/LLParser.cpp
  llvm/lib/Bitcode/Reader/BitcodeReader.cpp
  llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
  llvm/lib/IR/AsmWriter.cpp
  llvm/lib/IR/ModuleSummaryIndex.cpp
  llvm/lib/LTO/LTO.cpp
  llvm/lib/LTO/LTOBackend.cpp
  llvm/lib/LTO/ThinLTOCodeGenerator.cpp
  llvm/lib/Transforms/IPO/FunctionAttrs.cpp
  llvm/test/Assembler/thinlto-summary.ll
  llvm/test/ThinLTO/X86/deadstrip.ll
  llvm/test/ThinLTO/X86/dot-dumper.ll
  llvm/test/ThinLTO/X86/dot-dumper2.ll
  llvm/test/ThinLTO/X86/funcattrs-prop-exported-internal.ll
  llvm/test/ThinLTO/X86/funcattrs-prop-indirect.ll
  llvm/test/ThinLTO/X86/funcattrs-prop-undefined.ll
  llvm/test/ThinLTO/X86/funcattrs-prop.ll
  llvm/test/ThinLTO/X86/funcimport_alwaysinline.ll
  llvm/test/ThinLTO/X86/function_entry_count.ll
  llvm/test/ThinLTO/X86/linkonce_resolution_comdat.ll
  llvm/test/ThinLTO/X86/weak_externals.ll

Index: llvm/test/ThinLTO/X86/weak_externals.ll
===
--- llvm/test/ThinLTO/X86/weak_externals.ll
+++ llvm/test/ThinLTO/X86/weak_externals.ll
@@ -40,4 +40,3 @@
 define linkonce_odr dso_local dereferenceable(16) %struct.S* @_ZN9SingletonI1SE11getInstanceEv() #0 comdat align 2 {
   ret %struct.S* @_ZZN9SingletonI1SE11getInstanceEvE8instance
 }
-
Index: llvm/test/ThinLTO/X86/linkonce_resolution_comdat.ll
===
--- llvm/test/ThinLTO/X86/linkonce_resolution_comdat.ll
+++ llvm/test/ThinLTO/X86/linkonce_resolution_comdat.ll
@@ -3,15 +3,17 @@
 ; verification error.
 ; RUN: opt -module-summary %s -o %t1.bc
 ; RUN: opt -module-summary %p/Inputs/linkonce_resolution_comdat.ll -o %t2.bc
-; RUN: llvm-lto -thinlto-action=run %t1.bc %t2.bc -exported-symbol=f -exported-symbol=g -thinlto-save-temps=%t3.
+; RUN: llvm-lto -thinlto-action=run -disable-thinlto-funcattrs=0 %t1.bc %t2.bc -exported-symbol=f -exported-symbol=g -thinlto-save-temps=%t3.
 
 ; RUN: llvm-dis %t3.0.3.imported.bc -o - | FileCheck %s --check-prefix=IMPORT1
 ; RUN: llvm-dis %t3.1.3.imported.bc -o - | FileCheck %s --check-prefix=IMPORT2
 ; Copy from first module is prevailing and converted to weak_odr, copy
 ; from second module is preempted and converted to available_externally and
 ; removed from comdat.
-; IMPORT1: define weak_odr i32 @f(i8* %0) unnamed_addr comdat($c1) {
-; IMPORT2: define available_externally i32 @f(i8* %0) unnamed_addr {
+; IMPORT1: define weak_odr i32 @f(i8* %0) unnamed_addr [[ATTR:#[0-9]+]] comdat($c1) {
+; IMPORT2: define available_externally i32 @f(i8* %0) unnamed_addr [[ATTR:#[0-9]+]] {
+
+; CHECK-DAG: attributes [[ATTR]] = { norecurse nounwind }
 
 ; RUN: llvm-nm -o - < %t1.bc.thinlto.o | FileCheck %s --check-prefix=NM1
 ; NM1: W f
Index: llvm/test/ThinLTO/X86/function_entry_count.ll
===
--- llvm/test/ThinLTO/X86/function_entry_count.ll
+++ llvm/test/ThinLTO/X86/function_entry_count.ll
@@ -2,7 +2,7 @@
 ; RUN: opt -thinlto-bc %p/Inputs/function_entry_count.ll -write-relbf-to-summary -thin-link-bitcode-file=%t2.thinlink.bc -o %t2.bc
 
 ; First perform the thin link on the normal bitcode file.
-; RUN: llvm-lto2 run %t1.bc %t2.bc -o %t.o -save-temps -thinlto-synthesize-entry-counts \
+; RUN: llvm-lto2 run %t1.bc %t2.bc -o %t.o -save-temps -disable-thinlto-funcattrs=0 -thinlto-synthesize-entry-counts \
 ; RUN: -r=%t1.bc,g, \
 ; RUN: -r=%t1.bc,f,px \
 ; RUN: -r=%t1.bc,h,px \
@@ -10,15 +10,16 @@
 ; RUN: -r=%t2.bc,g,px
 ; RUN: llvm-dis -o - %t.o.1.3.import.bc | FileCheck %s
 
-; RUN: llvm-lto -thinlto-action=run -thinlto-synthesize-entry-counts -exported-symbol=f \
+; RUN: llvm-lto -thinlto-action=run -disable-thinlto-funcattrs=0 -thinlto-synthesize-entry-counts -exported-symbol=f \
 ; RUN: -exported-symbol=g -exported-symbol=h -thinlto-save-temps=%t3. %t1.bc %t2.bc
 ; RUN: llvm-dis %t3.0.3.imported.bc -o - | FileCheck %s
 
-; CHECK: define void @h() !prof ![[PROF2:[0-9]+]]
-; CHECK: define void @f(i32{{.*}}) !prof ![[PROF1:[0-9]+]]
+; CHECK: define void @h() [[ATTR:#[0-9]+]] !prof ![[PROF2:[0-9]+]]
+; CHECK: define void @f(i32{{.*}}) [[ATTR:#[0-9]+]] !prof ![[PROF1:[0-9]+]]
 ; CHECK: define available_externally void @g() !prof ![[PROF2]]
 ; CHECK-DAG: ![[PROF1]] = !{!"synthetic_function_entry_count", i64 10}
 ; 

[PATCH] D36850: [ThinLTO] Add norecurse function attribute propagation

2021-08-20 Thread Di Mo via Phabricator via cfe-commits
modimo added a comment.

@tejohnson Indirect calls are not captured in FunctionSummaries in CallGraph or 
in a flag form saying they exist. Also looks like 

 speculative candidates for ICP do make it on the graph. For this analysis we 
need to bail out on indirect calls so I think the simplest way is to add a flag 
indicating the presence of them (In FunFlags?). As for the speculative 
candidates, it's probably not too big of a deal.

In D36850#2940598 , @tejohnson wrote:

> Thanks - I know you are still working on this, but I had a few comments so 
> far. I haven't had a chance to test it yet. Unfortunately, the nounwind 
> propagation shouldn't do much on our side as we disable exceptions internally.

I've been iterating on this with Clang self-build with exceptions enabled. Once 
this gets into a good state logically I'll start testing on some of our 
internal workloads which generally enable exceptions and report back on the 
findings.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D36850/new/

https://reviews.llvm.org/D36850

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D36850: [ThinLTO] Add norecurse function attribute propagation

2021-08-20 Thread Di Mo via Phabricator via cfe-commits
modimo updated this revision to Diff 367937.
modimo marked 3 inline comments as done.
modimo edited the summary of this revision.
modimo added a comment.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Adding more test cases and changed logic around weak linkages


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D36850/new/

https://reviews.llvm.org/D36850

Files:
  clang/test/CodeGen/thinlto-distributed-cfi-devirt.ll
  clang/test/CodeGen/thinlto-distributed-cfi.ll
  clang/test/CodeGen/thinlto-funcattr-prop.ll
  llvm/include/llvm/AsmParser/LLToken.h
  llvm/include/llvm/IR/GlobalValue.h
  llvm/include/llvm/IR/ModuleSummaryIndex.h
  llvm/include/llvm/LTO/LTO.h
  llvm/include/llvm/Transforms/IPO/FunctionAttrs.h
  llvm/lib/Analysis/ModuleSummaryAnalysis.cpp
  llvm/lib/AsmParser/LLLexer.cpp
  llvm/lib/AsmParser/LLParser.cpp
  llvm/lib/Bitcode/Reader/BitcodeReader.cpp
  llvm/lib/Bitcode/Writer/BitcodeWriter.cpp
  llvm/lib/IR/AsmWriter.cpp
  llvm/lib/IR/ModuleSummaryIndex.cpp
  llvm/lib/LTO/LTO.cpp
  llvm/lib/LTO/LTOBackend.cpp
  llvm/lib/LTO/ThinLTOCodeGenerator.cpp
  llvm/lib/Transforms/IPO/FunctionAttrs.cpp
  llvm/test/Assembler/thinlto-summary.ll
  llvm/test/ThinLTO/X86/deadstrip.ll
  llvm/test/ThinLTO/X86/dot-dumper.ll
  llvm/test/ThinLTO/X86/dot-dumper2.ll
  llvm/test/ThinLTO/X86/funcattrs-prop-exported-internal.ll
  llvm/test/ThinLTO/X86/funcattrs-prop-indirect.ll
  llvm/test/ThinLTO/X86/funcattrs-prop-undefined.ll
  llvm/test/ThinLTO/X86/funcattrs-prop.ll
  llvm/test/ThinLTO/X86/funcimport_alwaysinline.ll
  llvm/test/ThinLTO/X86/function_entry_count.ll
  llvm/test/ThinLTO/X86/linkonce_resolution_comdat.ll
  llvm/test/ThinLTO/X86/weak_externals.ll

Index: llvm/test/ThinLTO/X86/weak_externals.ll
===
--- llvm/test/ThinLTO/X86/weak_externals.ll
+++ llvm/test/ThinLTO/X86/weak_externals.ll
@@ -40,4 +40,3 @@
 define linkonce_odr dso_local dereferenceable(16) %struct.S* @_ZN9SingletonI1SE11getInstanceEv() #0 comdat align 2 {
   ret %struct.S* @_ZZN9SingletonI1SE11getInstanceEvE8instance
 }
-
Index: llvm/test/ThinLTO/X86/linkonce_resolution_comdat.ll
===
--- llvm/test/ThinLTO/X86/linkonce_resolution_comdat.ll
+++ llvm/test/ThinLTO/X86/linkonce_resolution_comdat.ll
@@ -3,15 +3,17 @@
 ; verification error.
 ; RUN: opt -module-summary %s -o %t1.bc
 ; RUN: opt -module-summary %p/Inputs/linkonce_resolution_comdat.ll -o %t2.bc
-; RUN: llvm-lto -thinlto-action=run %t1.bc %t2.bc -exported-symbol=f -exported-symbol=g -thinlto-save-temps=%t3.
+; RUN: llvm-lto -thinlto-action=run -disable-thinlto-funcattrs=0 %t1.bc %t2.bc -exported-symbol=f -exported-symbol=g -thinlto-save-temps=%t3.
 
 ; RUN: llvm-dis %t3.0.3.imported.bc -o - | FileCheck %s --check-prefix=IMPORT1
 ; RUN: llvm-dis %t3.1.3.imported.bc -o - | FileCheck %s --check-prefix=IMPORT2
 ; Copy from first module is prevailing and converted to weak_odr, copy
 ; from second module is preempted and converted to available_externally and
 ; removed from comdat.
-; IMPORT1: define weak_odr i32 @f(i8* %0) unnamed_addr comdat($c1) {
-; IMPORT2: define available_externally i32 @f(i8* %0) unnamed_addr {
+; IMPORT1: define weak_odr i32 @f(i8* %0) unnamed_addr [[ATTR:#[0-9]+]] comdat($c1) {
+; IMPORT2: define available_externally i32 @f(i8* %0) unnamed_addr [[ATTR:#[0-9]+]] {
+
+; CHECK-DAG: attributes [[ATTR]] = { norecurse nounwind }
 
 ; RUN: llvm-nm -o - < %t1.bc.thinlto.o | FileCheck %s --check-prefix=NM1
 ; NM1: W f
Index: llvm/test/ThinLTO/X86/function_entry_count.ll
===
--- llvm/test/ThinLTO/X86/function_entry_count.ll
+++ llvm/test/ThinLTO/X86/function_entry_count.ll
@@ -2,7 +2,7 @@
 ; RUN: opt -thinlto-bc %p/Inputs/function_entry_count.ll -write-relbf-to-summary -thin-link-bitcode-file=%t2.thinlink.bc -o %t2.bc
 
 ; First perform the thin link on the normal bitcode file.
-; RUN: llvm-lto2 run %t1.bc %t2.bc -o %t.o -save-temps -thinlto-synthesize-entry-counts \
+; RUN: llvm-lto2 run %t1.bc %t2.bc -o %t.o -save-temps -disable-thinlto-funcattrs=0 -thinlto-synthesize-entry-counts \
 ; RUN: -r=%t1.bc,g, \
 ; RUN: -r=%t1.bc,f,px \
 ; RUN: -r=%t1.bc,h,px \
@@ -10,15 +10,16 @@
 ; RUN: -r=%t2.bc,g,px
 ; RUN: llvm-dis -o - %t.o.1.3.import.bc | FileCheck %s
 
-; RUN: llvm-lto -thinlto-action=run -thinlto-synthesize-entry-counts -exported-symbol=f \
+; RUN: llvm-lto -thinlto-action=run -disable-thinlto-funcattrs=0 -thinlto-synthesize-entry-counts -exported-symbol=f \
 ; RUN: -exported-symbol=g -exported-symbol=h -thinlto-save-temps=%t3. %t1.bc %t2.bc
 ; RUN: llvm-dis %t3.0.3.imported.bc -o - | FileCheck %s
 
-; CHECK: define void @h() !prof ![[PROF2:[0-9]+]]
-; CHECK: define void @f(i32{{.*}}) !prof ![[PROF1:[0-9]+]]
+; CHECK: define void @h() [[ATTR:#[0-9]+]] !prof ![[PROF2:[0-9]+]]
+; CHECK: define void 

[PATCH] D105225: [clang] Add support for optional flag -fnew-infallible to restrict exception propagation

2021-08-03 Thread Di Mo via Phabricator via cfe-commits
modimo added a comment.

In D105225#2921471 , @rsmith wrote:

> Sorry I'm late to the party here... is there any work ongoing to add this to 
> GCC too? If not, would it make sense to send a quick note to the GCC 
> development list pointing this out to reduce the chance that they add a 
> similar feature with a different flag name?

No ongoing work to add this to GCC. Pinged mailing list here: 
https://gcc.gnu.org/pipermail/gcc/2021-August/236969.html


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105225/new/

https://reviews.llvm.org/D105225

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D105225: [clang] Add support for optional flag -fnew-infallible to restrict exception propagation

2021-08-02 Thread Di Mo via Phabricator via cfe-commits
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
Closed by commit rGb40a2a533a9d: [clang] Add support for optional flag 
-fnew-infallible to restrict exception… (authored by modimo).
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D105225/new/

https://reviews.llvm.org/D105225

Files:
  clang/docs/ClangCommandLineReference.rst
  clang/include/clang/Basic/LangOptions.def
  clang/include/clang/Driver/Options.td
  clang/lib/Driver/ToolChains/Clang.cpp
  clang/lib/Sema/SemaExprCXX.cpp
  clang/test/CodeGenCXX/new-infallible.cpp


Index: clang/test/CodeGenCXX/new-infallible.cpp
===
--- /dev/null
+++ clang/test/CodeGenCXX/new-infallible.cpp
@@ -0,0 +1,7 @@
+// RUN: %clang_cc1 -emit-llvm -triple x86_64-linux-gnu -fnew-infallible -o - 
%s | FileCheck %s
+
+// CHECK: call noalias nonnull i8* @_Znwm(i64 4)
+
+// CHECK: ; Function Attrs: nobuiltin nounwind allocsize(0)
+// CHECK-NEXT: declare nonnull i8* @_Znwm(i64)
+int *new_infallible = new int;
Index: clang/lib/Sema/SemaExprCXX.cpp
===
--- clang/lib/Sema/SemaExprCXX.cpp
+++ clang/lib/Sema/SemaExprCXX.cpp
@@ -3049,6 +3049,9 @@
   EPI.ExceptionSpec.Type = EST_Dynamic;
   EPI.ExceptionSpec.Exceptions = llvm::makeArrayRef(BadAllocType);
 }
+if (getLangOpts().NewInfallible) {
+  EPI.ExceptionSpec.Type = EST_DynamicNone;
+}
   } else {
 EPI.ExceptionSpec =
 getLangOpts().CPlusPlus11 ? EST_BasicNoexcept : EST_DynamicNone;
@@ -3064,6 +3067,10 @@
 // Global allocation functions should always be visible.
 Alloc->setVisibleDespiteOwningModule();
 
+if (HasBadAllocExceptionSpec && getLangOpts().NewInfallible)
+  Alloc->addAttr(
+  ReturnsNonNullAttr::CreateImplicit(Context, Alloc->getLocation()));
+
 Alloc->addAttr(VisibilityAttr::CreateImplicit(
 Context, LangOpts.GlobalAllocationFunctionVisibilityHidden
  ? VisibilityAttr::Hidden
Index: clang/lib/Driver/ToolChains/Clang.cpp
===
--- clang/lib/Driver/ToolChains/Clang.cpp
+++ clang/lib/Driver/ToolChains/Clang.cpp
@@ -5711,7 +5711,7 @@
   Args.AddLastArg(CmdArgs, 
options::OPT_fvisibility_inlines_hidden_static_local_var,

options::OPT_fno_visibility_inlines_hidden_static_local_var);
   Args.AddLastArg(CmdArgs, options::OPT_fvisibility_global_new_delete_hidden);
-
+  Args.AddLastArg(CmdArgs, options::OPT_fnew_infallible);
   Args.AddLastArg(CmdArgs, options::OPT_ftlsmodel_EQ);
 
   if (Args.hasFlag(options::OPT_fno_operator_names,
Index: clang/include/clang/Driver/Options.td
===
--- clang/include/clang/Driver/Options.td
+++ clang/include/clang/Driver/Options.td
@@ -2730,6 +2730,10 @@
 def fvisibility_global_new_delete_hidden : Flag<["-"], 
"fvisibility-global-new-delete-hidden">, Group,
   HelpText<"Give global C++ operator new and delete declarations hidden 
visibility">, Flags<[CC1Option]>,
   MarshallingInfoFlag>;
+def fnew_infallible : Flag<["-"], "fnew-infallible">, Group,
+  HelpText<"Treats throwing global C++ operator new as always returning valid 
memory "
+  "(annotates with __attribute__((returns_nonnull)) and throw()). This is 
detectable in source.">,
+  Flags<[CC1Option]>, MarshallingInfoFlag>;
 defm whole_program_vtables : BoolFOption<"whole-program-vtables",
   CodeGenOpts<"WholeProgramVTables">, DefaultFalse,
   PosFlag,
Index: clang/include/clang/Basic/LangOptions.def
===
--- clang/include/clang/Basic/LangOptions.def
+++ clang/include/clang/Basic/LangOptions.def
@@ -280,6 +280,7 @@
"hidden visibility for static local variables in inline C++ "
"methods when -fvisibility-inlines hidden is enabled")
 LANGOPT(GlobalAllocationFunctionVisibilityHidden , 1, 0, "hidden visibility 
for global operator new and delete declaration")
+LANGOPT(NewInfallible , 1, 0, "Treats throwing global C++ operator new as 
always returning valid memory (annotates with __attribute__((returns_nonnull)) 
and throw()). This is detectable in source.")
 BENIGN_LANGOPT(ParseUnknownAnytype, 1, 0, "__unknown_anytype")
 BENIGN_LANGOPT(DebuggerSupport , 1, 0, "debugger support")
 BENIGN_LANGOPT(DebuggerCastResultToId, 1, 0, "for 'po' in the debugger, cast 
the result to id if it is of unknown type")
Index: clang/docs/ClangCommandLineReference.rst
===
--- clang/docs/ClangCommandLineReference.rst
+++ clang/docs/ClangCommandLineReference.rst
@@ -1941,6 +1941,10 @@
 
 Specifies the largest alignment guaranteed by '::operator 

[PATCH] D102820: [Clang] Check for returns_nonnull when deciding to add allocation null checks

2021-06-23 Thread Di Mo via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG42b99e094c4f: [Clang] Check for returns_nonnull when 
deciding to add allocation null checks (authored by modimo).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D102820/new/

https://reviews.llvm.org/D102820

Files:
  clang/lib/AST/ExprCXX.cpp
  clang/test/CodeGenCXX/new.cpp


Index: clang/test/CodeGenCXX/new.cpp
===
--- clang/test/CodeGenCXX/new.cpp
+++ clang/test/CodeGenCXX/new.cpp
@@ -176,6 +176,7 @@
 struct Alloc{
   int x;
   void* operator new[](size_t size);
+  __attribute__((returns_nonnull)) void *operator new[](size_t size, const 
std::nothrow_t &) throw();
   void operator delete[](void* p);
   ~Alloc();
 };
@@ -186,6 +187,10 @@
   // CHECK: call void @_ZN5AllocD1Ev(
   // CHECK: call void @_ZN5AllocdaEPv(i8*
   delete[] new Alloc[10][20];
+  // CHECK: [[P:%.*]] = call nonnull i8* @_ZN5AllocnaEmRKSt9nothrow_t(i64 808, 
{{.*}}) [[ATTR_NOUNWIND:#[^ ]*]]
+  // CHECK-NOT: icmp eq i8* [[P]], null
+  // CHECK: store i64 200
+  delete[] new (nothrow) Alloc[10][20];
   // CHECK: call noalias nonnull i8* @_Znwm
   // CHECK: call void @_ZdlPv(i8*
   delete new bool;
@@ -328,7 +333,7 @@
 // CHECK: call void @_ZdaPv({{.*}}) [[ATTR_BUILTIN_DELETE]]
 delete[] p; // expected-warning {{'delete[]' applied to a pointer that was 
allocated with 'new'; did you mean 'delete'?}}
 
-// CHECK: call noalias i8* @_ZnamRKSt9nothrow_t(i64 3, {{.*}}) 
[[ATTR_BUILTIN_NOTHROW_NEW:#[^ ]*]]
+// CHECK: call noalias i8* @_ZnamRKSt9nothrow_t(i64 3, {{.*}}) 
[[ATTR_NOBUILTIN_NOUNWIND_ALLOCSIZE:#[^ ]*]]
 (void) new (nothrow) S[3];
 
 // CHECK: call i8* @_Znwm15MyPlacementType(i64 4){{$}}
Index: clang/lib/AST/ExprCXX.cpp
===
--- clang/lib/AST/ExprCXX.cpp
+++ clang/lib/AST/ExprCXX.cpp
@@ -275,7 +275,8 @@
 }
 
 bool CXXNewExpr::shouldNullCheckAllocation() const {
-  return getOperatorNew()
+  return !getOperatorNew()->hasAttr() &&
+ getOperatorNew()
  ->getType()
  ->castAs()
  ->isNothrow() &&


Index: clang/test/CodeGenCXX/new.cpp
===
--- clang/test/CodeGenCXX/new.cpp
+++ clang/test/CodeGenCXX/new.cpp
@@ -176,6 +176,7 @@
 struct Alloc{
   int x;
   void* operator new[](size_t size);
+  __attribute__((returns_nonnull)) void *operator new[](size_t size, const std::nothrow_t &) throw();
   void operator delete[](void* p);
   ~Alloc();
 };
@@ -186,6 +187,10 @@
   // CHECK: call void @_ZN5AllocD1Ev(
   // CHECK: call void @_ZN5AllocdaEPv(i8*
   delete[] new Alloc[10][20];
+  // CHECK: [[P:%.*]] = call nonnull i8* @_ZN5AllocnaEmRKSt9nothrow_t(i64 808, {{.*}}) [[ATTR_NOUNWIND:#[^ ]*]]
+  // CHECK-NOT: icmp eq i8* [[P]], null
+  // CHECK: store i64 200
+  delete[] new (nothrow) Alloc[10][20];
   // CHECK: call noalias nonnull i8* @_Znwm
   // CHECK: call void @_ZdlPv(i8*
   delete new bool;
@@ -328,7 +333,7 @@
 // CHECK: call void @_ZdaPv({{.*}}) [[ATTR_BUILTIN_DELETE]]
 delete[] p; // expected-warning {{'delete[]' applied to a pointer that was allocated with 'new'; did you mean 'delete'?}}
 
-// CHECK: call noalias i8* @_ZnamRKSt9nothrow_t(i64 3, {{.*}}) [[ATTR_BUILTIN_NOTHROW_NEW:#[^ ]*]]
+// CHECK: call noalias i8* @_ZnamRKSt9nothrow_t(i64 3, {{.*}}) [[ATTR_NOBUILTIN_NOUNWIND_ALLOCSIZE:#[^ ]*]]
 (void) new (nothrow) S[3];
 
 // CHECK: call i8* @_Znwm15MyPlacementType(i64 4){{$}}
Index: clang/lib/AST/ExprCXX.cpp
===
--- clang/lib/AST/ExprCXX.cpp
+++ clang/lib/AST/ExprCXX.cpp
@@ -275,7 +275,8 @@
 }
 
 bool CXXNewExpr::shouldNullCheckAllocation() const {
-  return getOperatorNew()
+  return !getOperatorNew()->hasAttr() &&
+ getOperatorNew()
  ->getType()
  ->castAs()
  ->isNothrow() &&
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D94333: [Inliner] Change inline remark format and update ReplayInlineAdvisor to use it

2021-06-20 Thread Di Mo via Phabricator via cfe-commits
modimo added a comment.

In D94333#2829233 , @sylvestre.ledru 
wrote:

>> I think this change broke apt.llvm.org
>
> Confirmed: reverting this change fixed the link issue

What exact commit/download package and build command repros this? 
M68kSubtarget.cpp is completely untouched by this change I'm surprised that 
reverting this fixes it. Did you bisect it down to this commit?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94333/new/

https://reviews.llvm.org/D94333

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D102820: [Clang] Check for returns_nonnull when deciding to add allocation null checks

2021-06-08 Thread Di Mo via Phabricator via cfe-commits
modimo added a comment.

Ping.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D102820/new/

https://reviews.llvm.org/D102820

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D102820: [Clang] Check for returns_nonnull when deciding to add allocation null checks

2021-05-26 Thread Di Mo via Phabricator via cfe-commits
modimo added a comment.

@rsmith @lebedev.ri thoughts on adding this directly to FE generation? As 
mentioned this isn't strictly needed and the BE can elide the check but we can 
also not emit it to save on AST/IR processing cost.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D102820/new/

https://reviews.llvm.org/D102820

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D102820: [Clang] Check for returns_nonnull when deciding to add allocation null checks

2021-05-20 Thread Di Mo via Phabricator via cfe-commits
modimo added a comment.

In D102820#2772184 , @bruno wrote:

> Sounds reasonable to me! Can you double check whether this attribute gets 
> correctly serialized/deserialized in face of `CXXNewExpr`? An example of how 
> to test that would be in `clang/test/PCH/cxx-method.cpp`.

Piggybacking on that test case:

Inputs/cxx-method.h:

  typedef __typeof__(sizeof(0)) size_t;
  
  void *operator new(size_t size)
  {
return ::operator new(size);
  }

cxx-method.cpp:

  int * foo()
  {
  return new int;
  }

Testing

  ~/llvm-project/clang/test/PCH# ~/llvm-project/build-rel/bin/clang++ -cc1 -x 
c++  -emit-pch Inputs/cxx-method.h -o test.pch
  ~/llvm-project/clang/test/PCH# ~/llvm-project/build-rel/bin/clang++ -cc1 -x 
c++ cxx-method.cpp -include-pch test.pch -emit-llvm
  ~/llvm-project/clang/test/PCH# grep _Znwm cxx-method.ll
  define dso_local nonnull i8* @_Znwm(i64 %size) #0 {
%call = call noalias nonnull i8* @_Znwm(i64 %0) #2
%call = call noalias nonnull i8* @_Znwm(i64 4) #3

Assuming I'm answering the correct question that the returns_nonnull is 
preserved through a PCH, the answer is yes.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D102820/new/

https://reviews.llvm.org/D102820

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D102820: [Clang] Check for returns_nonnull when deciding to add allocation null checks

2021-05-20 Thread Di Mo via Phabricator via cfe-commits
modimo added a subscriber: urnathan.
modimo added a comment.

Discussing with @urnathan this makes more sense for the BE to handle when 
optimizing by eliding the generated null check. Confirmed this is indeed the 
case so removing the generation in the FE isn't really needed.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D102820/new/

https://reviews.llvm.org/D102820

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D102820: [Clang] Check for returns_nonnull when deciding to add allocation null checks

2021-05-19 Thread Di Mo via Phabricator via cfe-commits
modimo updated this revision to Diff 346610.
modimo added a comment.

Go back to single statement return, fix up ATTR_BUILTIN_NOTHROW_NEW test label 
that pointed to nothing to correct ATTR_NOBUILTIN_NOUNWIND_ALLOCSIZE label.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D102820/new/

https://reviews.llvm.org/D102820

Files:
  clang/lib/AST/ExprCXX.cpp
  clang/test/CodeGenCXX/new.cpp


Index: clang/test/CodeGenCXX/new.cpp
===
--- clang/test/CodeGenCXX/new.cpp
+++ clang/test/CodeGenCXX/new.cpp
@@ -176,6 +176,7 @@
 struct Alloc{
   int x;
   void* operator new[](size_t size);
+  __attribute__((returns_nonnull)) void *operator new[](size_t size, const 
std::nothrow_t &) throw();
   void operator delete[](void* p);
   ~Alloc();
 };
@@ -186,6 +187,10 @@
   // CHECK: call void @_ZN5AllocD1Ev(
   // CHECK: call void @_ZN5AllocdaEPv(i8*
   delete[] new Alloc[10][20];
+  // CHECK: [[P:%.*]] = call nonnull i8* @_ZN5AllocnaEmRKSt9nothrow_t(i64 808, 
{{.*}}) [[ATTR_NOUNWIND:#[^ ]*]]
+  // CHECK-NOT: icmp eq i8* [[P]], null
+  // CHECK: store i64 200
+  delete[] new (nothrow) Alloc[10][20];
   // CHECK: call noalias nonnull i8* @_Znwm
   // CHECK: call void @_ZdlPv(i8*
   delete new bool;
@@ -328,7 +333,7 @@
 // CHECK: call void @_ZdaPv({{.*}}) [[ATTR_BUILTIN_DELETE]]
 delete[] p; // expected-warning {{'delete[]' applied to a pointer that was 
allocated with 'new'; did you mean 'delete'?}}
 
-// CHECK: call noalias i8* @_ZnamRKSt9nothrow_t(i64 3, {{.*}}) 
[[ATTR_BUILTIN_NOTHROW_NEW:#[^ ]*]]
+// CHECK: call noalias i8* @_ZnamRKSt9nothrow_t(i64 3, {{.*}}) 
[[ATTR_NOBUILTIN_NOUNWIND_ALLOCSIZE:#[^ ]*]]
 (void) new (nothrow) S[3];
 
 // CHECK: call i8* @_Znwm15MyPlacementType(i64 4){{$}}
Index: clang/lib/AST/ExprCXX.cpp
===
--- clang/lib/AST/ExprCXX.cpp
+++ clang/lib/AST/ExprCXX.cpp
@@ -275,7 +275,8 @@
 }
 
 bool CXXNewExpr::shouldNullCheckAllocation() const {
-  return getOperatorNew()
+  return !getOperatorNew()->hasAttr() &&
+ getOperatorNew()
  ->getType()
  ->castAs()
  ->isNothrow() &&


Index: clang/test/CodeGenCXX/new.cpp
===
--- clang/test/CodeGenCXX/new.cpp
+++ clang/test/CodeGenCXX/new.cpp
@@ -176,6 +176,7 @@
 struct Alloc{
   int x;
   void* operator new[](size_t size);
+  __attribute__((returns_nonnull)) void *operator new[](size_t size, const std::nothrow_t &) throw();
   void operator delete[](void* p);
   ~Alloc();
 };
@@ -186,6 +187,10 @@
   // CHECK: call void @_ZN5AllocD1Ev(
   // CHECK: call void @_ZN5AllocdaEPv(i8*
   delete[] new Alloc[10][20];
+  // CHECK: [[P:%.*]] = call nonnull i8* @_ZN5AllocnaEmRKSt9nothrow_t(i64 808, {{.*}}) [[ATTR_NOUNWIND:#[^ ]*]]
+  // CHECK-NOT: icmp eq i8* [[P]], null
+  // CHECK: store i64 200
+  delete[] new (nothrow) Alloc[10][20];
   // CHECK: call noalias nonnull i8* @_Znwm
   // CHECK: call void @_ZdlPv(i8*
   delete new bool;
@@ -328,7 +333,7 @@
 // CHECK: call void @_ZdaPv({{.*}}) [[ATTR_BUILTIN_DELETE]]
 delete[] p; // expected-warning {{'delete[]' applied to a pointer that was allocated with 'new'; did you mean 'delete'?}}
 
-// CHECK: call noalias i8* @_ZnamRKSt9nothrow_t(i64 3, {{.*}}) [[ATTR_BUILTIN_NOTHROW_NEW:#[^ ]*]]
+// CHECK: call noalias i8* @_ZnamRKSt9nothrow_t(i64 3, {{.*}}) [[ATTR_NOBUILTIN_NOUNWIND_ALLOCSIZE:#[^ ]*]]
 (void) new (nothrow) S[3];
 
 // CHECK: call i8* @_Znwm15MyPlacementType(i64 4){{$}}
Index: clang/lib/AST/ExprCXX.cpp
===
--- clang/lib/AST/ExprCXX.cpp
+++ clang/lib/AST/ExprCXX.cpp
@@ -275,7 +275,8 @@
 }
 
 bool CXXNewExpr::shouldNullCheckAllocation() const {
-  return getOperatorNew()
+  return !getOperatorNew()->hasAttr() &&
+ getOperatorNew()
  ->getType()
  ->castAs()
  ->isNothrow() &&
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D102820: [Clang] Check for returns_nonnull when deciding to add allocation null checks

2021-05-19 Thread Di Mo via Phabricator via cfe-commits
modimo created this revision.
modimo added reviewers: bruno, lebedev.ri, rsmith.
Herald added subscribers: hoy, wenlei, lxfind.
modimo requested review of this revision.
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Non-throwing allocators currently will always get null-check code. However, if 
the non-throwing allocator is explicitly annotated with returns_nonnull the 
null check should be elided.

Testing:
ninja check-all
added test case correctly elides


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D102820

Files:
  clang/lib/AST/ExprCXX.cpp
  clang/test/CodeGenCXX/new.cpp


Index: clang/test/CodeGenCXX/new.cpp
===
--- clang/test/CodeGenCXX/new.cpp
+++ clang/test/CodeGenCXX/new.cpp
@@ -176,6 +176,7 @@
 struct Alloc{
   int x;
   void* operator new[](size_t size);
+  __attribute__((returns_nonnull)) void *operator new[](size_t size, const 
std::nothrow_t &) throw();
   void operator delete[](void* p);
   ~Alloc();
 };
@@ -186,6 +187,10 @@
   // CHECK: call void @_ZN5AllocD1Ev(
   // CHECK: call void @_ZN5AllocdaEPv(i8*
   delete[] new Alloc[10][20];
+  // CHECK: [[P:%.*]] = call nonnull i8* @_ZN5AllocnaEmRKSt9nothrow_t(i64 808, 
{{.*}}) [[ATTR_NOBUILTIN_NOUNWIND_ALLOCSIZE:#[^ ]*]]
+  // CHECK-NOT: icmp eq i8* [[P]], null
+  // CHECK: store i64 200
+  delete[] new (nothrow) Alloc[10][20];
   // CHECK: call noalias nonnull i8* @_Znwm
   // CHECK: call void @_ZdlPv(i8*
   delete new bool;
Index: clang/lib/AST/ExprCXX.cpp
===
--- clang/lib/AST/ExprCXX.cpp
+++ clang/lib/AST/ExprCXX.cpp
@@ -275,7 +275,8 @@
 }
 
 bool CXXNewExpr::shouldNullCheckAllocation() const {
-  return getOperatorNew()
+  return !getOperatorNew()->hasAttr() &&
+ getOperatorNew()
  ->getType()
  ->castAs()
  ->isNothrow() &&


Index: clang/test/CodeGenCXX/new.cpp
===
--- clang/test/CodeGenCXX/new.cpp
+++ clang/test/CodeGenCXX/new.cpp
@@ -176,6 +176,7 @@
 struct Alloc{
   int x;
   void* operator new[](size_t size);
+  __attribute__((returns_nonnull)) void *operator new[](size_t size, const std::nothrow_t &) throw();
   void operator delete[](void* p);
   ~Alloc();
 };
@@ -186,6 +187,10 @@
   // CHECK: call void @_ZN5AllocD1Ev(
   // CHECK: call void @_ZN5AllocdaEPv(i8*
   delete[] new Alloc[10][20];
+  // CHECK: [[P:%.*]] = call nonnull i8* @_ZN5AllocnaEmRKSt9nothrow_t(i64 808, {{.*}}) [[ATTR_NOBUILTIN_NOUNWIND_ALLOCSIZE:#[^ ]*]]
+  // CHECK-NOT: icmp eq i8* [[P]], null
+  // CHECK: store i64 200
+  delete[] new (nothrow) Alloc[10][20];
   // CHECK: call noalias nonnull i8* @_Znwm
   // CHECK: call void @_ZdlPv(i8*
   delete new bool;
Index: clang/lib/AST/ExprCXX.cpp
===
--- clang/lib/AST/ExprCXX.cpp
+++ clang/lib/AST/ExprCXX.cpp
@@ -275,7 +275,8 @@
 }
 
 bool CXXNewExpr::shouldNullCheckAllocation() const {
-  return getOperatorNew()
+  return !getOperatorNew()->hasAttr() &&
+ getOperatorNew()
  ->getType()
  ->castAs()
  ->isNothrow() &&
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D94333: [Inliner] Change inline remark format and update ReplayInlineAdvisor to use it

2021-01-21 Thread Di Mo via Phabricator via cfe-commits
modimo added inline comments.



Comment at: llvm/lib/Analysis/InlineAdvisor.cpp:412
+
+  Remark << ";";
 }

wenlei wrote:
> modimo wrote:
> > wenlei wrote:
> > > nit: any special reason for adding this? doesn't seem consistent with 
> > > other remarks we have.
> > If you grab the remark outputs via `-Rpass=inline` you'll get additional 
> > suffix information:
> > ```
> > inline.cpp:8:12: remark: _Z3foov inlined into main with (cost=0, 
> > threshold=375) at callsite main:2:12; [-Rpass=inline]
> > return foo();
> > ```
> > 
> > The semicolon is to separate the remark from any additional output at the 
> > end so when replaying we can match the correct callsite. Something like 
> > this would be unneeded for yaml replay but for the current implementation 
> > it's necessary for correctness.
> > 
> By correctness, did you mean the fact that you rely on `split(";")` in 
> parsing, or something else?
> 
> This is not a big deal, but if no other remarks end with `;`, it would be 
> good to be consistent. Using `split(";")` for parsing is just one way of 
> implementing it, and IMO could be changed to favor consistency in remarks 
> output.
> By correctness, did you mean the fact that you rely on `split(";")` in 
> parsing, or something else?

Yeah, without that we would store the callsite from remarks as `main:2:12 
[-Rpass=inline]` which would not match the actual callsite string `main:2:12` 
that we query the map with which causes replay to never inline.

> This is not a big deal, but if no other remarks end with `;`, it would be 
> good to be consistent. Using `split(";")` for parsing is just one way of 
> implementing it, and IMO could be changed to favor consistency in remarks 
> output.

Doing a search query for `OptimizationRemarkAnalysis` I see vectorizer ORE uses 
"." for their terminator so switching to that is better consistency. I'll make 
the change in an upcoming patch.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94333/new/

https://reviews.llvm.org/D94333

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D94333: [Inliner] Change inline remark format and update ReplayInlineAdvisor to use it

2021-01-14 Thread Di Mo via Phabricator via cfe-commits
modimo added inline comments.



Comment at: llvm/lib/Analysis/InlineAdvisor.cpp:412
+
+  Remark << ";";
 }

wenlei wrote:
> nit: any special reason for adding this? doesn't seem consistent with other 
> remarks we have.
If you grab the remark outputs via `-Rpass=inline` you'll get additional suffix 
information:
```
inline.cpp:8:12: remark: _Z3foov inlined into main with (cost=0, threshold=375) 
at callsite main:2:12; [-Rpass=inline]
return foo();
```

The semicolon is to separate the remark from any additional output at the end 
so when replaying we can match the correct callsite. Something like this would 
be unneeded for yaml replay but for the current implementation it's necessary 
for correctness.



Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94333/new/

https://reviews.llvm.org/D94333

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D94333: [Inliner] Change inline remark format and update ReplayInlineAdvisor to use it

2021-01-12 Thread Di Mo via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG2a49b7c64a33: [Inliner] Change inline remark format and 
update ReplayInlineAdvisor to use it (authored by modimo).
Herald added a project: clang.
Herald added a subscriber: cfe-commits.

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94333/new/

https://reviews.llvm.org/D94333

Files:
  clang/test/Frontend/optimization-remark-with-hotness-new-pm.c
  clang/test/Frontend/optimization-remark-with-hotness.c
  llvm/include/llvm/Analysis/InlineAdvisor.h
  llvm/include/llvm/Analysis/ReplayInlineAdvisor.h
  llvm/lib/Analysis/InlineAdvisor.cpp
  llvm/lib/Analysis/ReplayInlineAdvisor.cpp
  llvm/lib/Transforms/IPO/SampleProfile.cpp
  llvm/test/Transforms/Inline/optimization-remarks-passed-yaml.ll
  llvm/test/Transforms/SampleProfile/Inputs/inline-replay.txt
  llvm/test/Transforms/SampleProfile/inline-replay.ll
  llvm/test/Transforms/SampleProfile/remarks-hotness.ll
  llvm/test/Transforms/SampleProfile/remarks.ll

Index: llvm/test/Transforms/SampleProfile/remarks.ll
===
--- llvm/test/Transforms/SampleProfile/remarks.ll
+++ llvm/test/Transforms/SampleProfile/remarks.ll
@@ -21,8 +21,8 @@
 
 ; We are expecting foo() to be inlined in main() (almost all the cycles are
 ; spent inside foo).
-; CHECK: remark: remarks.cc:13:21: _Z3foov inlined into main to match profiling context with (cost=130, threshold=225) at callsite main:0
-; CHECK: remark: remarks.cc:9:19: rand inlined into main to match profiling context with (cost=always): always inline attribute at callsite _Z3foov:6 @ main:0
+; CHECK: remark: remarks.cc:13:21: _Z3foov inlined into main to match profiling context with (cost=130, threshold=225) at callsite main:0:21;
+; CHECK: remark: remarks.cc:9:19: rand inlined into main to match profiling context with (cost=always): always inline attribute at callsite _Z3foov:6:19 @ main:0:21;
 
 ; The back edge for the loop is the hottest edge in the loop subgraph.
 ; CHECK: remark: remarks.cc:6:9: most popular destination for conditional branches at remarks.cc:5:3
@@ -53,6 +53,9 @@
 ;YAML-NEXT:- String:  main
 ;YAML-NEXT:- String:  ':'
 ;YAML-NEXT:- Line:'0'
+;YAML-NEXT:- String:  ':'
+;YAML-NEXT:- Column:  '21'
+;YAML-NEXT:- String:  ';'
 ;YAML-NEXT:  ...
 ;YAML:   --- !Passed
 ;YAML-NEXT:  Pass:sample-profile-inline
@@ -74,10 +77,15 @@
 ;YAML-NEXT:- String:  _Z3foov
 ;YAML-NEXT:- String:  ':'
 ;YAML-NEXT:- Line:'6'
+;YAML-NEXT:- String:  ':'
+;YAML-NEXT:- Column:  '19'
 ;YAML-NEXT:- String:  ' @ '
 ;YAML-NEXT:- String:  main
 ;YAML-NEXT:- String:  ':'
 ;YAML-NEXT:- Line:'0'
+;YAML-NEXT:- String:  ':'
+;YAML-NEXT:- Column:  '21'
+;YAML-NEXT:- String:  ';'
 ;YAML:  --- !Analysis
 ;YAML-NEXT:  Pass:sample-profile
 ;YAML-NEXT:  Name:AppliedSamples
Index: llvm/test/Transforms/SampleProfile/remarks-hotness.ll
===
--- llvm/test/Transforms/SampleProfile/remarks-hotness.ll
+++ llvm/test/Transforms/SampleProfile/remarks-hotness.ll
@@ -36,7 +36,7 @@
 ; YAML-MISS-NEXT: Function:_Z7caller2v
 ; YAML-MISS-NEXT: Hotness: 2
 
-; CHECK-RPASS: _Z7callee1v inlined into _Z7caller1v with (cost=-30, threshold=4500) at callsite _Z7caller1v:1 (hotness: 401)
+; CHECK-RPASS: _Z7callee1v inlined into _Z7caller1v with (cost=-30, threshold=4500) at callsite _Z7caller1v:1:10; (hotness: 401)
 ; CHECK-RPASS-NOT: _Z7callee2v not inlined into _Z7caller2v because it should never be inlined (cost=never): noinline function attribute (hotness: 2)
 
 ; ModuleID = 'remarks-hotness.cpp'
@@ -93,4 +93,3 @@
 !17 = distinct !DISubprogram(name: "caller2", linkageName: "_Z7caller2v", scope: !1, file: !1, line: 13, type: !8, scopeLine: 13, flags: DIFlagPrototyped, spFlags: DISPFlagDefinition, unit: !0, retainedNodes: !2)
 !18 = !DILocation(line: 14, column: 10, scope: !17)
 !19 = !DILocation(line: 14, column: 3, scope: !17)
-
Index: llvm/test/Transforms/SampleProfile/inline-replay.ll
===
--- llvm/test/Transforms/SampleProfile/inline-replay.ll
+++ llvm/test/Transforms/SampleProfile/inline-replay.ll
@@ -119,4 +119,4 @@
 
 ; REPLAY: _Z3sumii inlined into main
 ; REPLAY: _Z3subii inlined into main
-; REPLA-NOT: _Z3subii inlined into _Z3sumii
+; REPLAY-NOT: _Z3subii inlined into _Z3sumii
Index: llvm/test/Transforms/SampleProfile/Inputs/inline-replay.txt
===
--- llvm/test/Transforms/SampleProfile/Inputs/inline-replay.txt
+++ llvm/test/Transforms/SampleProfile/Inputs/inline-replay.txt
@@ -1,2 +1,2 @@
-remark: 

[PATCH] D86156: [BFI] Make BFI information available through loop passes inside LoopStandardAnalysisResults

2020-09-16 Thread Di Mo via Phabricator via cfe-commits
modimo added a comment.

Thanks @asbirlea!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86156: [BFI] Make BFI information available through loop passes inside LoopStandardAnalysisResults

2020-09-15 Thread Di Mo via Phabricator via cfe-commits
modimo updated this revision to Diff 292044.
modimo added a comment.

Rebase #2


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

Files:
  llvm/include/llvm/Analysis/LoopAnalysisManager.h
  llvm/include/llvm/Transforms/Scalar/LoopPassManager.h
  llvm/lib/Passes/PassBuilder.cpp
  llvm/lib/Transforms/Scalar/LICM.cpp
  llvm/lib/Transforms/Scalar/LoopDistribute.cpp
  llvm/lib/Transforms/Scalar/LoopLoadElimination.cpp
  llvm/lib/Transforms/Scalar/LoopUnswitch.cpp
  llvm/lib/Transforms/Utils/LoopVersioning.cpp
  llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
  llvm/test/Other/opt-O2-pipeline.ll
  llvm/test/Other/opt-O3-pipeline-enable-matrix.ll
  llvm/test/Other/opt-O3-pipeline.ll
  llvm/test/Other/opt-Os-pipeline.ll
  llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp

Index: llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
===
--- llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
+++ llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
@@ -9,7 +9,10 @@
 #include "llvm/Transforms/Scalar/LoopPassManager.h"
 #include "llvm/Analysis/AliasAnalysis.h"
 #include "llvm/Analysis/AssumptionCache.h"
+#include "llvm/Analysis/BlockFrequencyInfo.h"
+#include "llvm/Analysis/BranchProbabilityInfo.h"
 #include "llvm/Analysis/MemorySSA.h"
+#include "llvm/Analysis/PostDominators.h"
 #include "llvm/Analysis/ScalarEvolution.h"
 #include "llvm/Analysis/TargetLibraryInfo.h"
 #include "llvm/Analysis/TargetTransformInfo.h"
@@ -294,6 +297,9 @@
 // those.
 FAM.registerPass([&] { return AAManager(); });
 FAM.registerPass([&] { return AssumptionAnalysis(); });
+FAM.registerPass([&] { return BlockFrequencyAnalysis(); });
+FAM.registerPass([&] { return BranchProbabilityAnalysis(); });
+FAM.registerPass([&] { return PostDominatorTreeAnalysis(); });
 FAM.registerPass([&] { return MemorySSAAnalysis(); });
 FAM.registerPass([&] { return ScalarEvolutionAnalysis(); });
 FAM.registerPass([&] { return TargetLibraryAnalysis(); });
Index: llvm/test/Other/opt-Os-pipeline.ll
===
--- llvm/test/Other/opt-Os-pipeline.ll
+++ llvm/test/Other/opt-Os-pipeline.ll
@@ -97,6 +97,8 @@
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Rotate Loops
 ; CHECK-NEXT: Memory SSA
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT:   Unswitch loops
@@ -154,6 +156,8 @@
 ; CHECK-NEXT: LCSSA Verifier
 ; CHECK-NEXT: Loop-Closed SSA Form Pass
 ; CHECK-NEXT: Scalar Evolution Analysis
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT: Post-Dominator Tree Construction
@@ -256,10 +260,10 @@
 ; CHECK-NEXT:   LCSSA Verifier
 ; CHECK-NEXT:   Loop-Closed SSA Form Pass
 ; CHECK-NEXT:   Scalar Evolution Analysis
-; CHECK-NEXT:   Loop Pass Manager
-; CHECK-NEXT: Loop Invariant Code Motion
 ; CHECK-NEXT:   Lazy Branch Probability Analysis
 ; CHECK-NEXT:   Lazy Block Frequency Analysis
+; CHECK-NEXT:   Loop Pass Manager
+; CHECK-NEXT: Loop Invariant Code Motion
 ; CHECK-NEXT:   Optimization Remark Emitter
 ; CHECK-NEXT:   Warn about non-applied transformations
 ; CHECK-NEXT:   Alignment from assumptions
Index: llvm/test/Other/opt-O3-pipeline.ll
===
--- llvm/test/Other/opt-O3-pipeline.ll
+++ llvm/test/Other/opt-O3-pipeline.ll
@@ -116,6 +116,8 @@
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Rotate Loops
 ; CHECK-NEXT: Memory SSA
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT:   Unswitch loops
@@ -173,6 +175,8 @@
 ; CHECK-NEXT: LCSSA Verifier
 ; CHECK-NEXT: Loop-Closed SSA Form Pass
 ; CHECK-NEXT: Scalar Evolution Analysis
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT: Post-Dominator Tree Construction
@@ -275,10 +279,10 @@
 ; CHECK-NEXT:   LCSSA Verifier
 ; CHECK-NEXT:   Loop-Closed SSA Form Pass
 ; CHECK-NEXT:   Scalar Evolution Analysis
-; CHECK-NEXT:   Loop Pass Manager
-; CHECK-NEXT: Loop Invariant Code Motion
 ; CHECK-NEXT:   Lazy Branch Probability Analysis
 ; CHECK-NEXT:   Lazy Block Frequency Analysis
+; 

[PATCH] D86156: [BFI] Make BFI information available through loop passes inside LoopStandardAnalysisResults

2020-09-11 Thread Di Mo via Phabricator via cfe-commits
modimo updated this revision to Diff 291353.
modimo added a comment.

Rebase


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

Files:
  llvm/include/llvm/Analysis/LoopAnalysisManager.h
  llvm/include/llvm/Transforms/Scalar/LoopPassManager.h
  llvm/lib/Passes/PassBuilder.cpp
  llvm/lib/Transforms/Scalar/LICM.cpp
  llvm/lib/Transforms/Scalar/LoopDistribute.cpp
  llvm/lib/Transforms/Scalar/LoopLoadElimination.cpp
  llvm/lib/Transforms/Scalar/LoopUnswitch.cpp
  llvm/lib/Transforms/Utils/LoopVersioning.cpp
  llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
  llvm/test/Other/opt-O2-pipeline.ll
  llvm/test/Other/opt-O3-pipeline-enable-matrix.ll
  llvm/test/Other/opt-O3-pipeline.ll
  llvm/test/Other/opt-Os-pipeline.ll
  llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp

Index: llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
===
--- llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
+++ llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
@@ -9,7 +9,10 @@
 #include "llvm/Transforms/Scalar/LoopPassManager.h"
 #include "llvm/Analysis/AliasAnalysis.h"
 #include "llvm/Analysis/AssumptionCache.h"
+#include "llvm/Analysis/BlockFrequencyInfo.h"
+#include "llvm/Analysis/BranchProbabilityInfo.h"
 #include "llvm/Analysis/MemorySSA.h"
+#include "llvm/Analysis/PostDominators.h"
 #include "llvm/Analysis/ScalarEvolution.h"
 #include "llvm/Analysis/TargetLibraryInfo.h"
 #include "llvm/Analysis/TargetTransformInfo.h"
@@ -294,6 +297,9 @@
 // those.
 FAM.registerPass([&] { return AAManager(); });
 FAM.registerPass([&] { return AssumptionAnalysis(); });
+FAM.registerPass([&] { return BlockFrequencyAnalysis(); });
+FAM.registerPass([&] { return BranchProbabilityAnalysis(); });
+FAM.registerPass([&] { return PostDominatorTreeAnalysis(); });
 FAM.registerPass([&] { return MemorySSAAnalysis(); });
 FAM.registerPass([&] { return ScalarEvolutionAnalysis(); });
 FAM.registerPass([&] { return TargetLibraryAnalysis(); });
Index: llvm/test/Other/opt-Os-pipeline.ll
===
--- llvm/test/Other/opt-Os-pipeline.ll
+++ llvm/test/Other/opt-Os-pipeline.ll
@@ -97,6 +97,8 @@
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Rotate Loops
 ; CHECK-NEXT: Memory SSA
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT:   Unswitch loops
@@ -153,6 +155,8 @@
 ; CHECK-NEXT: Loop-Closed SSA Form Pass
 ; CHECK-NEXT: Function Alias Analysis Results
 ; CHECK-NEXT: Scalar Evolution Analysis
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT: Post-Dominator Tree Construction
@@ -255,10 +259,10 @@
 ; CHECK-NEXT:   LCSSA Verifier
 ; CHECK-NEXT:   Loop-Closed SSA Form Pass
 ; CHECK-NEXT:   Scalar Evolution Analysis
-; CHECK-NEXT:   Loop Pass Manager
-; CHECK-NEXT: Loop Invariant Code Motion
 ; CHECK-NEXT:   Lazy Branch Probability Analysis
 ; CHECK-NEXT:   Lazy Block Frequency Analysis
+; CHECK-NEXT:   Loop Pass Manager
+; CHECK-NEXT: Loop Invariant Code Motion
 ; CHECK-NEXT:   Optimization Remark Emitter
 ; CHECK-NEXT:   Warn about non-applied transformations
 ; CHECK-NEXT:   Alignment from assumptions
Index: llvm/test/Other/opt-O3-pipeline.ll
===
--- llvm/test/Other/opt-O3-pipeline.ll
+++ llvm/test/Other/opt-O3-pipeline.ll
@@ -116,6 +116,8 @@
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Rotate Loops
 ; CHECK-NEXT: Memory SSA
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT:   Unswitch loops
@@ -172,6 +174,8 @@
 ; CHECK-NEXT: Loop-Closed SSA Form Pass
 ; CHECK-NEXT: Function Alias Analysis Results
 ; CHECK-NEXT: Scalar Evolution Analysis
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT: Post-Dominator Tree Construction
@@ -274,10 +278,10 @@
 ; CHECK-NEXT:   LCSSA Verifier
 ; CHECK-NEXT:   Loop-Closed SSA Form Pass
 ; CHECK-NEXT:   Scalar Evolution Analysis
-; CHECK-NEXT:   Loop Pass Manager
-; CHECK-NEXT: Loop Invariant Code Motion
 ; CHECK-NEXT:   Lazy Branch Probability Analysis
 ; CHECK-NEXT:   Lazy 

[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-28 Thread Di Mo via Phabricator via cfe-commits
modimo added a comment.

In D86156#2245103 , @nikic wrote:

> I have no familiarity with BFI, so possibly stupid question: There is already 
> some similar handling as part of BFIImpl here: 
> https://github.com/llvm/llvm-project/blob/0f14b2e6cbb54c84ed3b00b0db521f5ce2d1e3f2/llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h#L1043-L1058
>  What is the difference to that / why are both needed?

Well this is a funny situation: looks like someone else saw the same problem we 
observed and pushed the fix also using VH callbacks (D75341 
). Both of us coming to the same solution here 
is a good sign that its a good fit.

I'm glad you pointed this out! Looking at my diff with both callbacks 
incorporated the node gets erase() called twice but since the second call isn't 
in the DenseMap erase becomes a no-op. This explains why both changes didn't 
catastrophically collide against each other.

Given all that, the changes here to pass along the BFI information in the loop 
passes and allow usage in LICM still seems meaningful enough to commit. I've 
removed the redundant call-back added. Also updating the description to reflect 
the latest updates.




Comment at: llvm/lib/Transforms/Scalar/LoopDistribute.cpp:1062
+LoopStandardAnalysisResults AR = {AA,  AC,  DT,  LI, SE,
+  TLI, TTI, nullptr, nullptr};
 return LAM.getResult(L, AR);

nikic wrote:
> Huh, surprised that clang-format allows this.
I also thought that was a mis-format at first but the combination of rules 
turns out to prefer this.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-28 Thread Di Mo via Phabricator via cfe-commits
modimo updated this revision to Diff 288719.
modimo added a comment.

Remove redundant VH callback as @nikic helpfully pointed out!


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

Files:
  llvm/include/llvm/Analysis/LoopAnalysisManager.h
  llvm/include/llvm/Transforms/Scalar/LoopPassManager.h
  llvm/lib/Passes/PassBuilder.cpp
  llvm/lib/Transforms/Scalar/LICM.cpp
  llvm/lib/Transforms/Scalar/LoopDistribute.cpp
  llvm/lib/Transforms/Scalar/LoopLoadElimination.cpp
  llvm/lib/Transforms/Scalar/LoopUnswitch.cpp
  llvm/lib/Transforms/Utils/LoopVersioning.cpp
  llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
  llvm/test/Other/opt-O2-pipeline.ll
  llvm/test/Other/opt-O3-pipeline-enable-matrix.ll
  llvm/test/Other/opt-O3-pipeline.ll
  llvm/test/Other/opt-Os-pipeline.ll
  llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp

Index: llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
===
--- llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
+++ llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
@@ -9,7 +9,10 @@
 #include "llvm/Transforms/Scalar/LoopPassManager.h"
 #include "llvm/Analysis/AliasAnalysis.h"
 #include "llvm/Analysis/AssumptionCache.h"
+#include "llvm/Analysis/BlockFrequencyInfo.h"
+#include "llvm/Analysis/BranchProbabilityInfo.h"
 #include "llvm/Analysis/MemorySSA.h"
+#include "llvm/Analysis/PostDominators.h"
 #include "llvm/Analysis/ScalarEvolution.h"
 #include "llvm/Analysis/TargetLibraryInfo.h"
 #include "llvm/Analysis/TargetTransformInfo.h"
@@ -294,6 +297,9 @@
 // those.
 FAM.registerPass([&] { return AAManager(); });
 FAM.registerPass([&] { return AssumptionAnalysis(); });
+FAM.registerPass([&] { return BlockFrequencyAnalysis(); });
+FAM.registerPass([&] { return BranchProbabilityAnalysis(); });
+FAM.registerPass([&] { return PostDominatorTreeAnalysis(); });
 FAM.registerPass([&] { return MemorySSAAnalysis(); });
 FAM.registerPass([&] { return ScalarEvolutionAnalysis(); });
 FAM.registerPass([&] { return TargetLibraryAnalysis(); });
Index: llvm/test/Other/opt-Os-pipeline.ll
===
--- llvm/test/Other/opt-Os-pipeline.ll
+++ llvm/test/Other/opt-Os-pipeline.ll
@@ -97,6 +97,8 @@
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Rotate Loops
 ; CHECK-NEXT: Memory SSA
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT:   Unswitch loops
@@ -154,6 +156,8 @@
 ; CHECK-NEXT: LCSSA Verifier
 ; CHECK-NEXT: Loop-Closed SSA Form Pass
 ; CHECK-NEXT: Scalar Evolution Analysis
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT: Post-Dominator Tree Construction
@@ -256,10 +260,10 @@
 ; CHECK-NEXT:   LCSSA Verifier
 ; CHECK-NEXT:   Loop-Closed SSA Form Pass
 ; CHECK-NEXT:   Scalar Evolution Analysis
-; CHECK-NEXT:   Loop Pass Manager
-; CHECK-NEXT: Loop Invariant Code Motion
 ; CHECK-NEXT:   Lazy Branch Probability Analysis
 ; CHECK-NEXT:   Lazy Block Frequency Analysis
+; CHECK-NEXT:   Loop Pass Manager
+; CHECK-NEXT: Loop Invariant Code Motion
 ; CHECK-NEXT:   Optimization Remark Emitter
 ; CHECK-NEXT:   Warn about non-applied transformations
 ; CHECK-NEXT:   Alignment from assumptions
Index: llvm/test/Other/opt-O3-pipeline.ll
===
--- llvm/test/Other/opt-O3-pipeline.ll
+++ llvm/test/Other/opt-O3-pipeline.ll
@@ -116,6 +116,8 @@
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Rotate Loops
 ; CHECK-NEXT: Memory SSA
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT:   Unswitch loops
@@ -173,6 +175,8 @@
 ; CHECK-NEXT: LCSSA Verifier
 ; CHECK-NEXT: Loop-Closed SSA Form Pass
 ; CHECK-NEXT: Scalar Evolution Analysis
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT: Post-Dominator Tree Construction
@@ -275,10 +279,10 @@
 ; CHECK-NEXT:   LCSSA Verifier
 ; CHECK-NEXT:   Loop-Closed SSA Form Pass
 ; CHECK-NEXT:   Scalar Evolution Analysis
-; CHECK-NEXT:   Loop Pass Manager
-; CHECK-NEXT: Loop Invariant Code Motion
 ; CHECK-NEXT:   Lazy Branch Probability Analysis
 ; 

[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-27 Thread Di Mo via Phabricator via cfe-commits
modimo added inline comments.



Comment at: llvm/include/llvm/Transforms/Scalar/LoopPassManager.h:273
   : nullptr;
+BlockFrequencyInfo *BFI = UseBlockFrequencyInfo
+  ? ((F))

asbirlea wrote:
> Add `&& F.hasProfileData()` check here, in the NPM as well?
Makes sense, added.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-27 Thread Di Mo via Phabricator via cfe-commits
modimo updated this revision to Diff 288477.
modimo added a comment.

Condition usage of BFI to PGO in newPM as well


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

Files:
  llvm/include/llvm/Analysis/BlockFrequencyInfo.h
  llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h
  llvm/include/llvm/Analysis/LoopAnalysisManager.h
  llvm/include/llvm/Transforms/Scalar/LoopPassManager.h
  llvm/lib/Analysis/BlockFrequencyInfo.cpp
  llvm/lib/Passes/PassBuilder.cpp
  llvm/lib/Transforms/Scalar/LICM.cpp
  llvm/lib/Transforms/Scalar/LoopDistribute.cpp
  llvm/lib/Transforms/Scalar/LoopLoadElimination.cpp
  llvm/lib/Transforms/Scalar/LoopUnswitch.cpp
  llvm/lib/Transforms/Utils/LoopVersioning.cpp
  llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
  llvm/test/Other/opt-O2-pipeline.ll
  llvm/test/Other/opt-O3-pipeline-enable-matrix.ll
  llvm/test/Other/opt-O3-pipeline.ll
  llvm/test/Other/opt-Os-pipeline.ll
  llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp

Index: llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
===
--- llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
+++ llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
@@ -9,7 +9,10 @@
 #include "llvm/Transforms/Scalar/LoopPassManager.h"
 #include "llvm/Analysis/AliasAnalysis.h"
 #include "llvm/Analysis/AssumptionCache.h"
+#include "llvm/Analysis/BlockFrequencyInfo.h"
+#include "llvm/Analysis/BranchProbabilityInfo.h"
 #include "llvm/Analysis/MemorySSA.h"
+#include "llvm/Analysis/PostDominators.h"
 #include "llvm/Analysis/ScalarEvolution.h"
 #include "llvm/Analysis/TargetLibraryInfo.h"
 #include "llvm/Analysis/TargetTransformInfo.h"
@@ -294,6 +297,9 @@
 // those.
 FAM.registerPass([&] { return AAManager(); });
 FAM.registerPass([&] { return AssumptionAnalysis(); });
+FAM.registerPass([&] { return BlockFrequencyAnalysis(); });
+FAM.registerPass([&] { return BranchProbabilityAnalysis(); });
+FAM.registerPass([&] { return PostDominatorTreeAnalysis(); });
 FAM.registerPass([&] { return MemorySSAAnalysis(); });
 FAM.registerPass([&] { return ScalarEvolutionAnalysis(); });
 FAM.registerPass([&] { return TargetLibraryAnalysis(); });
Index: llvm/test/Other/opt-Os-pipeline.ll
===
--- llvm/test/Other/opt-Os-pipeline.ll
+++ llvm/test/Other/opt-Os-pipeline.ll
@@ -97,6 +97,8 @@
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Rotate Loops
 ; CHECK-NEXT: Memory SSA
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT:   Unswitch loops
@@ -154,6 +156,8 @@
 ; CHECK-NEXT: LCSSA Verifier
 ; CHECK-NEXT: Loop-Closed SSA Form Pass
 ; CHECK-NEXT: Scalar Evolution Analysis
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT: Post-Dominator Tree Construction
@@ -256,10 +260,10 @@
 ; CHECK-NEXT:   LCSSA Verifier
 ; CHECK-NEXT:   Loop-Closed SSA Form Pass
 ; CHECK-NEXT:   Scalar Evolution Analysis
-; CHECK-NEXT:   Loop Pass Manager
-; CHECK-NEXT: Loop Invariant Code Motion
 ; CHECK-NEXT:   Lazy Branch Probability Analysis
 ; CHECK-NEXT:   Lazy Block Frequency Analysis
+; CHECK-NEXT:   Loop Pass Manager
+; CHECK-NEXT: Loop Invariant Code Motion
 ; CHECK-NEXT:   Optimization Remark Emitter
 ; CHECK-NEXT:   Warn about non-applied transformations
 ; CHECK-NEXT:   Alignment from assumptions
Index: llvm/test/Other/opt-O3-pipeline.ll
===
--- llvm/test/Other/opt-O3-pipeline.ll
+++ llvm/test/Other/opt-O3-pipeline.ll
@@ -116,6 +116,8 @@
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Rotate Loops
 ; CHECK-NEXT: Memory SSA
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT:   Unswitch loops
@@ -173,6 +175,8 @@
 ; CHECK-NEXT: LCSSA Verifier
 ; CHECK-NEXT: Loop-Closed SSA Form Pass
 ; CHECK-NEXT: Scalar Evolution Analysis
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT: Post-Dominator Tree Construction
@@ -275,10 +279,10 @@
 ; CHECK-NEXT:   LCSSA Verifier
 ; CHECK-NEXT:   Loop-Closed SSA Form Pass
 ; CHECK-NEXT:   Scalar Evolution Analysis
-; CHECK-NEXT:   

[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-27 Thread Di Mo via Phabricator via cfe-commits
modimo added a comment.

In D86156#2242872 , @asbirlea wrote:

> As a general note, it may make sense to include BFI in the set of loop passes 
> always preserved (`getLoopPassPreservedAnalyses()`), if its nature is to 
> always be preserved (with some potential info loss) due to the callbacks 
> deleting blocks. But since we're only looking at LICM effect for now, this 
> can be a follow up when/if needed.

Certainly, that makes a lot of sense and makes it easier for any other loop 
optimization to take advantage of this data. I definitely want to make that 
happen, process-wise agreed that following up makes sense instead of tacking 
this on in addition.




Comment at: llvm/lib/Transforms/Scalar/LICM.cpp:220
   
().getDomTree(),
+  ().getBFI(),
   ().getTLI(

nikic wrote:
> I believe that to make this actually lazy the getBFI() call needs to be 
> guarded by some other check for presence of profiling data, otherwise it will 
> be computed unconditionally at this point. Typically something like 
> F.hasProfileData() or PSI.hasProfileSummary().
Appreciate the diligence here-your website is great! 

Your assessment is correct, the ".getBFI()" call for lazy will calculate BFI. 
I've guarded it under hasProfileData now.

I see in the "about" section that I can use your setup to test TP changes 
myself, here's my fork of llvm-project if that option is still available: 
https://github.com/modiking/llvm-project



Comment at: llvm/test/Other/opt-O2-pipeline.ll:281
 ; CHECK-NEXT: Loop Invariant Code Motion
 ; CHECK-NEXT:   Lazy Branch Probability Analysis
 ; CHECK-NEXT:   Lazy Block Frequency Analysis

asbirlea wrote:
> Mark LICM to preserve these passes so they get moved above LICM rather than 
> recomputed here (same as they are preserved in unswitch).
Marked, these are no longer calculated again.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-27 Thread Di Mo via Phabricator via cfe-commits
modimo updated this revision to Diff 288445.
modimo added a comment.

only use BFI when profile is enabled, have LICM mark BFI as preserved


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

Files:
  clang/test/CodeGen/thinlto-distributed-newpm.ll
  llvm/include/llvm/Analysis/BlockFrequencyInfo.h
  llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h
  llvm/include/llvm/Analysis/LoopAnalysisManager.h
  llvm/include/llvm/Transforms/Scalar/LoopPassManager.h
  llvm/lib/Analysis/BlockFrequencyInfo.cpp
  llvm/lib/Passes/PassBuilder.cpp
  llvm/lib/Transforms/Scalar/LICM.cpp
  llvm/lib/Transforms/Scalar/LoopDistribute.cpp
  llvm/lib/Transforms/Scalar/LoopLoadElimination.cpp
  llvm/lib/Transforms/Scalar/LoopUnswitch.cpp
  llvm/lib/Transforms/Utils/LoopVersioning.cpp
  llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
  llvm/test/Other/new-pm-defaults.ll
  llvm/test/Other/new-pm-thinlto-defaults.ll
  llvm/test/Other/opt-O2-pipeline.ll
  llvm/test/Other/opt-O3-pipeline-enable-matrix.ll
  llvm/test/Other/opt-O3-pipeline.ll
  llvm/test/Other/opt-Os-pipeline.ll
  llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp

Index: llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
===
--- llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
+++ llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
@@ -9,7 +9,10 @@
 #include "llvm/Transforms/Scalar/LoopPassManager.h"
 #include "llvm/Analysis/AliasAnalysis.h"
 #include "llvm/Analysis/AssumptionCache.h"
+#include "llvm/Analysis/BlockFrequencyInfo.h"
+#include "llvm/Analysis/BranchProbabilityInfo.h"
 #include "llvm/Analysis/MemorySSA.h"
+#include "llvm/Analysis/PostDominators.h"
 #include "llvm/Analysis/ScalarEvolution.h"
 #include "llvm/Analysis/TargetLibraryInfo.h"
 #include "llvm/Analysis/TargetTransformInfo.h"
@@ -294,6 +297,9 @@
 // those.
 FAM.registerPass([&] { return AAManager(); });
 FAM.registerPass([&] { return AssumptionAnalysis(); });
+FAM.registerPass([&] { return BlockFrequencyAnalysis(); });
+FAM.registerPass([&] { return BranchProbabilityAnalysis(); });
+FAM.registerPass([&] { return PostDominatorTreeAnalysis(); });
 FAM.registerPass([&] { return MemorySSAAnalysis(); });
 FAM.registerPass([&] { return ScalarEvolutionAnalysis(); });
 FAM.registerPass([&] { return TargetLibraryAnalysis(); });
Index: llvm/test/Other/opt-Os-pipeline.ll
===
--- llvm/test/Other/opt-Os-pipeline.ll
+++ llvm/test/Other/opt-Os-pipeline.ll
@@ -97,6 +97,8 @@
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Rotate Loops
 ; CHECK-NEXT: Memory SSA
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT:   Unswitch loops
@@ -154,6 +156,8 @@
 ; CHECK-NEXT: LCSSA Verifier
 ; CHECK-NEXT: Loop-Closed SSA Form Pass
 ; CHECK-NEXT: Scalar Evolution Analysis
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT: Post-Dominator Tree Construction
@@ -256,10 +260,10 @@
 ; CHECK-NEXT:   LCSSA Verifier
 ; CHECK-NEXT:   Loop-Closed SSA Form Pass
 ; CHECK-NEXT:   Scalar Evolution Analysis
-; CHECK-NEXT:   Loop Pass Manager
-; CHECK-NEXT: Loop Invariant Code Motion
 ; CHECK-NEXT:   Lazy Branch Probability Analysis
 ; CHECK-NEXT:   Lazy Block Frequency Analysis
+; CHECK-NEXT:   Loop Pass Manager
+; CHECK-NEXT: Loop Invariant Code Motion
 ; CHECK-NEXT:   Optimization Remark Emitter
 ; CHECK-NEXT:   Warn about non-applied transformations
 ; CHECK-NEXT:   Alignment from assumptions
Index: llvm/test/Other/opt-O3-pipeline.ll
===
--- llvm/test/Other/opt-O3-pipeline.ll
+++ llvm/test/Other/opt-O3-pipeline.ll
@@ -116,6 +116,8 @@
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Rotate Loops
 ; CHECK-NEXT: Memory SSA
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT:   Unswitch loops
@@ -173,6 +175,8 @@
 ; CHECK-NEXT: LCSSA Verifier
 ; CHECK-NEXT: Loop-Closed SSA Form Pass
 ; CHECK-NEXT: Scalar Evolution Analysis
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT: Post-Dominator Tree Construction
@@ -275,10 

[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-26 Thread Di Mo via Phabricator via cfe-commits
modimo added inline comments.



Comment at: llvm/lib/Passes/PassBuilder.cpp:522
   FPM.addPass(createFunctionToLoopPassAdaptor(
-  std::move(LPM2), /*UseMemorySSA=*/false, DebugLogging));
+  std::move(LPM2), /*UseMemorySSA=*/false, /*UseBlockFrequencyInfo=*/true,
+  DebugLogging));

asbirlea wrote:
> It doesn't look like `UseBlockFrequencyInfo` is used for LPM2 here and below. 
> Would it make sense to set it to `false` at this point?
> 
Good catch, changed to false for LPM2.



Comment at: llvm/test/Other/pass-pipelines.ll:57
 ; CHECK-O2: Loop Pass Manager
-; CHECK-O2-NOT: Manager
+; Requiring block frequency for LICM will place ICM and rotation under 
separate Loop Pass Manager
 ; FIXME: We shouldn't be pulling out to simplify-cfg and instcombine and

asbirlea wrote:
> Add `; CHECK-O2: Loop Pass Manager` along with this comment.
> 
> 
> Please consider if splitting the loop pass pipeline has any effects on 
> optimizations. This is for the legacyPM only, so those who switched to the 
> newPM will not be affected.
> The solution may be to mark the analyses preserved in loop unswitch.
To check my understanding here, with the split loop pass pipeline the phases 
look like the following:

```
Lazy Branch Probability Analysis
Lazy Block Frequency Analysis
Loop Pass Manager
  Loop Invariant Code Motion
Loop Pass Manager
  Unswitch loops
```
Walking through the code of Loop Pass Manager by itself it doesn't re-calculate 
or produce additional analysis. Thus the difference appears to arise as follows:
1. combined loop pass: loop opts are run one after the other per loop, so if 
you have loops L1 and L2 the order would be L1(LICM, Unswitch) -> L2 (LICM, 
Unswitch)
2. split loop pass: in this case it would be L1(LICM)->L2(LICM) into 
L1(Unswitch)->L2(Unswitch)

My qualitative assessment is that the impact here is quite minimal. The main 
scenario I can think of where differences could occur is having all LICM 
completed early can change the costing in unswitching for nested loops. 

Unswitching only occurs once in the legacyPM and always after LICM so it seems 
fairly tame to build in this cross-pass dependence by marking lazy BFI/BPI 
preserved. I'd like to know if this level of cross-pass dependence is 
potentially an issue and if there's precedence for doing so if you have that 
context.

I've made the changes in the latest commit so that they exist in the same pass 
pipeline again by marking lazy BFI/FPI as preserved in unswitching. The value 
is definitely there to prevent unwanted side-effects.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-26 Thread Di Mo via Phabricator via cfe-commits
modimo updated this revision to Diff 288172.
modimo added a comment.

Remove usage need for BFI in LPM2 and set unswitching to preserve lazy BPI/BFI 
so it can remain in the same loop pass as LICM


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

Files:
  clang/test/CodeGen/thinlto-distributed-newpm.ll
  llvm/include/llvm/Analysis/BlockFrequencyInfo.h
  llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h
  llvm/include/llvm/Analysis/LoopAnalysisManager.h
  llvm/include/llvm/Transforms/Scalar/LoopPassManager.h
  llvm/lib/Analysis/BlockFrequencyInfo.cpp
  llvm/lib/Passes/PassBuilder.cpp
  llvm/lib/Transforms/Scalar/LICM.cpp
  llvm/lib/Transforms/Scalar/LoopDistribute.cpp
  llvm/lib/Transforms/Scalar/LoopLoadElimination.cpp
  llvm/lib/Transforms/Scalar/LoopUnswitch.cpp
  llvm/lib/Transforms/Utils/LoopVersioning.cpp
  llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
  llvm/test/Other/new-pm-defaults.ll
  llvm/test/Other/new-pm-thinlto-defaults.ll
  llvm/test/Other/opt-O2-pipeline.ll
  llvm/test/Other/opt-O3-pipeline-enable-matrix.ll
  llvm/test/Other/opt-O3-pipeline.ll
  llvm/test/Other/opt-Os-pipeline.ll
  llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp

Index: llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
===
--- llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
+++ llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
@@ -9,7 +9,10 @@
 #include "llvm/Transforms/Scalar/LoopPassManager.h"
 #include "llvm/Analysis/AliasAnalysis.h"
 #include "llvm/Analysis/AssumptionCache.h"
+#include "llvm/Analysis/BlockFrequencyInfo.h"
+#include "llvm/Analysis/BranchProbabilityInfo.h"
 #include "llvm/Analysis/MemorySSA.h"
+#include "llvm/Analysis/PostDominators.h"
 #include "llvm/Analysis/ScalarEvolution.h"
 #include "llvm/Analysis/TargetLibraryInfo.h"
 #include "llvm/Analysis/TargetTransformInfo.h"
@@ -294,6 +297,9 @@
 // those.
 FAM.registerPass([&] { return AAManager(); });
 FAM.registerPass([&] { return AssumptionAnalysis(); });
+FAM.registerPass([&] { return BlockFrequencyAnalysis(); });
+FAM.registerPass([&] { return BranchProbabilityAnalysis(); });
+FAM.registerPass([&] { return PostDominatorTreeAnalysis(); });
 FAM.registerPass([&] { return MemorySSAAnalysis(); });
 FAM.registerPass([&] { return ScalarEvolutionAnalysis(); });
 FAM.registerPass([&] { return TargetLibraryAnalysis(); });
Index: llvm/test/Other/opt-Os-pipeline.ll
===
--- llvm/test/Other/opt-Os-pipeline.ll
+++ llvm/test/Other/opt-Os-pipeline.ll
@@ -97,6 +97,8 @@
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Rotate Loops
 ; CHECK-NEXT: Memory SSA
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT:   Unswitch loops
@@ -154,6 +156,8 @@
 ; CHECK-NEXT: LCSSA Verifier
 ; CHECK-NEXT: Loop-Closed SSA Form Pass
 ; CHECK-NEXT: Scalar Evolution Analysis
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT: Post-Dominator Tree Construction
@@ -256,6 +260,8 @@
 ; CHECK-NEXT:   LCSSA Verifier
 ; CHECK-NEXT:   Loop-Closed SSA Form Pass
 ; CHECK-NEXT:   Scalar Evolution Analysis
+; CHECK-NEXT:   Lazy Branch Probability Analysis
+; CHECK-NEXT:   Lazy Block Frequency Analysis
 ; CHECK-NEXT:   Loop Pass Manager
 ; CHECK-NEXT: Loop Invariant Code Motion
 ; CHECK-NEXT:   Lazy Branch Probability Analysis
Index: llvm/test/Other/opt-O3-pipeline.ll
===
--- llvm/test/Other/opt-O3-pipeline.ll
+++ llvm/test/Other/opt-O3-pipeline.ll
@@ -116,6 +116,8 @@
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Rotate Loops
 ; CHECK-NEXT: Memory SSA
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT:   Unswitch loops
@@ -173,6 +175,8 @@
 ; CHECK-NEXT: LCSSA Verifier
 ; CHECK-NEXT: Loop-Closed SSA Form Pass
 ; CHECK-NEXT: Scalar Evolution Analysis
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT: Post-Dominator Tree Construction
@@ -275,6 +279,8 @@
 ; CHECK-NEXT:   LCSSA Verifier
 ; CHECK-NEXT:   Loop-Closed SSA Form Pass
 ; CHECK-NEXT:   Scalar Evolution Analysis
+; 

[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-26 Thread Di Mo via Phabricator via cfe-commits
modimo added a comment.

In D86156#2231710 , @nikic wrote:

> This change adds three PDT calculations to the standard pipeline. Please try 
> to avoid the PDT calculations if PGO is not used, possibly by using LazyBPI.

Good catch, changed to use LazyBFI which eliminates the extra PDT calculations 
unless this is accessed by PGO


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-26 Thread Di Mo via Phabricator via cfe-commits
modimo updated this revision to Diff 288125.
modimo added a comment.

Change to LazyBFI for legacy pass manager to prevent rebuilding the 
post-dominator tree


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

Files:
  clang/test/CodeGen/thinlto-distributed-newpm.ll
  llvm/include/llvm/Analysis/BlockFrequencyInfo.h
  llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h
  llvm/include/llvm/Analysis/LoopAnalysisManager.h
  llvm/include/llvm/Transforms/Scalar/LoopPassManager.h
  llvm/lib/Analysis/BlockFrequencyInfo.cpp
  llvm/lib/Passes/PassBuilder.cpp
  llvm/lib/Transforms/Scalar/LICM.cpp
  llvm/lib/Transforms/Scalar/LoopDistribute.cpp
  llvm/lib/Transforms/Scalar/LoopLoadElimination.cpp
  llvm/lib/Transforms/Utils/LoopVersioning.cpp
  llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
  llvm/test/Other/new-pm-defaults.ll
  llvm/test/Other/new-pm-thinlto-defaults.ll
  llvm/test/Other/opt-O2-pipeline.ll
  llvm/test/Other/opt-O3-pipeline-enable-matrix.ll
  llvm/test/Other/opt-O3-pipeline.ll
  llvm/test/Other/opt-Os-pipeline.ll
  llvm/test/Other/pass-pipelines.ll
  llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp

Index: llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
===
--- llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
+++ llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
@@ -9,7 +9,10 @@
 #include "llvm/Transforms/Scalar/LoopPassManager.h"
 #include "llvm/Analysis/AliasAnalysis.h"
 #include "llvm/Analysis/AssumptionCache.h"
+#include "llvm/Analysis/BlockFrequencyInfo.h"
+#include "llvm/Analysis/BranchProbabilityInfo.h"
 #include "llvm/Analysis/MemorySSA.h"
+#include "llvm/Analysis/PostDominators.h"
 #include "llvm/Analysis/ScalarEvolution.h"
 #include "llvm/Analysis/TargetLibraryInfo.h"
 #include "llvm/Analysis/TargetTransformInfo.h"
@@ -294,6 +297,9 @@
 // those.
 FAM.registerPass([&] { return AAManager(); });
 FAM.registerPass([&] { return AssumptionAnalysis(); });
+FAM.registerPass([&] { return BlockFrequencyAnalysis(); });
+FAM.registerPass([&] { return BranchProbabilityAnalysis(); });
+FAM.registerPass([&] { return PostDominatorTreeAnalysis(); });
 FAM.registerPass([&] { return MemorySSAAnalysis(); });
 FAM.registerPass([&] { return ScalarEvolutionAnalysis(); });
 FAM.registerPass([&] { return TargetLibraryAnalysis(); });
Index: llvm/test/Other/pass-pipelines.ll
===
--- llvm/test/Other/pass-pipelines.ll
+++ llvm/test/Other/pass-pipelines.ll
@@ -54,7 +54,7 @@
 ; CHECK-O2-NOT: Manager
 ; CHECK-O2: Loop Pass Manager
 ; CHECK-O2: Loop Pass Manager
-; CHECK-O2-NOT: Manager
+; Requiring block frequency for LICM will place ICM and rotation under separate Loop Pass Manager
 ; FIXME: We shouldn't be pulling out to simplify-cfg and instcombine and
 ; causing new loop pass managers.
 ; CHECK-O2: Simplify the CFG
Index: llvm/test/Other/opt-Os-pipeline.ll
===
--- llvm/test/Other/opt-Os-pipeline.ll
+++ llvm/test/Other/opt-Os-pipeline.ll
@@ -97,8 +97,11 @@
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Rotate Loops
 ; CHECK-NEXT: Memory SSA
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
+; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Unswitch loops
 ; CHECK-NEXT: Simplify the CFG
 ; CHECK-NEXT: Dominator Tree Construction
@@ -154,6 +157,8 @@
 ; CHECK-NEXT: LCSSA Verifier
 ; CHECK-NEXT: Loop-Closed SSA Form Pass
 ; CHECK-NEXT: Scalar Evolution Analysis
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT: Post-Dominator Tree Construction
@@ -256,6 +261,8 @@
 ; CHECK-NEXT:   LCSSA Verifier
 ; CHECK-NEXT:   Loop-Closed SSA Form Pass
 ; CHECK-NEXT:   Scalar Evolution Analysis
+; CHECK-NEXT:   Lazy Branch Probability Analysis
+; CHECK-NEXT:   Lazy Block Frequency Analysis
 ; CHECK-NEXT:   Loop Pass Manager
 ; CHECK-NEXT: Loop Invariant Code Motion
 ; CHECK-NEXT:   Lazy Branch Probability Analysis
Index: llvm/test/Other/opt-O3-pipeline.ll
===
--- llvm/test/Other/opt-O3-pipeline.ll
+++ llvm/test/Other/opt-O3-pipeline.ll
@@ -116,8 +116,11 @@
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Rotate Loops
 ; CHECK-NEXT: Memory SSA
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:

[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-21 Thread Di Mo via Phabricator via cfe-commits
modimo added inline comments.



Comment at: llvm/include/llvm/Transforms/Scalar/LoopPassManager.h:408
 FunctionToLoopPassAdaptor
 createFunctionToLoopPassAdaptor(LoopPassT Pass, bool UseMemorySSA = false,
+bool UseBlockFrequencyInfo = false,

@asbirlea Assuming this change matches expectations of making BFI on present 
when LICM or loop passes contain LICM I'm looking at the users of this and they 
seem to fall into 2 categories:
1. Those that specify the optional flags
2. Those that only pass in the LoopPassT

I found as I was updating with a new flag that due to C++ behavior a place that 
wasn't updated to have all 3 optional parameters will place DebugLogging into 
UseBlockFrequencyInfo which is a nasty error. I think a way around it is to 
enforce overloaded functions with either 0 additional parameters or having 
every "optional" parameter specified so it'll be a build time error for the 
previous scenario rather than a runtime issue.



Comment at: llvm/include/llvm/Transforms/Scalar/LoopPassManager.h:278
AM.getResult(F),
+   
(F),
MSSA};

asbirlea wrote:
> This should not be unconditional. See MSSA approach.
Fixed it up to resemble MSSA



Comment at: llvm/test/Transforms/LoopRotate/pr35210.ll:51
+; MSSA-NEXT: Running analysis: BlockFrequencyAnalysis on f
+; MSSA-NEXT: Running analysis: BranchProbabilityAnalysis on f
 ; MSSA-NEXT: Running analysis: InnerAnalysisManagerProxy{{.*}} on f

asbirlea wrote:
> e.g. there's no use of creating these for LoopRotate.
Changed up so LoopRotate no longer has a dependency on BFI


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-21 Thread Di Mo via Phabricator via cfe-commits
modimo updated this revision to Diff 287152.
modimo added a comment.
Herald added subscribers: lxfind, nikic.

@asbirlea Thanks for taking a look!

I updated BFI to resemble MSSA as recommended which removed the BFI calculation 
unless LICM is invoked. Also removed the spurious diffs from formatting.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

Files:
  clang/test/CodeGen/thinlto-distributed-newpm.ll
  llvm/include/llvm/Analysis/BlockFrequencyInfo.h
  llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h
  llvm/include/llvm/Analysis/LoopAnalysisManager.h
  llvm/include/llvm/Transforms/Scalar/LoopPassManager.h
  llvm/lib/Analysis/BlockFrequencyInfo.cpp
  llvm/lib/Passes/PassBuilder.cpp
  llvm/lib/Transforms/Scalar/LICM.cpp
  llvm/lib/Transforms/Scalar/LoopDistribute.cpp
  llvm/lib/Transforms/Scalar/LoopLoadElimination.cpp
  llvm/lib/Transforms/Utils/LoopVersioning.cpp
  llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
  llvm/test/Other/new-pm-defaults.ll
  llvm/test/Other/new-pm-thinlto-defaults.ll
  llvm/test/Other/opt-O2-pipeline.ll
  llvm/test/Other/opt-O3-pipeline-enable-matrix.ll
  llvm/test/Other/opt-O3-pipeline.ll
  llvm/test/Other/opt-Os-pipeline.ll
  llvm/test/Other/pass-pipelines.ll
  llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp

Index: llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
===
--- llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
+++ llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
@@ -9,7 +9,10 @@
 #include "llvm/Transforms/Scalar/LoopPassManager.h"
 #include "llvm/Analysis/AliasAnalysis.h"
 #include "llvm/Analysis/AssumptionCache.h"
+#include "llvm/Analysis/BlockFrequencyInfo.h"
+#include "llvm/Analysis/BranchProbabilityInfo.h"
 #include "llvm/Analysis/MemorySSA.h"
+#include "llvm/Analysis/PostDominators.h"
 #include "llvm/Analysis/ScalarEvolution.h"
 #include "llvm/Analysis/TargetLibraryInfo.h"
 #include "llvm/Analysis/TargetTransformInfo.h"
@@ -294,6 +297,9 @@
 // those.
 FAM.registerPass([&] { return AAManager(); });
 FAM.registerPass([&] { return AssumptionAnalysis(); });
+FAM.registerPass([&] { return BlockFrequencyAnalysis(); });
+FAM.registerPass([&] { return BranchProbabilityAnalysis(); });
+FAM.registerPass([&] { return PostDominatorTreeAnalysis(); });
 FAM.registerPass([&] { return MemorySSAAnalysis(); });
 FAM.registerPass([&] { return ScalarEvolutionAnalysis(); });
 FAM.registerPass([&] { return TargetLibraryAnalysis(); });
Index: llvm/test/Other/pass-pipelines.ll
===
--- llvm/test/Other/pass-pipelines.ll
+++ llvm/test/Other/pass-pipelines.ll
@@ -54,7 +54,7 @@
 ; CHECK-O2-NOT: Manager
 ; CHECK-O2: Loop Pass Manager
 ; CHECK-O2: Loop Pass Manager
-; CHECK-O2-NOT: Manager
+; Requiring block frequency for LICM will place ICM and rotation under separate Loop Pass Manager
 ; FIXME: We shouldn't be pulling out to simplify-cfg and instcombine and
 ; causing new loop pass managers.
 ; CHECK-O2: Simplify the CFG
Index: llvm/test/Other/opt-Os-pipeline.ll
===
--- llvm/test/Other/opt-Os-pipeline.ll
+++ llvm/test/Other/opt-Os-pipeline.ll
@@ -96,9 +96,13 @@
 ; CHECK-NEXT: Scalar Evolution Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Rotate Loops
+; CHECK-NEXT: Post-Dominator Tree Construction
+; CHECK-NEXT: Branch Probability Analysis
+; CHECK-NEXT: Block Frequency Analysis
 ; CHECK-NEXT: Memory SSA
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
+; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Unswitch loops
 ; CHECK-NEXT: Simplify the CFG
 ; CHECK-NEXT: Dominator Tree Construction
@@ -147,13 +151,17 @@
 ; CHECK-NEXT: Phi Values Analysis
 ; CHECK-NEXT: Memory Dependence Analysis
 ; CHECK-NEXT: Dead Store Elimination
+; CHECK-NEXT: Natural Loop Information
+; CHECK-NEXT: Post-Dominator Tree Construction
+; CHECK-NEXT: Branch Probability Analysis
+; CHECK-NEXT: Block Frequency Analysis
 ; CHECK-NEXT: Function Alias Analysis Results
 ; CHECK-NEXT: Memory SSA
-; CHECK-NEXT: Natural Loop Information
 ; CHECK-NEXT: Canonicalize natural loops
 ; CHECK-NEXT: LCSSA Verifier
 ; CHECK-NEXT: Loop-Closed SSA Form Pass
 ; CHECK-NEXT: Scalar Evolution Analysis
+; CHECK-NEXT: Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT: Post-Dominator Tree Construction
@@ -251,11 +259,15 @@
 ; CHECK-NEXT:   Lazy Block Frequency Analysis
 ; CHECK-NEXT:   Optimization Remark Emitter
 ; CHECK-NEXT:   Combine redundant instructions
+; 

[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-18 Thread Di Mo via Phabricator via cfe-commits
modimo updated this revision to Diff 286392.
modimo added a comment.

Commit my changes (crazy I know) so that the diff is actually updated for 
linting


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

Files:
  clang/test/CodeGen/thinlto-distributed-newpm.ll
  llvm/include/llvm/Analysis/BlockFrequencyInfo.h
  llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h
  llvm/include/llvm/Analysis/LoopAnalysisManager.h
  llvm/include/llvm/Transforms/Scalar/LoopPassManager.h
  llvm/lib/Analysis/BlockFrequencyInfo.cpp
  llvm/lib/Transforms/Scalar/LoopDistribute.cpp
  llvm/lib/Transforms/Scalar/LoopLoadElimination.cpp
  llvm/lib/Transforms/Utils/LoopVersioning.cpp
  llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
  llvm/test/Other/loop-pm-invalidation.ll
  llvm/test/Other/new-pass-manager.ll
  llvm/test/Other/new-pm-defaults.ll
  llvm/test/Other/new-pm-thinlto-defaults.ll
  llvm/test/Other/pass-pipelines.ll
  llvm/test/Transforms/LoopRotate/pr35210.ll
  llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp

Index: llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
===
--- llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
+++ llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
@@ -9,7 +9,10 @@
 #include "llvm/Transforms/Scalar/LoopPassManager.h"
 #include "llvm/Analysis/AliasAnalysis.h"
 #include "llvm/Analysis/AssumptionCache.h"
+#include "llvm/Analysis/BlockFrequencyInfo.h"
+#include "llvm/Analysis/BranchProbabilityInfo.h"
 #include "llvm/Analysis/MemorySSA.h"
+#include "llvm/Analysis/PostDominators.h"
 #include "llvm/Analysis/ScalarEvolution.h"
 #include "llvm/Analysis/TargetLibraryInfo.h"
 #include "llvm/Analysis/TargetTransformInfo.h"
@@ -294,6 +297,9 @@
 // those.
 FAM.registerPass([&] { return AAManager(); });
 FAM.registerPass([&] { return AssumptionAnalysis(); });
+FAM.registerPass([&] { return BlockFrequencyAnalysis(); });
+FAM.registerPass([&] { return BranchProbabilityAnalysis(); });
+FAM.registerPass([&] { return PostDominatorTreeAnalysis(); });
 FAM.registerPass([&] { return MemorySSAAnalysis(); });
 FAM.registerPass([&] { return ScalarEvolutionAnalysis(); });
 FAM.registerPass([&] { return TargetLibraryAnalysis(); });
Index: llvm/test/Transforms/LoopRotate/pr35210.ll
===
--- llvm/test/Transforms/LoopRotate/pr35210.ll
+++ llvm/test/Transforms/LoopRotate/pr35210.ll
@@ -19,12 +19,15 @@
 ; CHECK-NEXT: Running analysis: TargetLibraryAnalysis on f
 ; CHECK-NEXT: Running analysis: ScalarEvolutionAnalysis on f
 ; CHECK-NEXT: Running analysis: TargetIRAnalysis on f
+; CHECK-NEXT: Running analysis: BlockFrequencyAnalysis on f
+; CHECK-NEXT: Running analysis: BranchProbabilityAnalysis on f
 ; CHECK-NEXT: Running analysis: InnerAnalysisManagerProxy{{.*}} on f
 ; CHECK-NEXT: Starting Loop pass manager run.
 ; CHECK-NEXT: Running pass: LoopRotatePass on Loop at depth 1 containing: %bb,%bb4
 ; CHECK-NEXT: Folding loop latch bb4 into bb
 ; CHECK-NEXT: Finished Loop pass manager run.
 ; CHECK-NEXT: Invalidating analysis: PostDominatorTreeAnalysis on f
+; CHECK-NEXT: Invalidating analysis: BranchProbabilityAnalysis on f
 ; CHECK-NEXT: Running pass: ADCEPass on f
 ; CHECK-NEXT: Running analysis: PostDominatorTreeAnalysis on f
 ; CHECK-NEXT: Finished llvm::Function pass manager run.
@@ -44,12 +47,15 @@
 ; MSSA-NEXT: Running analysis: TargetLibraryAnalysis on f
 ; MSSA-NEXT: Running analysis: ScalarEvolutionAnalysis on f
 ; MSSA-NEXT: Running analysis: TargetIRAnalysis on f
+; MSSA-NEXT: Running analysis: BlockFrequencyAnalysis on f
+; MSSA-NEXT: Running analysis: BranchProbabilityAnalysis on f
 ; MSSA-NEXT: Running analysis: InnerAnalysisManagerProxy{{.*}} on f
 ; MSSA-NEXT: Starting Loop pass manager run.
 ; MSSA-NEXT: Running pass: LoopRotatePass on Loop at depth 1 containing: %bb,%bb4
 ; MSSA-NEXT: Folding loop latch bb4 into bb
 ; MSSA-NEXT: Finished Loop pass manager run.
 ; MSSA-NEXT: Invalidating analysis: PostDominatorTreeAnalysis on f
+; MSSA-NEXT: Invalidating analysis: BranchProbabilityAnalysis on f
 ; MSSA-NEXT: Running pass: ADCEPass on f
 ; MSSA-NEXT: Running analysis: PostDominatorTreeAnalysis on f
 ; MSSA-NEXT: Finished llvm::Function pass manager run.
Index: llvm/test/Other/pass-pipelines.ll
===
--- llvm/test/Other/pass-pipelines.ll
+++ llvm/test/Other/pass-pipelines.ll
@@ -54,7 +54,7 @@
 ; CHECK-O2-NOT: Manager
 ; CHECK-O2: Loop Pass Manager
 ; CHECK-O2: Loop Pass Manager
-; CHECK-O2-NOT: Manager
+; Requiring block frequency for LICM will place ICM and rotation under separate Loop Pass Manager
 ; FIXME: We shouldn't be pulling out to simplify-cfg and instcombine and
 ; causing new loop pass managers.
 ; CHECK-O2: Simplify the CFG
Index: llvm/test/Other/new-pm-thinlto-defaults.ll

[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-18 Thread Di Mo via Phabricator via cfe-commits
modimo updated this revision to Diff 286390.
modimo added a comment.

Linting


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

Files:
  clang/test/CodeGen/thinlto-distributed-newpm.ll
  llvm/include/llvm/Analysis/BlockFrequencyInfo.h
  llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h
  llvm/include/llvm/Analysis/LoopAnalysisManager.h
  llvm/include/llvm/Transforms/Scalar/LoopPassManager.h
  llvm/lib/Analysis/BlockFrequencyInfo.cpp
  llvm/lib/Transforms/Scalar/LoopDistribute.cpp
  llvm/lib/Transforms/Scalar/LoopLoadElimination.cpp
  llvm/lib/Transforms/Utils/LoopVersioning.cpp
  llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
  llvm/test/Other/loop-pm-invalidation.ll
  llvm/test/Other/new-pass-manager.ll
  llvm/test/Other/new-pm-defaults.ll
  llvm/test/Other/new-pm-thinlto-defaults.ll
  llvm/test/Other/pass-pipelines.ll
  llvm/test/Transforms/LoopRotate/pr35210.ll
  llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp

Index: llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
===
--- llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
+++ llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
@@ -10,6 +10,9 @@
 #include "llvm/Analysis/AliasAnalysis.h"
 #include "llvm/Analysis/AssumptionCache.h"
 #include "llvm/Analysis/MemorySSA.h"
+#include "llvm/Analysis/BlockFrequencyInfo.h"
+#include "llvm/Analysis/BranchProbabilityInfo.h"
+#include "llvm/Analysis/PostDominators.h"
 #include "llvm/Analysis/ScalarEvolution.h"
 #include "llvm/Analysis/TargetLibraryInfo.h"
 #include "llvm/Analysis/TargetTransformInfo.h"
@@ -294,6 +297,9 @@
 // those.
 FAM.registerPass([&] { return AAManager(); });
 FAM.registerPass([&] { return AssumptionAnalysis(); });
+FAM.registerPass([&] { return BlockFrequencyAnalysis(); });
+FAM.registerPass([&] { return BranchProbabilityAnalysis(); });
+FAM.registerPass([&] { return PostDominatorTreeAnalysis(); });
 FAM.registerPass([&] { return MemorySSAAnalysis(); });
 FAM.registerPass([&] { return ScalarEvolutionAnalysis(); });
 FAM.registerPass([&] { return TargetLibraryAnalysis(); });
Index: llvm/test/Transforms/LoopRotate/pr35210.ll
===
--- llvm/test/Transforms/LoopRotate/pr35210.ll
+++ llvm/test/Transforms/LoopRotate/pr35210.ll
@@ -19,12 +19,15 @@
 ; CHECK-NEXT: Running analysis: TargetLibraryAnalysis on f
 ; CHECK-NEXT: Running analysis: ScalarEvolutionAnalysis on f
 ; CHECK-NEXT: Running analysis: TargetIRAnalysis on f
+; CHECK-NEXT: Running analysis: BlockFrequencyAnalysis on f
+; CHECK-NEXT: Running analysis: BranchProbabilityAnalysis on f
 ; CHECK-NEXT: Running analysis: InnerAnalysisManagerProxy{{.*}} on f
 ; CHECK-NEXT: Starting Loop pass manager run.
 ; CHECK-NEXT: Running pass: LoopRotatePass on Loop at depth 1 containing: %bb,%bb4
 ; CHECK-NEXT: Folding loop latch bb4 into bb
 ; CHECK-NEXT: Finished Loop pass manager run.
 ; CHECK-NEXT: Invalidating analysis: PostDominatorTreeAnalysis on f
+; CHECK-NEXT: Invalidating analysis: BranchProbabilityAnalysis on f
 ; CHECK-NEXT: Running pass: ADCEPass on f
 ; CHECK-NEXT: Running analysis: PostDominatorTreeAnalysis on f
 ; CHECK-NEXT: Finished llvm::Function pass manager run.
@@ -44,12 +47,15 @@
 ; MSSA-NEXT: Running analysis: TargetLibraryAnalysis on f
 ; MSSA-NEXT: Running analysis: ScalarEvolutionAnalysis on f
 ; MSSA-NEXT: Running analysis: TargetIRAnalysis on f
+; MSSA-NEXT: Running analysis: BlockFrequencyAnalysis on f
+; MSSA-NEXT: Running analysis: BranchProbabilityAnalysis on f
 ; MSSA-NEXT: Running analysis: InnerAnalysisManagerProxy{{.*}} on f
 ; MSSA-NEXT: Starting Loop pass manager run.
 ; MSSA-NEXT: Running pass: LoopRotatePass on Loop at depth 1 containing: %bb,%bb4
 ; MSSA-NEXT: Folding loop latch bb4 into bb
 ; MSSA-NEXT: Finished Loop pass manager run.
 ; MSSA-NEXT: Invalidating analysis: PostDominatorTreeAnalysis on f
+; MSSA-NEXT: Invalidating analysis: BranchProbabilityAnalysis on f
 ; MSSA-NEXT: Running pass: ADCEPass on f
 ; MSSA-NEXT: Running analysis: PostDominatorTreeAnalysis on f
 ; MSSA-NEXT: Finished llvm::Function pass manager run.
Index: llvm/test/Other/pass-pipelines.ll
===
--- llvm/test/Other/pass-pipelines.ll
+++ llvm/test/Other/pass-pipelines.ll
@@ -54,7 +54,7 @@
 ; CHECK-O2-NOT: Manager
 ; CHECK-O2: Loop Pass Manager
 ; CHECK-O2: Loop Pass Manager
-; CHECK-O2-NOT: Manager
+; Requiring block frequency for LICM will place ICM and rotation under separate Loop Pass Manager
 ; FIXME: We shouldn't be pulling out to simplify-cfg and instcombine and
 ; causing new loop pass managers.
 ; CHECK-O2: Simplify the CFG
Index: llvm/test/Other/new-pm-thinlto-defaults.ll
===
--- llvm/test/Other/new-pm-thinlto-defaults.ll
+++ 

[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-18 Thread Di Mo via Phabricator via cfe-commits
modimo created this revision.
modimo added reviewers: wenlei, asbirlea, vsk.
modimo added a project: LLVM.
Herald added subscribers: llvm-commits, cfe-commits, dexonsmith, steven_wu, 
hiraditya.
Herald added a project: clang.
modimo requested review of this revision.

D65060  uncovered that trying to use BFI in 
loop passes can lead to non-deterministic behavior when blocks are re-used 
while retaining old BFI data.

To make sure BFI is preserved through loop passes a Value Handle (VH) callback 
is registered on blocks themselves. When a block is freed it now also wipes out 
the accompanying BFI entry such that stale BFI data can no longer persist 
resolving the determinism issue.

An optimistic approach would be to incrementally update BFI information 
throughout the loop passes rather than only invalidating them on removed 
blocks. The issues with that are:

1. It is not clear how BFI information should be incrementally updated: If a 
block is duplicated does its BFI information come with? How about if it's 
split/modified/moved around?
2. Assuming we can address these problems the implementation here will be a 
massive undertaking.

There's a known need of BFI in LICM analysis which requires correct but not 
incrementally updated BFI data. Other loop passes can now also use this level 
of information and richer updates can be made as needed.

This diff also moves BFI to be a part of LoopStandardAnalysisResults since the 
previous method using getCachedResults now (correctly!) statically asserts 
(D72893 ) that this data isn't static through 
the loop passes.

Testing
Ninja check


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D86156

Files:
  clang/test/CodeGen/thinlto-distributed-newpm.ll
  llvm/include/llvm/Analysis/BlockFrequencyInfo.h
  llvm/include/llvm/Analysis/BlockFrequencyInfoImpl.h
  llvm/include/llvm/Analysis/LoopAnalysisManager.h
  llvm/include/llvm/Transforms/Scalar/LoopPassManager.h
  llvm/lib/Analysis/BlockFrequencyInfo.cpp
  llvm/lib/Transforms/Scalar/LoopDistribute.cpp
  llvm/lib/Transforms/Scalar/LoopLoadElimination.cpp
  llvm/lib/Transforms/Utils/LoopVersioning.cpp
  llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
  llvm/test/Other/loop-pm-invalidation.ll
  llvm/test/Other/new-pass-manager.ll
  llvm/test/Other/new-pm-defaults.ll
  llvm/test/Other/new-pm-thinlto-defaults.ll
  llvm/test/Other/pass-pipelines.ll
  llvm/test/Transforms/LoopRotate/pr35210.ll
  llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp

Index: llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
===
--- llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
+++ llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
@@ -10,6 +10,9 @@
 #include "llvm/Analysis/AliasAnalysis.h"
 #include "llvm/Analysis/AssumptionCache.h"
 #include "llvm/Analysis/MemorySSA.h"
+#include "llvm/Analysis/BlockFrequencyInfo.h"
+#include "llvm/Analysis/BranchProbabilityInfo.h"
+#include "llvm/Analysis/PostDominators.h"
 #include "llvm/Analysis/ScalarEvolution.h"
 #include "llvm/Analysis/TargetLibraryInfo.h"
 #include "llvm/Analysis/TargetTransformInfo.h"
@@ -294,6 +297,9 @@
 // those.
 FAM.registerPass([&] { return AAManager(); });
 FAM.registerPass([&] { return AssumptionAnalysis(); });
+FAM.registerPass([&] { return BlockFrequencyAnalysis(); });
+FAM.registerPass([&] { return BranchProbabilityAnalysis(); });
+FAM.registerPass([&] { return PostDominatorTreeAnalysis(); });
 FAM.registerPass([&] { return MemorySSAAnalysis(); });
 FAM.registerPass([&] { return ScalarEvolutionAnalysis(); });
 FAM.registerPass([&] { return TargetLibraryAnalysis(); });
Index: llvm/test/Transforms/LoopRotate/pr35210.ll
===
--- llvm/test/Transforms/LoopRotate/pr35210.ll
+++ llvm/test/Transforms/LoopRotate/pr35210.ll
@@ -19,12 +19,15 @@
 ; CHECK-NEXT: Running analysis: TargetLibraryAnalysis on f
 ; CHECK-NEXT: Running analysis: ScalarEvolutionAnalysis on f
 ; CHECK-NEXT: Running analysis: TargetIRAnalysis on f
+; CHECK-NEXT: Running analysis: BlockFrequencyAnalysis on f
+; CHECK-NEXT: Running analysis: BranchProbabilityAnalysis on f
 ; CHECK-NEXT: Running analysis: InnerAnalysisManagerProxy{{.*}} on f
 ; CHECK-NEXT: Starting Loop pass manager run.
 ; CHECK-NEXT: Running pass: LoopRotatePass on Loop at depth 1 containing: %bb,%bb4
 ; CHECK-NEXT: Folding loop latch bb4 into bb
 ; CHECK-NEXT: Finished Loop pass manager run.
 ; CHECK-NEXT: Invalidating analysis: PostDominatorTreeAnalysis on f
+; CHECK-NEXT: Invalidating analysis: BranchProbabilityAnalysis on f
 ; CHECK-NEXT: Running pass: ADCEPass on f
 ; CHECK-NEXT: Running analysis: PostDominatorTreeAnalysis on f
 ; CHECK-NEXT: Finished llvm::Function pass manager run.
@@ -44,12 +47,15 @@
 ; MSSA-NEXT: Running analysis: TargetLibraryAnalysis on f
 ; MSSA-NEXT: