[PATCH] D157518: Avoid running optimization passes in frontend test

2023-09-05 Thread Wenlei He via Phabricator via cfe-commits
wenlei accepted this revision.
wenlei added a comment.

lgtm, thanks.




Comment at: clang/test/CodeGenCXX/attr-likelihood-if-branch-weights.cpp:147-148
 
-// CHECK: ![[PROF_LIKELY]] = !{!"branch_weights", i32 [[UNLIKELY]], i32 
[[LIKELY]]}
-// CHECK: ![[PROF_UNLIKELY]] = !{!"branch_weights", i32 [[LIKELY]], i32 
[[UNLIKELY]]}

MatzeB wrote:
> aeubanks wrote:
> > wenlei wrote:
> > > I thought this is the purpose of the test, which is no longer checked 
> > > after change? 
> > those should be covered by LowerExpectIntrinsic tests
> The frontend rewrites `[[likely]]` and `[[unlikely]]` into calls to 
> `llvm.expect`. And this is still tested here. Running 
> `lower-expect-intrinsic` pass to lower the `llvm.expect` to `!prof` 
> annotations doesn't need to be tested here as we have separate tests for that 
> pass.
Ah, makes sense. I see that we have existing coverage for lowering 
`llvm.expect` to `!prof` in `expect-*.ll`, so we're good. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157518/new/

https://reviews.llvm.org/D157518

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D158668: RFC: Add getLikelyBranchWeight helper function

2023-09-05 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

In D158668#4611988 , @MatzeB wrote:

> And as another strawman / discussion-starter I put up D158680 
>  where I use `!{"branch_weights", i32 1, 
> i32 0}` to represent likely branches and the actual "LikelyWeight" mostly 
> becomes an internal implementation detail of the BranchProbabilityInfo 
> class... This seemed like a neat way to get an abstract "likely", "unlikely" 
> notation, but not sure of the effects if we no longer would have "truly zero" 
> weights because they would be interpreted differently now...

Commented on that patch, I'm not convinced that canonicalize {X,0} -> {1,0} is 
a good idea mostly because: 1) we don't attempt to canonicalize {X*Z,Y*Z} -> 
{X,Y}; 2) there are places where the absolute values matters; 3) canonicalize 
branch_weights as a side effect of BPI seems not right.

> Remove -unlikely-branch-weight option and just use 1.

Seems fine, I don't have a strong opinion.

> Add getLikelyBranchWeight() function returning the value of the 
> -likely-branch-weight option defaulting to 2000.

This makes sense to me.

> prefer pass-specific weights rather than a unified notion of 
> "likely"/"unlikely"... So with the latest round of patches it's probably best 
> to abandon this for now.

I don't see them as mutually exclusive. In fact when I look at your other 
patches, I feel that we need a default likely/unlikely for cases where we just 
don't have a good answer, hence fall back to default. So the default 
likely/unlikely probability is a layer below pass-specific setting.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D158668/new/

https://reviews.llvm.org/D158668

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D157518: Avoid running optimization passes in frontend test

2023-09-05 Thread Wenlei He via Phabricator via cfe-commits
wenlei added inline comments.



Comment at: clang/test/CodeGenCXX/attr-likelihood-if-branch-weights.cpp:147-148
 
-// CHECK: ![[PROF_LIKELY]] = !{!"branch_weights", i32 [[UNLIKELY]], i32 
[[LIKELY]]}
-// CHECK: ![[PROF_UNLIKELY]] = !{!"branch_weights", i32 [[LIKELY]], i32 
[[UNLIKELY]]}

I thought this is the purpose of the test, which is no longer checked after 
change? 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D157518/new/

https://reviews.llvm.org/D157518

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152741: [WPD] implement -funknown-vtable-visibility-filepaths

2023-06-23 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

> For concrete data are you talking about between the different solutions e.g. 
> --lto-whole-program-visibility, -funknown-vtable-visibility-filepaths, RTTI 
> based, FatLTO based etc or something else?

Right, between the different solutions. RTTI based solution doesn't exist yet, 
so maybe just compare using `-fwhole-program-vtables` on a known safe set of 
files vs using `-funknown-vtable-visibility-filepaths` on a known unsafe set of 
files first.

> Some data would help quantify the difference, I'll hack up a module 
> granularity implementation and compare to the current one.

I wasn't asking about implementing `-funknown-vtable-visibility-filepaths` with 
module (instead of header) granularity, but just the comparison mentioned above.

> The ordering for conflicts is embedded in the logic for 
> CodeGenModule::GetVCallVisibilityLevel which has priority order of

I was thinking about different source of visibility instead of absolute order 
of visibility itself - i.e. what is the rule if 
`__attribute__((visibility("...")))` conflicts with 
`-funknown-vtable-visibility-filepaths` setting for a specific type? This may 
not be an immediately important question, but just as example of the knock on 
effect of added complexity, which may or may not be justified depending on the 
benefit, which goes back to data from experiments.

We have `-wholeprogramdevirt-skip`; with this patch, we will have 
`-funknown-vtable-visibility-filepaths`; later on, we will have another RTTI 
based solution, then there's FatObj solution. It feels like a lot of stuff 
trying to solve one problem, so wondering if this addition here is going to 
provide enough value in the end state.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152741/new/

https://reviews.llvm.org/D152741

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D152741: [WPD] implement -funknown-vtable-visibility-filepaths

2023-06-23 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

> The big advantage of doing this in the FE is that we know which types are 
> actually coming from the native headers. Blocking all types in the TU is 
> overly conservative and also less stable as header changes can effectively 
> turn on/off unrelated large chunks of WPD.

This is clearly going to be selective in punting unsafe devirt, however do you 
have data comparing the effectiveness of the two (module granularity vs header 
granularity)?

I also think introducing unknown visibility is a good idea for this to work, 
but this is going to be exposed to users (not hidden implementing only), so we 
would probably need to have spec/rule to handle conflicting visibility from 
different source and make those explicit here: 
https://clang.llvm.org/docs/LTOVisibility.html.

There's a spectrum of solutions we could use to make WPD safer, but we need to 
be careful not to make this whole thing too convoluted with multiple solutions 
implemented, but little differentiation in their incremental value (extra perf, 
extra safety). So having concrete data backing the incremental value of this 
solution would be helpful.




Comment at: clang/include/clang/Basic/CodeGenOptions.h:191
+  struct RegexWithPattern {
+std::string Pattern;
+std::shared_ptr Regex;

Pattern string doesn't seem to be used anywhere, can we simplify this using 
`llvm::Regex` instead of `RegexWithPattern`?



Comment at: clang/include/clang/Basic/CodeGenOptions.h:198
+
+/// Returns true iff the optimization remark holds a valid regular
+/// expression.

update comment



Comment at: clang/include/clang/Driver/Options.td:3160
   NegFlag, BothFlags<[CoreOption]>>;
+def funknown_vtable_visibility_filepaths_EQ : Joined<["-"], 
"funknown-vtable-visibility-filepaths=">, Group,
+  HelpText<"Skip types from participating in whole-program vtable optimization 
defined under these filepaths. Pass empty string to unset">,

nit: `filepaths` -> `filepath` - this does not accept a comma separated list of 
paths(regexs).



Comment at: clang/include/clang/Driver/Options.td:3161
+def funknown_vtable_visibility_filepaths_EQ : Joined<["-"], 
"funknown-vtable-visibility-filepaths=">, Group,
+  HelpText<"Skip types from participating in whole-program vtable optimization 
defined under these filepaths. Pass empty string to unset">,
+  Flags<[CC1Option]>;

> Pass empty string to unset 

Remove this to be concise?



Comment at: clang/lib/Frontend/CompilerInvocation.cpp:2001
+if (!Pattern->isValid(RegexError)) {
+  Diags.Report(diag::err_drv_optimization_remark_pattern)
+  << RegexError << A->getAsString(Args);

rename `err_drv_optimization_remark_pattern` so it's not specific to remarks. 



Comment at: llvm/include/llvm/IR/GlobalObject.h:41
+// Type is unknown and should always be treated conservatively.
+VCallVisibilityUnknown = 3,
   };

This probably needs doc change too: 
https://clang.llvm.org/docs/LTOVisibility.html.

We also need to define when flag derived visibility conflicts with attribute 
derived visibility, what is the priority order. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D152741/new/

https://reviews.llvm.org/D152741

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D83906: [CodeGen] Emit a call instruction instead of an invoke if the called llvm function is marked nounwind

2023-03-08 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

> I wonder if this is just a single example, where there could be various other 
> (header-related) peepholes that cause similar problems for stable output. 
> IIRC, the usual Clang approach is to make as-close-to-optimal IR up front, 
> but maybe in some situations it's desirable to delay optimizations to improve 
> stability. Another application where that could be useful is caching.

I think this nounwind propagation a classic IPA problem, where you need proper 
per-function summary first and then propagate that through call graph to get 
final per-function attribute (like Attributor). Frontend is not the right place 
to do this kind of IPA/IPO.

> Do we want IRGen to prefer stable IR, or optimized IR? Should there be a -cc1 
> flag to decide (which AutoFDO could set)?

Unstable IR is a side of trying to do IPA in frontend which is naturally going 
to be half-complete.

I'm not sure if I follow why invoke -> call optimization can not be done in 
mid-end. If possible, I think this should be deferred to mid-end.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D83906/new/

https://reviews.llvm.org/D83906

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D142994: [UsersManual] Add llvm-progen as an alternative tool for AutoFDO profile generation.

2023-01-31 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a subscriber: davidxl.
wenlei added a comment.

Given llvm-profgen is in llvm tree, I'm wondering if it makes sense to treat it 
first option instead of alternative to out of tree create_llvm_prof. wdyt 
@davidxl


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D142994/new/

https://reviews.llvm.org/D142994

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D136846: [Driver] Add -fsample-profile-use-profi

2022-11-05 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

In D136846#3910272 , @MaskRay wrote:

> In D136846#3890699 , @wenlei wrote:
>
>> Did you see measurable perf boost with profi on autofdo? Or what motivated 
>> you to turn on profi? In most cases, profi helps when csspgo is used 
>> (instead of traditional autofdo).
>
> LG in my view, but @wenlei may have something to add.

lgtm.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136846/new/

https://reviews.llvm.org/D136846

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D136846: [Driver] Add -fsample-profile-use-profi

2022-10-27 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

Did you see measurable perf boost with profi on autofdo? Or what motivated you 
to turn on profi? In most cases, profi helps when csspgo is used (instead of 
traditional autofdo).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136846/new/

https://reviews.llvm.org/D136846

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D136846: [Driver] Enable profi flag as user-facing flag

2022-10-27 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

I'm curious why do you need profi to be a driver flag? There're many similar 
tuning flags that don't have corresponding driver flag. profi is narrower than 
most tuning flags as it's specific to one component (profile inference) unique 
to sample PGO.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D136846/new/

https://reviews.llvm.org/D136846

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D134456: [PGO] Consider parent context when weighing branches with likelyhood.

2022-09-27 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a subscriber: davidxl.
wenlei added a comment.

For instr PGO, changing behavior may have perf implication. cc @davidxl


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D134456/new/

https://reviews.llvm.org/D134456

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D134456: [PGO] Consider parent context when weighing branches with likelyhood.

2022-09-26 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

There's also subtle difference between different variants of PGO. For 
instrumentation PGO where profile quality is higher, profile counts always take 
precedence. But for sampling PGO (AutoFDO, CSSPGO), it tries to not overwrite 
any existing branch weights, which means the weights derived from hints earlier 
are honored. But there's count smoothing algorithm that try to make make CFG 
counts consistent, which may occasionally overwrite some branch weights that 
seem really off based on surround count/flow.

The current default looks more like a hint rather than a directive..


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D134456/new/

https://reviews.llvm.org/D134456

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D125291: Introduce @llvm.threadlocal.address intrinsic to access TLS variable (1/3)

2022-05-25 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a subscriber: weiwang.
wenlei added a comment.

+@weiwang


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D125291/new/

https://reviews.llvm.org/D125291

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D121969: Pass split-machine-functions to code generator when flto is used

2022-03-22 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

The linter warning seems legit, otherwise looks good.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D121969/new/

https://reviews.llvm.org/D121969

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D121969: Pass split-machine-functions to code generator when flto is used

2022-03-19 Thread Wenlei He via Phabricator via cfe-commits
wenlei added inline comments.



Comment at: clang/test/Driver/split-machine-functions.c:2
+// Split machine functions only work for ELF, so disable the test on Windows
+// UNSUPPORTED: system-windows
+

My understanding is that if you specify target, e.g. `-target 
x86_64-unknown-linux`, the test should still be runnable on windows host 
platform. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D121969/new/

https://reviews.llvm.org/D121969

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D120610: [DebugInfo] Include DW_TAG_skeleton_unit when looking for parent UnitDie

2022-03-12 Thread Wenlei He via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG4f320ca4ba15: [DebugInfo] Include DW_TAG_skeleton_unit when 
looking for parent UnitDie (authored by wenlei).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120610/new/

https://reviews.llvm.org/D120610

Files:
  llvm/lib/CodeGen/AsmPrinter/DIE.cpp
  llvm/test/DebugInfo/X86/fission-template.ll


Index: llvm/test/DebugInfo/X86/fission-template.ll
===
--- /dev/null
+++ llvm/test/DebugInfo/X86/fission-template.ll
@@ -0,0 +1,75 @@
+; Check that we handle template types for split-dwarf-inlining correctly.
+; RUN: llc -split-dwarf-file=%t.dwo -O2 < %s -dwarf-version=5 
-mtriple=x86_64-unknown-linux-gnu -filetype=obj -o -  | llvm-dwarfdump - | 
FileCheck %s
+;
+; The test case is generated from the following code
+; clang -cc1 -emit-llvm -fdebug-info-for-profiling -fsplit-dwarf-inlining 
-debug-info-kind=constructor -dwarf-version=5 -split-dwarf-file temp.dwo -O2
+;
+; void f1();
+;
+; template 
+; void f2() {
+;   f1();
+; }
+;
+; void f3() {
+;   f2();
+; }
+
+; CHECK:  .debug_info contents:
+; CHECK:DW_TAG_skeleton_unit
+; CHECK:  DW_TAG_subprogram
+; CHECK-NEXT: DW_AT_linkage_name   ("_Z2f2IiEvv")
+; CHECK-NEXT: DW_AT_name   ("f2")
+; CHECK:  DW_TAG_template_type_parameter
+; CHECK-NEXT:   DW_AT_type (0x{{.*}} "int")
+; CHECK-NEXT:   DW_AT_name ("T")
+; CHECK:  .debug_info.dwo contents:
+; CHECK:DW_TAG_compile_unit
+; CHECK:  DW_TAG_subprogram
+; CHECK-NEXT: DW_AT_linkage_name   ("_Z2f2IiEvv")
+; CHECK-NEXT: DW_AT_name   ("f2")
+; CHECK:  DW_TAG_template_type_parameter
+; CHECK-NEXT:   DW_AT_type (0x{{.*}} "int")
+; CHECK-NEXT:   DW_AT_name ("T")
+
+; ModuleID = 'split-debug-inlining-template.cpp'
+source_filename = 
"llvm-project/clang/test/CodeGen/split-debug-inlining-template.cpp"
+target datalayout = 
"e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-redhat-linux-gnu"
+
+; Function Attrs: mustprogress nounwind
+define dso_local void @_Z2f3v() local_unnamed_addr #0 !dbg !6 {
+entry:
+  tail call void @_Z2f1v() #2, !dbg !11
+  ret void, !dbg !17
+}
+
+declare !dbg !18 void @_Z2f1v() local_unnamed_addr #1
+
+attributes #0 = { mustprogress nounwind "frame-pointer"="none" 
"min-legal-vector-width"="0" "no-trapping-math"="true" 
"stack-protector-buffer-size"="8" "target-features"="+cx8,+mmx,+sse,+sse2,+x87" 
}
+attributes #1 = { "frame-pointer"="none" "no-trapping-math"="true" 
"stack-protector-buffer-size"="8" "target-features"="+cx8,+mmx,+sse,+sse2,+x87" 
}
+attributes #2 = { nounwind }
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!2, !3, !4}
+!llvm.ident = !{!5}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_14, file: !1, 
producer: "clang version 15.0.0 (https://github.com/llvm/llvm-project.git 
8b2b9c3fa91ebe583f8c634482885a669b82a1f0)", isOptimized: true, runtimeVersion: 
0, splitDebugFilename: "split-debug-inlining-template.c.tmp.dwo", emissionKind: 
FullDebug, debugInfoForProfiling: true, nameTableKind: None)
+!1 = !DIFile(filename: "llvm-project/clang/test/CodeGen/", directory: 
"build-debug", checksumkind: CSK_MD5, checksum: 
"0fb39b3bb5a60928b5d9c251b2d91b2c")
+!2 = !{i32 7, !"Dwarf Version", i32 5}
+!3 = !{i32 2, !"Debug Info Version", i32 3}
+!4 = !{i32 1, !"wchar_size", i32 4}
+!5 = !{!"clang version 15.0.0 (https://github.com/llvm/llvm-project.git 
8b2b9c3fa91ebe583f8c634482885a669b82a1f0)"}
+!6 = distinct !DISubprogram(name: "f3", linkageName: "_Z2f3v", scope: !7, 
file: !7, line: 11, type: !8, scopeLine: 11, flags: DIFlagPrototyped | 
DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: 
!0, retainedNodes: !10)
+!7 = !DIFile(filename: 
"llvm-project/clang/test/CodeGen/split-debug-inlining-template.cpp", directory: 
"", checksumkind: CSK_MD5, checksum: "0fb39b3bb5a60928b5d9c251b2d91b2c")
+!8 = !DISubroutineType(types: !9)
+!9 = !{null}
+!10 = !{}
+!11 = !DILocation(line: 8, column: 3, scope: !12, inlinedAt: !16)
+!12 = distinct !DISubprogram(name: "f2", linkageName: "_Z2f2IiEvv", 
scope: !7, file: !7, line: 7, type: !8, scopeLine: 7, flags: DIFlagPrototyped | 
DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: 
!0, templateParams: !13, retainedNodes: !10)
+!13 = !{!14}
+!14 = !DITemplateTypeParameter(name: "T", type: !15)
+!15 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+!16 = distinct !DILocation(line: 12, column: 3, scope: !6)
+!17 = !DILocation(line: 13, column: 1, scope: !6)
+!18 = !DISubprogram(name: "f1", linkageName: "_Z2f1v", scope: !7, file: !7, 
line: 4, type: !8, flags: DIFlagPrototyped, spFlags: DISPFlagOptimized, 
retainedNodes: !10)
Index: llvm/lib/CodeGen/AsmPrinter/DIE.cpp
===
--- llvm/lib/Co

[PATCH] D120610: [DebugInfo] Include DW_TAG_skeleton_unit when looking for parent UnitDie

2022-03-11 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

In D120610#3376002 , @dblaikie wrote:

> It'd be good to include some testing beyond "does not crash" - like what was 
> the specific debug info we were trying to create when the crash was hit? 
> Perhaps we should be testing that (since the crash demonstrates we weren't 
> testing that anywhere else)

Sure, added a few checks around `DW_TAG_template_type_parameter` for both 
`DW_TAG_skeleton_unit` and `DW_TAG_skeleton_unit`. Let me know whether what I 
have here aligns with the convention of similar tests.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120610/new/

https://reviews.llvm.org/D120610

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D120610: [DebugInfo] Include DW_TAG_skeleton_unit when looking for parent UnitDie

2022-03-11 Thread Wenlei He via Phabricator via cfe-commits
wenlei updated this revision to Diff 414737.
wenlei added a comment.

Add test check for template type parameters


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120610/new/

https://reviews.llvm.org/D120610

Files:
  llvm/lib/CodeGen/AsmPrinter/DIE.cpp
  llvm/test/DebugInfo/X86/fission-template.ll


Index: llvm/test/DebugInfo/X86/fission-template.ll
===
--- /dev/null
+++ llvm/test/DebugInfo/X86/fission-template.ll
@@ -0,0 +1,75 @@
+; Check that we handle template types for split-dwarf-inlining correctly.
+; RUN: llc -split-dwarf-file=%t.dwo -O2 < %s -dwarf-version=5 
-mtriple=x86_64-unknown-linux-gnu -filetype=obj -o -  | llvm-dwarfdump - | 
FileCheck %s
+;
+; The test case is generated from the following code
+; clang -cc1 -emit-llvm -fdebug-info-for-profiling -fsplit-dwarf-inlining 
-debug-info-kind=constructor -dwarf-version=5 -split-dwarf-file temp.dwo -O2
+;
+; void f1();
+;
+; template 
+; void f2() {
+;   f1();
+; }
+;
+; void f3() {
+;   f2();
+; }
+
+; CHECK:  .debug_info contents:
+; CHECK:DW_TAG_skeleton_unit
+; CHECK:  DW_TAG_subprogram
+; CHECK-NEXT: DW_AT_linkage_name   ("_Z2f2IiEvv")
+; CHECK-NEXT: DW_AT_name   ("f2")
+; CHECK:  DW_TAG_template_type_parameter
+; CHECK-NEXT:   DW_AT_type (0x{{.*}} "int")
+; CHECK-NEXT:   DW_AT_name ("T")
+; CHECK:  .debug_info.dwo contents:
+; CHECK:DW_TAG_compile_unit
+; CHECK:  DW_TAG_subprogram
+; CHECK-NEXT: DW_AT_linkage_name   ("_Z2f2IiEvv")
+; CHECK-NEXT: DW_AT_name   ("f2")
+; CHECK:  DW_TAG_template_type_parameter
+; CHECK-NEXT:   DW_AT_type (0x{{.*}} "int")
+; CHECK-NEXT:   DW_AT_name ("T")
+
+; ModuleID = 'split-debug-inlining-template.cpp'
+source_filename = 
"llvm-project/clang/test/CodeGen/split-debug-inlining-template.cpp"
+target datalayout = 
"e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-redhat-linux-gnu"
+
+; Function Attrs: mustprogress nounwind
+define dso_local void @_Z2f3v() local_unnamed_addr #0 !dbg !6 {
+entry:
+  tail call void @_Z2f1v() #2, !dbg !11
+  ret void, !dbg !17
+}
+
+declare !dbg !18 void @_Z2f1v() local_unnamed_addr #1
+
+attributes #0 = { mustprogress nounwind "frame-pointer"="none" 
"min-legal-vector-width"="0" "no-trapping-math"="true" 
"stack-protector-buffer-size"="8" "target-features"="+cx8,+mmx,+sse,+sse2,+x87" 
}
+attributes #1 = { "frame-pointer"="none" "no-trapping-math"="true" 
"stack-protector-buffer-size"="8" "target-features"="+cx8,+mmx,+sse,+sse2,+x87" 
}
+attributes #2 = { nounwind }
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!2, !3, !4}
+!llvm.ident = !{!5}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_14, file: !1, 
producer: "clang version 15.0.0 (https://github.com/llvm/llvm-project.git 
8b2b9c3fa91ebe583f8c634482885a669b82a1f0)", isOptimized: true, runtimeVersion: 
0, splitDebugFilename: "split-debug-inlining-template.c.tmp.dwo", emissionKind: 
FullDebug, debugInfoForProfiling: true, nameTableKind: None)
+!1 = !DIFile(filename: "llvm-project/clang/test/CodeGen/", directory: 
"build-debug", checksumkind: CSK_MD5, checksum: 
"0fb39b3bb5a60928b5d9c251b2d91b2c")
+!2 = !{i32 7, !"Dwarf Version", i32 5}
+!3 = !{i32 2, !"Debug Info Version", i32 3}
+!4 = !{i32 1, !"wchar_size", i32 4}
+!5 = !{!"clang version 15.0.0 (https://github.com/llvm/llvm-project.git 
8b2b9c3fa91ebe583f8c634482885a669b82a1f0)"}
+!6 = distinct !DISubprogram(name: "f3", linkageName: "_Z2f3v", scope: !7, 
file: !7, line: 11, type: !8, scopeLine: 11, flags: DIFlagPrototyped | 
DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: 
!0, retainedNodes: !10)
+!7 = !DIFile(filename: 
"llvm-project/clang/test/CodeGen/split-debug-inlining-template.cpp", directory: 
"", checksumkind: CSK_MD5, checksum: "0fb39b3bb5a60928b5d9c251b2d91b2c")
+!8 = !DISubroutineType(types: !9)
+!9 = !{null}
+!10 = !{}
+!11 = !DILocation(line: 8, column: 3, scope: !12, inlinedAt: !16)
+!12 = distinct !DISubprogram(name: "f2", linkageName: "_Z2f2IiEvv", 
scope: !7, file: !7, line: 7, type: !8, scopeLine: 7, flags: DIFlagPrototyped | 
DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: 
!0, templateParams: !13, retainedNodes: !10)
+!13 = !{!14}
+!14 = !DITemplateTypeParameter(name: "T", type: !15)
+!15 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+!16 = distinct !DILocation(line: 12, column: 3, scope: !6)
+!17 = !DILocation(line: 13, column: 1, scope: !6)
+!18 = !DISubprogram(name: "f1", linkageName: "_Z2f1v", scope: !7, file: !7, 
line: 4, type: !8, flags: DIFlagPrototyped, spFlags: DISPFlagOptimized, 
retainedNodes: !10)
Index: llvm/lib/CodeGen/AsmPrinter/DIE.cpp
===
--- llvm/lib/CodeGen/AsmPrinter/DIE.cpp
+++ llvm/lib/CodeGen/AsmPrinter/DIE.cpp
@@ -204,6 +204,7 @@
   

[PATCH] D120610: [DebugInfo] Include DW_TAG_skeleton_unit when looking for parent UnitDie

2022-03-11 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

In D120610#3373042 , @dblaikie wrote:

> Fixes in LLVM require tests in LLVM - probably taking the clang test and 
> compiling that to llvm IR (include the original C++ source in a comment in 
> the IR test case) and then testing it in LLVM instead of clang.
>
> Also looks like the test could be simplified a bit more:
>
>   void f1();
>   
>   template 
>   void f2() {
> f1();
>   }
>   
>   void f3() {
> f2();
>   }

Done, thanks.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120610/new/

https://reviews.llvm.org/D120610

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D120610: [DebugInfo] Include DW_TAG_skeleton_unit when looking for parent UnitDie

2022-03-11 Thread Wenlei He via Phabricator via cfe-commits
wenlei updated this revision to Diff 414606.
wenlei added a comment.

update test case


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120610/new/

https://reviews.llvm.org/D120610

Files:
  llvm/lib/CodeGen/AsmPrinter/DIE.cpp
  llvm/test/DebugInfo/X86/fission-template.ll


Index: llvm/test/DebugInfo/X86/fission-template.ll
===
--- /dev/null
+++ llvm/test/DebugInfo/X86/fission-template.ll
@@ -0,0 +1,58 @@
+; Check that we handle template types for split-dwarf-inlining without ICE.
+; RUN: llc -split-dwarf-file=%t.dwo -O2 -dwarf-version=5 
-mtriple=x86_64-unknown-linux-gnu -filetype=obj -o %t.o < %s
+;
+; The test case is generated from the following code
+; clang -cc1 -emit-llvm -fdebug-info-for-profiling -fsplit-dwarf-inlining 
-debug-info-kind=constructor -dwarf-version=5 -split-dwarf-file temp.dwo -O2
+;
+; void f1();
+;
+; template 
+; void f2() {
+;   f1();
+; }
+;
+; void f3() {
+;   f2();
+; }
+
+; ModuleID = 'split-debug-inlining-template.cpp'
+source_filename = 
"llvm-project/clang/test/CodeGen/split-debug-inlining-template.cpp"
+target datalayout = 
"e-m:e-p270:32:32-p271:32:32-p272:64:64-i64:64-f80:128-n8:16:32:64-S128"
+target triple = "x86_64-redhat-linux-gnu"
+
+; Function Attrs: mustprogress nounwind
+define dso_local void @_Z2f3v() local_unnamed_addr #0 !dbg !6 {
+entry:
+  tail call void @_Z2f1v() #2, !dbg !11
+  ret void, !dbg !17
+}
+
+declare !dbg !18 void @_Z2f1v() local_unnamed_addr #1
+
+attributes #0 = { mustprogress nounwind "frame-pointer"="none" 
"min-legal-vector-width"="0" "no-trapping-math"="true" 
"stack-protector-buffer-size"="8" "target-features"="+cx8,+mmx,+sse,+sse2,+x87" 
}
+attributes #1 = { "frame-pointer"="none" "no-trapping-math"="true" 
"stack-protector-buffer-size"="8" "target-features"="+cx8,+mmx,+sse,+sse2,+x87" 
}
+attributes #2 = { nounwind }
+
+!llvm.dbg.cu = !{!0}
+!llvm.module.flags = !{!2, !3, !4}
+!llvm.ident = !{!5}
+
+!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus_14, file: !1, 
producer: "clang version 15.0.0 (https://github.com/llvm/llvm-project.git 
8b2b9c3fa91ebe583f8c634482885a669b82a1f0)", isOptimized: true, runtimeVersion: 
0, splitDebugFilename: "split-debug-inlining-template.c.tmp.dwo", emissionKind: 
FullDebug, debugInfoForProfiling: true, nameTableKind: None)
+!1 = !DIFile(filename: "llvm-project/clang/test/CodeGen/", directory: 
"build-debug", checksumkind: CSK_MD5, checksum: 
"0fb39b3bb5a60928b5d9c251b2d91b2c")
+!2 = !{i32 7, !"Dwarf Version", i32 5}
+!3 = !{i32 2, !"Debug Info Version", i32 3}
+!4 = !{i32 1, !"wchar_size", i32 4}
+!5 = !{!"clang version 15.0.0 (https://github.com/llvm/llvm-project.git 
8b2b9c3fa91ebe583f8c634482885a669b82a1f0)"}
+!6 = distinct !DISubprogram(name: "f3", linkageName: "_Z2f3v", scope: !7, 
file: !7, line: 11, type: !8, scopeLine: 11, flags: DIFlagPrototyped | 
DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: 
!0, retainedNodes: !10)
+!7 = !DIFile(filename: 
"llvm-project/clang/test/CodeGen/split-debug-inlining-template.cpp", directory: 
"", checksumkind: CSK_MD5, checksum: "0fb39b3bb5a60928b5d9c251b2d91b2c")
+!8 = !DISubroutineType(types: !9)
+!9 = !{null}
+!10 = !{}
+!11 = !DILocation(line: 8, column: 3, scope: !12, inlinedAt: !16)
+!12 = distinct !DISubprogram(name: "f2", linkageName: "_Z2f2IiEvv", 
scope: !7, file: !7, line: 7, type: !8, scopeLine: 7, flags: DIFlagPrototyped | 
DIFlagAllCallsDescribed, spFlags: DISPFlagDefinition | DISPFlagOptimized, unit: 
!0, templateParams: !13, retainedNodes: !10)
+!13 = !{!14}
+!14 = !DITemplateTypeParameter(name: "T", type: !15)
+!15 = !DIBasicType(name: "int", size: 32, encoding: DW_ATE_signed)
+!16 = distinct !DILocation(line: 12, column: 3, scope: !6)
+!17 = !DILocation(line: 13, column: 1, scope: !6)
+!18 = !DISubprogram(name: "f1", linkageName: "_Z2f1v", scope: !7, file: !7, 
line: 4, type: !8, flags: DIFlagPrototyped, spFlags: DISPFlagOptimized, 
retainedNodes: !10)
Index: llvm/lib/CodeGen/AsmPrinter/DIE.cpp
===
--- llvm/lib/CodeGen/AsmPrinter/DIE.cpp
+++ llvm/lib/CodeGen/AsmPrinter/DIE.cpp
@@ -204,6 +204,7 @@
   const DIE *p = this;
   while (p) {
 if (p->getTag() == dwarf::DW_TAG_compile_unit ||
+p->getTag() == dwarf::DW_TAG_skeleton_unit ||
 p->getTag() == dwarf::DW_TAG_type_unit)
   return p;
 p = p->getParent();


Index: llvm/test/DebugInfo/X86/fission-template.ll
===
--- /dev/null
+++ llvm/test/DebugInfo/X86/fission-template.ll
@@ -0,0 +1,58 @@
+; Check that we handle template types for split-dwarf-inlining without ICE.
+; RUN: llc -split-dwarf-file=%t.dwo -O2 -dwarf-version=5 -mtriple=x86_64-unknown-linux-gnu -filetype=obj -o %t.o < %s
+;
+; The test case is generated from the following code
+; clang -cc1 -emit-llvm -fdebug-info-for-profiling -fsplit-

[PATCH] D120610: [DebugInfo] Include DW_TAG_skeleton_unit when looking for parent UnitDie

2022-03-09 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

In D120610#3348997 , @dblaikie wrote:

> This should include a test - and could you describe more about where this 
> issue came up/what motivated the change?

Sure, sorry for the delay as I just got back to this today. I sent the fix 
first hoping to get a sanity check as we were a bit puzzled how come nobody hit 
this in the past.

Here's what happened for the ICE - we needed to add template types when 
constructing subprogram die for template functions (`addTemplateParams`) for 
the skeleton CU, this is done through `getOrCreateContextDIE` which involves 
finding/creating parent DIEs. For that, we sometimes need to retrieve top level 
DIE (`getOrCreateContextDIE` -> `getUnitDie`). Before this fix, we end up with 
nullptr from `getUnitDie`, so some DIEs created in that process has `getUnit()` 
being null which leads to ICE on access.

I added a test case that would cause ICE without this fix (the set of flags 
there is minimum).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120610/new/

https://reviews.llvm.org/D120610

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D120610: [DebugInfo] Include DW_TAG_skeleton_unit when looking for parent UnitDie

2022-03-09 Thread Wenlei He via Phabricator via cfe-commits
wenlei updated this revision to Diff 414251.
wenlei added a comment.
Herald added projects: clang, All.
Herald added a subscriber: cfe-commits.

Add test case.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D120610/new/

https://reviews.llvm.org/D120610

Files:
  clang/test/CodeGen/split-debug-inlining-template.c
  llvm/lib/CodeGen/AsmPrinter/DIE.cpp


Index: llvm/lib/CodeGen/AsmPrinter/DIE.cpp
===
--- llvm/lib/CodeGen/AsmPrinter/DIE.cpp
+++ llvm/lib/CodeGen/AsmPrinter/DIE.cpp
@@ -204,6 +204,7 @@
   const DIE *p = this;
   while (p) {
 if (p->getTag() == dwarf::DW_TAG_compile_unit ||
+p->getTag() == dwarf::DW_TAG_skeleton_unit ||
 p->getTag() == dwarf::DW_TAG_type_unit)
   return p;
 p = p->getParent();
Index: clang/test/CodeGen/split-debug-inlining-template.c
===
--- /dev/null
+++ clang/test/CodeGen/split-debug-inlining-template.c
@@ -0,0 +1,16 @@
+// Check that we handle template types for split-dwarf-inlining without ICE.
+// RUN: %clang_cc1 -emit-obj -fdebug-info-for-profiling -fsplit-dwarf-inlining 
-debug-info-kind=constructor -dwarf-version=5 -split-dwarf-file %t.dwo -O2 -o 
%t.o -x c++ %s
+
+struct container {
+  int size();
+};
+
+template 
+int Range(T &x) {
+  return x.size();
+}
+
+int main() {
+  container v;
+  return Range(v);
+}


Index: llvm/lib/CodeGen/AsmPrinter/DIE.cpp
===
--- llvm/lib/CodeGen/AsmPrinter/DIE.cpp
+++ llvm/lib/CodeGen/AsmPrinter/DIE.cpp
@@ -204,6 +204,7 @@
   const DIE *p = this;
   while (p) {
 if (p->getTag() == dwarf::DW_TAG_compile_unit ||
+p->getTag() == dwarf::DW_TAG_skeleton_unit ||
 p->getTag() == dwarf::DW_TAG_type_unit)
   return p;
 p = p->getParent();
Index: clang/test/CodeGen/split-debug-inlining-template.c
===
--- /dev/null
+++ clang/test/CodeGen/split-debug-inlining-template.c
@@ -0,0 +1,16 @@
+// Check that we handle template types for split-dwarf-inlining without ICE.
+// RUN: %clang_cc1 -emit-obj -fdebug-info-for-profiling -fsplit-dwarf-inlining -debug-info-kind=constructor -dwarf-version=5 -split-dwarf-file %t.dwo -O2 -o %t.o -x c++ %s
+
+struct container {
+  int size();
+};
+
+template 
+int Range(T &x) {
+  return x.size();
+}
+
+int main() {
+  container v;
+  return Range(v);
+}
___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D114130: [Clang] Add option to disable -mconstructor-aliases with -mno-constructor-aliases

2021-11-18 Thread Wenlei He via Phabricator via cfe-commits
wenlei accepted this revision.
wenlei added a comment.
This revision is now accepted and ready to land.

lgtm, thanks


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D114130/new/

https://reviews.llvm.org/D114130

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D114130: [Clang] Add option to disable -mconstructor-aliases with -mno-constructor-aliases

2021-11-17 Thread Wenlei He via Phabricator via cfe-commits
wenlei added inline comments.



Comment at: clang/include/clang/Driver/Options.td:5058-5061
+  CodeGenOpts<"CXXCtorDtorAliases">,
+  DefaultFalse,
+  PosFlag,
+  NegFlag,

nit: use two lines for consistency. check `sanitize_stats`, 
`sanitize_cfi_cross_dso` and others. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D114130/new/

https://reviews.llvm.org/D114130

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110891: [inliner] Mandatory inlining decisions produce remarks

2021-10-01 Thread Wenlei He via Phabricator via cfe-commits
wenlei accepted this revision.
wenlei added a comment.
This revision is now accepted and ready to land.

lgtm, thanks.




Comment at: llvm/lib/Analysis/InlineAdvisor.cpp:52
 
+namespace {
+using namespace llvm::ore;

mtrofin wrote:
> wenlei wrote:
> > curious why do we need anonymous namespace here?
> iiuc it's preferred we place file-local types inside an anonymous namespace. 
> 
> Looking now at the [[ 
> https://llvm.org/docs/CodingStandards.html#anonymous-namespaces | style 
> guideline ]], it touts their benefits but also says I should have only placed 
> de decl there and the impl of those members out... but the members are quite 
> trivial. Happy to move them out though.
Thanks for the pointer. I don't have a strong opinion but slightly leaning 
towards moving out of anonymous namespace be consistent with how other 
InlineAdvice is organized (DefaultInlineAdvice, MLInlineAdvice not in anonymous 
namespace).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110891/new/

https://reviews.llvm.org/D110891

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110891: [inliner] Mandatory inlining decisions produce remarks

2021-10-01 Thread Wenlei He via Phabricator via cfe-commits
wenlei added inline comments.



Comment at: llvm/lib/Analysis/InlineAdvisor.cpp:52
 
+namespace {
+using namespace llvm::ore;

curious why do we need anonymous namespace here?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110891/new/

https://reviews.llvm.org/D110891

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D110209: [CSSPGO] Set PseudoProbeInserter as a default pass.

2021-09-21 Thread Wenlei He via Phabricator via cfe-commits
wenlei accepted this revision.
wenlei added a comment.
This revision is now accepted and ready to land.

thanks for the change. control this through metadata is more reliable than 
through LTO time flags. Scheduling a non-op pass shouldn't incur overhead 
either. lgtm.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D110209/new/

https://reviews.llvm.org/D110209

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109638: [CSSPGO][llvm-profgen] Truncate stack samples with invalid return address.

2021-09-14 Thread Wenlei He via Phabricator via cfe-commits
wenlei accepted this revision.
wenlei added a comment.
This revision is now accepted and ready to land.

lgtm except two nits.




Comment at: llvm/tools/llvm-profgen/PerfReader.cpp:772
 
+void PerfReaderBase::emitAccumulatedWarnings() {
+  for (auto Address : InvalidReturnAddresses) {

nit: emitAccumulatedWarnings -> warnTruncatedStack



Comment at: llvm/tools/llvm-profgen/PerfReader.cpp:774
+  for (auto Address : InvalidReturnAddresses) {
+WithColor::warning() << "Invalid return address "
+ << format("%" PRIx64, Address) << "\n";

Nit: make the message more meaningful. `Truncated stack sample due to invalid 
return address at 0xabcd, likely caused by frame pointer omission.`


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109638/new/

https://reviews.llvm.org/D109638

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109531: [CSSPGO] Enable pseudo probe instrumentation in O0 mode.

2021-09-14 Thread Wenlei He via Phabricator via cfe-commits
wenlei accepted this revision.
wenlei added a comment.

lgtm, thanks.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109531/new/

https://reviews.llvm.org/D109531

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109638: [CSSPGO][llvm-profgen] Truncate stack samples with invalid return address.

2021-09-14 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

> It isn't common when the program is built with the frame pointer omission 
> disabled, but can still happen with third-party static libs built with frame 
> pointer omitted.

Did you mean disabled -> enabled?

> This could happen to frame-pointer-based unwinding and the callee functions 
> that do not have the frame pointer chain set up.

So this leads to frame pointer being used to volatile register and assume it 
contains frame pointer would lead to bad frame address when unwinding the stack 
during profiling, is that what you observed?




Comment at: llvm/tools/llvm-profgen/PerfReader.cpp:523-531
+  auto I = Binary->getIndexForAddr(FrameAddr);
+  FrameAddr = I ? Binary->getAddressforIndex(I - 1) : 0;
+  // Stop at an invalid return address caused by bad unwinding. This could
+  // happen to frame-pointer-based unwinding and the callee functions that
+  // do not have the frame pointer chain set up.
+  if (!FrameAddr || !Binary->addressIsCall(FrameAddr)) {
+NumStackSamplesWithInvalidReturnAddress++;

Can we hide all this complexity into `getCallAddrFromFrameAddr`? i.e. we could 
use `bool getCallAddrFromFrameAddr(uint64_t FrameAddr, uint64_t &CallAddr)`. 



Comment at: llvm/tools/llvm-profgen/PerfReader.cpp:529
+  if (!FrameAddr || !Binary->addressIsCall(FrameAddr)) {
+NumStackSamplesWithInvalidReturnAddress++;
+break;

Should we be using a warning too? For truncated context due to missing probe 
for merged callsite, we used warning. I think this is things of similar nature, 
should we also handle them in similar/consistent fashion- i.e. with warning 
instead of stats? 

And if we expect duplicated warnings, would be good to deduplicated before 
printing. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109638/new/

https://reviews.llvm.org/D109638

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109531: [CSSPGO] Enable pseudo probe instrumentation in O0 mode.

2021-09-12 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

> More specifically, with the following command, both cc1 and lld will run in 
> default mode, which is -O0 for cc1 and -O2 for lld.
>
>   clang -flto 1.cpp -v -fuse-ld=lld

I'm wondering is this the expected behavior or an oversight of pass pipeline 
setup? In what scenario would a O0 prelink + O2 
 postlink make sense?

Btw, just double check - the O2  
here you mentioned is not LLD's O2  
for linking, but actually postlink LLVM O2 
, right?




Comment at: llvm/lib/Passes/PassBuilder.cpp:1930
+  // mixed with an O0 prelink and an O2 postlink. Loading a sample profile in
+  // the postlink will require pseudo probe instrumentation in the prelink.
+  if (PGOOpt && PGOOpt->PseudoProbeForProfiling)

> Loading a sample profile in the postlink will require pseudo probe 
> instrumentation in the prelink. 

Even with this change, is it still possible that prelink compile for some 
module actually doesn't have `-fpseudo-probe-for-profiling`, and it's on for 
LTO postlink? We could contrive such case, and it could happen in reality too, 
right? Would we have the same problem when trying to load profile for functions 
from modules without pseudo-probe in prelink? 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109531/new/

https://reviews.llvm.org/D109531

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109531: [CSSPGO] Enable pseudo probe instrumentation in O0 mode.

2021-09-09 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

The change makes sense given instr PGO also happens for O0. But practically, if 
a file is being built with O0, do we care about its profile given we're not 
really optimizing it anyways? Functions from O0 modules are not supposed to be 
inlined into O1 + modules either.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109531/new/

https://reviews.llvm.org/D109531

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109386: Fix use-after-free from GlobalCtors associated data

2021-09-08 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

Yes, this looks like the kind of stuff VH is designed for. @wlei here's an 
example of how VH is used to handle deletion of BFI of basic blocks: 
https://reviews.llvm.org/D75341.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109386/new/

https://reviews.llvm.org/D109386

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D109175: [openmp] Add clang cc1 option -fopenmp-skip-deferred-diags

2021-09-02 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

In D109175#2980446 , @lebedev.ri 
wrote:

> In D109175#2980333 , @weiwang wrote:
>
>> Our internal codebase never uses the target directive. Once the deferred 
>> diags is bypassed, we observed 18% e2e build time improvement.
>
> Is that with `-fopenmp` or without?
> That seems, kinda a lot more than i would have expected,
> perhaps there are some other ways to reduce the overhead other than this 
> approach?

Yeah, though the slow down in build time is blocking us from moving to newer 
llvm for many projects. Currently I think it makes sense to give user the 
option to decide whether they want faster build or better diagnostics. Of 
course if the slow down is fully addressed in the future, the switch could be 
removed too.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D109175/new/

https://reviews.llvm.org/D109175

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D108905: [ItaniumCXXABI] Set nounwind on __cxa_begin_catch/__cxa_end_catch

2021-08-30 Thread Wenlei He via Phabricator via cfe-commits
wenlei added subscribers: modimo, wenlei.
wenlei added a comment.

+@modimo


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D108905/new/

https://reviews.llvm.org/D108905

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D107878: [SampleFDO] Flow Sensitive Sample FDO (FSAFDO) profile loader

2021-08-19 Thread Wenlei He via Phabricator via cfe-commits
wenlei added inline comments.



Comment at: llvm/test/CodeGen/X86/fsafdo_test2.ll:3
+; RUN: llvm-profdata merge --sample -profile-isfs -o %t.afdo 
%S/Inputs/fsloader.afdo
+; RUN: llc -enable-fs-discriminator -fs-profile-file=%t.afdo 
-show-fs-branchprob -disable-ra-fsprofile-loader=false 
-disable-layout-fsprofile-loader=false < %s 2>&1 | FileCheck %s 
--check-prefix=LOADER
+;

lenary wrote:
> I think this `REQUIRES: asserts` if you're checking the debug output?
Should be fixed in D108364


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107878/new/

https://reviews.llvm.org/D107878

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D107876: [CSSPGO] Allow the use of debug-info-for-profiling and pseudo-probe-for-profiling together

2021-08-11 Thread Wenlei He via Phabricator via cfe-commits
wenlei accepted this revision.
wenlei added a comment.
This revision is now accepted and ready to land.

lgtm, thanks.




Comment at: clang/lib/Driver/ToolChains/Clang.cpp:3896
-  // These two forms of profiling info can't be used together.
-  if (const Arg *A1 = 
Args.getLastArg(options::OPT_fpseudo_probe_for_profiling))
-if (const Arg *A2 = 
Args.getLastArg(options::OPT_fdebug_info_for_profiling))

hoy wrote:
> wenlei wrote:
> > Would a warning be useful to catch accidental misuse? 
> I was thinking about that. It may be useful, but downstream may have to 
> suppress that warning to be treated as error for a clean build. What do you 
> think?
Fair point. We could add warning after transition.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107876/new/

https://reviews.llvm.org/D107876

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D107876: [CSSPGO] Allow the use of debug-info-for-profiling and pseudo-probe-for-profiling together

2021-08-10 Thread Wenlei He via Phabricator via cfe-commits
wenlei added inline comments.



Comment at: clang/lib/Driver/ToolChains/Clang.cpp:3896
-  // These two forms of profiling info can't be used together.
-  if (const Arg *A1 = 
Args.getLastArg(options::OPT_fpseudo_probe_for_profiling))
-if (const Arg *A2 = 
Args.getLastArg(options::OPT_fdebug_info_for_profiling))

Would a warning be useful to catch accidental misuse? 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D107876/new/

https://reviews.llvm.org/D107876

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D106193: [CSSPGO] Turn on unique linkage name by default for pseudo probe.

2021-07-16 Thread Wenlei He via Phabricator via cfe-commits
wenlei accepted this revision.
wenlei added a comment.
This revision is now accepted and ready to land.

lgtm, thanks.




Comment at: clang/test/Driver/pseudo-probe.c:4
 // RUN: %clang -### -fpseudo-probe-for-profiling -fdebug-info-for-profiling %s 
2>&1 | FileCheck %s --check-prefix=CONFLICT
+// RUN: %clang -### -fpseudo-probe-for-profiling 
-funique-internal-linkage-names %s 2>&1 | FileCheck %s --check-prefix=YESPROBE
+// RUN: %clang -### -fpseudo-probe-for-profiling 
-fno-unique-internal-linkage-names %s 2>&1 | FileCheck %s --check-prefix=NONAME

hoy wrote:
> hoy wrote:
> > wenlei wrote:
> > > hoy wrote:
> > > > wenlei wrote:
> > > > > how do we test unique linkage name is implied with probe on?
> > > > These two test lines below corresponding to line 1 in the test, which 
> > > > show that -funique-internal-linkage-names is passed to cc1.
> > > > 
> > > > ```
> > > > 
> > > > // YESPROBE: -fpseudo-probe-for-profiling
> > > > // YESPROBE: -funique-internal-linkage-names
> > > > ```
> > > > 
> > > > The change being made here is a driver change, so a driver test should 
> > > > be good.
> > > The change is about making `-funique-internal-linkage-names` implied, but 
> > > the test specified that switch explicitly. I was asking how to test the 
> > > "implicit" part? Not sure what line 4 is testing wrt to this change. 
> > Line 1 is the -funique-internal-linkage-names switch , and we are checking 
> > `YESPROBE` for it.
> > Line 1 is the -funique-internal-linkage-names switch , and we are checking 
> > `YESPROBE` for it.
> 
> Line is w/o the -funique-internal-linkage-names switch.
ah, sorry missed that. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106193/new/

https://reviews.llvm.org/D106193

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D106193: [CSSPGO] Turn on unique linkage name by default for pseudo probe.

2021-07-16 Thread Wenlei He via Phabricator via cfe-commits
wenlei added inline comments.



Comment at: clang/test/Driver/pseudo-probe.c:4
 // RUN: %clang -### -fpseudo-probe-for-profiling -fdebug-info-for-profiling %s 
2>&1 | FileCheck %s --check-prefix=CONFLICT
+// RUN: %clang -### -fpseudo-probe-for-profiling 
-funique-internal-linkage-names %s 2>&1 | FileCheck %s --check-prefix=YESPROBE
+// RUN: %clang -### -fpseudo-probe-for-profiling 
-fno-unique-internal-linkage-names %s 2>&1 | FileCheck %s --check-prefix=NONAME

hoy wrote:
> wenlei wrote:
> > how do we test unique linkage name is implied with probe on?
> These two test lines below corresponding to line 1 in the test, which show 
> that -funique-internal-linkage-names is passed to cc1.
> 
> ```
> 
> // YESPROBE: -fpseudo-probe-for-profiling
> // YESPROBE: -funique-internal-linkage-names
> ```
> 
> The change being made here is a driver change, so a driver test should be 
> good.
The change is about making `-funique-internal-linkage-names` implied, but the 
test specified that switch explicitly. I was asking how to test the "implicit" 
part? Not sure what line 4 is testing wrt to this change. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106193/new/

https://reviews.llvm.org/D106193

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D106193: [CSSPGO] Turn on unique linkage name by default for pseudo probe.

2021-07-16 Thread Wenlei He via Phabricator via cfe-commits
wenlei added inline comments.



Comment at: clang/test/Driver/pseudo-probe.c:4
 // RUN: %clang -### -fpseudo-probe-for-profiling -fdebug-info-for-profiling %s 
2>&1 | FileCheck %s --check-prefix=CONFLICT
+// RUN: %clang -### -fpseudo-probe-for-profiling 
-funique-internal-linkage-names %s 2>&1 | FileCheck %s --check-prefix=YESPROBE
+// RUN: %clang -### -fpseudo-probe-for-profiling 
-fno-unique-internal-linkage-names %s 2>&1 | FileCheck %s --check-prefix=NONAME

how do we test unique linkage name is implied with probe on?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D106193/new/

https://reviews.llvm.org/D106193

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D104099: [NewPM] Remove SpeculateAroundPHIs pass from pipeline

2021-06-12 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

In D104099#2814167 , @davidxl wrote:

> Adding Wei to help measure performance impact on our internal workloads.  
> Also add Wenlei to help measure impact with FB's workloads.

Measured perf using FB internal workload w/ and w/o this pass, result is 
neutral.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D104099/new/

https://reviews.llvm.org/D104099

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D103909: [CSSPGO] Emit mangled dwarf names for line tables debug option under -fpseudo-probe-for-profiling

2021-06-08 Thread Wenlei He via Phabricator via cfe-commits
wenlei accepted this revision.
wenlei added a comment.
This revision is now accepted and ready to land.

lgtm, thanks for the fix.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D103909/new/

https://reviews.llvm.org/D103909

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D99554: [ThinLTO] During module importing, close one source module before open another one for distributed mode.

2021-04-07 Thread Wenlei He via Phabricator via cfe-commits
wenlei added subscribers: weiwang, wenlei.
wenlei added a comment.

Does this help non-distributed ThinLTO as well? cc: @weiwang


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D99554/new/

https://reviews.llvm.org/D99554

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D95271: [CSSPGO] Passing the clang driver switch -fpseudo-probe-for-profiling to the linker.

2021-01-24 Thread Wenlei He via Phabricator via cfe-commits
wenlei accepted this revision.
wenlei added a comment.
This revision is now accepted and ready to land.

lgtm, thanks.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D95271/new/

https://reviews.llvm.org/D95271

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D95271: [CSSPGO] Passing the clang driver switch -fpseudo-probe-for-profiling to the linker.

2021-01-23 Thread Wenlei He via Phabricator via cfe-commits
wenlei added inline comments.



Comment at: clang/lib/Driver/ToolChains/CommonArgs.cpp:609
+  // Pass an option to enable pseudo probe emission.
+  if (Args.hasArg(options::OPT_fpseudo_probe_for_profiling))
+CmdArgs.push_back("-plugin-opt=pseudo-probe-for-profiling");

`getLastArg(options::OPT_fpseudo_probe_for_profiling, 
options::OPT_fno_pseudo_probe_for_profiling` instead?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D95271/new/

https://reviews.llvm.org/D95271

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D93747: Rename debug linkage name with -funique-internal-linkage-names

2021-01-20 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a subscriber: lxfind.
wenlei added a comment.

In D93747#2475852 , @dblaikie wrote:

> In D93747#2470504 , @hoy wrote:
>
>> In D93747#2470387 , @dblaikie wrote:
>>
>>> Please remove the clang test change - if this is an LLVM change with LLVM 
>>> test coverage, that's enough. (we don't generally test every LLVM change 
>>> from both LLVM and Clang)
>>
>> Sounds good.
>>
>>> I'd still be curious if you could look around to see whether there are 
>>> other cases of function splitting/cloning/etc to see how they deal with 
>>> updating the DISubprograms, to see if there's some prior art that should be 
>>> used here.
>>
>> To the best of my knowledge, existing code does not change the linkage name 
>> field of a DISubprogram once created. You can create a new DISubprogram 
>> record with any linkage name you want but bear in mind how the debugger will 
>> consume the record. Looks like we are the first one to change existing 
>> record which will confuse the debugger.
>
> Sure enough - do any other transforms have similar requirements (of creating 
> a record for something that looks like the same function but isn't quite) but 
> take a different approach? Be good to know if other approaches were 
> chosen/how/why they are applicable there but not here, etc. (conversely 
> perhaps other passes worked around the issue in less than ideal ways and 
> could benefit from using this new approach).
>
> Looks like some places that could use this functionality aren't quite there 
> yet - 
> The Attributor has an internalizing feature (added in 
> 87a85f3d57f55f5652ec44f77816c7c9457545fa 
>  ) that 
> produces invalid IR (ends up with two !dbg attachments of the same 
> DISubprogram) if the function it's internalizing has a DISubprogram - but if 
> it succeeded it'd have the same problem that the linkage name on the 
> DISubprogram wouldn't match the actual symbol/function name. (@jdoerfert 
> @bbn).
> The IR Outliner (@AndrewLitteken ) seems to be only a few months old and 
> appears to have no debug info handling - probably results in the same problem.
> The Partial Inliner does clone a function into a new name & so would have an 
> invalid DISubprogram, though it only uses the clone for inlining (this 
> probably then goes on to produce the desired debug info, where the inlining 
> appears to come from the original function)
> ThinLTO does some function cloning within a module for aliases, but it then 
> renames the clone to the aliasees name so I think the name works out to match 
> again.

Another place where mismatch between linkage name and DISubprogram happens is 
the cloning for coroutines (.resume, .destroy and cleanup funclets). We found 
that out again through different kind of AutoFDO profile fidelity issue (cc: 
@lxfind).

> If this is an invariant, that the linkage name on the DISubprogram should 
> match the actual llvm::Function name (@aprantl @JDevlieghere - verifier 
> check, perhaps?) - it'd be nice to make that more reliable, either by 
> removing the name and relying on the llvm::Function name (perhaps with a 
> boolean on the DISubprogram as to whether the linkage name should be emitted 
> or not - I think currently Clang's IRGen makes choices about which 
> DISubprograms will get linkage names) or a verifier and utilities to keep 
> them in sync.

Agreed, clarity on the rules and sanity check built into verifier would be 
beneficial.

> But I'll leave that alone for now/for this review, unless someone else wants 
> to chime in on it.
>
> In D93747#2470178 , @tmsriram wrote:
>
>> In D93747#2469556 , @hoy wrote:
>>
 In D93656 , @dblaikie wrote:
 Though the C case is interesting - it means you'll end up with C functions 
 with the same DWARF 'name' but different linkage name. I don't know what 
 that'll do to DWARF consumers - I guess they'll probably be OK-ish with 
 it, as much as they are with C++ overloading. I think there are some cases 
 of C name mangling so it's probably supported/OK-ish with DWARF Consumers. 
 Wouldn't hurt for you to take a look/see what happens in that case with a 
 debugger like gdb/check other cases of C name mangling to see what DWARF 
 they end up creating (with both GCC and Clang).
>>>
>>> I did a quick experiment with C name managing with GCC and -flto. It turns 
>>> out the `DW_AT_linkage_name` field of `DW_TAG_subprogram` is never set for 
>>> C programs. If set, the gdb debugger will use that field to match the user 
>>> input and set breakpoints. Therefore, giving `DW_AT_linkage_name` a 
>>> uniquefied name prevents the debugger from setting a breakpoint based on 
>>> source names unless the user specifies a dec

[PATCH] D94333: [Inliner] Change inline remark format and update ReplayInlineAdvisor to use it

2021-01-19 Thread Wenlei He via Phabricator via cfe-commits
wenlei added inline comments.



Comment at: llvm/lib/Analysis/InlineAdvisor.cpp:412
+
+  Remark << ";";
 }

modimo wrote:
> wenlei wrote:
> > nit: any special reason for adding this? doesn't seem consistent with other 
> > remarks we have.
> If you grab the remark outputs via `-Rpass=inline` you'll get additional 
> suffix information:
> ```
> inline.cpp:8:12: remark: _Z3foov inlined into main with (cost=0, 
> threshold=375) at callsite main:2:12; [-Rpass=inline]
> return foo();
> ```
> 
> The semicolon is to separate the remark from any additional output at the end 
> so when replaying we can match the correct callsite. Something like this 
> would be unneeded for yaml replay but for the current implementation it's 
> necessary for correctness.
> 
By correctness, did you mean the fact that you rely on `split(";")` in parsing, 
or something else?

This is not a big deal, but if no other remarks end with `;`, it would be good 
to be consistent. Using `split(";")` for parsing is just one way of 
implementing it, and IMO could be changed to favor consistency in remarks 
output.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94333/new/

https://reviews.llvm.org/D94333

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D94333: [Inliner] Change inline remark format and update ReplayInlineAdvisor to use it

2021-01-12 Thread Wenlei He via Phabricator via cfe-commits
wenlei added inline comments.



Comment at: llvm/lib/Analysis/InlineAdvisor.cpp:412
+
+  Remark << ";";
 }

nit: any special reason for adding this? doesn't seem consistent with other 
remarks we have.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D94333/new/

https://reviews.llvm.org/D94333

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D90507: Adding DWARF64 clang flag

2020-11-13 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

This change covers non-LTO cases. For LTO, I think we would need to pass it 
from driver to LTO. Something like this: tools::addLTOOptions -> lld -> 
lto::Config (Config->TargetOptions->MCTargetOptions) ->LTO Backend.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D90507/new/

https://reviews.llvm.org/D90507

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D89066: [Coroutine][Sema] Only tighten the suspend call temp lifetime for final awaiter

2020-10-12 Thread Wenlei He via Phabricator via cfe-commits
wenlei accepted this revision.
wenlei added a comment.
This revision is now accepted and ready to land.

LGTM. I think we can get this in first to address the fall out from 
https://reviews.llvm.org/D87470, while investigating ASAN failure.




Comment at: clang/lib/Sema/SemaCoroutine.cpp:401
 
-  // The coroutine handle used to obtain the address is no longer needed
-  // at this point, clean it up to avoid unnecessarily long lifetime which
-  // could lead to unnecessary spilling.
-  JustAddress = S.MaybeCreateExprWithCleanups(JustAddress);
+  // After the suspend call on the final awaiter, the coroutine may have
+  // been destroyed. In that case, we can not store anything to the frame

Can you add comment explaining why we don't cleanup for all await, probably a 
TODO? 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D89066/new/

https://reviews.llvm.org/D89066

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86156: [BFI] Make BFI information available through loop passes inside LoopStandardAnalysisResults

2020-09-15 Thread Wenlei He via Phabricator via cfe-commits
wenlei added subscribers: asbirlea, modimo.
wenlei added a comment.

Just saw this. Thanks @Alina Sbirlea for fixing the tests, much appreciated!

On 9/15/20, 6:44 PM, "Nico Weber via Phabricator"  
wrote:

  thakis added a comment.
  
  Looks like this breaks tests: 
https://urldefense.proofpoint.com/v2/url?u=http-3A__45.33.8.238_linux_27942_step-5F12.txt&d=DwIFAg&c=5VD0RTtNlTh3ycd41b3MUw&r=KfYo542rDdZQGClmgz-RBw&m=WVDTkMJEgC2QBjCWKFwyWX-ftSq4MQIeuJdkzkhJdck&s=u5nGzx9n7KhDMJ5K5977aCoxEF8XTrwovmuF0w0QVhw&e=
 
  
  Ptal, and revert for now if it takes a while to fix.
  
  
  Repository:
rG LLVM Github Monorepo
  
  CHANGES SINCE LAST ACTION

https://urldefense.proofpoint.com/v2/url?u=https-3A__reviews.llvm.org_D86156_new_&d=DwIFAg&c=5VD0RTtNlTh3ycd41b3MUw&r=KfYo542rDdZQGClmgz-RBw&m=WVDTkMJEgC2QBjCWKFwyWX-ftSq4MQIeuJdkzkhJdck&s=7S7jBRn6nAdqYCKhAJ2_bj-1GFfwhTmDvykaV67bOd4&e=
 
  
  
https://urldefense.proofpoint.com/v2/url?u=https-3A__reviews.llvm.org_D86156&d=DwIFAg&c=5VD0RTtNlTh3ycd41b3MUw&r=KfYo542rDdZQGClmgz-RBw&m=WVDTkMJEgC2QBjCWKFwyWX-ftSq4MQIeuJdkzkhJdck&s=whWFyZ-i_C3g_NJi7ZrVBm4HMklqSjbBalaSQEIf22g&e=


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86156: [BFI] Make BFI information available through loop passes inside LoopStandardAnalysisResults

2020-09-15 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

Committed on behalf of @modimo.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86156: [BFI] Make BFI information available through loop passes inside LoopStandardAnalysisResults

2020-09-15 Thread Wenlei He via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG2ea4c2c598b7: [BFI] Make BFI information available through 
loop passes inside… (authored by wenlei).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

Files:
  llvm/include/llvm/Analysis/LoopAnalysisManager.h
  llvm/include/llvm/Transforms/Scalar/LoopPassManager.h
  llvm/lib/Passes/PassBuilder.cpp
  llvm/lib/Transforms/Scalar/LICM.cpp
  llvm/lib/Transforms/Scalar/LoopDistribute.cpp
  llvm/lib/Transforms/Scalar/LoopLoadElimination.cpp
  llvm/lib/Transforms/Scalar/LoopUnswitch.cpp
  llvm/lib/Transforms/Utils/LoopVersioning.cpp
  llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
  llvm/test/Other/opt-O2-pipeline.ll
  llvm/test/Other/opt-O3-pipeline-enable-matrix.ll
  llvm/test/Other/opt-O3-pipeline.ll
  llvm/test/Other/opt-Os-pipeline.ll
  llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp

Index: llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
===
--- llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
+++ llvm/unittests/Transforms/Scalar/LoopPassManagerTest.cpp
@@ -9,7 +9,10 @@
 #include "llvm/Transforms/Scalar/LoopPassManager.h"
 #include "llvm/Analysis/AliasAnalysis.h"
 #include "llvm/Analysis/AssumptionCache.h"
+#include "llvm/Analysis/BlockFrequencyInfo.h"
+#include "llvm/Analysis/BranchProbabilityInfo.h"
 #include "llvm/Analysis/MemorySSA.h"
+#include "llvm/Analysis/PostDominators.h"
 #include "llvm/Analysis/ScalarEvolution.h"
 #include "llvm/Analysis/TargetLibraryInfo.h"
 #include "llvm/Analysis/TargetTransformInfo.h"
@@ -294,6 +297,9 @@
 // those.
 FAM.registerPass([&] { return AAManager(); });
 FAM.registerPass([&] { return AssumptionAnalysis(); });
+FAM.registerPass([&] { return BlockFrequencyAnalysis(); });
+FAM.registerPass([&] { return BranchProbabilityAnalysis(); });
+FAM.registerPass([&] { return PostDominatorTreeAnalysis(); });
 FAM.registerPass([&] { return MemorySSAAnalysis(); });
 FAM.registerPass([&] { return ScalarEvolutionAnalysis(); });
 FAM.registerPass([&] { return TargetLibraryAnalysis(); });
Index: llvm/test/Other/opt-Os-pipeline.ll
===
--- llvm/test/Other/opt-Os-pipeline.ll
+++ llvm/test/Other/opt-Os-pipeline.ll
@@ -97,6 +97,8 @@
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Rotate Loops
 ; CHECK-NEXT: Memory SSA
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT:   Unswitch loops
@@ -154,6 +156,8 @@
 ; CHECK-NEXT: LCSSA Verifier
 ; CHECK-NEXT: Loop-Closed SSA Form Pass
 ; CHECK-NEXT: Scalar Evolution Analysis
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT: Post-Dominator Tree Construction
@@ -256,10 +260,10 @@
 ; CHECK-NEXT:   LCSSA Verifier
 ; CHECK-NEXT:   Loop-Closed SSA Form Pass
 ; CHECK-NEXT:   Scalar Evolution Analysis
-; CHECK-NEXT:   Loop Pass Manager
-; CHECK-NEXT: Loop Invariant Code Motion
 ; CHECK-NEXT:   Lazy Branch Probability Analysis
 ; CHECK-NEXT:   Lazy Block Frequency Analysis
+; CHECK-NEXT:   Loop Pass Manager
+; CHECK-NEXT: Loop Invariant Code Motion
 ; CHECK-NEXT:   Optimization Remark Emitter
 ; CHECK-NEXT:   Warn about non-applied transformations
 ; CHECK-NEXT:   Alignment from assumptions
Index: llvm/test/Other/opt-O3-pipeline.ll
===
--- llvm/test/Other/opt-O3-pipeline.ll
+++ llvm/test/Other/opt-O3-pipeline.ll
@@ -116,6 +116,8 @@
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Rotate Loops
 ; CHECK-NEXT: Memory SSA
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT:   Unswitch loops
@@ -173,6 +175,8 @@
 ; CHECK-NEXT: LCSSA Verifier
 ; CHECK-NEXT: Loop-Closed SSA Form Pass
 ; CHECK-NEXT: Scalar Evolution Analysis
+; CHECK-NEXT: Lazy Branch Probability Analysis
+; CHECK-NEXT: Lazy Block Frequency Analysis
 ; CHECK-NEXT: Loop Pass Manager
 ; CHECK-NEXT:   Loop Invariant Code Motion
 ; CHECK-NEXT: Post-Dominator Tree Construction
@@ -275,10 +279,10 @@
 ; CHECK-NEXT:   LCSSA Verifier
 ; CHECK-NEXT:   Loop-Closed SSA Form Pass
 ; CHECK-NEXT:   Scalar Evolution Analysis
-; CHECK-NEXT:   Loop Pass Manager
-; CHECK-NEXT

[PATCH] D77925: Revert "[TLI] Per-function fveclib for math library used for vectorization"

2020-08-27 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

In D77925#2236519 , @tejohnson wrote:

> In D77925#2229484 , @mehdi_amini 
> wrote:
>
>> Overall that would likely work for XLA. Something I'd like to mention though 
>> in response to:
>>
>>> The veclib type is also tied to the accepted values for -fveclib, which is 
>>> a list of supported lib,
>>
>> `-fveclib` is a Clang thing, it shouldn't limit what LLVM does. Of course 
>> LLVM needs to support Clang, but does not have to inherit the limitation of 
>> map 1:1 to Clang UI.
>> In particular as a library, it isn't clear why we would make the choice to 
>> write LLVM VecLib support this way.

Fair point. I was trying to differentiate the accepted veclib from any other 
custom lib. I guess it's somewhat like namespace, even though built-in ones are 
different from user defined ones, the underlying support doesn't have to 
differentiate them.

> Is there any benefit to keeping a closed list like this in LLVM? If not (and 
> presumably clang is checking for valid values of -fveclib), then I think I 
> agree with @mehdi_amini. Unless there is an efficiency reason for doing it 
> via an enum. It's been awhile since I looked through this code in detail...

I think performance should be fine, the attributes on functions are in string 
form already. TLI compatibility check will involve a string compare, but a 
short string compare shouldn't be disastrous. I was mainly trying to let LLVM 
match clang's supported list.

Will get back to this hopefully next week.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77925/new/

https://reviews.llvm.org/D77925

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-27 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

In D86156#2243393 , @asbirlea wrote:

> Diff looks reasonable at this point. Thank you for the patch!
> Please wait on @nikic for compile-time impact or additional feedback.
>
> Just out of curiosity, in D65060 , it was 
> mentioned that using BFI got you ~7% improvement for a CPU related metric 
> (@wenlei). Are you seeing benefits from this patch? And which pass manager 
> are you using?

Thanks for quick review. We got the perf improvement from preventing a bad 
hoisting in a critical loop using BFI. Originally it was with legacy pass 
manager, hence the invalidation issue with new pass manager wasn't caught from 
our usage. This change was made as an internal patch for perf parity when we 
moved to new pass manager, and we've been using it for a while now.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D86156: [BFI] Preserve BFI information through loop passes via VH callbacks inside LoopStandardAnalysisResults

2020-08-22 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

> If only LICM benefits from it, I'd consider creating a BFI instance for the 
> duration of the LICM pass

Currently only LICM uses BFI, but I think it'd be beneficial for other loop 
passes to be able to use BFI too. BFI not accessible to loop passes seems to be 
a non-trivial limitation (comparing to other compilers). I would think it's not 
that other loop passes won't benefit from BFI, but because of the constraints 
imposed by pass manager, we couldn't reap any potential benefit.

I'm hoping that we could find a way to lift that limitation (and of course 
without breaking the invariants we need to keep for pass manager). This change 
is an attempt towards that. It's not ideal, as it only covers invalidation (for 
removal of blocks), but not incremental update for BFI to reflect arbitrary CFG 
changes, which can be really hard. Still, we think it's a practical solution to 
expose BFI to loop passes.

> or only enabling it in loop pipelines that have LICM in them (see example 
> comment in one of the tests).

Makes sense for now.


CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D86156/new/

https://reviews.llvm.org/D86156

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77925: Revert "[TLI] Per-function fveclib for math library used for vectorization"

2020-08-20 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

In D77925#2220177 , @wenlei wrote:

> In D77925#2220169 , @mehdi_amini 
> wrote:
>
>> I rather have a non closed list of veclib: if you just have a map keyed by 
>> string instead of an enum it should just work out?
>
> The veclib type is also tied to the accepted values for `-fveclib`, which is 
> a list of supported lib, anything outside of the list is rejected with error. 
> If you want to allow arbitrary strings as key, it's inconsistent with the 
> restricted/closed values for `-fveclib`. So basically there's no openness 
> from clang/llvm tool, while there was some openness through the libraries. I 
> think by introducing a "Custom" veclib type, we can solve the impedance 
> mismatch there. And in the XLA case, conceptually it's an indeed a custom 
> veclib, right? Functionality-wise, `Custom` should just work for XLA usage 
> too.

@mehdi_amini @tejohnson thoughts on the above?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77925/new/

https://reviews.llvm.org/D77925

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77925: Revert "[TLI] Per-function fveclib for math library used for vectorization"

2020-08-15 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

In D77925#2220169 , @mehdi_amini wrote:

> I rather have a non closed list of veclib: if you just have a map keyed by 
> string instead of an enum it should just work out?

The veclib type is also tied to the accepted values for `-fveclib`, which is a 
list of supported lib, anything outside of the list is rejected with error. If 
you want to allow arbitrary strings as key, it's inconsistent with the 
restricted/closed values for `-fveclib`. So basically there's no openness from 
clang/llvm tool, while there was some openness through the libraries. I think 
by introducing a "Custom" veclib type, we can solve the impedance mismatch 
there. And in the XLA case, conceptually it's an indeed a custom veclib, right? 
Functionality-wise, `Custom` should just work for XLA usage too.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77925/new/

https://reviews.llvm.org/D77925

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77925: Revert "[TLI] Per-function fveclib for math library used for vectorization"

2020-08-15 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

In D77925#2016326 , @tejohnson wrote:

> In D77925#2016299 , @wenlei wrote:
>
>> @mehdi_amini @tejohnson When can we re-land this (with tweaks)? I'm under 
>> the impression that a test case demonstrating the 3rd party usage will be 
>> added very soon after this revert, then we can tweak the original patch to 
>> accommodate that usage, and re-land asap. Or am I missing something there? 
>> I'd like to get this unblocked asap. Currently we have to keep this as a 
>> private patch on our end which is a bit cumbersome, and I think this one can 
>> be useful for others too. Thanks..
>
> @bkramer can you work with Wenlei on this (original patch is D77632 
> ).
>
> @wenlei, in the meantime you can see the use case here:
> https://github.com/tensorflow/tensorflow/blob/master/tensorflow/compiler/xla/service/cpu/compiler_functor.cc#L198
> for revising the patch.

Thanks for pointer, @tejohnson. Looks like we need a way for others to provide 
a set of vector functions. How about we introduced a dedicated VecLib type 
`Custom`, in addition to the existing ones (Accelerate, SVML and MASSV), and 
expose a public API `addCustomVectorizableFunctions(ArrayRef Fns)` for 
TLII to allow registering custom function list. This way we preserve the 
openness through `Custom`, but also keep it clean and structured.

Then for XLA, you just need to specify `-fveclib=Custom` and call 
`addCustomVectorizableFunctions` instead of `addVectorizableFunctions`.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77925/new/

https://reviews.llvm.org/D77925

___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82213: [Remarks] Add callsite locations to inline remarks

2020-06-24 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

In D82213#2110941 , @fhahn wrote:

> That's interesting. We are also using something similar for the matrix 
> lowering remarks [1]: we traverse the inlining chain bottom up and emit a 
> remark at each step which contains the expression available at that level. I 
> think those approaches could be useful in general to surface remarks at the 
> right level and it might be worth moving them somewhere so they can be 
> shared. What do you think?
>
> [1] 
> https://github.com/llvm/llvm-project/blob/master/llvm/lib/Transforms/Scalar/LowerMatrixIntrinsics.cpp#L1783


That's indeed similar, though it seems like what you're doing is more than just 
showing the full inline stack as location. Agreed that if we start to do these 
in more places for optimization remarks, it'd make sense to build it into 
remarks infra. But we may not always want full inline stack names as location 
(considering deep inlining with long template instantiation names that can 
"pollute" the remark messages), so I'm guessing what we could do is move that 
into remarks infra, but still use a separate switch to control whether we show 
inline locations (just like how `-fdiagnostics-show-hotness` controls whether 
we show hotness for remarks)?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82213/new/

https://reviews.llvm.org/D82213



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82213: [Remarks] Add callsite locations to inline remarks

2020-06-20 Thread Wenlei He via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG7c8a6936bf6b: [Remarks] Add callsite locations to inline 
remarks (authored by wenlei).

Changed prior to commit:
  https://reviews.llvm.org/D82213?vs=272263&id=272289#toc

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82213/new/

https://reviews.llvm.org/D82213

Files:
  clang/test/Frontend/optimization-remark-with-hotness-new-pm.c
  clang/test/Frontend/optimization-remark-with-hotness.c
  llvm/include/llvm/Analysis/InlineAdvisor.h
  llvm/lib/Analysis/InlineAdvisor.cpp
  llvm/lib/Transforms/IPO/SampleProfile.cpp
  llvm/test/Transforms/Inline/optimization-remarks-passed-yaml.ll
  llvm/test/Transforms/SampleProfile/Inputs/remarks.prof
  llvm/test/Transforms/SampleProfile/remarks.ll

Index: llvm/test/Transforms/SampleProfile/remarks.ll
===
--- llvm/test/Transforms/SampleProfile/remarks.ll
+++ llvm/test/Transforms/SampleProfile/remarks.ll
@@ -21,7 +21,8 @@
 
 ; We are expecting foo() to be inlined in main() (almost all the cycles are
 ; spent inside foo).
-; CHECK: remark: remarks.cc:13:21: inlined callee '_Z3foov' into 'main'
+; CHECK: remark: remarks.cc:13:21: _Z3foov inlined into main to match profiling context with (cost=130, threshold=225) at callsite main:0
+; CHECK: remark: remarks.cc:9:19: rand inlined into main to match profiling context with (cost=always): always inline attribute at callsite _Z3foov:6 @ main:0
 
 ; The back edge for the loop is the hottest edge in the loop subgraph.
 ; CHECK: remark: remarks.cc:6:9: most popular destination for conditional branches at remarks.cc:5:3
@@ -32,18 +33,51 @@
 ; Checking to see if YAML file is generated and contains remarks
 ;YAML:   --- !Passed
 ;YAML-NEXT:  Pass:sample-profile-inline
-;YAML-NEXT:  Name:InlineSuccess
+;YAML-NEXT:  Name:Inlined
 ;YAML-NEXT:  DebugLoc:{ File: remarks.cc, Line: 13, Column: 21 }
 ;YAML-NEXT:  Function:main
 ;YAML-NEXT:  Args:
-;YAML-NEXT:- String:  'inlined callee '''
 ;YAML-NEXT:- Callee:  _Z3foov
 ;YAML-NEXT:  DebugLoc:{ File: remarks.cc, Line: 3, Column: 0 }
-;YAML-NEXT:- String:  ''' into '''
+;YAML-NEXT:- String:  ' inlined into '
 ;YAML-NEXT:- Caller:  main
 ;YAML-NEXT:DebugLoc:{ File: remarks.cc, Line: 13, Column: 0 }
-;YAML-NEXT:- String:  
+;YAML-NEXT:- String:  ' to match profiling context'
+;YAML-NEXT:- String:  ' with '
+;YAML-NEXT:- String:  '(cost='
+;YAML-NEXT:- Cost:'130'
+;YAML-NEXT:- String:  ', threshold='
+;YAML-NEXT:- Threshold:   '225'
+;YAML-NEXT:- String:  ')'
+;YAML-NEXT:- String:  ' at callsite '
+;YAML-NEXT:- String:  main
+;YAML-NEXT:- String:  ':'
+;YAML-NEXT:- Line:'0'
 ;YAML-NEXT:  ...
+;YAML:   --- !Passed
+;YAML-NEXT:  Pass:sample-profile-inline
+;YAML-NEXT:  Name:AlwaysInline
+;YAML-NEXT:  DebugLoc:{ File: remarks.cc, Line: 9, Column: 19 }
+;YAML-NEXT:  Function:main
+;YAML-NEXT:  Args:
+;YAML-NEXT:- Callee:  rand
+;YAML-NEXT:  DebugLoc:{ File: remarks.cc, Line: 90, Column: 0 }
+;YAML-NEXT:- String:  ' inlined into '
+;YAML-NEXT:- Caller:  main
+;YAML-NEXT:  DebugLoc:{ File: remarks.cc, Line: 13, Column: 0 }
+;YAML-NEXT:- String:  ' to match profiling context'
+;YAML-NEXT:- String:  ' with '
+;YAML-NEXT:- String:  '(cost=always)'
+;YAML-NEXT:- String:  ': '
+;YAML-NEXT:- Reason:  always inline attribute
+;YAML-NEXT:- String:  ' at callsite '
+;YAML-NEXT:- String:  _Z3foov
+;YAML-NEXT:- String:  ':'
+;YAML-NEXT:- Line:'6'
+;YAML-NEXT:- String:  ' @ '
+;YAML-NEXT:- String:  main
+;YAML-NEXT:- String:  ':'
+;YAML-NEXT:- Line:'0'
 ;YAML:  --- !Analysis
 ;YAML-NEXT:  Pass:sample-profile
 ;YAML-NEXT:  Name:AppliedSamples
@@ -139,7 +173,9 @@
 declare void @llvm.dbg.declare(metadata, metadata, metadata) #2
 
 ; Function Attrs: nounwind
-declare i32 @rand() #3
+define i32 @rand() #3 !dbg !59 {
+  ret i32 1
+}
 
 ; Function Attrs: nounwind argmemonly
 declare void @llvm.lifetime.end.p0i8(i64, i8* nocapture) #1
@@ -158,7 +194,7 @@
 attributes #0 = { nounwind uwtable "disable-tail-calls"="false" "less-precise-fpmad"="false" "frame-pointer"="none" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+fxsr,+mmx,+sse,+sse2" "unsafe-fp-math"="false" "use-soft-float"="false" "use-sample-profile" }
 attributes #1 = { nounwind argmemonly }
 attributes #2 = { nou

[PATCH] D82213: [Remarks] Add callsite locations to inline remarks

2020-06-20 Thread Wenlei He via Phabricator via cfe-commits
wenlei marked an inline comment as done.
wenlei added inline comments.



Comment at: llvm/lib/Analysis/InlineAdvisor.cpp:392
+if (ProfileGuidedInline)
+  Remark << " by profile guided inliner";
+Remark << " with " << IC;

davidxl wrote:
> Perhaps reword it to " to match profiling context" ..
Sounds good, updated.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82213/new/

https://reviews.llvm.org/D82213



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82213: [Remarks] Add callsite locations to inline remarks

2020-06-20 Thread Wenlei He via Phabricator via cfe-commits
wenlei updated this revision to Diff 272263.
wenlei added a comment.

Update remark message.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82213/new/

https://reviews.llvm.org/D82213

Files:
  clang/test/Frontend/optimization-remark-with-hotness-new-pm.c
  clang/test/Frontend/optimization-remark-with-hotness.c
  llvm/include/llvm/Analysis/InlineAdvisor.h
  llvm/lib/Analysis/InlineAdvisor.cpp
  llvm/lib/Transforms/IPO/SampleProfile.cpp
  llvm/test/Transforms/Inline/optimization-remarks-passed-yaml.ll
  llvm/test/Transforms/SampleProfile/Inputs/remarks.prof
  llvm/test/Transforms/SampleProfile/remarks.ll

Index: llvm/test/Transforms/SampleProfile/remarks.ll
===
--- llvm/test/Transforms/SampleProfile/remarks.ll
+++ llvm/test/Transforms/SampleProfile/remarks.ll
@@ -21,7 +21,8 @@
 
 ; We are expecting foo() to be inlined in main() (almost all the cycles are
 ; spent inside foo).
-; CHECK: remark: remarks.cc:13:21: inlined callee '_Z3foov' into 'main'
+; CHECK: remark: remarks.cc:13:21: _Z3foov inlined into main to match profiling context with (cost=130, threshold=225) at callsite main:0
+; CHECK: remark: remarks.cc:9:19: rand inlined into main to match profiling context with (cost=always): always inline attribute at callsite _Z3foov:6 @ main:0
 
 ; The back edge for the loop is the hottest edge in the loop subgraph.
 ; CHECK: remark: remarks.cc:6:9: most popular destination for conditional branches at remarks.cc:5:3
@@ -32,18 +33,51 @@
 ; Checking to see if YAML file is generated and contains remarks
 ;YAML:   --- !Passed
 ;YAML-NEXT:  Pass:sample-profile-inline
-;YAML-NEXT:  Name:InlineSuccess
+;YAML-NEXT:  Name:Inlined
 ;YAML-NEXT:  DebugLoc:{ File: remarks.cc, Line: 13, Column: 21 }
 ;YAML-NEXT:  Function:main
 ;YAML-NEXT:  Args:
-;YAML-NEXT:- String:  'inlined callee '''
 ;YAML-NEXT:- Callee:  _Z3foov
 ;YAML-NEXT:  DebugLoc:{ File: remarks.cc, Line: 3, Column: 0 }
-;YAML-NEXT:- String:  ''' into '''
+;YAML-NEXT:- String:  ' inlined into '
 ;YAML-NEXT:- Caller:  main
 ;YAML-NEXT:DebugLoc:{ File: remarks.cc, Line: 13, Column: 0 }
-;YAML-NEXT:- String:  
+;YAML-NEXT:- String:  ' to match profiling context'
+;YAML-NEXT:- String:  ' with '
+;YAML-NEXT:- String:  '(cost='
+;YAML-NEXT:- Cost:'130'
+;YAML-NEXT:- String:  ', threshold='
+;YAML-NEXT:- Threshold:   '225'
+;YAML-NEXT:- String:  ')'
+;YAML-NEXT:- String:  ' at callsite '
+;YAML-NEXT:- String:  main
+;YAML-NEXT:- String:  ':'
+;YAML-NEXT:- Line:'0'
 ;YAML-NEXT:  ...
+;YAML:   --- !Passed
+;YAML-NEXT:  Pass:sample-profile-inline
+;YAML-NEXT:  Name:AlwaysInline
+;YAML-NEXT:  DebugLoc:{ File: remarks.cc, Line: 9, Column: 19 }
+;YAML-NEXT:  Function:main
+;YAML-NEXT:  Args:
+;YAML-NEXT:- Callee:  rand
+;YAML-NEXT:  DebugLoc:{ File: remarks.cc, Line: 90, Column: 0 }
+;YAML-NEXT:- String:  ' inlined into '
+;YAML-NEXT:- Caller:  main
+;YAML-NEXT:  DebugLoc:{ File: remarks.cc, Line: 13, Column: 0 }
+;YAML-NEXT:- String:  ' to match profiling context'
+;YAML-NEXT:- String:  ' with '
+;YAML-NEXT:- String:  '(cost=always)'
+;YAML-NEXT:- String:  ': '
+;YAML-NEXT:- Reason:  always inline attribute
+;YAML-NEXT:- String:  ' at callsite '
+;YAML-NEXT:- String:  _Z3foov
+;YAML-NEXT:- String:  ':'
+;YAML-NEXT:- Line:'6'
+;YAML-NEXT:- String:  ' @ '
+;YAML-NEXT:- String:  main
+;YAML-NEXT:- String:  ':'
+;YAML-NEXT:- Line:'0'
 ;YAML:  --- !Analysis
 ;YAML-NEXT:  Pass:sample-profile
 ;YAML-NEXT:  Name:AppliedSamples
@@ -139,7 +173,9 @@
 declare void @llvm.dbg.declare(metadata, metadata, metadata) #2
 
 ; Function Attrs: nounwind
-declare i32 @rand() #3
+define i32 @rand() #3 !dbg !59 {
+  ret i32 1
+}
 
 ; Function Attrs: nounwind argmemonly
 declare void @llvm.lifetime.end.p0i8(i64, i8* nocapture) #1
@@ -158,7 +194,7 @@
 attributes #0 = { nounwind uwtable "disable-tail-calls"="false" "less-precise-fpmad"="false" "frame-pointer"="none" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+fxsr,+mmx,+sse,+sse2" "unsafe-fp-math"="false" "use-soft-float"="false" "use-sample-profile" }
 attributes #1 = { nounwind argmemonly }
 attributes #2 = { nounwind readnone }
-attributes #3 = { nounwind "disable-tail-calls"="false" "less-precise-fpmad"="false" "frame-pointer"="none" "no-infs-fp-math"="false" "no-nans-fp-math"="

[PATCH] D82213: [Remarks] Add callsite locations to inline remarks

2020-06-19 Thread Wenlei He via Phabricator via cfe-commits
wenlei updated this revision to Diff 272240.
wenlei added a comment.

Address David's comments, add test for nested inlinining.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82213/new/

https://reviews.llvm.org/D82213

Files:
  clang/test/Frontend/optimization-remark-with-hotness-new-pm.c
  clang/test/Frontend/optimization-remark-with-hotness.c
  llvm/include/llvm/Analysis/InlineAdvisor.h
  llvm/lib/Analysis/InlineAdvisor.cpp
  llvm/lib/Transforms/IPO/SampleProfile.cpp
  llvm/test/Transforms/Inline/optimization-remarks-passed-yaml.ll
  llvm/test/Transforms/SampleProfile/Inputs/remarks.prof
  llvm/test/Transforms/SampleProfile/remarks.ll

Index: llvm/test/Transforms/SampleProfile/remarks.ll
===
--- llvm/test/Transforms/SampleProfile/remarks.ll
+++ llvm/test/Transforms/SampleProfile/remarks.ll
@@ -21,7 +21,8 @@
 
 ; We are expecting foo() to be inlined in main() (almost all the cycles are
 ; spent inside foo).
-; CHECK: remark: remarks.cc:13:21: inlined callee '_Z3foov' into 'main'
+; CHECK: remark: remarks.cc:13:21: _Z3foov inlined into main by profile guided inliner with (cost=130, threshold=225) at callsite main:0
+; CHECK: remark: remarks.cc:9:19: rand inlined into main by profile guided inliner with (cost=always): always inline attribute at callsite _Z3foov:6 @ main:0
 
 ; The back edge for the loop is the hottest edge in the loop subgraph.
 ; CHECK: remark: remarks.cc:6:9: most popular destination for conditional branches at remarks.cc:5:3
@@ -32,18 +33,51 @@
 ; Checking to see if YAML file is generated and contains remarks
 ;YAML:   --- !Passed
 ;YAML-NEXT:  Pass:sample-profile-inline
-;YAML-NEXT:  Name:InlineSuccess
+;YAML-NEXT:  Name:Inlined
 ;YAML-NEXT:  DebugLoc:{ File: remarks.cc, Line: 13, Column: 21 }
 ;YAML-NEXT:  Function:main
 ;YAML-NEXT:  Args:
-;YAML-NEXT:- String:  'inlined callee '''
 ;YAML-NEXT:- Callee:  _Z3foov
 ;YAML-NEXT:  DebugLoc:{ File: remarks.cc, Line: 3, Column: 0 }
-;YAML-NEXT:- String:  ''' into '''
+;YAML-NEXT:- String:  ' inlined into '
 ;YAML-NEXT:- Caller:  main
 ;YAML-NEXT:DebugLoc:{ File: remarks.cc, Line: 13, Column: 0 }
-;YAML-NEXT:- String:  
+;YAML-NEXT:- String:  ' by profile guided inliner'
+;YAML-NEXT:- String:  ' with '
+;YAML-NEXT:- String:  '(cost='
+;YAML-NEXT:- Cost:'130'
+;YAML-NEXT:- String:  ', threshold='
+;YAML-NEXT:- Threshold:   '225'
+;YAML-NEXT:- String:  ')'
+;YAML-NEXT:- String:  ' at callsite '
+;YAML-NEXT:- String:  main
+;YAML-NEXT:- String:  ':'
+;YAML-NEXT:- Line:'0'
 ;YAML-NEXT:  ...
+;YAML:   --- !Passed
+;YAML-NEXT:  Pass:sample-profile-inline
+;YAML-NEXT:  Name:AlwaysInline
+;YAML-NEXT:  DebugLoc:{ File: remarks.cc, Line: 9, Column: 19 }
+;YAML-NEXT:  Function:main
+;YAML-NEXT:  Args:
+;YAML-NEXT:- Callee:  rand
+;YAML-NEXT:  DebugLoc:{ File: remarks.cc, Line: 90, Column: 0 }
+;YAML-NEXT:- String:  ' inlined into '
+;YAML-NEXT:- Caller:  main
+;YAML-NEXT:  DebugLoc:{ File: remarks.cc, Line: 13, Column: 0 }
+;YAML-NEXT:- String:  ' by profile guided inliner'
+;YAML-NEXT:- String:  ' with '
+;YAML-NEXT:- String:  '(cost=always)'
+;YAML-NEXT:- String:  ': '
+;YAML-NEXT:- Reason:  always inline attribute
+;YAML-NEXT:- String:  ' at callsite '
+;YAML-NEXT:- String:  _Z3foov
+;YAML-NEXT:- String:  ':'
+;YAML-NEXT:- Line:'6'
+;YAML-NEXT:- String:  ' @ '
+;YAML-NEXT:- String:  main
+;YAML-NEXT:- String:  ':'
+;YAML-NEXT:- Line:'0'
 ;YAML:  --- !Analysis
 ;YAML-NEXT:  Pass:sample-profile
 ;YAML-NEXT:  Name:AppliedSamples
@@ -139,7 +173,9 @@
 declare void @llvm.dbg.declare(metadata, metadata, metadata) #2
 
 ; Function Attrs: nounwind
-declare i32 @rand() #3
+define i32 @rand() #3 !dbg !59 {
+  ret i32 1
+}
 
 ; Function Attrs: nounwind argmemonly
 declare void @llvm.lifetime.end.p0i8(i64, i8* nocapture) #1
@@ -158,7 +194,7 @@
 attributes #0 = { nounwind uwtable "disable-tail-calls"="false" "less-precise-fpmad"="false" "frame-pointer"="none" "no-infs-fp-math"="false" "no-nans-fp-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="x86-64" "target-features"="+fxsr,+mmx,+sse,+sse2" "unsafe-fp-math"="false" "use-soft-float"="false" "use-sample-profile" }
 attributes #1 = { nounwind argmemonly }
 attributes #2 = { nounwind readnone }
-attributes #3 = { nounwind "disable-tail-calls"="false" "less-precise-fpmad"="false" "frame-pointer"="none" "no-infs-fp-ma

[PATCH] D82213: [Remarks] Add callsite locations to inline remarks

2020-06-19 Thread Wenlei He via Phabricator via cfe-commits
wenlei marked an inline comment as done.
wenlei added inline comments.



Comment at: llvm/lib/Analysis/InlineAdvisor.cpp:391
+Remark << ore::NV("Caller", &Caller);
+if (ProfileGuidedInline)
+  Remark << " by profile guided inliner";

davidxl wrote:
> is this necessary? User should know if their build has profile or not.
> 
> What is more useful is when PGO is on, but some callsite does not have 
> profile data, then it is worth reporting.
> is this necessary? User should know if their build has profile or not.

This was used to differentiate between SampleProfileLoader inline vs CGSCC 
inline. Maybe the message `by profile guided inliner` isn't great, but can't 
think of a better and concise way..

With the differentiation in the message, the inlinee tree recovered through 
some parsing is what I'm looking for (`[P]` for SampleProfileLoader inline, 
`[C]` for CGSCC inline):

```
Inlinees for main
[P]  _ZN15largesolidarrayIP6regobjEixEi @ 369
[P]  _Z7random1i @ 363
[C]_Z8myrandomv @ 2
[P]  _Z7random1i @ 364
[C]_Z8myrandomv @ 2
[P]  _ZN15largesolidarrayIP6regobjEixEi @ 366
[P]  _ZN6wayobj9createwayERP8point16tRi @ 327
[P]_ZN6wayobj11createwayarEiiRP8point16tRi @ 37.1
[P]  _ZN6wayobj5indexEii @ 143
[P]  _ZN6wayobj5indexEii @ 130
[P]  _ZN6wayobj6indexxEi @ 31
[P]  _ZN6wayobj6indexyEi @ 32
[C]  _ZN8point16tC2Ess @ 2
[C]  _ZN8point16tC2Ess @ 2.1
```

> What is more useful is when PGO is on, but some callsite does not have 
> profile data, then it is worth reporting.

That can be useful. I was also looking for a way to get call site count printed 
(if we have a count), but looks like it's not available from `InlineCost`. I'm 
going to defer that for now if that's ok. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D82213/new/

https://reviews.llvm.org/D82213



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D82213: [Remarks] Add callsite locations to inline remarks

2020-06-19 Thread Wenlei He via Phabricator via cfe-commits
wenlei created this revision.
wenlei added reviewers: wmi, davidxl, hoy.
Herald added subscribers: llvm-commits, cfe-commits, hiraditya.
Herald added projects: clang, LLVM.

Add call site location info into inline remarks so we can differentiate inline 
sites.
This can be useful for inliner tuning. We can also reconstruct full 
hierarchical inline 
tree from parsing such remarks. The messege of inline remark is also tweaked so 
we can 
differentiate SampleProfileLoader inline from CGSCC inline.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D82213

Files:
  clang/test/Frontend/optimization-remark-with-hotness-new-pm.c
  clang/test/Frontend/optimization-remark-with-hotness.c
  llvm/include/llvm/Analysis/InlineAdvisor.h
  llvm/lib/Analysis/InlineAdvisor.cpp
  llvm/lib/Transforms/IPO/SampleProfile.cpp
  llvm/test/Transforms/Inline/optimization-remarks-passed-yaml.ll
  llvm/test/Transforms/SampleProfile/remarks.ll

Index: llvm/test/Transforms/SampleProfile/remarks.ll
===
--- llvm/test/Transforms/SampleProfile/remarks.ll
+++ llvm/test/Transforms/SampleProfile/remarks.ll
@@ -21,7 +21,7 @@
 
 ; We are expecting foo() to be inlined in main() (almost all the cycles are
 ; spent inside foo).
-; CHECK: remark: remarks.cc:13:21: inlined callee '_Z3foov' into 'main'
+; CHECK: remark: remarks.cc:13:21: _Z3foov inlined into main by profile guided inliner with (cost=130, threshold=225) at callsite main:0
 
 ; The back edge for the loop is the hottest edge in the loop subgraph.
 ; CHECK: remark: remarks.cc:6:9: most popular destination for conditional branches at remarks.cc:5:3
@@ -32,17 +32,26 @@
 ; Checking to see if YAML file is generated and contains remarks
 ;YAML:   --- !Passed
 ;YAML-NEXT:  Pass:sample-profile-inline
-;YAML-NEXT:  Name:InlineSuccess
+;YAML-NEXT:  Name:Inlined
 ;YAML-NEXT:  DebugLoc:{ File: remarks.cc, Line: 13, Column: 21 }
 ;YAML-NEXT:  Function:main
 ;YAML-NEXT:  Args:
-;YAML-NEXT:- String:  'inlined callee '''
 ;YAML-NEXT:- Callee:  _Z3foov
 ;YAML-NEXT:  DebugLoc:{ File: remarks.cc, Line: 3, Column: 0 }
-;YAML-NEXT:- String:  ''' into '''
+;YAML-NEXT:- String:  ' inlined into '
 ;YAML-NEXT:- Caller:  main
 ;YAML-NEXT:DebugLoc:{ File: remarks.cc, Line: 13, Column: 0 }
-;YAML-NEXT:- String:  
+;YAML-NEXT:- String:  ' by profile guided inliner'
+;YAML-NEXT:- String:  ' with '
+;YAML-NEXT:- String:  '(cost='
+;YAML-NEXT:- Cost:'130'
+;YAML-NEXT:- String:  ', threshold='
+;YAML-NEXT:- Threshold:   '225'
+;YAML-NEXT:- String:  ')'
+;YAML-NEXT:- String:  ' at callsite '
+;YAML-NEXT:- String:  main
+;YAML-NEXT:- String:  ':'
+;YAML-NEXT:- Line:'0'
 ;YAML-NEXT:  ...
 ;YAML:  --- !Analysis
 ;YAML-NEXT:  Pass:sample-profile
Index: llvm/test/Transforms/Inline/optimization-remarks-passed-yaml.ll
===
--- llvm/test/Transforms/Inline/optimization-remarks-passed-yaml.ll
+++ llvm/test/Transforms/Inline/optimization-remarks-passed-yaml.ll
@@ -22,7 +22,7 @@
 ;  4   return foo();
 ;  5 }
 
-; CHECK: remark: /tmp/s.c:4:10: foo inlined into bar with (cost={{[0-9\-]+}}, threshold={{[0-9]+}}) (hotness: 30)
+; CHECK: remark: /tmp/s.c:4:10: foo inlined into bar with (cost={{[0-9\-]+}}, threshold={{[0-9]+}}) at callsite bar:1 (hotness: 30)
 
 ; YAML:  --- !Passed
 ; YAML-NEXT: Pass:inline
@@ -42,6 +42,10 @@
 ; YAML-NEXT:   - String: ', threshold='
 ; YAML-NEXT:   - Threshold: '{{[0-9]+}}'
 ; YAML-NEXT:   - String: ')'
+; YAML-NEXT:   - String:  ' at callsite '
+; YAML-NEXT:   - String:  bar
+; YAML-NEXT:   - String:  ':'
+; YAML-NEXT:   - Line:'1'
 ; YAML-NEXT: ...
 
 ; ModuleID = '/tmp/s.c'
Index: llvm/lib/Transforms/IPO/SampleProfile.cpp
===
--- llvm/lib/Transforms/IPO/SampleProfile.cpp
+++ llvm/lib/Transforms/IPO/SampleProfile.cpp
@@ -37,6 +37,7 @@
 #include "llvm/Analysis/AssumptionCache.h"
 #include "llvm/Analysis/CallGraph.h"
 #include "llvm/Analysis/CallGraphSCCPass.h"
+#include "llvm/Analysis/InlineAdvisor.h"
 #include "llvm/Analysis/InlineCost.h"
 #include "llvm/Analysis/LoopInfo.h"
 #include "llvm/Analysis/OptimizationRemarkEmitter.h"
@@ -916,9 +917,8 @@
   InlineFunctionInfo IFI(nullptr, GetAC);
   if (InlineFunction(CB, IFI).isSuccess()) {
 // The call to InlineFunction erases I, so we can't pass it here.
-ORE->emit(OptimizationRemark(CSINLINE_DEBUG, "InlineSuccess", DLoc, BB)
-  << "inlined callee '" << ore::NV("Callee", CalledFunction)
-  << "' into '" << ore::NV("Caller", BB->getParent()) << "'");
+emitInl

[PATCH] D77925: Revert "[TLI] Per-function fveclib for math library used for vectorization"

2020-05-02 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.
Herald added a project: LLVM.
Herald added a subscriber: llvm-commits.

@mehdi_amini @tejohnson When can we re-land this (with tweaks)? I'm under the 
impression that a test case demonstrating the 3rd party usage will be added 
very soon after this revert, then we can tweak the original patch to 
accommodate that usage, and re-land asap. Or am I missing something there? I'd 
like to get this unblocked asap. Currently we have to keep this as a private 
patch on our end which is a bit cumbersome, and I think this one can be useful 
for others too. Thanks..


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77925/new/

https://reviews.llvm.org/D77925



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77952: [TLII] Reduce copies of TLII for TLA

2020-04-12 Thread Wenlei He via Phabricator via cfe-commits
wenlei marked 2 inline comments as done.
wenlei added inline comments.



Comment at: clang/lib/CodeGen/BackendUtil.cpp:689
   // Set up the per-function pass manager.
-  FPM.add(new TargetLibraryInfoWrapperPass(*TLII));
+  FPM.add(new TargetLibraryInfoWrapperPass(TargetTriple));
   if (CodeGenOpts.VerifyModule)

tejohnson wrote:
> wenlei wrote:
> > tejohnson wrote:
> > > These changes mean we now construct a new TLII multiple times (e.g. both 
> > > when we add the TargetLibraryInfoWrapperPass to the MPM earlier and to 
> > > the FPM here, rather that just copying. Is this actually faster? It seems 
> > > like it would be slower overall.
> > Oops, this one isn't intentional... changed it back. Though for other 
> > instances where TLII isn't reused, similar change turns extra copy into 
> > move.
> I suppose you could std::move the TLII here to avoid this second copy.
> 
> Do you know how much difference this patch makes on the compile time 
> instruction count regressions seen with the original patch? It seems like it 
> shouldn't be huge given that this is just during the pipeline setup for the 
> module. But if it does explain the instruction count increases that's 
> probably why it didn't regress the actual wall time measurements.
Yeah, the 2nd use can be a move, though I'm inclined to not do that, as TLII 
isn't immediately out of scope yet.

I didn't setup measurement for this. I was basically playing with some tests 
for sanity check just to make sure we're not doing things unexpectedly, e.g. we 
don't do per-function copies unexpectedly in `TargetLibraryAnalysis::run`. But 
other than a few extra copies on setup path, I didn't see anything unusual. 
Since the original patch can make any copies more expensive, I thought it's 
good to reduce that as I see it.



Comment at: llvm/lib/Analysis/TargetLibraryInfo.cpp:596
   memcpy(AvailableArray, TLI.AvailableArray, sizeof(AvailableArray));
   return *this;
 }

tejohnson wrote:
> This is missing copying of the VecLibDescs (looks like I missed this as well 
> originally).
Now I remembered why this was missed from my side, before my patch, 
`VecLibDescs` isn't copied here either, which seems like a bug? Same for the 
move one below. 


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77952/new/

https://reviews.llvm.org/D77952



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77952: [TLII] Reduce copies of TLII for TLA

2020-04-12 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

In D77952#1976803 , @tejohnson wrote:

> Also, just a nit, because TLI is sometimes used to refer to the 
> TargetLibraryInfo and occasionally to the TargetLibraryInfoImpl, and the 
> latter is frequently referred to as TLII, could you change the description to 
> say TLII or TLI Impl? Since this doesn't affect TargetLibraryInfo/TLI object 
> copies.


Sure, done.. changed the title, and made the description more explicit.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77952/new/

https://reviews.llvm.org/D77952



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77952: [TLI] Reduce copies for TLI and TLA

2020-04-11 Thread Wenlei He via Phabricator via cfe-commits
wenlei updated this revision to Diff 256806.
wenlei added a comment.

address feedback


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77952/new/

https://reviews.llvm.org/D77952

Files:
  clang/lib/CodeGen/BackendUtil.cpp
  llvm/include/llvm/Analysis/TargetLibraryInfo.h
  llvm/lib/Analysis/TargetLibraryInfo.cpp

Index: llvm/lib/Analysis/TargetLibraryInfo.cpp
===
--- llvm/lib/Analysis/TargetLibraryInfo.cpp
+++ llvm/lib/Analysis/TargetLibraryInfo.cpp
@@ -583,8 +583,8 @@
   ShouldSignExtI32Param(TLI.ShouldSignExtI32Param) {
   std::move(std::begin(TLI.AvailableArray), std::end(TLI.AvailableArray),
 AvailableArray);
-  for (unsigned i = 0; i < NumVecLibs; i++)
-VecLibDescs[i] = TLI.VecLibDescs[i];
+  std::move(std::begin(TLI.VecLibDescs), std::end(TLI.VecLibDescs),
+VecLibDescs);
 }
 
 TargetLibraryInfoImpl &TargetLibraryInfoImpl::operator=(const TargetLibraryInfoImpl &TLI) {
@@ -593,6 +593,8 @@
   ShouldExtI32Return = TLI.ShouldExtI32Return;
   ShouldSignExtI32Param = TLI.ShouldSignExtI32Param;
   memcpy(AvailableArray, TLI.AvailableArray, sizeof(AvailableArray));
+  for (unsigned i = 0; i < NumVecLibs; i++)
+VecLibDescs[i] = TLI.VecLibDescs[i];
   return *this;
 }
 
@@ -603,6 +605,8 @@
   ShouldSignExtI32Param = TLI.ShouldSignExtI32Param;
   std::move(std::begin(TLI.AvailableArray), std::end(TLI.AvailableArray),
 AvailableArray);
+  std::move(std::begin(TLI.VecLibDescs), std::end(TLI.VecLibDescs),
+VecLibDescs);
   return *this;
 }
 
@@ -1677,6 +1681,12 @@
   initializeTargetLibraryInfoWrapperPassPass(*PassRegistry::getPassRegistry());
 }
 
+TargetLibraryInfoWrapperPass::TargetLibraryInfoWrapperPass(
+TargetLibraryInfoImpl &&TLIImpl)
+: ImmutablePass(ID), TLA(std::move(TLIImpl)) {
+  initializeTargetLibraryInfoWrapperPassPass(*PassRegistry::getPassRegistry());
+}
+
 AnalysisKey TargetLibraryAnalysis::Key;
 
 // Register the basic pass.
Index: llvm/include/llvm/Analysis/TargetLibraryInfo.h
===
--- llvm/include/llvm/Analysis/TargetLibraryInfo.h
+++ llvm/include/llvm/Analysis/TargetLibraryInfo.h
@@ -452,7 +452,11 @@
   /// Construct a library analysis with baseline Module-level info.
   ///
   /// This will be supplemented with Function-specific info in the Result.
-  TargetLibraryAnalysis(TargetLibraryInfoImpl BaselineInfoImpl)
+  TargetLibraryAnalysis(const Triple &T)
+  : BaselineInfoImpl(TargetLibraryInfoImpl(T)) {}
+  TargetLibraryAnalysis(const TargetLibraryInfoImpl &BaselineInfoImpl)
+  : BaselineInfoImpl(BaselineInfoImpl) {}
+  TargetLibraryAnalysis(TargetLibraryInfoImpl &&BaselineInfoImpl)
   : BaselineInfoImpl(std::move(BaselineInfoImpl)) {}
 
   TargetLibraryInfo run(const Function &F, FunctionAnalysisManager &);
@@ -475,6 +479,7 @@
   TargetLibraryInfoWrapperPass();
   explicit TargetLibraryInfoWrapperPass(const Triple &T);
   explicit TargetLibraryInfoWrapperPass(const TargetLibraryInfoImpl &TLI);
+  explicit TargetLibraryInfoWrapperPass(TargetLibraryInfoImpl &&TLI);
 
   TargetLibraryInfo &getTLI(const Function &F) {
 FunctionAnalysisManager DummyFAM;
Index: clang/lib/CodeGen/BackendUtil.cpp
===
--- clang/lib/CodeGen/BackendUtil.cpp
+++ clang/lib/CodeGen/BackendUtil.cpp
@@ -352,10 +352,6 @@
   PM.add(createStackSafetyGlobalInfoWrapperPass(/*SetMetadata=*/true));
 }
 
-static TargetLibraryInfoImpl *createTLII(llvm::Triple &TargetTriple) {
-  return new TargetLibraryInfoImpl(TargetTriple);
-}
-
 static void addSymbolRewriterPass(const CodeGenOptions &Opts,
   legacy::PassManager *MPM) {
   llvm::SymbolRewriter::RewriteDescriptorList DL;
@@ -541,12 +537,12 @@
   if (CodeGenOpts.DisableLLVMPasses)
 return;
 
-  // Figure out TargetLibraryInfo.  This needs to be added to MPM and FPM
+  // Figure out TargetLibraryInfo. This needs to be added to MPM and FPM
   // manually (and not via PMBuilder), since some passes (eg. InstrProfiling)
   // are inserted before PMBuilder ones - they'd get the default-constructed
   // TLI with an unknown target otherwise.
   Triple TargetTriple(TheModule->getTargetTriple());
-  std::unique_ptr TLII(createTLII(TargetTriple));
+  TargetLibraryInfoImpl TLII(TargetTriple);
 
   // If we reached here with a non-empty index file name, then the index file
   // was empty and we are not performing ThinLTO backend compilation (used in
@@ -591,7 +587,7 @@
   PMBuilder.PrepareForLTO = CodeGenOpts.PrepareForLTO;
   PMBuilder.RerollLoops = CodeGenOpts.RerollLoops;
 
-  MPM.add(new TargetLibraryInfoWrapperPass(*TLII));
+  MPM.add(new TargetLibraryInfoWrapperPass(TLII));
 
   if (TM)
 TM->adjustPassManager(PMBuilder);
@@ -692,7 +688,7 @@
   }
 
   // Set up the per-function pass manager.
-  FPM.add(new TargetLibraryInfoWrapperPas

[PATCH] D77952: [TLI] Reduce copies for TLI and TLA

2020-04-11 Thread Wenlei He via Phabricator via cfe-commits
wenlei marked an inline comment as done.
wenlei added a comment.

In D77952#1976336 , @tejohnson wrote:

> Some parts of this are dependent on the patch that got reverted, but I have 
> some other questions below about the changes in BackendUtil.cpp.


Thanks for quick review.. I will remove the changes dependent on the reverted 
change before commit.




Comment at: clang/lib/CodeGen/BackendUtil.cpp:689
   // Set up the per-function pass manager.
-  FPM.add(new TargetLibraryInfoWrapperPass(*TLII));
+  FPM.add(new TargetLibraryInfoWrapperPass(TargetTriple));
   if (CodeGenOpts.VerifyModule)

tejohnson wrote:
> These changes mean we now construct a new TLII multiple times (e.g. both when 
> we add the TargetLibraryInfoWrapperPass to the MPM earlier and to the FPM 
> here, rather that just copying. Is this actually faster? It seems like it 
> would be slower overall.
Oops, this one isn't intentional... changed it back. Though for other instances 
where TLII isn't reused, similar change turns extra copy into move.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77952/new/

https://reviews.llvm.org/D77952



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77632: [TLI] Per-function fveclib for math library used for vectorization

2020-04-11 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

In D77632#1975619 , @mehdi_amini wrote:

> The existing TLI provides a very convenient way to define a VecLib without 
> LLVM knowing about it ahead of time. This feature is important for any 
> embedded use of LLVM as a library out-of-tree (I'll add a unit-test in-tree).
>  I don't think it is a big change to this patch to preserve the current 
> ability but I wanted to check first (and in the meantime I reverted in 
> temporarily in https://reviews.llvm.org/D77925 to avoid the feature 
> regression).
>
> At the moment the place where you seem to use this knowledge is with the 
> `enum VectorLibrary`  in the `TargetLibraryInfoImpl` class, and the 
> `VecLibDescs` array which statically contains the known VecLib.
>  It seems to me that if we replace this enum with a string instead to 
> identify the VecLib everything should still hold together and this would fit 
> with minor changes to this path. The `VecLibDescs` could just be a 
> `StringMap` in this case.
>
> That was a third-party (in my case the XLA compiler) can still register its 
> own "XLA" VecLib and add all the descriptors.
>
> How does it sound?


Thanks for the explanation about the revert. The proposal of using a StringMap 
to maintain the openness sounds good to me. And agree with @tejohnson, if the 
openness is a feature, it should be covered in tests, otherwise it can feel 
somewhat like a loophole and prone to breakage, though I can see how it can be 
useful.. Hope this patch can be restored with tweaks soon (we have workloads 
with very visible vectorization that relies on this).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77632/new/

https://reviews.llvm.org/D77632



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77952: [TLI] Reduce copies for TLI and TLA

2020-04-11 Thread Wenlei He via Phabricator via cfe-commits
wenlei created this revision.
wenlei added reviewers: tejohnson, nikic, mehdi_amini, hoyFB.
Herald added subscribers: cfe-commits, hiraditya.
Herald added a project: clang.

Minor changes to reduce the copying needed for TLI and TLA by using move 
whenever possible.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D77952

Files:
  clang/lib/CodeGen/BackendUtil.cpp
  llvm/include/llvm/Analysis/TargetLibraryInfo.h
  llvm/lib/Analysis/TargetLibraryInfo.cpp

Index: llvm/lib/Analysis/TargetLibraryInfo.cpp
===
--- llvm/lib/Analysis/TargetLibraryInfo.cpp
+++ llvm/lib/Analysis/TargetLibraryInfo.cpp
@@ -583,8 +583,8 @@
   ShouldSignExtI32Param(TLI.ShouldSignExtI32Param) {
   std::move(std::begin(TLI.AvailableArray), std::end(TLI.AvailableArray),
 AvailableArray);
-  for (unsigned i = 0; i < NumVecLibs; i++)
-VecLibDescs[i] = TLI.VecLibDescs[i];
+  std::move(std::begin(TLI.VecLibDescs), std::end(TLI.VecLibDescs),
+VecLibDescs);
 }
 
 TargetLibraryInfoImpl &TargetLibraryInfoImpl::operator=(const TargetLibraryInfoImpl &TLI) {
@@ -1677,6 +1677,12 @@
   initializeTargetLibraryInfoWrapperPassPass(*PassRegistry::getPassRegistry());
 }
 
+TargetLibraryInfoWrapperPass::TargetLibraryInfoWrapperPass(
+TargetLibraryInfoImpl &&TLIImpl)
+: ImmutablePass(ID), TLA(std::move(TLIImpl)) {
+  initializeTargetLibraryInfoWrapperPassPass(*PassRegistry::getPassRegistry());
+}
+
 AnalysisKey TargetLibraryAnalysis::Key;
 
 // Register the basic pass.
Index: llvm/include/llvm/Analysis/TargetLibraryInfo.h
===
--- llvm/include/llvm/Analysis/TargetLibraryInfo.h
+++ llvm/include/llvm/Analysis/TargetLibraryInfo.h
@@ -452,7 +452,11 @@
   /// Construct a library analysis with baseline Module-level info.
   ///
   /// This will be supplemented with Function-specific info in the Result.
-  TargetLibraryAnalysis(TargetLibraryInfoImpl BaselineInfoImpl)
+  TargetLibraryAnalysis(const Triple &T)
+  : BaselineInfoImpl(TargetLibraryInfoImpl(T)) {}
+  TargetLibraryAnalysis(const TargetLibraryInfoImpl &BaselineInfoImpl)
+  : BaselineInfoImpl(BaselineInfoImpl) {}
+  TargetLibraryAnalysis(TargetLibraryInfoImpl &&BaselineInfoImpl)
   : BaselineInfoImpl(std::move(BaselineInfoImpl)) {}
 
   TargetLibraryInfo run(const Function &F, FunctionAnalysisManager &);
@@ -475,6 +479,7 @@
   TargetLibraryInfoWrapperPass();
   explicit TargetLibraryInfoWrapperPass(const Triple &T);
   explicit TargetLibraryInfoWrapperPass(const TargetLibraryInfoImpl &TLI);
+  explicit TargetLibraryInfoWrapperPass(TargetLibraryInfoImpl &&TLI);
 
   TargetLibraryInfo &getTLI(const Function &F) {
 FunctionAnalysisManager DummyFAM;
Index: clang/lib/CodeGen/BackendUtil.cpp
===
--- clang/lib/CodeGen/BackendUtil.cpp
+++ clang/lib/CodeGen/BackendUtil.cpp
@@ -352,10 +352,6 @@
   PM.add(createStackSafetyGlobalInfoWrapperPass(/*SetMetadata=*/true));
 }
 
-static TargetLibraryInfoImpl *createTLII(llvm::Triple &TargetTriple) {
-  return new TargetLibraryInfoImpl(TargetTriple);
-}
-
 static void addSymbolRewriterPass(const CodeGenOptions &Opts,
   legacy::PassManager *MPM) {
   llvm::SymbolRewriter::RewriteDescriptorList DL;
@@ -541,13 +537,6 @@
   if (CodeGenOpts.DisableLLVMPasses)
 return;
 
-  // Figure out TargetLibraryInfo.  This needs to be added to MPM and FPM
-  // manually (and not via PMBuilder), since some passes (eg. InstrProfiling)
-  // are inserted before PMBuilder ones - they'd get the default-constructed
-  // TLI with an unknown target otherwise.
-  Triple TargetTriple(TheModule->getTargetTriple());
-  std::unique_ptr TLII(createTLII(TargetTriple));
-
   // If we reached here with a non-empty index file name, then the index file
   // was empty and we are not performing ThinLTO backend compilation (used in
   // testing in a distributed build environment). Drop any the type test
@@ -558,6 +547,7 @@
  /*ImportSummary=*/nullptr,
  /*DropTypeTests=*/true));
 
+  Triple TargetTriple(TheModule->getTargetTriple());
   PassManagerBuilderWrapper PMBuilder(TargetTriple, CodeGenOpts, LangOpts);
 
   // At O0 and O1 we only run the always inliner which is more efficient. At
@@ -591,7 +581,11 @@
   PMBuilder.PrepareForLTO = CodeGenOpts.PrepareForLTO;
   PMBuilder.RerollLoops = CodeGenOpts.RerollLoops;
 
-  MPM.add(new TargetLibraryInfoWrapperPass(*TLII));
+  // Figure out TargetLibraryInfo. This needs to be added to MPM and FPM
+  // manually (and not via PMBuilder), since some passes (eg. InstrProfiling)
+  // are inserted before PMBuilder ones - they'd get the default-constructed
+  // TLI with an unknown target otherwise.
+  MPM.add(new TargetLibraryInfoWrapperPass(TargetTriple));
 
   if (TM)
 TM->adjustPassManager(PMBuilder)

[PATCH] D77632: [TLI] Per-function fveclib for math library used for vectorization

2020-04-11 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

In D77632#1974409 , @tejohnson wrote:

> In D77632#1974363 , @wenlei wrote:
>
> > In D77632#1974015 , @nikic wrote:
> >
> > > This change causes a ~0.5% compile-time regressions: 
> > > http://llvm-compile-time-tracker.com/compare.php?from=5b18b6e9a84d985c0a907009fb71de7c1943bc88&to=60c642e74be6af86906d9f3d982728be7bd4329f&stat=instructions
> > >  This is quite a lot as these things go, so it would be great if you 
> > > could double check if there's any optimization potential here. In 
> > > particular I'm wondering why this affects normal builds so much, even 
> > > though they (presumably?) don't use any veclib at all.
> >
> >
> > Thanks for the heads-up. This is surprising but there is a change even when 
> > veclib is not used - in order to allow each function to use different 
> > veclib without duplicating the work of populating vector function list for 
> > each function, we now always pre-populate vector function list for three 
> > supported vector libraries for each module. However 0.5% compile-time for 
> > that work given it's per-module is not expected. I suspect we may be 
> > passing/copying TLII around more than we anticipated (now we always have 
> > more stuff to copy). I will take a look. We could also turn this into a 
> > lazy initialization - only populate the needed list for module level TLII 
> > when it's first queried by a function level TLI.
>
>
> Hmm, yeah that is surprising, because the TLII should be built once per 
> module per TLI analysis, which is never invalidated. We've gone from 
> populating one set of vec libs to 3, I wouldn't have thought that was 
> particularly expensive, so it would be good to see what is going on here and 
> confirm we are only building this once as expected.
>
> Looking at the compile time data at that link, interestingly the 
> "instructions" metric increased, but not wall time or cycles or task clock - 
> they were all neutral.


Turns out there're a few places where we call copy ctor for TLI unnecessarily. 
Made some changes in D77952  to use move when 
possible. In addition, I should have used move for `TLI.VecLibDescs` in move 
ctor of `TargetLibraryInfoImpl` too.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77632/new/

https://reviews.llvm.org/D77632



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77632: [TLI] Per-function fveclib for math library used for vectorization

2020-04-10 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

In D77632#1974015 , @nikic wrote:

> This change causes a ~0.5% compile-time regressions: 
> http://llvm-compile-time-tracker.com/compare.php?from=5b18b6e9a84d985c0a907009fb71de7c1943bc88&to=60c642e74be6af86906d9f3d982728be7bd4329f&stat=instructions
>  This is quite a lot as these things go, so it would be great if you could 
> double check if there's any optimization potential here. In particular I'm 
> wondering why this affects normal builds so much, even though they 
> (presumably?) don't use any veclib at all.


Thanks for the heads-up. This is surprising but there is a change even when 
veclib is not used - in order to allow each function to use different veclib 
without duplicating the work of populating vector function list for each 
function, we now always pre-populate vector function list for three supported 
vector libraries for each module. However 0.5% compile-time for that work given 
it's per-module is not expected. I suspect we may be passing/copying TLII 
around more than we anticipated (now we always have more stuff to copy). I will 
take a look. We could also turn this into a lazy initialization - only populate 
the needed list for module level TLII when it's first queried by a function 
level TLI.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77632/new/

https://reviews.llvm.org/D77632



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77632: [TLI] Per-function fveclib for math library used for vectorization

2020-04-09 Thread Wenlei He via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rG60c642e74be6: [TLI] Per-function fveclib for math library 
used for vectorization (authored by wenlei).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77632/new/

https://reviews.llvm.org/D77632

Files:
  clang/lib/CodeGen/BackendUtil.cpp
  clang/lib/CodeGen/CGCall.cpp
  clang/test/CodeGen/libcalls-veclib.c
  llvm/include/llvm/Analysis/TargetLibraryInfo.h
  llvm/lib/Analysis/InlineCost.cpp
  llvm/lib/Analysis/TargetLibraryInfo.cpp
  llvm/test/Transforms/Inline/inline-no-builtin-compatible.ll
  llvm/test/Transforms/Inline/veclib-compat.ll

Index: llvm/test/Transforms/Inline/veclib-compat.ll
===
--- /dev/null
+++ llvm/test/Transforms/Inline/veclib-compat.ll
@@ -0,0 +1,48 @@
+; RUN: opt < %s -inline -inline-caller-superset-tli=true -S | FileCheck %s --check-prefixes=COMMON
+; RUN: opt < %s -passes='cgscc(inline)' -inline-caller-superset-tli=true -S | FileCheck %s --check-prefixes=COMMON
+; RUN: opt < %s -inline -inline-caller-superset-tli=false -S | FileCheck %s --check-prefixes=NOSUPERSET,COMMON
+; RUN: opt < %s -passes='cgscc(inline)' -inline-caller-superset-tli=false -S | FileCheck %s --check-prefixes=NOSUPERSET,COMMON
+
+
+
+define i32 @callee_svml(i8 %X) #0 {
+entry:
+  ret i32 1
+}
+
+define i32 @callee_massv(i8 %X) #1 {
+entry:
+  ret i32 1
+}
+
+define i32 @callee_nolibrary(i8 %X) {
+entry:
+  ret i32 1
+}
+
+define i32 @caller_svml() #0 {
+; COMMON-LABEL: define i32 @caller_svml()
+entry:
+  %rslt = call i32 @callee_massv(i8 123)
+; COMMON: call i32 @callee_massv
+  %tmp1 = call i32 @callee_nolibrary(i8 123)
+; NOSUPERSET: call i32 @callee_nolibrary
+  %tmp2 = call i32 @callee_svml(i8 123)
+; COMMON-NOT: call
+  ret i32 %rslt
+}
+
+define i32 @caller_nolibrary() {
+; COMMON-LABEL: define i32 @caller_nolibrary()
+entry:
+  %rslt = call i32 @callee_svml(i8 123)
+; COMMON: call i32 @callee_svml
+  %tmp1 = call i32 @callee_massv(i8 123)
+; COMMON: call i32 @callee_massv
+  %tmp2 = call i32 @callee_nolibrary(i8 123)
+; COMMON-NOT: call
+  ret i32 %rslt
+}
+
+attributes #0 = { "veclib"="SVML" }
+attributes #1 = { "veclib"="MASSV" }
\ No newline at end of file
Index: llvm/test/Transforms/Inline/inline-no-builtin-compatible.ll
===
--- llvm/test/Transforms/Inline/inline-no-builtin-compatible.ll
+++ llvm/test/Transforms/Inline/inline-no-builtin-compatible.ll
@@ -3,8 +3,8 @@
 ; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -S -passes='cgscc(inline)' | FileCheck %s
 
 ; Make sure we don't inline callees into a caller with a superset of the
-; no builtin attributes when -inline-caller-superset-nobuiltin=false.
-; RUN: opt < %s -inline-caller-superset-nobuiltin=false -mtriple=x86_64-unknown-linux-gnu -S -passes='cgscc(inline)' | FileCheck %s --check-prefix=NOSUPERSET
+; no builtin attributes when -inline-caller-superset-tli=false.
+; RUN: opt < %s -inline-caller-superset-tli=false -mtriple=x86_64-unknown-linux-gnu -S -passes='cgscc(inline)' | FileCheck %s --check-prefix=NOSUPERSET
 
 target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
 target triple = "x86_64-unknown-linux-gnu"
Index: llvm/lib/Analysis/TargetLibraryInfo.cpp
===
--- llvm/lib/Analysis/TargetLibraryInfo.cpp
+++ llvm/lib/Analysis/TargetLibraryInfo.cpp
@@ -550,7 +550,7 @@
 TLI.setUnavailable(LibFunc_nvvm_reflect);
   }
 
-  TLI.addVectorizableFunctionsFromVecLib(ClVectorLibrary);
+  TLI.addAllVectorizableFunctions();
 }
 
 TargetLibraryInfoImpl::TargetLibraryInfoImpl() {
@@ -572,8 +572,8 @@
   ShouldExtI32Return(TLI.ShouldExtI32Return),
   ShouldSignExtI32Param(TLI.ShouldSignExtI32Param) {
   memcpy(AvailableArray, TLI.AvailableArray, sizeof(AvailableArray));
-  VectorDescs = TLI.VectorDescs;
-  ScalarDescs = TLI.ScalarDescs;
+  for (unsigned i = 0; i < NumVecLibs; i++)
+VecLibDescs[i] = TLI.VecLibDescs[i];
 }
 
 TargetLibraryInfoImpl::TargetLibraryInfoImpl(TargetLibraryInfoImpl &&TLI)
@@ -583,8 +583,8 @@
   ShouldSignExtI32Param(TLI.ShouldSignExtI32Param) {
   std::move(std::begin(TLI.AvailableArray), std::end(TLI.AvailableArray),
 AvailableArray);
-  VectorDescs = TLI.VectorDescs;
-  ScalarDescs = TLI.ScalarDescs;
+  for (unsigned i = 0; i < NumVecLibs; i++)
+VecLibDescs[i] = TLI.VecLibDescs[i];
 }
 
 TargetLibraryInfoImpl &TargetLibraryInfoImpl::operator=(const TargetLibraryInfoImpl &TLI) {
@@ -1520,7 +1520,16 @@
   return LHS.VectorFnName < S;
 }
 
-void TargetLibraryInfoImpl::addVectorizableFunctions(ArrayRef Fns) {
+void TargetLibraryInfoImpl::addAllVectorizableFunctions() {
+  addVectorizableFunctionsFromVecLib(Accelerate, VecLibDescs[Accelerate]);
+  addVectorizableFunctionsFromVecLib(MASSV, VecLibDescs[MASSV]);
+  addVectorizableFunctionsFromVecLib(SVML, VecLibDescs[SVML])

[PATCH] D77632: [TLI] Per-function fveclib for math library used for vectorization

2020-04-09 Thread Wenlei He via Phabricator via cfe-commits
wenlei updated this revision to Diff 256465.
wenlei marked an inline comment as done.
wenlei added a comment.

rebase, update test


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77632/new/

https://reviews.llvm.org/D77632

Files:
  clang/lib/CodeGen/BackendUtil.cpp
  clang/lib/CodeGen/CGCall.cpp
  clang/test/CodeGen/libcalls-veclib.c
  llvm/include/llvm/Analysis/TargetLibraryInfo.h
  llvm/lib/Analysis/InlineCost.cpp
  llvm/lib/Analysis/TargetLibraryInfo.cpp
  llvm/test/Transforms/Inline/inline-no-builtin-compatible.ll
  llvm/test/Transforms/Inline/veclib-compat.ll

Index: llvm/test/Transforms/Inline/veclib-compat.ll
===
--- /dev/null
+++ llvm/test/Transforms/Inline/veclib-compat.ll
@@ -0,0 +1,48 @@
+; RUN: opt < %s -inline -inline-caller-superset-tli=true -S | FileCheck %s --check-prefixes=COMMON
+; RUN: opt < %s -passes='cgscc(inline)' -inline-caller-superset-tli=true -S | FileCheck %s --check-prefixes=COMMON
+; RUN: opt < %s -inline -inline-caller-superset-tli=false -S | FileCheck %s --check-prefixes=NOSUPERSET,COMMON
+; RUN: opt < %s -passes='cgscc(inline)' -inline-caller-superset-tli=false -S | FileCheck %s --check-prefixes=NOSUPERSET,COMMON
+
+
+
+define i32 @callee_svml(i8 %X) #0 {
+entry:
+  ret i32 1
+}
+
+define i32 @callee_massv(i8 %X) #1 {
+entry:
+  ret i32 1
+}
+
+define i32 @callee_nolibrary(i8 %X) {
+entry:
+  ret i32 1
+}
+
+define i32 @caller_svml() #0 {
+; COMMON-LABEL: define i32 @caller_svml()
+entry:
+  %rslt = call i32 @callee_massv(i8 123)
+; COMMON: call i32 @callee_massv
+  %tmp1 = call i32 @callee_nolibrary(i8 123)
+; NOSUPERSET: call i32 @callee_nolibrary
+  %tmp2 = call i32 @callee_svml(i8 123)
+; COMMON-NOT: call
+  ret i32 %rslt
+}
+
+define i32 @caller_nolibrary() {
+; COMMON-LABEL: define i32 @caller_nolibrary()
+entry:
+  %rslt = call i32 @callee_svml(i8 123)
+; COMMON: call i32 @callee_svml
+  %tmp1 = call i32 @callee_massv(i8 123)
+; COMMON: call i32 @callee_massv
+  %tmp2 = call i32 @callee_nolibrary(i8 123)
+; COMMON-NOT: call
+  ret i32 %rslt
+}
+
+attributes #0 = { "veclib"="SVML" }
+attributes #1 = { "veclib"="MASSV" }
\ No newline at end of file
Index: llvm/test/Transforms/Inline/inline-no-builtin-compatible.ll
===
--- llvm/test/Transforms/Inline/inline-no-builtin-compatible.ll
+++ llvm/test/Transforms/Inline/inline-no-builtin-compatible.ll
@@ -3,8 +3,8 @@
 ; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -S -passes='cgscc(inline)' | FileCheck %s
 
 ; Make sure we don't inline callees into a caller with a superset of the
-; no builtin attributes when -inline-caller-superset-nobuiltin=false.
-; RUN: opt < %s -inline-caller-superset-nobuiltin=false -mtriple=x86_64-unknown-linux-gnu -S -passes='cgscc(inline)' | FileCheck %s --check-prefix=NOSUPERSET
+; no builtin attributes when -inline-caller-superset-tli=false.
+; RUN: opt < %s -inline-caller-superset-tli=false -mtriple=x86_64-unknown-linux-gnu -S -passes='cgscc(inline)' | FileCheck %s --check-prefix=NOSUPERSET
 
 target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
 target triple = "x86_64-unknown-linux-gnu"
Index: llvm/lib/Analysis/TargetLibraryInfo.cpp
===
--- llvm/lib/Analysis/TargetLibraryInfo.cpp
+++ llvm/lib/Analysis/TargetLibraryInfo.cpp
@@ -550,7 +550,7 @@
 TLI.setUnavailable(LibFunc_nvvm_reflect);
   }
 
-  TLI.addVectorizableFunctionsFromVecLib(ClVectorLibrary);
+  TLI.addAllVectorizableFunctions();
 }
 
 TargetLibraryInfoImpl::TargetLibraryInfoImpl() {
@@ -572,8 +572,8 @@
   ShouldExtI32Return(TLI.ShouldExtI32Return),
   ShouldSignExtI32Param(TLI.ShouldSignExtI32Param) {
   memcpy(AvailableArray, TLI.AvailableArray, sizeof(AvailableArray));
-  VectorDescs = TLI.VectorDescs;
-  ScalarDescs = TLI.ScalarDescs;
+  for (unsigned i = 0; i < NumVecLibs; i++)
+VecLibDescs[i] = TLI.VecLibDescs[i];
 }
 
 TargetLibraryInfoImpl::TargetLibraryInfoImpl(TargetLibraryInfoImpl &&TLI)
@@ -583,8 +583,8 @@
   ShouldSignExtI32Param(TLI.ShouldSignExtI32Param) {
   std::move(std::begin(TLI.AvailableArray), std::end(TLI.AvailableArray),
 AvailableArray);
-  VectorDescs = TLI.VectorDescs;
-  ScalarDescs = TLI.ScalarDescs;
+  for (unsigned i = 0; i < NumVecLibs; i++)
+VecLibDescs[i] = TLI.VecLibDescs[i];
 }
 
 TargetLibraryInfoImpl &TargetLibraryInfoImpl::operator=(const TargetLibraryInfoImpl &TLI) {
@@ -1520,7 +1520,16 @@
   return LHS.VectorFnName < S;
 }
 
-void TargetLibraryInfoImpl::addVectorizableFunctions(ArrayRef Fns) {
+void TargetLibraryInfoImpl::addAllVectorizableFunctions() {
+  addVectorizableFunctionsFromVecLib(Accelerate, VecLibDescs[Accelerate]);
+  addVectorizableFunctionsFromVecLib(MASSV, VecLibDescs[MASSV]);
+  addVectorizableFunctionsFromVecLib(SVML, VecLibDescs[SVML]);
+}
+
+void TargetLibraryInfoImpl::addVectorizableFunctions(
+  

[PATCH] D77632: [TLI] Per-function fveclib for math library used for vectorization

2020-04-09 Thread Wenlei He via Phabricator via cfe-commits
wenlei marked 4 inline comments as done.
wenlei added inline comments.



Comment at: llvm/include/llvm/Analysis/TargetLibraryInfo.h:272
+else
+  VectLibrary = Impl.ClVectorLibrary;
+

tejohnson wrote:
> wenlei wrote:
> > tejohnson wrote:
> > > Suggest moving the implementation of this constructor to the .cpp file, 
> > > in which case you can just set VectLibrary directly from ClVectorLibrary 
> > > there and remove the member on the Impl object.
> > There're utilities that use `TargetLibraryInfo`, but don't link with 
> > `TargetLibraryInfo.o`. And looking at `TargetLibraryInfo`, all of the 
> > functions are in this header, so I assumed it's intentional to keep this 
> > type self-contained in this header, as it's public API, which is why I add 
> > `ClVectorLibrary` to Impl to pass it back to `TargetLibraryInfo`. For 
> > `TargetLibraryInfoImpl`, it's ok to have the implementation outside of the 
> > header. I can give it a try if keeping the class implementation/definition 
> > self-contained in the header isn't important.  
> I don't think there should be anything using TLI without linking with 
> libAnalysis, which contains TargetLibraryInfo.o. I don't think it should be 
> important to keep the implementation in the header, any more so than for 
> other headers in the Analysis library.
Ok, I moved it to .cpp, thanks!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77632/new/

https://reviews.llvm.org/D77632



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77632: [TLI] Per-function fveclib for math library used for vectorization

2020-04-09 Thread Wenlei He via Phabricator via cfe-commits
wenlei updated this revision to Diff 256437.
wenlei marked an inline comment as done.
wenlei added a comment.
Herald added subscribers: haicheng, eraman.

address feedback


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77632/new/

https://reviews.llvm.org/D77632

Files:
  clang/lib/CodeGen/BackendUtil.cpp
  clang/lib/CodeGen/CGCall.cpp
  clang/test/CodeGen/libcalls-veclib.c
  llvm/include/llvm/Analysis/TargetLibraryInfo.h
  llvm/lib/Analysis/InlineCost.cpp
  llvm/lib/Analysis/TargetLibraryInfo.cpp
  llvm/test/Transforms/Inline/inline-no-builtin-compatible.ll
  llvm/test/Transforms/Inline/veclib-compat.ll

Index: llvm/test/Transforms/Inline/veclib-compat.ll
===
--- /dev/null
+++ llvm/test/Transforms/Inline/veclib-compat.ll
@@ -0,0 +1,50 @@
+; RUN: opt < %s -inline -inline-caller-superset-tli=true -S | FileCheck %s --check-prefixes=SUPERSET,COMMON
+; RUN: opt < %s -passes='cgscc(inline)' -inline-caller-superset-tli=true -S | FileCheck %s --check-prefixes=SUPERSET,COMMON
+; RUN: opt < %s -inline -inline-caller-superset-tli=false -S | FileCheck %s --check-prefixes=NOSUPERSET,COMMON
+; RUN: opt < %s -passes='cgscc(inline)' -inline-caller-superset-tli=false -S | FileCheck %s --check-prefixes=NOSUPERSET,COMMON
+
+
+
+define i32 @callee_svml(i8 %X) #0 {
+entry:
+  ret i32 1
+}
+
+define i32 @callee_massv(i8 %X) #1 {
+entry:
+  ret i32 1
+}
+
+define i32 @callee_nolibrary(i8 %X) {
+entry:
+  ret i32 1
+}
+
+define i32 @caller_svml() #0 {
+; SUPERSET-LABEL: define i32 @caller_svml()
+; NOSUPERSET-LABEL: define i32 @caller_svml()
+entry:
+  %rslt = call i32 @callee_massv(i8 123)
+; SUPERSET: call i32 @callee_massv
+  %tmp1 = call i32 @callee_nolibrary(i8 123)
+; NOSUPERSET: call i32 @callee_nolibrary
+  %tmp2 = call i32 @callee_svml(i8 123)
+; SUPERSET-NOT: call
+; NOSUPERSET-NOT: call
+  ret i32 %rslt
+}
+
+define i32 @caller_nolibrary() {
+; COMMON-LABEL: define i32 @caller_nolibrary()
+entry:
+  %rslt = call i32 @callee_svml(i8 123)
+; COMMON: call i32 @callee_svml
+  %tmp1 = call i32 @callee_massv(i8 123)
+; COMMON: call i32 @callee_massv
+  %tmp2 = call i32 @callee_nolibrary(i8 123)
+; COMMON-NOT: call
+  ret i32 %rslt
+}
+
+attributes #0 = { "veclib"="SVML" }
+attributes #1 = { "veclib"="MASSV" }
\ No newline at end of file
Index: llvm/test/Transforms/Inline/inline-no-builtin-compatible.ll
===
--- llvm/test/Transforms/Inline/inline-no-builtin-compatible.ll
+++ llvm/test/Transforms/Inline/inline-no-builtin-compatible.ll
@@ -3,8 +3,8 @@
 ; RUN: opt < %s -mtriple=x86_64-unknown-linux-gnu -S -passes='cgscc(inline)' | FileCheck %s
 
 ; Make sure we don't inline callees into a caller with a superset of the
-; no builtin attributes when -inline-caller-superset-nobuiltin=false.
-; RUN: opt < %s -inline-caller-superset-nobuiltin=false -mtriple=x86_64-unknown-linux-gnu -S -passes='cgscc(inline)' | FileCheck %s --check-prefix=NOSUPERSET
+; no builtin attributes when -inline-caller-superset-tli=false.
+; RUN: opt < %s -inline-caller-superset-tli=false -mtriple=x86_64-unknown-linux-gnu -S -passes='cgscc(inline)' | FileCheck %s --check-prefix=NOSUPERSET
 
 target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
 target triple = "x86_64-unknown-linux-gnu"
Index: llvm/lib/Analysis/TargetLibraryInfo.cpp
===
--- llvm/lib/Analysis/TargetLibraryInfo.cpp
+++ llvm/lib/Analysis/TargetLibraryInfo.cpp
@@ -550,7 +550,7 @@
 TLI.setUnavailable(LibFunc_nvvm_reflect);
   }
 
-  TLI.addVectorizableFunctionsFromVecLib(ClVectorLibrary);
+  TLI.addAllVectorizableFunctions();
 }
 
 TargetLibraryInfoImpl::TargetLibraryInfoImpl() {
@@ -572,8 +572,8 @@
   ShouldExtI32Return(TLI.ShouldExtI32Return),
   ShouldSignExtI32Param(TLI.ShouldSignExtI32Param) {
   memcpy(AvailableArray, TLI.AvailableArray, sizeof(AvailableArray));
-  VectorDescs = TLI.VectorDescs;
-  ScalarDescs = TLI.ScalarDescs;
+  for (unsigned i = 0; i < NumVecLibs; i++)
+VecLibDescs[i] = TLI.VecLibDescs[i];
 }
 
 TargetLibraryInfoImpl::TargetLibraryInfoImpl(TargetLibraryInfoImpl &&TLI)
@@ -583,8 +583,8 @@
   ShouldSignExtI32Param(TLI.ShouldSignExtI32Param) {
   std::move(std::begin(TLI.AvailableArray), std::end(TLI.AvailableArray),
 AvailableArray);
-  VectorDescs = TLI.VectorDescs;
-  ScalarDescs = TLI.ScalarDescs;
+  for (unsigned i = 0; i < NumVecLibs; i++)
+VecLibDescs[i] = TLI.VecLibDescs[i];
 }
 
 TargetLibraryInfoImpl &TargetLibraryInfoImpl::operator=(const TargetLibraryInfoImpl &TLI) {
@@ -1520,7 +1520,16 @@
   return LHS.VectorFnName < S;
 }
 
-void TargetLibraryInfoImpl::addVectorizableFunctions(ArrayRef Fns) {
+void TargetLibraryInfoImpl::addAllVectorizableFunctions() {
+  addVectorizableFunctionsFromVecLib(Accelerate, VecLibDescs[Accelerate]);
+  addVectorizableFunctionsFromVecLib(MASSV, VecLibDescs[

[PATCH] D77632: [TLI] Per-function fveclib for math library used for vectorization

2020-04-09 Thread Wenlei He via Phabricator via cfe-commits
wenlei marked 4 inline comments as done.
wenlei added inline comments.



Comment at: llvm/include/llvm/Analysis/TargetLibraryInfo.h:272
+else
+  VectLibrary = Impl.ClVectorLibrary;
+

tejohnson wrote:
> Suggest moving the implementation of this constructor to the .cpp file, in 
> which case you can just set VectLibrary directly from ClVectorLibrary there 
> and remove the member on the Impl object.
There're utilities that use `TargetLibraryInfo`, but don't link with 
`TargetLibraryInfo.o`. And looking at `TargetLibraryInfo`, all of the functions 
are in this header, so I assumed it's intentional to keep this type 
self-contained in this header, as it's public API, which is why I add 
`ClVectorLibrary` to Impl to pass it back to `TargetLibraryInfo`. For 
`TargetLibraryInfoImpl`, it's ok to have the implementation outside of the 
header. I can give it a try if keeping the class implementation/definition 
self-contained in the header isn't important.  



Comment at: llvm/lib/Analysis/TargetLibraryInfo.cpp:577
   memcpy(AvailableArray, TLI.AvailableArray, sizeof(AvailableArray));
-  VectorDescs = TLI.VectorDescs;
-  ScalarDescs = TLI.ScalarDescs;
+  for (unsigned i = 0; i < sizeof(TLI.VecLibDescs) / 
sizeof(TLI.VecLibDescs[0]);
+   i++)

tejohnson wrote:
> Why not just have "i < NumVecLibs"?
Good catch, thanks..


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77632/new/

https://reviews.llvm.org/D77632



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77632: [TLI] Per-function fveclib for math library used for vectorization

2020-04-09 Thread Wenlei He via Phabricator via cfe-commits
wenlei updated this revision to Diff 256350.
wenlei added a comment.

address feedback, allow functions within a module to have different vectlib 
setting. add test case for inline compatibility check.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77632/new/

https://reviews.llvm.org/D77632

Files:
  clang/lib/CodeGen/BackendUtil.cpp
  clang/lib/CodeGen/CGCall.cpp
  clang/test/CodeGen/libcalls-veclib.c
  llvm/include/llvm/Analysis/TargetLibraryInfo.h
  llvm/lib/Analysis/TargetLibraryInfo.cpp
  llvm/test/Transforms/Inline/veclib-compat.ll

Index: llvm/test/Transforms/Inline/veclib-compat.ll
===
--- /dev/null
+++ llvm/test/Transforms/Inline/veclib-compat.ll
@@ -0,0 +1,43 @@
+; RUN: opt < %s -inline -S | FileCheck %s
+; RUN: opt < %s -passes='cgscc(inline)' -S | FileCheck %s
+
+define i32 @callee_svml(i8 %X) #0 {
+entry:
+  ret i32 1
+}
+
+define i32 @callee_massv(i8 %X) #1 {
+entry:
+  ret i32 1
+}
+
+define i32 @callee_nolibrary(i8 %X) {
+entry:
+  ret i32 1
+}
+
+define i32 @caller_svml() #0 {
+; CHECK-LABEL: define i32 @caller_svml()
+entry:
+  %rslt = call i32 @callee_massv(i8 123)
+; CHECK: call i32 @callee_massv
+  %tmp1 = call i32 @callee_svml(i8 123)
+  %tmp2 = call i32 @callee_nolibrary(i8 123)
+; CHECK-NOT: call
+  ret i32 %rslt
+}
+
+define i32 @caller_nolibrary() {
+; CHECK-LABEL: define i32 @caller_nolibrary()
+entry:
+  %rslt = call i32 @callee_svml(i8 123)
+; CHECK: call i32 @callee_svml
+  %tmp1 = call i32 @callee_massv(i8 123)
+; CHECK: call i32 @callee_massv
+  %tmp2 = call i32 @callee_nolibrary(i8 123)
+; CHECK-NOT: call
+  ret i32 %rslt
+}
+
+attributes #0 = { "veclib"="SVML" }
+attributes #1 = { "veclib"="MASSV" }
\ No newline at end of file
Index: llvm/lib/Analysis/TargetLibraryInfo.cpp
===
--- llvm/lib/Analysis/TargetLibraryInfo.cpp
+++ llvm/lib/Analysis/TargetLibraryInfo.cpp
@@ -550,7 +550,8 @@
 TLI.setUnavailable(LibFunc_nvvm_reflect);
   }
 
-  TLI.addVectorizableFunctionsFromVecLib(ClVectorLibrary);
+  TLI.addAllVectorizableFunctions();
+  TLI.setClVectorLibrary(ClVectorLibrary);
 }
 
 TargetLibraryInfoImpl::TargetLibraryInfoImpl() {
@@ -570,21 +571,25 @@
 TargetLibraryInfoImpl::TargetLibraryInfoImpl(const TargetLibraryInfoImpl &TLI)
 : CustomNames(TLI.CustomNames), ShouldExtI32Param(TLI.ShouldExtI32Param),
   ShouldExtI32Return(TLI.ShouldExtI32Return),
-  ShouldSignExtI32Param(TLI.ShouldSignExtI32Param) {
+  ShouldSignExtI32Param(TLI.ShouldSignExtI32Param),
+  ClVectorLibrary(TLI.ClVectorLibrary) {
   memcpy(AvailableArray, TLI.AvailableArray, sizeof(AvailableArray));
-  VectorDescs = TLI.VectorDescs;
-  ScalarDescs = TLI.ScalarDescs;
+  for (unsigned i = 0; i < sizeof(TLI.VecLibDescs) / sizeof(TLI.VecLibDescs[0]);
+   i++)
+VecLibDescs[i] = TLI.VecLibDescs[i];
 }
 
 TargetLibraryInfoImpl::TargetLibraryInfoImpl(TargetLibraryInfoImpl &&TLI)
 : CustomNames(std::move(TLI.CustomNames)),
   ShouldExtI32Param(TLI.ShouldExtI32Param),
   ShouldExtI32Return(TLI.ShouldExtI32Return),
-  ShouldSignExtI32Param(TLI.ShouldSignExtI32Param) {
+  ShouldSignExtI32Param(TLI.ShouldSignExtI32Param),
+  ClVectorLibrary(TLI.ClVectorLibrary) {
   std::move(std::begin(TLI.AvailableArray), std::end(TLI.AvailableArray),
 AvailableArray);
-  VectorDescs = TLI.VectorDescs;
-  ScalarDescs = TLI.ScalarDescs;
+  for (unsigned i = 0; i < sizeof(TLI.VecLibDescs) / sizeof(TLI.VecLibDescs[0]);
+   i++)
+VecLibDescs[i] = TLI.VecLibDescs[i];
 }
 
 TargetLibraryInfoImpl &TargetLibraryInfoImpl::operator=(const TargetLibraryInfoImpl &TLI) {
@@ -1520,7 +1525,16 @@
   return LHS.VectorFnName < S;
 }
 
-void TargetLibraryInfoImpl::addVectorizableFunctions(ArrayRef Fns) {
+void TargetLibraryInfoImpl::addAllVectorizableFunctions() {
+  addVectorizableFunctionsFromVecLib(Accelerate, VecLibDescs[Accelerate]);
+  addVectorizableFunctionsFromVecLib(MASSV, VecLibDescs[MASSV]);
+  addVectorizableFunctionsFromVecLib(SVML, VecLibDescs[SVML]);
+}
+
+void TargetLibraryInfoImpl::addVectorizableFunctions(ArrayRef Fns,
+VectorLibraryDescriptors &VecLibDesc) {
+  auto &VectorDescs = VecLibDesc.VectorDescs;
+  auto &ScalarDescs = VecLibDesc.ScalarDescs;
   VectorDescs.insert(VectorDescs.end(), Fns.begin(), Fns.end());
   llvm::sort(VectorDescs, compareByScalarFnName);
 
@@ -1529,14 +1543,14 @@
 }
 
 void TargetLibraryInfoImpl::addVectorizableFunctionsFromVecLib(
-enum VectorLibrary VecLib) {
+enum VectorLibrary VecLib, VectorLibraryDescriptors &VetLibDesc) {
   switch (VecLib) {
   case Accelerate: {
 const VecDesc VecFuncs[] = {
 #define TLI_DEFINE_ACCELERATE_VECFUNCS
 #include "llvm/Analysis/VecFuncs.def"
 };
-addVectorizableFunctions(VecFuncs);
+addVectorizableFunctions(VecFuncs, VetLibDesc);
 break;
   }
   case MASSV: {
@@ -1544,7 +1558,7 @@
 #d

[PATCH] D77632: [TLI] Per-function fveclib for math library used for vectorization

2020-04-08 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

Thanks for taking a look and the suggestions, @tejohnson. Yeah, you're right 
about the potential conflict of attributes. Initially I thought even though we 
now allow this to be per-function, but since it comes from per-module switch, 
module level consistency can still be expected which can simplify things a bit 
(hence the assertion). But I overlooked the combined module from LTO.

I will get back to this later in the week - the change will be a bit more 
involving as there're a few other places where we populate the module level 
function list directly for TLII (clang and opt).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77632/new/

https://reviews.llvm.org/D77632



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77632: [TLI] Per-function fveclib for math library used for vectorization

2020-04-07 Thread Wenlei He via Phabricator via cfe-commits
wenlei updated this revision to Diff 255703.
wenlei added a comment.

rebase


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77632/new/

https://reviews.llvm.org/D77632

Files:
  clang/lib/CodeGen/CGCall.cpp
  clang/test/CodeGen/libcalls-veclib.c
  llvm/include/llvm/Analysis/TargetLibraryInfo.h
  llvm/lib/Analysis/TargetLibraryInfo.cpp

Index: llvm/lib/Analysis/TargetLibraryInfo.cpp
===
--- llvm/lib/Analysis/TargetLibraryInfo.cpp
+++ llvm/lib/Analysis/TargetLibraryInfo.cpp
@@ -1528,8 +1528,29 @@
   llvm::sort(ScalarDescs, compareByVectorFnName);
 }
 
+void TargetLibraryInfoImpl::addVectorizableFunctionsFromVecLib(
+const StringRef &VecLibName) {
+  VectorLibrary VecLib = NoLibrary;
+  if (VecLibName == "Accelerate")
+VecLib = Accelerate;
+  else if (VecLibName == "MASSV")
+VecLib = MASSV;
+  else if (VecLibName == "SVML")
+VecLib = SVML;
+  else
+return;
+  addVectorizableFunctionsFromVecLib(VecLib);
+}
+
 void TargetLibraryInfoImpl::addVectorizableFunctionsFromVecLib(
 enum VectorLibrary VecLib) {
+  if (VectLibrary != NoLibrary) {
+assert(VectLibrary == VecLib && 
+   "Conflicting VectorLibrary detected");
+return;
+  }
+
+  VectLibrary = VecLib;
   switch (VecLib) {
   case Accelerate: {
 const VecDesc VecFuncs[] = {
@@ -1604,6 +1625,11 @@
   if (!BaselineInfoImpl)
 BaselineInfoImpl =
 TargetLibraryInfoImpl(Triple(F.getParent()->getTargetTriple()));
+
+  StringRef VectLibName = F.getFnAttribute("vect-lib").getValueAsString();
+  if (!VectLibName.empty())
+BaselineInfoImpl->addVectorizableFunctionsFromVecLib(VectLibName);
+
   return TargetLibraryInfo(*BaselineInfoImpl, &F);
 }
 
Index: llvm/include/llvm/Analysis/TargetLibraryInfo.h
===
--- llvm/include/llvm/Analysis/TargetLibraryInfo.h
+++ llvm/include/llvm/Analysis/TargetLibraryInfo.h
@@ -48,6 +48,22 @@
 class TargetLibraryInfoImpl {
   friend class TargetLibraryInfo;
 
+public:
+  /// List of known vector-functions libraries.
+  ///
+  /// The vector-functions library defines, which functions are vectorizable
+  /// and with which factor. The library can be specified by either frontend,
+  /// or a commandline option, and then used by
+  /// addVectorizableFunctionsFromVecLib for filling up the tables of
+  /// vectorizable functions.
+  enum VectorLibrary {
+NoLibrary,  // Don't use any vector library.
+Accelerate, // Use Accelerate framework.
+MASSV,  // IBM MASS vector library.
+SVML// Intel short vector math library.
+  };
+
+private:
   unsigned char AvailableArray[(NumLibFuncs+3)/4];
   llvm::DenseMap CustomNames;
   static StringLiteral const StandardNames[NumLibFuncs];
@@ -71,6 +87,8 @@
   /// Scalarization descriptors - same content as VectorDescs but sorted based
   /// on VectorFnName rather than ScalarFnName.
   std::vector ScalarDescs;
+  /// Vector library available for vectorization.
+  VectorLibrary VectLibrary = NoLibrary;
 
   /// Return true if the function type FTy is valid for the library function
   /// F, regardless of whether the function is available.
@@ -78,20 +96,6 @@
   const DataLayout *DL) const;
 
 public:
-  /// List of known vector-functions libraries.
-  ///
-  /// The vector-functions library defines, which functions are vectorizable
-  /// and with which factor. The library can be specified by either frontend,
-  /// or a commandline option, and then used by
-  /// addVectorizableFunctionsFromVecLib for filling up the tables of
-  /// vectorizable functions.
-  enum VectorLibrary {
-NoLibrary,  // Don't use any vector library.
-Accelerate, // Use Accelerate framework.
-MASSV,  // IBM MASS vector library.
-SVML// Intel short vector math library.
-  };
-
   TargetLibraryInfoImpl();
   explicit TargetLibraryInfoImpl(const Triple &T);
 
@@ -148,6 +152,7 @@
   /// Calls addVectorizableFunctions with a known preset of functions for the
   /// given vector library.
   void addVectorizableFunctionsFromVecLib(enum VectorLibrary VecLib);
+  void addVectorizableFunctionsFromVecLib(const StringRef &VecLibName);
 
   /// Return true if the function F has a vector equivalent with vectorization
   /// factor VF.
@@ -261,18 +266,20 @@
   }
 
   /// Determine whether a callee with the given TLI can be inlined into
-  /// caller with this TLI, based on 'nobuiltin' attributes. When requested,
-  /// allow inlining into a caller with a superset of the callee's nobuiltin
-  /// attributes, which is conservatively correct.
+  /// caller with this TLI, based on 'nobuiltin', `vect-lib` attributes.
+  /// When requested, allow inlining into a caller with a superset of the
+  /// callee's attributes, which is conservatively correct.
   bool areInlineCompatible(const TargetLibraryInfo &CalleeTLI,
bool 

[PATCH] D77632: [TLI] Per-function fveclib for math library used for vectorization

2020-04-07 Thread Wenlei He via Phabricator via cfe-commits
wenlei updated this revision to Diff 255679.
wenlei added a comment.

update


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77632/new/

https://reviews.llvm.org/D77632

Files:
  clang/lib/CodeGen/CGCall.cpp
  clang/test/CodeGen/libcalls-veclib.c
  llvm/include/llvm/Analysis/TargetLibraryInfo.h
  llvm/lib/Analysis/TargetLibraryInfo.cpp

Index: llvm/lib/Analysis/TargetLibraryInfo.cpp
===
--- llvm/lib/Analysis/TargetLibraryInfo.cpp
+++ llvm/lib/Analysis/TargetLibraryInfo.cpp
@@ -1528,8 +1528,29 @@
   llvm::sort(ScalarDescs, compareByVectorFnName);
 }
 
+void TargetLibraryInfoImpl::addVectorizableFunctionsFromVecLib(
+const StringRef &VecLibName) {
+  VectorLibrary VecLib = NoLibrary;
+  if (VecLibName == "Accelerate")
+VecLib = Accelerate;
+  else if (VecLibName == "MASSV")
+VecLib = MASSV;
+  else if (VecLibName == "SVML")
+VecLib = SVML;
+  else
+return;
+  addVectorizableFunctionsFromVecLib(VecLib);
+}
+
 void TargetLibraryInfoImpl::addVectorizableFunctionsFromVecLib(
 enum VectorLibrary VecLib) {
+  if (VectLibrary != NoLibrary) {
+assert(VectLibrary == VecLib && 
+   "Conflicting VectorLibrary detected");
+return;
+  }
+
+  VectLibrary = VecLib;
   switch (VecLib) {
   case Accelerate: {
 const VecDesc VecFuncs[] = {
@@ -1604,6 +1625,11 @@
   if (!BaselineInfoImpl)
 BaselineInfoImpl =
 TargetLibraryInfoImpl(Triple(F.getParent()->getTargetTriple()));
+
+  StringRef VectLibName = F.getFnAttribute("vect-lib").getValueAsString();
+  if (!VectLibName.empty())
+BaselineInfoImpl->addVectorizableFunctionsFromVecLib(VectLibName);
+
   return TargetLibraryInfo(*BaselineInfoImpl, &F);
 }
 
Index: llvm/include/llvm/Analysis/TargetLibraryInfo.h
===
--- llvm/include/llvm/Analysis/TargetLibraryInfo.h
+++ llvm/include/llvm/Analysis/TargetLibraryInfo.h
@@ -48,6 +48,22 @@
 class TargetLibraryInfoImpl {
   friend class TargetLibraryInfo;
 
+public:
+  /// List of known vector-functions libraries.
+  ///
+  /// The vector-functions library defines, which functions are vectorizable
+  /// and with which factor. The library can be specified by either frontend,
+  /// or a commandline option, and then used by
+  /// addVectorizableFunctionsFromVecLib for filling up the tables of
+  /// vectorizable functions.
+  enum VectorLibrary {
+NoLibrary,  // Don't use any vector library.
+Accelerate, // Use Accelerate framework.
+MASSV,  // IBM MASS vector library.
+SVML// Intel short vector math library.
+  };
+
+private:
   unsigned char AvailableArray[(NumLibFuncs+3)/4];
   llvm::DenseMap CustomNames;
   static StringLiteral const StandardNames[NumLibFuncs];
@@ -71,6 +87,8 @@
   /// Scalarization descriptors - same content as VectorDescs but sorted based
   /// on VectorFnName rather than ScalarFnName.
   std::vector ScalarDescs;
+  /// Vector library available for vectorization.
+  VectorLibrary VectLibrary = NoLibrary;
 
   /// Return true if the function type FTy is valid for the library function
   /// F, regardless of whether the function is available.
@@ -78,20 +96,6 @@
   const DataLayout *DL) const;
 
 public:
-  /// List of known vector-functions libraries.
-  ///
-  /// The vector-functions library defines, which functions are vectorizable
-  /// and with which factor. The library can be specified by either frontend,
-  /// or a commandline option, and then used by
-  /// addVectorizableFunctionsFromVecLib for filling up the tables of
-  /// vectorizable functions.
-  enum VectorLibrary {
-NoLibrary,  // Don't use any vector library.
-Accelerate, // Use Accelerate framework.
-MASSV,  // IBM MASS vector library.
-SVML// Intel short vector math library.
-  };
-
   TargetLibraryInfoImpl();
   explicit TargetLibraryInfoImpl(const Triple &T);
 
@@ -148,6 +152,7 @@
   /// Calls addVectorizableFunctions with a known preset of functions for the
   /// given vector library.
   void addVectorizableFunctionsFromVecLib(enum VectorLibrary VecLib);
+  void addVectorizableFunctionsFromVecLib(const StringRef &VecLibName);
 
   /// Return true if the function F has a vector equivalent with vectorization
   /// factor VF.
@@ -261,18 +266,20 @@
   }
 
   /// Determine whether a callee with the given TLI can be inlined into
-  /// caller with this TLI, based on 'nobuiltin' attributes. When requested,
-  /// allow inlining into a caller with a superset of the callee's nobuiltin
-  /// attributes, which is conservatively correct.
+  /// caller with this TLI, based on 'nobuiltin', `vect-lib` attributes.
+  /// When requested, allow inlining into a caller with a superset of the
+  /// callee's attributes, which is conservatively correct.
   bool areInlineCompatible(const TargetLibraryInfo &CalleeTLI,
bool 

[PATCH] D77484: [Vector] Pass VectLib to LTO backend so TLI build correct vector function list

2020-04-07 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

In D77484#1965976 , @tejohnson wrote:

> In D77484#1965629 , @wenlei wrote:
>
> > > Ok then it does sound like these could be handled on a per-function 
> > > basis, similar to how -fno-builtin* are handled. I.e. a function 
> > > attribute to indicate the veclib, which would then be naturally preserved 
> > > during LTO even after merging/importing across modules. Similar to how 
> > > -fno-builtin* are handled, these would need to be examined when inlining 
> > > (see the new TargetLibraryInfo::areInlineCompatible). Presumably we would 
> > > want to block inlining between functions with different veclib attributes 
> > > in the LTO backends.
> >
> > @tejohnson, we could do that. But then on the other hand, technically 
> > almost everything for module or whole program can be passed as a function 
> > attribute, and yet we have switches passed to backend for many of those 
> > things. Wondering what's the convention or rule (if there's one) we want to 
> > follow? Specifically, if we only use function attributes for stuff that's 
> > indeed going to be different between functions, then vectlib isn't in that 
> > category; or if we use function attributes for the greatest flexibility 
> > whenever we can, then many other things should be function attributes too 
> > (though it's essentially duplication in IR, and probably not the most 
> > efficient).
>
>
> Passing the option through the driver to the linker is the legacy approach. 
> But it isn't really scalable and has other issues, so we've been moving 
> towards having all the necessary info in the IR itself. For one, this helps 
> deal with cases where different options were specified for different source 
> files. For another, it keeps the same build behavior with LTO and non-LTO. 
> I.e. for this option, if the build system specified it for the cc compiles 
> but not the links, it would work for O2 
>  but not for LTO if it had to be 
> propagated via the linker. It would work for LTO if it was propagated via the 
> IR.


Makes sense, thanks for clarification. I created D77632 
 to make vect lib setting a per-function 
attribute.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77484/new/

https://reviews.llvm.org/D77484



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77632: [TLI] Per-function fveclib for math library used for vectorization

2020-04-07 Thread Wenlei He via Phabricator via cfe-commits
wenlei created this revision.
Herald added subscribers: cfe-commits, dexonsmith, hiraditya.
Herald added a project: clang.
wenlei edited the summary of this revision.
wenlei added reviewers: tejohnson, hoyFB, spatel, gchatelet.

Encode `-fveclib` setting as per-function attribute so it can be threaded 
through to LTO backends. Accordingly, per-function TLI now reads the attribute 
and populated available vector function list based on that. Note that we expect 
functions within the same module to share `fveclib` setting, so vector function 
list is still shared between functions, as part of the shared 
`TargetLibraryInfoImpl`. Inlining between functions with different vect lib 
attribute is now blocked.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D77632

Files:
  clang/lib/CodeGen/CGCall.cpp
  clang/test/CodeGen/libcalls-veclib.c
  llvm/include/llvm/Analysis/TargetLibraryInfo.h
  llvm/lib/Analysis/TargetLibraryInfo.cpp

Index: llvm/lib/Analysis/TargetLibraryInfo.cpp
===
--- llvm/lib/Analysis/TargetLibraryInfo.cpp
+++ llvm/lib/Analysis/TargetLibraryInfo.cpp
@@ -1528,8 +1528,29 @@
   llvm::sort(ScalarDescs, compareByVectorFnName);
 }
 
+void TargetLibraryInfoImpl::addVectorizableFunctionsFromVecLib(
+const StringRef &VecLibName) {
+  VectorLibrary VecLib = NoLibrary;
+  if (VecLibName == "Accelerate")
+VecLib = Accelerate;
+  else if (VecLibName == "MASSV")
+VecLib = MASSV;
+  else if (VecLibName == "SVML")
+VecLib = SVML;
+  else
+return;
+  addVectorizableFunctionsFromVecLib(VecLib);
+}
+
 void TargetLibraryInfoImpl::addVectorizableFunctionsFromVecLib(
 enum VectorLibrary VecLib) {
+  if (VectLibrary != NoLibrary) {
+assert(VectLibrary == VecLib && 
+   "Conflicting VectorLibrary detected");
+return;
+  }
+
+  VectLibrary = VecLib;
   switch (VecLib) {
   case Accelerate: {
 const VecDesc VecFuncs[] = {
@@ -1604,6 +1625,11 @@
   if (!BaselineInfoImpl)
 BaselineInfoImpl =
 TargetLibraryInfoImpl(Triple(F.getParent()->getTargetTriple()));
+
+  StringRef VectLibName = F.getFnAttribute("vect-lib").getValueAsString();
+  if (!VectLibName.empty())
+BaselineInfoImpl->addVectorizableFunctionsFromVecLib(VectLibName);
+
   return TargetLibraryInfo(*BaselineInfoImpl, &F);
 }
 
Index: llvm/include/llvm/Analysis/TargetLibraryInfo.h
===
--- llvm/include/llvm/Analysis/TargetLibraryInfo.h
+++ llvm/include/llvm/Analysis/TargetLibraryInfo.h
@@ -48,6 +48,21 @@
 class TargetLibraryInfoImpl {
   friend class TargetLibraryInfo;
 
+public:
+  /// List of known vector-functions libraries.
+  ///
+  /// The vector-functions library defines, which functions are vectorizable
+  /// and with which factor. The library can be specified by either frontend,
+  /// or a commandline option, and then used by
+  /// addVectorizableFunctionsFromVecLib for filling up the tables of
+  /// vectorizable functions.
+  enum VectorLibrary {
+NoLibrary,  // Don't use any vector library.
+Accelerate, // Use Accelerate framework.
+MASSV,  // IBM MASS vector library.
+SVML// Intel short vector math library.
+  };
+
   unsigned char AvailableArray[(NumLibFuncs+3)/4];
   llvm::DenseMap CustomNames;
   static StringLiteral const StandardNames[NumLibFuncs];
@@ -71,6 +86,8 @@
   /// Scalarization descriptors - same content as VectorDescs but sorted based
   /// on VectorFnName rather than ScalarFnName.
   std::vector ScalarDescs;
+  /// Vector library available for vectorization.
+  VectorLibrary VectLibrary = NoLibrary;
 
   /// Return true if the function type FTy is valid for the library function
   /// F, regardless of whether the function is available.
@@ -78,20 +95,6 @@
   const DataLayout *DL) const;
 
 public:
-  /// List of known vector-functions libraries.
-  ///
-  /// The vector-functions library defines, which functions are vectorizable
-  /// and with which factor. The library can be specified by either frontend,
-  /// or a commandline option, and then used by
-  /// addVectorizableFunctionsFromVecLib for filling up the tables of
-  /// vectorizable functions.
-  enum VectorLibrary {
-NoLibrary,  // Don't use any vector library.
-Accelerate, // Use Accelerate framework.
-MASSV,  // IBM MASS vector library.
-SVML// Intel short vector math library.
-  };
-
   TargetLibraryInfoImpl();
   explicit TargetLibraryInfoImpl(const Triple &T);
 
@@ -148,6 +151,7 @@
   /// Calls addVectorizableFunctions with a known preset of functions for the
   /// given vector library.
   void addVectorizableFunctionsFromVecLib(enum VectorLibrary VecLib);
+  void addVectorizableFunctionsFromVecLib(const StringRef &VecLibName);
 
   /// Return true if the function F has a vector equivalent with vectorization
   /// factor VF.
@@ -261,18 +265,20 @@
   }
 
   /// Determine whether a ca

[PATCH] D77484: [Vector] Pass VectLib to LTO backend so TLI build correct vector function list

2020-04-06 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

> Ok then it does sound like these could be handled on a per-function basis, 
> similar to how -fno-builtin* are handled. I.e. a function attribute to 
> indicate the veclib, which would then be naturally preserved during LTO even 
> after merging/importing across modules. Similar to how -fno-builtin* are 
> handled, these would need to be examined when inlining (see the new 
> TargetLibraryInfo::areInlineCompatible). Presumably we would want to block 
> inlining between functions with different veclib attributes in the LTO 
> backends.

@tejohnson, we could do that. But then on the other hand, technically almost 
everything for module or whole program can be passed as a function attribute, 
and yet we have switches passed to backend for many of those things. Wondering 
what's the convention or rule (if there's one) we want to follow? Specifically, 
if we only use function attributes for stuff that's indeed going to be 
different between functions, then vectlib isn't in that category; or if we use 
function attributes for the greatest flexibility whenever we can, then many 
other things should be function attributes too (though it's essentially 
duplication in IR, and probably not the most efficient).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77484/new/

https://reviews.llvm.org/D77484



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77484: [Vector] Pass VectLib to LTO backend so TLI build correct vector function list

2020-04-05 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

> Linking against two vectlibs may cause name conflicts or other issues.

Of all three supported match libraries, all functions from Accelerate are 
prefixed with `v`; all MASS library functions are suffixed with `_massv`; and 
all SVML functions are prefixed with `__svml_`. See `VecFuncs.def`. So at least 
the exported functions don't have name conflicts.

> What happens today in a non-LTO build if conflicting veclib options are 
> provided to different TUs?

I think it will work, as long as used math libraries are all provided to 
linker. Even if not, I'm not sure if this is something we want to actively 
prevent from the use of fvectlib switch. fvectlib controls codegen/optimizer, 
and whether the resulting codegen can leads to linking problem (if any), is 
kind of orthogonal, and probably no different from a regular linking/symbol 
resolution problem if there're conflicts in the libraries provide.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77484/new/

https://reviews.llvm.org/D77484



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77484: [Vector] Pass VectLib to LTO backend so TLI build correct vector function list

2020-04-05 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

In D77484#1962445 , @tejohnson wrote:

> We're trying to move towards encoding all of this in the IR. And in fact, I 
> recently implemented a series of patches to make the TLI to be built 
> per-function, and along with some patches from @gchatelet to encode 
> -fno-builtin* as function attributes, we now handle that part of the TLII 
> with IR. See D67923  which is the last patch 
> in the series. We should encode the vectlib as function attributes similarly, 
> and just thread that through to the TLI.


Thanks for quick look, @tejohnson. Yeah, making vectlib a function attribute 
would be more consistent with other the way other TLI attributes are handled. 
However, unlike some other attributes, which really need to be per-function, 
conceptually, vectlib setting won't be different from function to function in a 
module. Implementation-wise, even though TLI is now built per-function, there's 
a `BaselineInfoImpl` within each `TargetLibraryInfo`, that's still shared among 
functions, and `VectorDescs` which is populated based on vectlib setting 
belongs to the shared baseline. We could also move `VectorDescs` and its 
population into per-function part of TLI, but I felt that's a bit overkill, 
though I don't have a strong opinion on this. So I followed how other module 
level stuff like `lto-sample-profile` is passed to backend..

Alternatively, we could chose to always run `InjectTLIMappings` in pre-LTO 
pipeline, even if we don't run vectorizer pre-LTO for ThinLTO, so we always 
have per-function attribute `vector-function-abi-variant` populated and 
threaded through to LTO time vectorizer. We will have to make sure the vector 
function list in TLI (based on vectlib setting) has no other use case though.

(More on TLI, while this change (or a variant of this) enables vectorization 
that uses math library for ThinLTO w/ legacy PM, there's separate issue with 
new PM as `InjectTLIMappings` is not scheduled before Vectorization for LTO 
pipeline. I will send a separate patch for that once we settle on a good way to 
handle this one.)


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D77484/new/

https://reviews.llvm.org/D77484



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D77484: [Vector] Pass VectLib to LTO backend so TLI build correct vector function list

2020-04-04 Thread Wenlei He via Phabricator via cfe-commits
wenlei created this revision.
wenlei added reviewers: hoyFB, spatel.
Herald added subscribers: cfe-commits, dang, dexonsmith, steven_wu, MaskRay, 
hiraditya, arichardson, inglorion, emaste.
Herald added a reviewer: espindola.
Herald added a project: clang.

-fveclib switch not propagated to LTO backends, and as a result, LTO populates 
vector list as if no math lib is used. This change fixed the driver to lld, and 
to backend propagation of -fveclib setting.


Repository:
  rG LLVM Github Monorepo

https://reviews.llvm.org/D77484

Files:
  clang/lib/Driver/ToolChains/CommonArgs.cpp
  lld/ELF/Config.h
  lld/ELF/Driver.cpp
  lld/ELF/LTO.cpp
  lld/ELF/Options.td
  llvm/include/llvm/Target/TargetOptions.h
  llvm/lib/LTO/LTOBackend.cpp

Index: llvm/lib/LTO/LTOBackend.cpp
===
--- llvm/lib/LTO/LTOBackend.cpp
+++ llvm/lib/LTO/LTOBackend.cpp
@@ -279,6 +279,7 @@
 
   PassManagerBuilder PMB;
   PMB.LibraryInfo = new TargetLibraryInfoImpl(Triple(TM->getTargetTriple()));
+  PMB.LibraryInfo->addVectorizableFunctionsFromVecLib(Conf.Options.VectLib);
   PMB.Inliner = createFunctionInliningPass();
   PMB.ExportSummary = ExportSummary;
   PMB.ImportSummary = ImportSummary;
Index: llvm/include/llvm/Target/TargetOptions.h
===
--- llvm/include/llvm/Target/TargetOptions.h
+++ llvm/include/llvm/Target/TargetOptions.h
@@ -15,6 +15,7 @@
 #define LLVM_TARGET_TARGETOPTIONS_H
 
 #include "llvm/ADT/FloatingPointMode.h"
+#include "llvm/Analysis/TargetLibraryInfo.h"
 #include "llvm/MC/MCTargetOptions.h"
 
 #include 
@@ -320,6 +321,10 @@
 /// the value of this option.
 FPOpFusion::FPOpFusionMode AllowFPOpFusion = FPOpFusion::Standard;
 
+/// Vector library used for vectorizable math function calls
+TargetLibraryInfoImpl::VectorLibrary VectLib =
+TargetLibraryInfoImpl::NoLibrary;
+
 /// ThreadModel - This flag specifies the type of threading model to assume
 /// for things like atomics
 ThreadModel::Model ThreadModel = ThreadModel::POSIX;
Index: lld/ELF/Options.td
===
--- lld/ELF/Options.td
+++ lld/ELF/Options.td
@@ -488,6 +488,8 @@
 def lto_obj_path_eq: J<"lto-obj-path=">;
 def lto_sample_profile: J<"lto-sample-profile=">,
   HelpText<"Sample profile file path">;
+def lto_vector_library: J<"lto-vector-library=">,
+  HelpText<"Vector functions library">;
 def lto_whole_program_visibility: F<"lto-whole-program-visibility">,
   HelpText<"Asserts that the LTO link has whole program visibility">;
 def disable_verify: F<"disable-verify">;
@@ -552,6 +554,9 @@
 def: J<"plugin-opt=thinlto-prefix-replace=">,
   Alias,
   HelpText<"Alias for --thinlto-prefix-replace=">;
+def: J<"plugin-opt=vector-library=">,
+  Alias,
+  HelpText<"Alias for -lto-vector-library=">;
 
 // Ignore LTO plugin-related options.
 // clang -flto passes -plugin and -plugin-opt to the linker. This is required
Index: lld/ELF/LTO.cpp
===
--- lld/ELF/LTO.cpp
+++ lld/ELF/LTO.cpp
@@ -110,6 +110,7 @@
   c.UseNewPM = config->ltoNewPassManager;
   c.DebugPassManager = config->ltoDebugPassManager;
   c.DwoDir = std::string(config->dwoDir);
+  c.Options.VectLib = config->VectLib;
 
   c.HasWholeProgramVisibility = config->ltoWholeProgramVisibility;
 
Index: lld/ELF/Driver.cpp
===
--- lld/ELF/Driver.cpp
+++ lld/ELF/Driver.cpp
@@ -704,6 +704,20 @@
   return OrphanHandlingPolicy::Place;
 }
 
+static llvm::TargetLibraryInfoImpl::VectorLibrary
+getVectLib(opt::InputArgList &args) {
+  StringRef s = args.getLastArgValue(OPT_lto_vector_library);
+  if (s == "Accelerate")
+return llvm::TargetLibraryInfoImpl::Accelerate;
+  if (s == "MASSV")
+return llvm::TargetLibraryInfoImpl::MASSV;
+  if (s == "SVML")
+return llvm::TargetLibraryInfoImpl::SVML;
+  if (!s.empty() && s != "none")
+error("unknown --vector-library type: " + s);
+  return llvm::TargetLibraryInfoImpl::NoLibrary;
+}
+
 // Parse --build-id or --build-id=

[PATCH] D74814: IR printing for single function with the new pass manager.

2020-02-23 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

Committed on behalf of @hoyFB per request.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74814/new/

https://reviews.llvm.org/D74814



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D74814: IR printing for single function with the new pass manager.

2020-02-23 Thread Wenlei He via Phabricator via cfe-commits
This revision was automatically updated to reflect the committed changes.
Closed by commit rGbae33a7c5a1f: IR printing for single function with the new 
pass manager. (authored by hoyFB, committed by wenlei).

Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74814/new/

https://reviews.llvm.org/D74814

Files:
  llvm/lib/Passes/StandardInstrumentations.cpp
  llvm/test/Other/module-pass-printer.ll


Index: llvm/test/Other/module-pass-printer.ll
===
--- llvm/test/Other/module-pass-printer.ll
+++ llvm/test/Other/module-pass-printer.ll
@@ -1,13 +1,43 @@
 ; Check pass name is only printed once.
-; RUN: opt < %s 2>&1 -forceattrs -disable-output -print-after-all | FileCheck 
%s
-; RUN: opt < %s 2>&1 -forceattrs -disable-output -print-after-all 
-filter-print-funcs=foo,bar | FileCheck %s
+; Check only one function is printed
+; RUN: opt < %s 2>&1 -forceattrs -disable-output -print-after-all 
-filter-print-funcs=foo | FileCheck %s  -check-prefix=FOO
+; RUN: opt < %s 2>&1 -passes=forceattrs -disable-output -print-after-all 
-filter-print-funcs=foo | FileCheck %s  -check-prefix=FOO
+
+; Check pass name is only printed once.
+; Check both functions are printed
+; RUN: opt < %s 2>&1 -forceattrs -disable-output -print-after-all 
-filter-print-funcs=foo,bar | FileCheck %s -check-prefix=BOTH
+; RUN: opt < %s 2>&1 -passes=forceattrs -disable-output -print-after-all 
-filter-print-funcs=foo,bar | FileCheck %s -check-prefix=BOTH
 
 ; Check pass name is not printed if a module doesn't include any function 
specified in -filter-print-funcs.
 ; RUN: opt < %s 2>&1 -forceattrs -disable-output -print-after-all 
-filter-print-funcs=baz | FileCheck %s -allow-empty -check-prefix=EMPTY
+; RUN: opt < %s 2>&1 -passes=forceattrs -disable-output -print-after-all 
-filter-print-funcs=baz | FileCheck %s -allow-empty -check-prefix=EMPTY
+
+; Check whole module is printed with user-specified wildcast switch 
-filter-print-funcs=* or -print-module-scope
+; RUN: opt < %s 2>&1 -forceattrs -disable-output -print-after-all | FileCheck 
%s -check-prefix=ALL
+; RUN: opt < %s 2>&1 -forceattrs -disable-output  -print-after-all 
-filter-print-funcs=* | FileCheck %s -check-prefix=ALL
+; RUN: opt < %s 2>&1 -forceattrs -disable-output -print-after-all 
-filter-print-funcs=foo -print-module-scope | FileCheck %s -check-prefix=ALL
+; RUN: opt < %s 2>&1 -passes=forceattrs -disable-output -print-after-all | 
FileCheck %s -check-prefix=ALL
+; RUN: opt < %s 2>&1 -passes=forceattrs -disable-output -print-after-all 
-filter-print-funcs=* | FileCheck %s -check-prefix=ALL
+; RUN: opt < %s 2>&1 -passes=forceattrs -disable-output -print-after-all 
-filter-print-funcs=foo -print-module-scope | FileCheck %s -check-prefix=ALL
+
+; FOO:  IR Dump After {{Force set function 
attributes|ForceFunctionAttrsPass}}
+; FOO:  define void @foo
+; FOO-NOT:  define void @bar
+; FOO-NOT:  IR Dump After {{Force set function 
attributes|ForceFunctionAttrsPass}}
+
+; BOTH: IR Dump After {{Force set function 
attributes|ForceFunctionAttrsPass}}
+; BOTH: define void @foo
+; BOTH: define void @bar
+; BOTH-NOT: IR Dump After {{Force set function 
attributes|ForceFunctionAttrsPass}}
+; BOTH-NOT: ModuleID =
+
+; EMPTY-NOT: IR Dump After {{Force set function 
attributes|ForceFunctionAttrsPass}}
 
-; CHECK: *** IR Dump After Force set function attributes ***
-; CHECK-NOT: *** IR Dump After Force set function attributes ***
-; EMPTY-NOT: *** IR Dump After Force set function attributes ***
+; ALL:  IR Dump After {{Force set function attributes|ForceFunctionAttrsPass}}
+; ALL:  ModuleID =
+; ALL:  define void @foo
+; ALL:  define void @bar
+; ALL-NOT: IR Dump After {{Force set function 
attributes|ForceFunctionAttrsPass}}
 
 define void @foo() {
   ret void
Index: llvm/lib/Passes/StandardInstrumentations.cpp
===
--- llvm/lib/Passes/StandardInstrumentations.cpp
+++ llvm/lib/Passes/StandardInstrumentations.cpp
@@ -70,16 +70,24 @@
   llvm_unreachable("Unknown IR unit");
 }
 
-void printIR(const Module *M, StringRef Banner, StringRef Extra = StringRef()) 
{
-  dbgs() << Banner << Extra << "\n";
-  M->print(dbgs(), nullptr, false);
-}
 void printIR(const Function *F, StringRef Banner,
  StringRef Extra = StringRef()) {
   if (!llvm::isFunctionInPrintList(F->getName()))
 return;
   dbgs() << Banner << Extra << "\n" << static_cast(*F);
 }
+
+void printIR(const Module *M, StringRef Banner, StringRef Extra = StringRef()) 
{
+  if (llvm::isFunctionInPrintList("*") || llvm::forcePrintModuleIR()) {
+dbgs() << Banner << Extra << "\n";
+M->print(dbgs(), nullptr, false);
+  } else {
+for (const auto &F : M->functions()) {
+  printIR(&F, Banner, Extra);
+}
+  }
+}
+
 void printIR(const LazyCallGraph::SCC *C, StringRef Banner,
  StringRef Extra = StringRef()) {
   bool BannerPrint

[PATCH] D74814: IR printing for single function with the new pass manager.

2020-02-19 Thread Wenlei He via Phabricator via cfe-commits
wenlei accepted this revision.
wenlei added a comment.
This revision is now accepted and ready to land.

Thanks for the making the changes, it would be nice to have some consistency 
(the same structure) between test cases for legacy PM and new PM, e.g. `EMPTY` 
is only tested for legacy PM, but not new; multi-function filter also only 
tested for legacy PM, would be nice to do them for both, and unify the 
check-prefix, otherwise LGTM.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74814/new/

https://reviews.llvm.org/D74814



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D74814: IR printing for single function with the new pass manager.

2020-02-19 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.



> Sounds good. I was trying to figure out how to invoke the new pass manager 
> with OPT. Is there a command line switch to do that?

If you just do `opt -passes=`, it will invoke new pass manager for 
`opt`. See https://reviews.llvm.org/D66560 for example - that was for legacy 
PM, the equivalent for new PM would be something like `opt < %s 2>&1 
-passes=forceattrs ...`


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74814/new/

https://reviews.llvm.org/D74814



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D74814: IR printing for single function with the new pass manager.

2020-02-19 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

Thanks for fixing this. Could you use `.ll` IR file as test case and place it 
alongside with `test/Other/module-pass-printer.ll` or just reuse that one? We 
can also use `opt` to invoke new pass manager, so it doesn't need to be a clang 
test.


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D74814/new/

https://reviews.llvm.org/D74814



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D73060: [Clang] Fix expansion of response files in -Wp after integrated-cc1 change

2020-01-20 Thread Wenlei He via Phabricator via cfe-commits
wenlei accepted this revision.
wenlei added a comment.
This revision is now accepted and ready to land.

> Is that the desired behavior? This was the previous behavior but I'm not sure 
> it's the right behavior.

Good point. It might be safer to keep that previous behavior though.

Thanks for the quick fix, LGTM!


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D73060/new/

https://reviews.llvm.org/D73060



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69825: [Clang][Driver] Re-use the calling process instead of creating a new process for the cc1 invocation

2020-01-18 Thread Wenlei He via Phabricator via cfe-commits
wenlei added a comment.

@aganea, @hans This patch broke response file expansion for preprocessor via 
`-Wp` - all our internal builds failed with this patch because we invoke clang 
with response file passed to preprocessor, e.g. `clang++ ... -Wp,@pp.rsp 
@cc.rsp`. Looks like we can't just call `ExecuteCC1Tool` directly, because 
response file expansion happens in `main`, so with this patch, `-Wp,@pp.rsp` 
won't be expanded properly.

repo:

  CLANG_SPAWN_CC1=0  clang++ -c  -Wp,@a.rsp  @a.rsp test.cpp
  
  error: error reading '@rr.rsp'
  1 error generated.

and it works with `CLANG_SPAWN_CC1=1`


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69825/new/

https://reviews.llvm.org/D69825



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D71903: [Coroutines][6/6] Clang schedules new passes

2019-12-30 Thread Wenlei He via Phabricator via cfe-commits
wenlei added inline comments.



Comment at: clang/lib/CodeGen/BackendUtil.cpp:1228
+  MPM.addPass(createModuleToFunctionPassAdaptor(CoroElidePass()));
+  
MPM.addPass(createModuleToPostOrderCGSCCPassAdaptor(CoroSplitPass()));
+  MPM.addPass(createModuleToFunctionPassAdaptor(CoroCleanupPass()));

Manually scheduling the 2nd coro-split passes in the same CGSCC pipeline would 
make the resume/suspend funclet ineligible for the bulk of CSGSS opts, given 
the split point is relatively early. The implication would be discouraging the 
use of coroutine in performance critical path. I would love to see this being 
addressed before we claim coroutine is ready for new PM.

As commented in the 2nd patch, we may not need the devirt trick used with 
legacy PM, instead for new PM, we could try to leverage `CGSCCUpdateResult` 
without involving artificial indirect call and devirt (or follow how outlining 
is handled by new PM).


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D71903/new/

https://reviews.llvm.org/D71903



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


[PATCH] D69732: [WIP][LTO] Apply SamplePGO pipeline tunes for ThinLTO pre-link to full LTO

2019-11-04 Thread Wenlei He via Phabricator via cfe-commits
wenlei added inline comments.



Comment at: llvm/lib/Transforms/IPO/PassManagerBuilder.cpp:614
   // during ThinLTO and perform the rest of the optimizations afterward.
   if (PrepareForThinLTO) {
 // Ensure we perform any last passes, but do so before renaming anonymous

this also need to be `PrepareForThinLTO || PrepareForLTO` for oldPM?


Repository:
  rG LLVM Github Monorepo

CHANGES SINCE LAST ACTION
  https://reviews.llvm.org/D69732/new/

https://reviews.llvm.org/D69732



___
cfe-commits mailing list
cfe-commits@lists.llvm.org
https://lists.llvm.org/cgi-bin/mailman/listinfo/cfe-commits


  1   2   >